Essays about: "Visual Transformers"
Showing result 1 - 5 of 11 essays containing the words Visual Transformers.
-
1. Using Machine Learning to Optimize Near-Earth Object Sighting Data at the Golden Ears Observatory
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : This research project focuses on improving Near-Earth Object (NEO) detection using advanced machine learning techniques, particularly Vision Transformers (ViTs). The study addresses challenges such as noise, limited data, and class imbalance. READ MORE
-
2. Evaluation of deep learning methods for industrial automation
University essay from Umeå universitet/Institutionen för datavetenskapAbstract : The rise and adaptation of the transformer architecture from natural language processing to visual tasks have proven a useful and powerful tool. Subsequent architectures such as visual transformers (ViT) and shifting window (SWIN) transformers have proven to be comparable and oftentimes exceed convolutional neural networks (CNNs) in terms of accuracy. READ MORE
-
3. AATrackT: A deep learning network using attentions for tracking fast-moving and tiny objects : (A)ttention (A)ugmented - (Track)ing on (T)iny objects
University essay from Jönköping University/JTH, Avdelningen för datavetenskapAbstract : Recent advances in deep learning have made it possible to visually track objects from a video sequence. Moreover, as transformers got introduced in computer vision, new state-of-the-art performances were achieved in visual tracking. READ MORE
-
4. Video Retargeting using Vision Transformers : Utilizing deep learning for video aspect ratio change
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : The diversity of video material, where a video is shot and produced using a single aspect ratio, and the variety of devices that can play video with screens in different aspect ratios make video retargeting a relevant topic. The process of fitting a video filmed in one aspect ratio to a screen in another aspect ratio is called video retargeting, and the retargeted video should ideally preserve the important content and structure of the original video as well as be free of visual artifacts. READ MORE
-
5. News article segmentation using multimodal input : Using Mask R-CNN and sentence transformers
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : In this century and the last, serious efforts have been made to digitize the content housed by libraries across the world. In order to open up these volumes to content-based information retrieval, independent elements such as headlines, body text, bylines, images and captions ideally need to be connected semantically as article-level units. READ MORE