Essays about: "Visual Transformer"
Showing result 1 - 5 of 10 essays containing the words Visual Transformer.
-
1. Where to Fuse
University essay from Lunds universitet/Matematisk statistikAbstract : This thesis investigates fusion techniques in multimodal transformer models, focusing on enhancing the capabilities of large language models in understanding not just text, but also other modalities like images, audio, and sensor data. The study compares late fusion (concatenating modality tokens after separate encoding) and early fusion (concatenating before encoding) techniques, examining their respective advantages and disadvantages. READ MORE
-
2. Evaluation of deep learning methods for industrial automation
University essay from Umeå universitet/Institutionen för datavetenskapAbstract : The rise and adaptation of the transformer architecture from natural language processing to visual tasks have proven a useful and powerful tool. Subsequent architectures such as visual transformers (ViT) and shifting window (SWIN) transformers have proven to be comparable and oftentimes exceed convolutional neural networks (CNNs) in terms of accuracy. READ MORE
-
3. Visual Bird's-Eye View Object Detection for Autonomous Driving
University essay from Linköpings universitet/DatorseendeAbstract : In the field of autonomous driving a common scenario is to apply deep learningmodels on camera feeds to provide information about the surroundings. A recenttrend is for such vision-based methods to be centralized, in that they fuse imagesfrom all cameras in one big model for a single comprehensive output. READ MORE
-
4. Large-scale Exploratory Text Visualisation
University essay from Linköpings universitet/Medie- och Informationsteknik; Linköpings universitet/Tekniska fakultetenAbstract : The amount of available text data has increased rapidly in the latest years, making it difficult for an everyday user to find relevant information. To solve this, NLP and visualisation methods have been developed for extracting valuable information from text and presenting it to the user. READ MORE
-
5. Handwritten Text Recognition Using a Vision Transformer
University essay from Uppsala universitet/Institutionen för informationsteknologiAbstract : The aim of this project is to create a method for offline handwritten text recognition using a vision transformer. It consists of two parts, where the first one segments all words in a document into separate images and the second one which recognizes the word on each image. READ MORE