Essays about: "Visual Transformer"

Showing result 1 - 5 of 10 essays containing the words Visual Transformer.

  1. 1. Where to Fuse

    University essay from Lunds universitet/Matematisk statistik

    Author : Lukas Petersson; [2024]
    Keywords : Technology and Engineering;

    Abstract : This thesis investigates fusion techniques in multimodal transformer models, focusing on enhancing the capabilities of large language models in understanding not just text, but also other modalities like images, audio, and sensor data. The study compares late fusion (concatenating modality tokens after separate encoding) and early fusion (concatenating before encoding) techniques, examining their respective advantages and disadvantages. READ MORE

  2. 2. Evaluation of deep learning methods for industrial automation

    University essay from Umeå universitet/Institutionen för datavetenskap

    Author : Ragnar Onning; [2023]
    Keywords : artificial intelligence; machine learning; deep learning; cnn; transformer; swin; swin transformer;

    Abstract : The rise and adaptation of the transformer architecture from natural language processing to visual tasks have proven a useful and powerful tool. Subsequent architectures such as visual transformers (ViT) and shifting window (SWIN) transformers have proven to be comparable and oftentimes exceed convolutional neural networks (CNNs) in terms of accuracy. READ MORE

  3. 3. Visual Bird's-Eye View Object Detection for Autonomous Driving

    University essay from Linköpings universitet/Datorseende

    Author : Erik Lidman; [2023]
    Keywords : computer vision; machine learning; neural networks; deep learning; autonomous driving; autonomous systems; datorseende; maskininlärning; neuronnät; djupinlärning; autonom körning; autonoma system;

    Abstract : In the field of autonomous driving a common scenario is to apply deep learningmodels on camera feeds to provide information about the surroundings. A recenttrend is for such vision-based methods to be centralized, in that they fuse imagesfrom all cameras in one big model for a single comprehensive output. READ MORE

  4. 4. Large-scale Exploratory Text Visualisation

    University essay from Linköpings universitet/Medie- och Informationsteknik; Linköpings universitet/Tekniska fakulteten

    Author : Wilma Axelsson; Nellie Engström; [2023]
    Keywords : Natural language processing; information visualisation; text visualisation; Swedish news articles; dynamic topic modeling; hierarchical topic modeling; BERTopic;

    Abstract : The amount of available text data has increased rapidly in the latest years, making it difficult for an everyday user to find relevant information. To solve this, NLP and visualisation methods have been developed for extracting valuable information from text and presenting it to the user. READ MORE

  5. 5. Handwritten Text Recognition Using a Vision Transformer

    University essay from Uppsala universitet/Institutionen för informationsteknologi

    Author : Jonathan Kurén; Martin Sundberg; [2023]
    Keywords : Handwritten text recognition; Vision transformer; machine learning; image analysis; neural network;

    Abstract : The aim of this project is to create a method for offline handwritten text recognition using a vision transformer. It consists of two parts, where the first one segments all words in a document into separate images and the second one which recognizes the word on each image. READ MORE