Essays about: "vision transformer"
Showing result 1 - 5 of 39 essays containing the words vision transformer.
-
1. Where to Fuse
University essay from Lunds universitet/Matematisk statistikAbstract : This thesis investigates fusion techniques in multimodal transformer models, focusing on enhancing the capabilities of large language models in understanding not just text, but also other modalities like images, audio, and sensor data. The study compares late fusion (concatenating modality tokens after separate encoding) and early fusion (concatenating before encoding) techniques, examining their respective advantages and disadvantages. READ MORE
-
2. Analyzing the Influence of Synthetic andAugmented Data on Segmentation Model
University essay from Luleå tekniska universitet/Institutionen för system- och rymdteknikAbstract : The field of Artificial Intelligence (AI) has experienced unprecedented growth in recent years, thanks to the numerous applications related to speech recognition, natural language processing, and computer vision. However, one of the challenges facing AI is the requirement for large amounts of energy, time, and data to be effective and accurate. READ MORE
-
3. Few-Shot Learning for Quality Inspection
University essay from Högskolan i Halmstad/Akademin för informationsteknologiAbstract : The goal of this project is to find a suitable Few-Shot Learning (FSL) model that can be used in a fault detection system for use in an industrial setting. A dataset of Printed Circuit Board (PCB) images has been created to train different FSL models. READ MORE
-
4. Convolution-compacted visiontransformers forprediction of localwall heat flux atmultiple Prandtlnumbers in turbulentchannel flow
University essay from KTH/Skolan för teknikvetenskap (SCI)Abstract : Predicting wall heat flux accurately in wall-bounded turbulent flows is critical for a variety of engineering applications, including thermal management systems and energy-efficient designs. Traditional methods, which rely on expensive numerical simulations, are hampered by increasing complexity and extremly high computation cost. READ MORE
-
5. Evaluation of deep learning methods for industrial automation
University essay from Umeå universitet/Institutionen för datavetenskapAbstract : The rise and adaptation of the transformer architecture from natural language processing to visual tasks have proven a useful and powerful tool. Subsequent architectures such as visual transformers (ViT) and shifting window (SWIN) transformers have proven to be comparable and oftentimes exceed convolutional neural networks (CNNs) in terms of accuracy. READ MORE