Essays about: "förstärkningsinlärning"
Showing result 16 - 20 of 94 essays containing the word förstärkningsinlärning.
-
16. Using Reinforcement Learning to Correct Soft Errors of Deep Neural Networks
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : Deep Neural Networks (DNNs) are becoming increasingly important in various aspects of human life, particularly in safety-critical areas such as autonomous driving and aerospace systems. However, soft errors including bit-flips can significantly impact the performance of these systems, leading to serious consequences. READ MORE
-
17. Automated Image Pre-Processing for Optimized Text Extraction Using Reinforcement Learning and Genetic Algorithms
University essay fromAbstract : This project aims to develop an automated image pre-processing chain to extract valuable information from appliance labels before recycling. The primary goal is to improve optical character recognition accuracy by addressing noise issues using reinforcement learning and an evolutionary algorithm. READ MORE
-
18. Data Harvesting and Path Planning in UAV-aided Internet-of-Things Wireless Networks with Reinforcement Learning : KTH Thesis Report
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : In recent years, Unmanned aerial vehicles (UAVs) have developed rapidly due to advances in aerospace technology, and wireless communication systems. As a result of their versatility, cost-effectiveness, and flexibility of deployment, UAVs have been developed to accomplish a variety of large and complex tasks without terrain restrictions, such as battlefield operations, search and rescue under disaster conditions, monitoring, etc. READ MORE
-
19. Tackling Non-Stationarity in Reinforcement Learning via Latent Representation : An application to Intraday Foreign Exchange Trading
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : Reinforcement Learning has applications in various domains, but the typical assumption is of a stationary process. Hence, when this hypothesis does not hold, performance may be sub-optimal. READ MORE
-
20. Comparison Between RLHF and RLAIF in Fine-Tuning a Large Language Model
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : Denna artikel undersöker fördelarna, nackdelarna och skillnaderna mellan förstärkningsinlärning från mänsklig återkoppling (RLHF) och förstärkningsinlärning från AI-återkoppling (RLAIF) i kontexten av finjustering av en stor språkmodell. RLHF har vanligtvis använts för att anpassa språkmodeller efter mänskliga preferenser genom att inkludera mänsklig feedback, medan RLAIF föreslår att man använder en AI-baserad metod för att ersätta mänsklig återkoppling. READ MORE