Essays about: "Reinforcement Learning from Human Feedback"

Found 3 essays containing the words Reinforcement Learning from Human Feedback.

  1. 1. Comparison Between RLHF and RLAIF in Fine-Tuning a Large Language Model

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Samuel Höglund; Josef Khedri; [2023]
    Keywords : ;

    Abstract : Denna artikel undersöker fördelarna, nackdelarna och skillnaderna mellan förstärkningsinlärning från mänsklig återkoppling (RLHF) och förstärkningsinlärning från AI-återkoppling (RLAIF) i kontexten av finjustering av en stor språkmodell. RLHF har vanligtvis använts för att anpassa språkmodeller efter mänskliga preferenser genom att inkludera mänsklig feedback, medan RLAIF föreslår att man använder en AI-baserad metod för att ersätta mänsklig återkoppling. READ MORE

  2. 2. Fine-tuning a LLM using Reinforcement Learning from Human Feedback for a Therapy Chatbot Application

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Desirée Bill; Theodor Eriksson; [2023]
    Keywords : Ethics; Fine-tuning; Large Language Models; Machine learning; Psychology; Reinforcement Learning from Human Feedback;

    Abstract : The field of AI and machine learning has seen exponential growth in the last decade and even more so in the recent year with the considerable public interest in Large Language models (LLMs) such as chat-GPT. LLMs can be used for several purposes, but one possible application would be fine-tuning a model to perform a particular function in a specific field. READ MORE

  3. 3. Machine Learning Technique for Uplink Link Adaptation in 5G NR RAN at Millimeter Wave Frequencies

    University essay from Lunds universitet/Institutionen för elektro- och informationsteknik

    Author : Hazem Mohamed Abdelwahab Ibrahim Elgabroun; [2019]
    Keywords : Machine Learning; Link Adaptation; Reinforcement Learning; Technology and Engineering;

    Abstract : The demands on wireless communications are continuously growing, due to the fact that when higher network capabilities are delivered, new features and applications are created, calling for even higher requirements. To keep pace with these demands and to allow new applications to rise, the limits of mobile networks must be pushed regularly. READ MORE