Essays about: "thesis in reward systems"
Showing result 1 - 5 of 41 essays containing the words thesis in reward systems.
-
1. Scalable Reinforcement Learning for Formation Control with Collision Avoidance : Localized policy gradient algorithm with continuous state and action space
University essay from KTH/Skolan för teknikvetenskap (SCI); KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : In the last decades, significant theoretical advances have been made on the field of distributed mulit-agent control theory. One of the most common systems that can be modelled as multi-agent systems are the so called formation control problems, in which a network of mobile agents is controlled to move towards a desired final formation. READ MORE
-
2. Challenges and opportunities of blockchain-based system for energy communities
University essay from Luleå tekniska universitet/DatavetenskapAbstract : Blockchain technology has attracted a lot of interest recently. Many different sectors from all over the world are now researching the technology to find any prospective uses that can enhance their company. READ MORE
-
3. Deep Reinforcement Learning and Simulation for the Optimization of Production Systems
University essay from Uppsala universitet/Institutionen för informationsteknologiAbstract : The main objective of this master thesis project is to use the deep reinforcement learning (DRL) and simulation method for optimization of production systems. In this project, the Deep Q-learning Networks (DQN) algorithm is first used to optimize seven decision variables in Averill Law’s production system to find the best profit, with 99. READ MORE
-
4. Investigating Multi-Objective Reinforcement Learning for Combinatorial Optimization and Scheduling Problems : Feature Identification for multi-objective Reinforcement Learning models
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : Reinforcement Learning (RL) has in recent years become a core method for sequential decision making in complex dynamical systems, being of great interest to support improvements in scheduling problems. This could prove important to areas in the newer generation of cellular networks. READ MORE
-
5. Graph Bandits : Multi-Armed Bandits with Locality Constraints
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : Multi-armed bandits (MABs) have been studied extensively in the literature and have applications in a wealth of domains, including recommendation systems, dynamic pricing, and investment management. On the one hand, the current MAB literature largely seems to focus on the setting where each arm is available to play at each time step, and ignores how agents move between the arms. READ MORE