Essays about: "markov decision process"

Showing result 1 - 5 of 47 essays containing the words markov decision process.

  1. 1. Risk-Averse Multi-Armed Bandit Problem with Multiple Plays

    University essay from Göteborgs universitet/Institutionen för data- och informationsteknik

    Author : Siri Dahlgren; Nicholas Marriott; [2023-10-23]
    Keywords : MAB; Gittins; Markovian bandit; risk-aversion; policy iteration; multiple plays;

    Abstract : This study aims to construct an efficient heuristic, referred to as RA, for a riskaverse Markovian multi-armed bandit problem (MAB) with multiple plays. The RA incorporates risk-aversion and multiple plays by modifying the Gittins index strategy. READ MORE

  2. 2. Optimal Order Placement Using Markov Models of Limit Order Books

    University essay from KTH/Matematik (Avd.)

    Author : Max Oliveberg; [2023]
    Keywords : Optimal order placement; Limit order book; Markov; Optimal orderläggning; Orderbok; Markov;

    Abstract : We study optimal order placement in a limit order book. By modelling the limit order book dynamics as a Markov chain, we can frame the purchase of a single share as a Markov Decision Process. Within the framework of the model, we can estimate optimal decision policies numerically. The trade rate is varied using a running cost control variable. READ MORE

  3. 3. Decreasing Training Time of Reinforcement Learning Agents for Remote Tilt Optimization using a Surrogate Neural Network Approximator

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Jiaming Huang; [2023]
    Keywords : ;

    Abstract : One possible application of reinforcement learning in the telecommunication field is antenna tilt optimization. However, one of key challenges we face is that the use of handcrafted simulators as environments to provide information for agents is often time-consuming regarding training reinforcement learning agents. READ MORE

  4. 4. S-MARL: An Algorithm for Single-To-Multi-Agent Reinforcement Learning : Case Study: Formula 1 Race Strategies

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Marinaro Davide; [2023]
    Keywords : Reinforcement Learning; Single-to-Multi-Agent; Learning Stability; Exploration-Exploitation trade-off; Race Strategy Optimization; Förstärkningsinlärning; Från en till flera agenter; Stabilitet vid inlärning; Utforskning-exploatering; Optimering av tävlingsstrategier;

    Abstract : A Multi-Agent System is a group of autonomous, intelligent, interacting agents sharing an environment that they observe through sensors, and upon which they act with actuators. The behaviors of these agents can be either defined upfront by programmers or learned by trial-and-error resorting to Reinforcement Learning. READ MORE

  5. 5. Random Edge is not faster than Random Facet on Linear Programs

    University essay from KTH/Matematik (Avd.)

    Author : Nicole Hedblom; [2023]
    Keywords : Simplex method; simplex; Random Edge; Linear Programming; Random Facet; randomized pivoting rule; Markov decision process; Simplexmetoden; Random Edge; linjärprogrammering; Random Facet; Markov-beslutsprocess;

    Abstract : A Linear Program is a problem where the goal is to maximize a linear function subject to a set of linear inequalities. Geometrically, this can be rephrased as finding the highest point on a polyhedron. The Simplex method is a commonly used algorithm to solve Linear Programs. READ MORE