Essays about: "TD Learning"
Found 5 essays containing the words TD Learning.
-
1. On the notions and predictability of Technical Debt
University essay from Linnéuniversitetet/Institutionen för datavetenskap och medieteknik (DM)Abstract : Technical debt (TD) is a by-product of short-term optimisation that results in long-term disadvantages. Because every system gets more complicated while it is evolving, technical debt can emerge naturally. READ MORE
-
2. SELECTION OF FEATURES FOR ML BASED COMMANDING OF AUTONOMOUS VEHICLES
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : Traffic coordination is an essential challenge in vehicle automation. The challenge is not only about maximizing the revenue/productivity of a fleet of vehicles, but also about avoiding non feasible states such as collisions and low energy levels, which could make the fleet inoperable. READ MORE
-
3. Robust Reinforcement Learning in Continuous Action/State Space
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : In this project we aim to apply Robust Reinforce-ment Learning algorithms, presented by Doya and Morimoto [1],[2], to control problems. Specifically, we train an agent to balancea pendulum in the unstable equilibrium, which is the invertedstate.We investigate the performance of controllers based on twodifferent function approximators. READ MORE
-
4. APPLYING MACHINE LEARNING ALGORITHMS TO DETECT LINES OF CODE CONTRIBUTING TO TECHNICAL DEBT
University essay from Göteborgs universitet/Institutionen för data- och informationsteknikAbstract : This paper shows the investigation of the viability of finding lines of code (LOC) contributing to technical debt (TD) using machine learning (ML), by trying to imitate the static code analysis tool SonarQube. This is approached by letting industry professionals choose the SonarQube rules, followed by training different classifiers with the help of CCFlex (a tool for training classifiers with lines of code), while using SonarQube as an oracle (a source of training sample data) which selects the faulty lines of code. READ MORE
-
5. Reinforcement Learning– Intelligent Weighting of Monte Carlo and Temporal Differences
University essay from Lunds universitet/Institutionen för reglerteknikAbstract : In Reinforcement learning the updating of the value functions determines the information spreading across the state/state-action space which condenses the valuebased control policy. It is important to have an information propagation across the value domain in a manner that is effective. READ MORE