Scalable Reinforcement Learning for Linear-Quadratic Control of Networks

University essay from Lunds universitet/Institutionen för reglerteknik

Author: Johan Olsson; [2023]

Keywords: Technology and Engineering;

Abstract: Distributed optimal control is known to be challenging and can become intractable even for linear-quadratic regulator problems. In this work, we study a special class of such problems where distributed state feedback controllers can give near-optimal performance. More specifically, we consider networked linear-quadratic controllers with decoupled costs and spatially exponentially decaying dynamics. We aim to exploit the structure in the problem to design a scalable reinforcement learning algorithm for learning a distributed controller. Recent work has shown that the optimal controller can be well approximated only using information from a κ-neighbourhood of each agent. Motivated by these results, we show that similar results hold for the agents’ individual value and action-value functions. We continue by designing an algorithm, based on the actor-critic framework, to learn distributed controllers only using local information. Specifically, the action-value function is estimated by modifying the Least Squares Temporal Difference for Q-functions method to only use local information. The algorithm then updates the policy using gradient descent. Finally, the algorithm is evaluated through simulations which suggest near-optimal performance.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)