Bayesian Off-policy Sim-to-Real Transfer for Antenna Tilt Optimization

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Choosing the correct angle of electrical tilt in a radio base station is essential when optimizing for coverage and capacity. A reinforcement learning agent can be trained to make this choice. If the training of the agent in the real world is restricted or even impossible, alternative methods can be used. Training in simulation combined with an approximation of the real world is one option that comes with a set of challenges associated with the reality gap. In this thesis, a method based on Bayesian optimization is implemented to tune the environment in which domain randomization is performed to improve the quality of the simulation training. The results show that using Bayesian optimization to find a good subset of parameters works even when access to the real world is constrained. Two off- policy estimators based on inverse propensity scoring and direct method evaluation in combination with an offline dataset of previously collected cell traces were tested. The method manages to find an isolated subspace of the whole domain that optimizes the randomization while still giving good performance in the target domain. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)