AI-driven admission control : with Deep Reinforcement Learning

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: 5G is expected to provide a high-performance and highly efficient network to prominent industry verticals with ubiquitous access to a wide range of services with orders of magnitude of improvement over 4G. Network slicing, which allocates network resources according to users’ specific requirements, is a key feature to fulfil the diversity of requirements in 5G network. However, network slicing also brings more orchestration and difficulty in monitoring and admission control. Although the problem of admission control has been extensively studied, those research take measurements for granted. Fixed high monitoring frequency can waste system resources, while low monitoring frequency (low level of observability) can lead to insufficient information for good admission control decisions. To achieve efficient admission control in 5G, we consider the impact of configurable observability, i.e. control observed information by configuring measurement frequency, is worth investigating. Generally, we believe more measurements provide more information about the monitored system, thus enabling a capable decision-maker to have better decisions. However, more measurements also bring more monitoring overhead. To study the problem of configurable observability, we can dynamically decide what measurements to monitor and their frequencies to achieve efficient admission control. In the problem of admission control with configurable observability, the objective is to minimize monitoring overhead while maintaining enough information to make proper admission control decisions. In this thesis, we propose using the Deep Reinforcement Learning (DRL) method to achieve efficient admission control in a simulated 5G end-to-end network, including core network, radio access network and four dynamic UEs. The proposed method is evaluated by comparing with baseline methods using different performance metrics, and then the results are discussed. With experiments, the proposed method demonstrates the ability to learn from interaction with the simulated environment and have good performance in admission control and used low measurement frequencies. After 11000 steps of learning, the proposed DRL agents generally achieve better performance than the threshold-based baseline agent, which takes admission decisions based on combined threshold conditions on RTT and throughput. Furthermore, the DRL agents that take non-zero measurement costs into consideration uses much lower measurement frequencies than DRL agents that take measurement costs as zero. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)