Machine Learning Clustering andClassification of Network DeploymentScenarios in a Telecom Networksetting

University essay from Linköpings universitet/Institutionen för datavetenskap

Abstract: Cellular network deployment scenarios refer to how cellular networks are implementedand deployed by network operators to provide wireless connectivity to end users.These scenarios can vary based on capacity requirements, type of geographical area, populationdensity, and specific use cases. Radio Access Networks of different generations,such as 4G and 5G, may also have different deployments. Network deployment scenarioscover many aspects, but two major components are Configuration settings and PerformanceMeasures which refer to the network nodes, hardware build-up and softwaresettings, and the end user behavior and connectivity experience in the area covered by thewireless network.In this master thesis, the aim is to understand how different area types - such as Rural,Suburban, and Urban – affect the cellular network deployment in such areas. A novelframework was developed to label each node (base station) with the area type it is associatedwith. The framework utilizes spatial analytics on the dataset provided by Ericsson forthe LTE nodes working with 4G technology in combination with open-source libraries anddatasets such as GeoPy and H3 Kontur population dataset respectively, to create area typelabels. The area types are labeled based on the calculated population density served byeach node and are considered true labels based on manual sanity checks performed. A supervisedmachine learning model was used to predict the nodes based on the CM and PMdata to understand the strength of the relationship between the features and true labels.This thesis also includes analysis and insights about characteristic deployment scenariosunder different area types. The main goal of this master thesis is to utilize machinelearning to uncover the characteristic features of a variety of node groups inherent in atelecom network, which, in the long run, contributes to better service operation and optimizationof existing cellular infrastructure. Nodes (base station) are labeled in the datato be able to distinguish their associated area-type. In addition to this clustering is performedto uncover the inherent characteristic behavior groups in the data and comparethem against the output from the classification model. Lastly, the investigation was doneon the potential impact of node placements such as indoor or outdoor, on the correspondingfeatures.In conclusion, the study’s results showed us that a correlation exists between deploymentscenarios and the different areas. There are a few prevalent common denominatorsbetween the node groups such as Pathloss and NR Cell Relations that drive the classificationmodel to a better classification metric, F1 score. Clustering of CM and PM data uncoversinherent patterns in different node groups under different area types and providesinformation about characteristic features of the groups such as CM data displaying twoconfiguration setting clusters, and PM data showing three different user behavior patterns.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)