Distributed databases for Multi Mediation : Scalability, Availability & Performance

University essay from Blekinge Tekniska Högskola/Institutionen för kommunikationssystem

Abstract: Context: Multi Mediation is a process of collecting data from network(s) & network elements, pre-processing this data and distributing it to various systems like Big Data analysis, Billing Systems, Network Monitoring Systems, and Service Assurance etc. With the growing demand for networks and emergence of new services, data collected from networks is growing. There is need for efficiently organizing this data and this can be done using databases. Although RDBMS offers Scale-up solutions to handle voluminous data and concurrent requests, this approach is expensive. So, alternatives like distributed databases are an attractive solution. Suitable distributed database for Multi Mediation, needs to be investigated. Objectives: In this research we analyze two distributed databases in terms of performance, scalability and availability. The inter-relations between performance, scalability and availability of distributed databases are also analyzed. The distributed databases that are analyzed are MySQL Cluster 7.4.4 and Apache Cassandra 2.0.13. Performance, scalability and availability are quantified, measurements are made in the context of Multi Mediation system. Methods: The methods to carry out this research are both qualitative and quantitative. Qualitative study is made for the selection of databases for evaluation. A benchmarking harness application is designed to quantitatively evaluate the performance of distributed database in the context of Multi Mediation. Several experiments are designed and performed using the benchmarking harness on the database cluster. Results: Results collected include average response time & average throughput of the distributed databases in various scenarios. The average throughput & average INSERT response time results favor Apache Cassandra low availability configuration. MySQL Cluster average SELECT response time is better than Apache Cassandra for greater number of client threads, in high availability and low availability configurations.Conclusions: Although Apache Cassandra outperforms MySQL Cluster, the support for transaction and ACID compliance are not to be forgotten for the selection of database. Apart from the contextual benchmarks, organizational choices, development costs, resource utilizations etc. are more influential parameters for selection of database within an organization. There is still a need for further evaluation of distributed databases.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)