Improving Knowledge of Truck Fuel Consumption Using Data Analysis

University essay from Linköpings universitet/Reglerteknik; Linköpings universitet/Logistik- och kvalitetsutveckling


The large potential of big data and how it has brought value into various industries have been established in research. Since big data has such large potential if handled and analyzed in the right way, revealing information to support decision making in an organization, this thesis is conducted as a case study at an automotive manufacturer with access to large amounts of customer usage data of their vehicles. The reason for performing an analysis of this kind of data is based on the cornerstones of Total Quality Management with the end objective of increasing customer satisfaction of the concerned products or services.

The case study includes a data analysis exploring how and if patterns about what affects fuel consumption can be revealed from aggregated customer usage data of trucks linked to truck applications. Based on the case study, conclusions are drawn about how a company can use this type of analysis as well as how to handle the data in order to turn it into business value.

The data analysis reveals properties describing truck usage using Factor Analysis and Principal Component Analysis. Especially one property is concluded to be important as it appears in the result of both techniques. Based on these properties the trucks are clustered using k-means and Hierarchical Clustering which shows groups of trucks where the importance of the properties varies. Due to the homogeneity and complexity of the chosen data, the clusters of trucks cannot be linked to truck applications. This would require data that is more easily interpretable. Finally, the importance for fuel consumption in the clusters is explored using model estimation. A comparison of Principal Component Regression (PCR) and the two regularization techniques Lasso and Elastic Net is made. PCR results in poor models difficult to evaluate. The two regularization techniques however outperform PCR, both giving a higher and very similar explained variance. The three techniques do not show obvious similarities in the models and no conclusions can therefore be drawn concerning what is important for fuel consumption.

During the data analysis many problems with the data are discovered, which are linked to managerial and technical issues of big data. This leads to for example that some of the parameters interesting for the analysis cannot be used and this is likely to have an impact on the inability to get unanimous results in the model estimations. It is also concluded that the data was not originally intended for this type of analysis of large populations, but rather for testing and engineering purposes.

Nevertheless, this type of data still contains valuable information and can be used if managed in the right way. From the case study it can be concluded that in order to use the data for more advanced analysis a big-data plan is needed at a strategic level in the organization. The plan summarizes the suggested solution for the managerial issues of the big data for the organization. This plan describes how to handle the data, how the analytic models revealing the information should be designed and the tools and organizational capabilities needed to support the people using the information.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)