Characterizing Video Compression Using Convolutional Neural Networks

University essay from Luleå tekniska universitet/Datavetenskap

Abstract: Can compression parameters used in video encoding be estimated, given only the visual information of the resulting compressed video? If so, these parameters could potentially improve existing parametric video quality estimation models. Today, parametric models use information like bitrate to estimate the quality of a given video. This method is inaccurate since it does not consider the coding complexity of a video. The constant rate factor (CRF) parameter for h.264 encoding aims to keep the quality constant while varying the bitrate, if the CRF for a video is known together with bitrate, a better quality estimate could potentially be achieved. In recent years, artificial neural networks and specifically convolutional neural networks have shown great promise in the field of image processing. In this thesis, convolutional neural networks are investigated as a way of estimating the constant rate factor parameter for a degraded video by identifying the compression artifacts and their relation to the CRF used. With the use of ResNet, a model for estimating the CRF for each frame of a video can be derived, these per-frame predictions are further used in a video classification model which performs a total CRF prediction for a given video. The results show that it is possible to find a relation between the visual encoding artifacts and CRF used. The top-5 accuracy achieved for the model is at 61.9% with the use of limited training data. Given that today’s parametric bitrate based models for quality have no information about coding complexity, even a rough estimate of the CRF could improve the precision of them.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)