Training Set Size for Skin Cancer Classification Using Google’s Inception v3

University essay from KTH/Skolan för datavetenskap och kommunikation (CSC)

Author: Patric Ridell; Henning Spett; [2017]

Keywords: ;

Abstract: Today, computer aided diagnosis (CADx) is a common occurrence in hospitals. With image recognition, computers are able to detect signs of breast cancer and different kinds of lung diseases. For a convolutional neural network (CNN) that classifies images, the accuracy depends on the amount of data it is trained on and performs better as the amount of training data increase. This introduces a need for relevant images for the classes the classifier is supposed to differentiate between. However, when input data is increased, so does the computational cost, leading to a trade-off between accuracy and computational time. In a study by Cho et al. the accuracy improvement stagnates, when comparing the accuracy with different amounts of training data. This creates interests in finding that point of stagnation, since further increase of input data would lead to longer computational time but little effect on the accuracy. In this study, the pre-trained CNN Google Inception v3 is retrained with various amounts of skin lesion images. The objective is to detect whether the image represents a benign nevus or malignant melanoma. When comparing the accuracy for these different training sessions it is concluded that the accuracy increases when trained with more data. However, a stagnation point for the accuracy is not found. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)