Unsupervised Feature Extraction from CT Images for Clustering of Geological Drill Core Samples

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Miquel Sven Larsson Corominas; [2019]

Keywords: ;

Abstract: Computed tomography (CT) scanned drill cores provide a high resolution view of the internal structure and composition of the rock, which is interesting for many analysis purposes. Although, this data is very high dimensional and difficult to analyze in an automatic way. In this work, a study of how to reduce the dimensionality of these samples is done, with the objective of being able to find low dimensional representations, which then can be clustered into distinct geologically meaningful groups. Due to the complex nature of the data, a high degree of preprocessing is required - involving thresholding, normalization, cropping, etc. First, in order to obtain a baseline for the clusters, a clustering of the chemical compositions of the samples is done, which results in overly simplistic clusters - separating ore from the rest, due to the continuous nature of the data. For the CT data, two approaches are tested - IPCA and convolutional autoencoders, which are able to successfully reduce the dimensions of the scanned samples. For the latter, different bottleneck dimensions are tested in order to evaluate its effect in the resulting reconstruction errors. Nevertheless, when attempting to cluster the low dimensional embeddings, the algorithms only manage to separate the ore from the rest of the samples, as in the chemical clustering, which is too simplistic. An alternative approach is tested in order to obtain an insight of the holes - using UMAP 3D projections as RGB color coordinates, which provide colored maps of the holes that make more geological sense and provide more information than the previous approach. To finalize, an experiment is performed by creating eight distinct classes of synthetic volumetric data with different textures and grain sizes, which resemble rock material, in order to validate the approach of clustering the convolutional autoencoder latent representations. For a sufficient number of channels, all synthetic classes are able to be clearly separated. Interestingly, latent representations of classes with bad reconstructions are still able to be clustered satisfactorily.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)