Improving Soil Information with Generative and Machine Learning Models

University essay from Göteborgs universitet/Institutionen för data- och informationsteknik

Abstract: Soil data observations are among the most difficult data to collect. Low sample density along with the high cost of sampling has made current soil information that are usually presented as maps, unusable for detailed applications such as modelling earth system dynamics, crop modelling, natural hazards prediction and climate change impacts. In this research, a new method for data augmentation of already available data was used to improve the accuracy levels of current soil maps across the country of Sri Lanka. Soil profile datasets were collected from custodians of national soil profile data and combined with data that were mined from published sources. Moreover, two extra datasets were developed to augment the original sample size. A series of pseudo-samples were generated by analysing over 600 images obtained from NASA MODIS MCD34A4 product, to locate Sand dunes with a high percentage of sand and low organic carbon. Another dataset was created by training and validation of a spatial generative adversarial neural network (SpaceGAN) that can mimic the original distribution of soil properties. The augmented datasets were used to predict the values of soil elements at unsampled locations with machine learning models. The performance of successful models was then compared with different levels of data augmentation. Results showed that data augmentation, particularly with the generated data from SpaceGAN can enhance the knowledge of some soil elements and could be explored as a viable option to improve the overall accuracy of soil information.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)