Feature Selection for Microarray Data via Stochastic Approximation

University essay from Göteborgs universitet/Institutionen för data- och informationsteknik

Abstract: This thesis explores the challenge of feature selection (FS) in machine learning, which involves reducing the dimensionality of data. The selection of a relevant subset of features from a larger pool has demonstrated its effectiveness in enhancing the performance of various machine learning algorithms. By reducing noise, improving model interpretability, and minimizing computational costs, FS plays a crucial role in optimizing algorithm outcomes. This research specifically focuses on FS in the domain of machine learning for mi croarray data, which frequently involves large and high-dimensional datasets. Mi croarray data is widely utilized in biological research and holds significant value. While filter-based methods, which employ statistical properties to rank features, are commonly used to address this challenge, they often overlook the connections with the classification algorithm, resulting in suboptimal classification accuracy. To address this limitation, this study analyses the performance of a novel wrapper based feature selection approach known as SPFSR, as proposed in Akman et al. (2022) [1]. Unlike filter-based methods, SPFSR considers classification accuracy and demonstrates its capability to handle large datasets. By incorporating the clas sification algorithm in the feature selection process, this approach aims to improve the overall performance and effectiveness of machine learning models in microarray data analysis.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)