A comparison between two computational tools estimating tumor purity using NGS data

University essay from Linköpings universitet/Institutionen för fysik, kemi och biologi

Author: My Eles; [2023]

Keywords: ;

Abstract: In 2020, cancer accounted for almost 20% of all deaths in the United States. Cancer is highly individual, and individualized treatments are essential in the battle against the disease. The tumor microenvironment is complex, and the cancer genome contains mutations driving the cancer. Identification and inference of mutations in the cancer genome are important for individualized diagnosis, prognosis, and treatment decisions. With NGS techniques, getting information about a tumor on the DNA level is possible. However, the data must be analyzed to reveal information from the NGS analysis. A tumor consists of both cancer and normal cells. When analyzing a tumor, DNA from cancer and normal cells is intermixed, and the information of which DNA comes from which cell is lost. The analysis is complicated since the fraction of cancer cells is unknown. Tumor purity is defined as the fraction of cancer cells in a tumor. Traditionally a pathologist decides the tumor purity by visually inspecting a tumor sample. As NGS techniques have developed, computational tools distinguishing between cancer and normal cells, including the fraction, have arisen. The purpose of this master’s thesis was to study how precise computational tools can estimate tumor purity using NGS data compared to a purity estimate made by a pathologist. To study the subject, a search was done for computational tools estimating tumor purity using NGS data. The software code had to be open, and the tools should focus on one tumor specimen from a patient, and papers using a normal sample from the patient were excluded. The search resulted in eight computational tools estimating tumor purity. Further, the two tools, ABSOLUTE and PureCN, were selected for comparison. An open access data set was used containing seven specimens. The data was filtered to imitate panel data targeting 250 genes. For some specimens, ABSOLUTE and PureCN performed consistent estimates with the pathologist’s estimates. However, for most specimens, the estimated purity by the tools was not in agreement with the ones made by the pathologist. PureCN performed more consistently with the pathologist estimates than ABSOLUTE, but it cannot be concluded with certainty. The study in this master’s thesis could not prove that the computational tools, ABSOLUTE and PureCN, are good enough at estimating tumor pu- rity on the imitated panel data to be used in the clinic. The study included data from only seven tumors. Therefore, significant conclusions could not be drawn from it.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)