Whole-Genome Sequencing of two Swedish Individuals on PromethION

University essay from Lunds universitet/Examensarbeten i bioinformatik

Author: Nazeefa Fatima; [2019]

Keywords: Biology and Life Sciences;

Abstract: Background: Chromosomes can undergo various changes such as deletions, inversions, insertions, and/or translocations resulting in structural variation differences between individuals. Structural variants are a common source of variability in the human genome and have been known to be associated with common diseases such as autism, cancer, and rare human diseases [1, 2]. However, they have not yet been extensively studied at the higher resolution. SVs are complex genomic components partially due to being known to emerge in repetitive regions [3]. Alignment of short reads to repetitive regions can cause ambiguity and has, therefore, posed challenges in the past to detect SVs. New approaches for SV detection have been enabled by the recent improvements in sequencing technologies. In particular, the new long-read single-molecule sequencing instruments provided by Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) produce a high yield in a short period while keeping a low cost for a library preparation. These instruments make it possible to generate high quality representations of whole genomes and enable reliable structural variant calling in human individuals [4, 5]. Objectives: A recent study performed on PacBio’s Single-Molecule Real-Time sequencing of two Swedish human genomes, Swe1 (male) and Swe2 (female), as part of the SweGen 1000 Genomes project (https://swefreq.nbis.se), uncovered over 17K SVs per individual as well as various other genomic components [6] that are otherwise not detectable in short reads. As a follow-up study, we have now generated data for the same two Swedish individuals on the ONT’s PromethION system, a new nanopore based sequencing instrument, that is known for its higher throughput as compared to the PacBio. Results and Conclusion: We present a pilot study that evaluates nanopore data derived from wholegenome sequencing (WGS) on PromethION in comparison to the Single-Molecule Real-Time (SMRT) reads obtained from the PacBio RS II platform. We performed comparative analyses of single- molecule long-read technologies in a context of mappability, and SV detection that resulted in an average of 17k and 24k variants across nanopore and SMRT datasets, respectively. The results will be useful for the large-scale SweGen project in a context of validation and comparison of SVs in Swedish individuals. In addition, the study serves as a bioinformatics pipeline for future long-read data analyses and sets a basis for what to consider when designing future PromethION experiments.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)