Next Next Gen Detection of Structural Variants

Accurate detection of complex structural variations using single-molecule sequencing

Fritz J. Sedlazeck, Philipp Rescheneder, Moritz Smolka, Han Fang, Maria Nattestad, Arndt von Haeseler, and Michael C. Schatz

Nature Methods (Research article)

Next Next Gen Detection of Structural Variants

Abstract—Structural variations are the greatest source of genetic variation, but they remain poorly understood because of technological limitations. Single-molecule long-read sequencing has the potential to dramatically advance the field, although high error rates are a challenge with existing methods. Addressing this need, we introduce open-source methods for long-read alignment (NGMLR; https://github.com/philres/ngmlr) and structural variant identification (Sniffles; https://github.com/fritzsedlazeck/Sniffles) that provide unprecedented sensitivity and precision for variant detection, even in repeat-rich regions and for complex nested events that can have substantial effects on human health. In several long-read datasets, including healthy and cancerous human genomes, we discovered thousands of novel variants and categorized systematic errors in short-read approaches. NGMLR and Sniffles can automatically filter false events and operate on low-coverage data, thereby reducing the high costs that have hindered the application of long reads in clinical and research settings.


READ MORE …

Connecting chromatin states (Epigenetics) to structural variation in human genomes

Chromatin organization modulates the origin of heritable structural variations in human genome 

Tanmoy Roychowdhury and Alexej Abyzov

Nucleic Acids Research (Article)

Abstract

Connecting chromatin states (Epigenetics) to structural variation in human genomes. Genome Media.

“Structural variations (SVs) in the human genome originate from different mechanisms related to DNA repair, replication errors, and retrotransposition. Our analyses of 26 927 SVs from the 1000 Genomes Project revealed differential distributions and consequences of SVs of different origin, e.g. deletions from non-allelic homologous recombination (NAHR) are more prone to disrupt chromatin organization while processed pseudogenes can create accessible chromatin. Spontaneous double stranded breaks (DSBs) are the best predictor of enrichment of NAHR deletions in open chromatin. This evidence, along with strong physical interaction of NAHR breakpoints belonging to the same deletion suggests that majority of NAHR deletions are non-meiotic i.e. originate from errors during homology directed repair (HDR) of spontaneous DSBs. In turn, the origin of the spontaneous DSBs is associated with transcription factor binding in accessible chromatin revealing the vulnerability of functional, open chromatin. The chromatin itself is enriched with repeats, particularly fixed Alu elements that provide the homology required to maintain stability via HDR. Through co-localization of fixed Alus and NAHR deletions in open chromatin we hypothesize that old Alu expansion had a stabilizing role on the human genome.”

Population-specific structural variation

Genome maps across 26 human populations reveal population-specific patterns of structural variation

Abstract—Large structural variants (SVs) in the human genome are difficult to detect and study by conventional sequencing technologies. With long-range genome analysis platforms, such as optical mapping, one can identify large SVs (>2 kb) across the genome in one experiment. Analyzing optical genome maps of 154 individuals from the 26 populations sequenced in the 1000 Genomes Project, we find that phylogenetic population patterns of large SVs are similar to those of single nucleotide variations in 86% of the human genome, while ~2% of the genome has high structural complexity. We are able to characterize SVs in many intractable regions of the genome, including segmental duplications and subtelomeric, pericentromeric, and acrocentric areas. In addition, we discover ~60 Mb of non-redundant genome content missing in the reference genome sequence assembly. Our results highlight the need for a comprehensive set of alternate haplotypes from different populations to represent SV patterns in the genome.

READ MORE …

Population-specific genome structure variation coverage in GenomeWeb

Human Genome Structural Variation Patterns Vary by Population, Optical Mapping Study Shows

NEW YORK (GenomeWeb) – Some large structural variants in the human genome exhibit population-specific patterns, according to a new analysis of more than 150 genome maps.

Large structural variants — those that are bigger than 2 kilobases — are difficult to detect, especially as short-read sequencing technologies are the most commonly used tools in genomic analysis.

Population-specific genome structure variation coverage in GenomeWeb. Genome Media.

For their study, Pui-Yan Kwok from the University of California, San Francisco and his colleagues analyzed optical genome maps generated for more than 150 individuals representing more than two dozen populations. A phylogenetic analysis of these maps indicated that some SVs and CNVs show variable population patterns. The researchers were also able to characterize SVs in typically intractable regions of the genome, including spots not covered by the human reference genome. Their results were published yesterday in Nature Communications.


READ MORE…