Cargando…

Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing

The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [∼80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions t...

Descripción completa

Detalles Bibliográficos
Autores principales: Vembar, Shruthi Sridhar, Seetin, Matthew, Lambert, Christine, Nattestad, Maria, Schatz, Michael C., Baybayan, Primo, Scherf, Artur, Smith, Melissa Laird
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4991835/
https://www.ncbi.nlm.nih.gov/pubmed/27345719
http://dx.doi.org/10.1093/dnares/dsw022
_version_ 1782448913799708672
author Vembar, Shruthi Sridhar
Seetin, Matthew
Lambert, Christine
Nattestad, Maria
Schatz, Michael C.
Baybayan, Primo
Scherf, Artur
Smith, Melissa Laird
author_facet Vembar, Shruthi Sridhar
Seetin, Matthew
Lambert, Christine
Nattestad, Maria
Schatz, Michael C.
Baybayan, Primo
Scherf, Artur
Smith, Melissa Laird
author_sort Vembar, Shruthi Sridhar
collection PubMed
description The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [∼80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12 kb, with 50% of the reads between 15.5 and 50 kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [∼90–99% (A + T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission.
format Online
Article
Text
id pubmed-4991835
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-49918352016-08-22 Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing Vembar, Shruthi Sridhar Seetin, Matthew Lambert, Christine Nattestad, Maria Schatz, Michael C. Baybayan, Primo Scherf, Artur Smith, Melissa Laird DNA Res Full Papers The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [∼80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12 kb, with 50% of the reads between 15.5 and 50 kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [∼90–99% (A + T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission. Oxford University Press 2016-08 2016-06-26 /pmc/articles/PMC4991835/ /pubmed/27345719 http://dx.doi.org/10.1093/dnares/dsw022 Text en © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Full Papers
Vembar, Shruthi Sridhar
Seetin, Matthew
Lambert, Christine
Nattestad, Maria
Schatz, Michael C.
Baybayan, Primo
Scherf, Artur
Smith, Melissa Laird
Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing
title Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing
title_full Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing
title_fullStr Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing
title_full_unstemmed Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing
title_short Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing
title_sort complete telomere-to-telomere de novo assembly of the plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing
topic Full Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4991835/
https://www.ncbi.nlm.nih.gov/pubmed/27345719
http://dx.doi.org/10.1093/dnares/dsw022
work_keys_str_mv AT vembarshruthisridhar completetelomeretotelomeredenovoassemblyoftheplasmodiumfalciparumgenomethroughlongread11kbsinglemoleculerealtimesequencing
AT seetinmatthew completetelomeretotelomeredenovoassemblyoftheplasmodiumfalciparumgenomethroughlongread11kbsinglemoleculerealtimesequencing
AT lambertchristine completetelomeretotelomeredenovoassemblyoftheplasmodiumfalciparumgenomethroughlongread11kbsinglemoleculerealtimesequencing
AT nattestadmaria completetelomeretotelomeredenovoassemblyoftheplasmodiumfalciparumgenomethroughlongread11kbsinglemoleculerealtimesequencing
AT schatzmichaelc completetelomeretotelomeredenovoassemblyoftheplasmodiumfalciparumgenomethroughlongread11kbsinglemoleculerealtimesequencing
AT baybayanprimo completetelomeretotelomeredenovoassemblyoftheplasmodiumfalciparumgenomethroughlongread11kbsinglemoleculerealtimesequencing
AT scherfartur completetelomeretotelomeredenovoassemblyoftheplasmodiumfalciparumgenomethroughlongread11kbsinglemoleculerealtimesequencing
AT smithmelissalaird completetelomeretotelomeredenovoassemblyoftheplasmodiumfalciparumgenomethroughlongread11kbsinglemoleculerealtimesequencing