Cargando…

Assembly of a pan-genome from deep sequencing of 910 humans of African descent

We used a deeply sequenced dataset of 910 individuals, all of African descent, to construct a set of DNA sequences present in these individuals but missing from the reference human genome. We aligned 1.19 trillion reads from the 910 individuals to the reference genome (GRCh38), collected all reads t...

Descripción completa

Detalles Bibliográficos
Autores principales: Sherman, Rachel M., Forman, Juliet, Antonescu, Valentin, Puiu, Daniela, Daya, Michelle, Rafaels, Nicholas, Boorgula, Meher Preethi, Chavan, Sameer, Vergara, Candelaria, Ortega, Victor E., Levin, Albert M., Eng, Celeste, Yazdanbakhsh, Maria, Wilson, James G., Marrugo, Javier, Lange, Leslie A., Williams, L. Keoki, Watson, Harold, Ware, Lorraine B., Olopade, Christopher O., Olopade, Olufunmilayo, Oliveira, Ricardo R., Ober, Carole, Nicolae, Dan L., Meyers, Deborah A., Mayorga, Alvaro, Knight-Madden, Jennifer, Hartert, Tina, Hansel, Nadia N., Foreman, Marilyn G., Ford, Jean G., Faruque, Mezbah U., Dunston, Georgia M., Caraballo, Luis, Burchard, Esteban G., Bleecker, Eugene R., Araujo, Maria I., Herrera-Paz, Edwin F., Campbell, Monica, Foster, Cassandra, Taub, Margaret A., Beaty, Terri H., Ruczinski, Ingo, Mathias, Rasika A., Barnes, Kathleen C., Salzberg, Steven L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6309586/
https://www.ncbi.nlm.nih.gov/pubmed/30455414
http://dx.doi.org/10.1038/s41588-018-0273-y
_version_ 1783383376201252864
author Sherman, Rachel M.
Forman, Juliet
Antonescu, Valentin
Puiu, Daniela
Daya, Michelle
Rafaels, Nicholas
Boorgula, Meher Preethi
Chavan, Sameer
Vergara, Candelaria
Ortega, Victor E.
Levin, Albert M.
Eng, Celeste
Yazdanbakhsh, Maria
Wilson, James G.
Marrugo, Javier
Lange, Leslie A.
Williams, L. Keoki
Watson, Harold
Ware, Lorraine B.
Olopade, Christopher O.
Olopade, Olufunmilayo
Oliveira, Ricardo R.
Ober, Carole
Nicolae, Dan L.
Meyers, Deborah A.
Mayorga, Alvaro
Knight-Madden, Jennifer
Hartert, Tina
Hansel, Nadia N.
Foreman, Marilyn G.
Ford, Jean G.
Faruque, Mezbah U.
Dunston, Georgia M.
Caraballo, Luis
Burchard, Esteban G.
Bleecker, Eugene R.
Araujo, Maria I.
Herrera-Paz, Edwin F.
Campbell, Monica
Foster, Cassandra
Taub, Margaret A.
Beaty, Terri H.
Ruczinski, Ingo
Mathias, Rasika A.
Barnes, Kathleen C.
Salzberg, Steven L.
author_facet Sherman, Rachel M.
Forman, Juliet
Antonescu, Valentin
Puiu, Daniela
Daya, Michelle
Rafaels, Nicholas
Boorgula, Meher Preethi
Chavan, Sameer
Vergara, Candelaria
Ortega, Victor E.
Levin, Albert M.
Eng, Celeste
Yazdanbakhsh, Maria
Wilson, James G.
Marrugo, Javier
Lange, Leslie A.
Williams, L. Keoki
Watson, Harold
Ware, Lorraine B.
Olopade, Christopher O.
Olopade, Olufunmilayo
Oliveira, Ricardo R.
Ober, Carole
Nicolae, Dan L.
Meyers, Deborah A.
Mayorga, Alvaro
Knight-Madden, Jennifer
Hartert, Tina
Hansel, Nadia N.
Foreman, Marilyn G.
Ford, Jean G.
Faruque, Mezbah U.
Dunston, Georgia M.
Caraballo, Luis
Burchard, Esteban G.
Bleecker, Eugene R.
Araujo, Maria I.
Herrera-Paz, Edwin F.
Campbell, Monica
Foster, Cassandra
Taub, Margaret A.
Beaty, Terri H.
Ruczinski, Ingo
Mathias, Rasika A.
Barnes, Kathleen C.
Salzberg, Steven L.
author_sort Sherman, Rachel M.
collection PubMed
description We used a deeply sequenced dataset of 910 individuals, all of African descent, to construct a set of DNA sequences present in these individuals but missing from the reference human genome. We aligned 1.19 trillion reads from the 910 individuals to the reference genome (GRCh38), collected all reads that failed to align, and assembled these reads into contiguous sequences (contigs). We then compared all contigs to one another to identify a set of unique sequences representing regions of the African pan-genome missing from the reference genome. Our analysis revealed 296,485,284 bp in 125,715 distinct contigs present in the African-descended populations, demonstrating that the African pan-genome contains ~10% more DNA than the current human reference genome. Although the functional significance of nearly all of this sequence is unknown, 387 of the novel contigs fall within 315 distinct protein-coding genes while the rest appear to be intergenic.
format Online
Article
Text
id pubmed-6309586
institution National Center for Biotechnology Information
language English
publishDate 2018
record_format MEDLINE/PubMed
spelling pubmed-63095862019-05-19 Assembly of a pan-genome from deep sequencing of 910 humans of African descent Sherman, Rachel M. Forman, Juliet Antonescu, Valentin Puiu, Daniela Daya, Michelle Rafaels, Nicholas Boorgula, Meher Preethi Chavan, Sameer Vergara, Candelaria Ortega, Victor E. Levin, Albert M. Eng, Celeste Yazdanbakhsh, Maria Wilson, James G. Marrugo, Javier Lange, Leslie A. Williams, L. Keoki Watson, Harold Ware, Lorraine B. Olopade, Christopher O. Olopade, Olufunmilayo Oliveira, Ricardo R. Ober, Carole Nicolae, Dan L. Meyers, Deborah A. Mayorga, Alvaro Knight-Madden, Jennifer Hartert, Tina Hansel, Nadia N. Foreman, Marilyn G. Ford, Jean G. Faruque, Mezbah U. Dunston, Georgia M. Caraballo, Luis Burchard, Esteban G. Bleecker, Eugene R. Araujo, Maria I. Herrera-Paz, Edwin F. Campbell, Monica Foster, Cassandra Taub, Margaret A. Beaty, Terri H. Ruczinski, Ingo Mathias, Rasika A. Barnes, Kathleen C. Salzberg, Steven L. Nat Genet Article We used a deeply sequenced dataset of 910 individuals, all of African descent, to construct a set of DNA sequences present in these individuals but missing from the reference human genome. We aligned 1.19 trillion reads from the 910 individuals to the reference genome (GRCh38), collected all reads that failed to align, and assembled these reads into contiguous sequences (contigs). We then compared all contigs to one another to identify a set of unique sequences representing regions of the African pan-genome missing from the reference genome. Our analysis revealed 296,485,284 bp in 125,715 distinct contigs present in the African-descended populations, demonstrating that the African pan-genome contains ~10% more DNA than the current human reference genome. Although the functional significance of nearly all of this sequence is unknown, 387 of the novel contigs fall within 315 distinct protein-coding genes while the rest appear to be intergenic. 2018-11-19 2019-01 /pmc/articles/PMC6309586/ /pubmed/30455414 http://dx.doi.org/10.1038/s41588-018-0273-y Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Sherman, Rachel M.
Forman, Juliet
Antonescu, Valentin
Puiu, Daniela
Daya, Michelle
Rafaels, Nicholas
Boorgula, Meher Preethi
Chavan, Sameer
Vergara, Candelaria
Ortega, Victor E.
Levin, Albert M.
Eng, Celeste
Yazdanbakhsh, Maria
Wilson, James G.
Marrugo, Javier
Lange, Leslie A.
Williams, L. Keoki
Watson, Harold
Ware, Lorraine B.
Olopade, Christopher O.
Olopade, Olufunmilayo
Oliveira, Ricardo R.
Ober, Carole
Nicolae, Dan L.
Meyers, Deborah A.
Mayorga, Alvaro
Knight-Madden, Jennifer
Hartert, Tina
Hansel, Nadia N.
Foreman, Marilyn G.
Ford, Jean G.
Faruque, Mezbah U.
Dunston, Georgia M.
Caraballo, Luis
Burchard, Esteban G.
Bleecker, Eugene R.
Araujo, Maria I.
Herrera-Paz, Edwin F.
Campbell, Monica
Foster, Cassandra
Taub, Margaret A.
Beaty, Terri H.
Ruczinski, Ingo
Mathias, Rasika A.
Barnes, Kathleen C.
Salzberg, Steven L.
Assembly of a pan-genome from deep sequencing of 910 humans of African descent
title Assembly of a pan-genome from deep sequencing of 910 humans of African descent
title_full Assembly of a pan-genome from deep sequencing of 910 humans of African descent
title_fullStr Assembly of a pan-genome from deep sequencing of 910 humans of African descent
title_full_unstemmed Assembly of a pan-genome from deep sequencing of 910 humans of African descent
title_short Assembly of a pan-genome from deep sequencing of 910 humans of African descent
title_sort assembly of a pan-genome from deep sequencing of 910 humans of african descent
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6309586/
https://www.ncbi.nlm.nih.gov/pubmed/30455414
http://dx.doi.org/10.1038/s41588-018-0273-y
work_keys_str_mv AT shermanrachelm assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT formanjuliet assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT antonescuvalentin assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT puiudaniela assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT dayamichelle assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT rafaelsnicholas assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT boorgulameherpreethi assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT chavansameer assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT vergaracandelaria assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT ortegavictore assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT levinalbertm assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT engceleste assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT yazdanbakhshmaria assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT wilsonjamesg assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT marrugojavier assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT langelesliea assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT williamslkeoki assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT watsonharold assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT warelorraineb assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT olopadechristophero assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT olopadeolufunmilayo assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT oliveiraricardor assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT obercarole assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT nicolaedanl assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT meyersdeboraha assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT mayorgaalvaro assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT knightmaddenjennifer assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT harterttina assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT hanselnadian assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT foremanmarilyng assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT fordjeang assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT faruquemezbahu assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT dunstongeorgiam assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT caraballoluis assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT burchardestebang assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT bleeckereugener assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT araujomariai assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT herrerapazedwinf assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT campbellmonica assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT fostercassandra assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT taubmargareta assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT beatyterrih assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT ruczinskiingo assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT mathiasrasikaa assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT barneskathleenc assemblyofapangenomefromdeepsequencingof910humansofafricandescent
AT salzbergstevenl assemblyofapangenomefromdeepsequencingof910humansofafricandescent