Cargando…

Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences

BACKGROUND: The study of ancient DNA is hampered by degradation, resulting in short DNA fragments. Advances in laboratory methods have made it possible to retrieve short DNA fragments, thereby improving access to DNA preserved in highly degraded, ancient material. However, such material contains lar...

Descripción completa

Detalles Bibliográficos
Autores principales: de Filippo, Cesare, Meyer, Matthias, Prüfer, Kay
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6202837/
https://www.ncbi.nlm.nih.gov/pubmed/30359256
http://dx.doi.org/10.1186/s12915-018-0581-9
_version_ 1783365765533007872
author de Filippo, Cesare
Meyer, Matthias
Prüfer, Kay
author_facet de Filippo, Cesare
Meyer, Matthias
Prüfer, Kay
author_sort de Filippo, Cesare
collection PubMed
description BACKGROUND: The study of ancient DNA is hampered by degradation, resulting in short DNA fragments. Advances in laboratory methods have made it possible to retrieve short DNA fragments, thereby improving access to DNA preserved in highly degraded, ancient material. However, such material contains large amounts of microbial contamination in addition to DNA fragments from the ancient organism. The resulting mixture of sequences constitutes a challenge for computational analysis, since microbial sequences are hard to distinguish from the ancient sequences of interest, especially when they are short. RESULTS: Here, we develop a method to quantify spurious alignments based on the presence or absence of rare variants. We find that spurious alignments are enriched for mismatches and insertion/deletion differences and lack substitution patterns typical of ancient DNA. The impact of spurious alignments can be reduced by filtering on these features and by imposing a sample-specific minimum length cutoff. We apply this approach to sequences from four ~ 430,000-year-old Sima de los Huesos hominin remains, which contain particularly short DNA fragments, and increase the amount of usable sequence data by 17–150%. This allows us to place a third specimen from the site on the Neandertal lineage. CONCLUSIONS: Our method maximizes the sequence data amenable to genetic analysis from highly degraded ancient material and avoids pitfalls that are associated with the analysis of ultra-short DNA sequences. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12915-018-0581-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6202837
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62028372018-11-01 Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences de Filippo, Cesare Meyer, Matthias Prüfer, Kay BMC Biol Research Article BACKGROUND: The study of ancient DNA is hampered by degradation, resulting in short DNA fragments. Advances in laboratory methods have made it possible to retrieve short DNA fragments, thereby improving access to DNA preserved in highly degraded, ancient material. However, such material contains large amounts of microbial contamination in addition to DNA fragments from the ancient organism. The resulting mixture of sequences constitutes a challenge for computational analysis, since microbial sequences are hard to distinguish from the ancient sequences of interest, especially when they are short. RESULTS: Here, we develop a method to quantify spurious alignments based on the presence or absence of rare variants. We find that spurious alignments are enriched for mismatches and insertion/deletion differences and lack substitution patterns typical of ancient DNA. The impact of spurious alignments can be reduced by filtering on these features and by imposing a sample-specific minimum length cutoff. We apply this approach to sequences from four ~ 430,000-year-old Sima de los Huesos hominin remains, which contain particularly short DNA fragments, and increase the amount of usable sequence data by 17–150%. This allows us to place a third specimen from the site on the Neandertal lineage. CONCLUSIONS: Our method maximizes the sequence data amenable to genetic analysis from highly degraded ancient material and avoids pitfalls that are associated with the analysis of ultra-short DNA sequences. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12915-018-0581-9) contains supplementary material, which is available to authorized users. BioMed Central 2018-10-25 /pmc/articles/PMC6202837/ /pubmed/30359256 http://dx.doi.org/10.1186/s12915-018-0581-9 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
de Filippo, Cesare
Meyer, Matthias
Prüfer, Kay
Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences
title Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences
title_full Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences
title_fullStr Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences
title_full_unstemmed Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences
title_short Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences
title_sort quantifying and reducing spurious alignments for the analysis of ultra-short ancient dna sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6202837/
https://www.ncbi.nlm.nih.gov/pubmed/30359256
http://dx.doi.org/10.1186/s12915-018-0581-9
work_keys_str_mv AT defilippocesare quantifyingandreducingspuriousalignmentsfortheanalysisofultrashortancientdnasequences
AT meyermatthias quantifyingandreducingspuriousalignmentsfortheanalysisofultrashortancientdnasequences
AT pruferkay quantifyingandreducingspuriousalignmentsfortheanalysisofultrashortancientdnasequences