Cargando…
A method for positive forensic identification of samples from extremely low-coverage sequence data
BACKGROUND: Determining whether two DNA samples originate from the same individual is difficult when the amount of retrievable DNA is limited. This is often the case for ancient, historic, and forensic samples. The most widely used approaches rely on amplification of a defined panel of multi-allelic...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4672566/ https://www.ncbi.nlm.nih.gov/pubmed/26643904 http://dx.doi.org/10.1186/s12864-015-2241-6 |
_version_ | 1782404595241189376 |
---|---|
author | Vohr, Samuel H. Buen Abad Najar, Carlos Fernando Shapiro, Beth Green, Richard E. |
author_facet | Vohr, Samuel H. Buen Abad Najar, Carlos Fernando Shapiro, Beth Green, Richard E. |
author_sort | Vohr, Samuel H. |
collection | PubMed |
description | BACKGROUND: Determining whether two DNA samples originate from the same individual is difficult when the amount of retrievable DNA is limited. This is often the case for ancient, historic, and forensic samples. The most widely used approaches rely on amplification of a defined panel of multi-allelic markers and comparison to similar data from other samples. When the amount retrievable DNA is low these approaches fail. RESULTS: We describe a new method for assessing whether shotgun DNA sequence data from two samples are consistent with originating from the same or different individuals. Our approach makes use of the large catalogs of single nucleotide polymorphism (SNP) markers to maximize the chances of observing potentially discriminating alleles. We further reduce the amount of data required by taking advantage of patterns of linkage disequilibrium modeled by a reference panel of haplotypes to indirectly compare observations at pairs of linked SNPs. Using both coalescent simulations and real sequencing data from modern and ancient sources, we show that this approach is robust with respect to the reference panel and has power to detect positive identity from DNA libraries with less than 1 % random and non-overlapping genome coverage in each sample. CONCLUSION: We present a powerful new approach that can determine whether DNA from two samples originated from the same individual even when only minute quantities of DNA are recoverable from each. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2241-6) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4672566 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-46725662015-12-09 A method for positive forensic identification of samples from extremely low-coverage sequence data Vohr, Samuel H. Buen Abad Najar, Carlos Fernando Shapiro, Beth Green, Richard E. BMC Genomics Methodology Article BACKGROUND: Determining whether two DNA samples originate from the same individual is difficult when the amount of retrievable DNA is limited. This is often the case for ancient, historic, and forensic samples. The most widely used approaches rely on amplification of a defined panel of multi-allelic markers and comparison to similar data from other samples. When the amount retrievable DNA is low these approaches fail. RESULTS: We describe a new method for assessing whether shotgun DNA sequence data from two samples are consistent with originating from the same or different individuals. Our approach makes use of the large catalogs of single nucleotide polymorphism (SNP) markers to maximize the chances of observing potentially discriminating alleles. We further reduce the amount of data required by taking advantage of patterns of linkage disequilibrium modeled by a reference panel of haplotypes to indirectly compare observations at pairs of linked SNPs. Using both coalescent simulations and real sequencing data from modern and ancient sources, we show that this approach is robust with respect to the reference panel and has power to detect positive identity from DNA libraries with less than 1 % random and non-overlapping genome coverage in each sample. CONCLUSION: We present a powerful new approach that can determine whether DNA from two samples originated from the same individual even when only minute quantities of DNA are recoverable from each. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2241-6) contains supplementary material, which is available to authorized users. BioMed Central 2015-12-07 /pmc/articles/PMC4672566/ /pubmed/26643904 http://dx.doi.org/10.1186/s12864-015-2241-6 Text en © Vohr et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Vohr, Samuel H. Buen Abad Najar, Carlos Fernando Shapiro, Beth Green, Richard E. A method for positive forensic identification of samples from extremely low-coverage sequence data |
title | A method for positive forensic identification of samples from extremely low-coverage sequence data |
title_full | A method for positive forensic identification of samples from extremely low-coverage sequence data |
title_fullStr | A method for positive forensic identification of samples from extremely low-coverage sequence data |
title_full_unstemmed | A method for positive forensic identification of samples from extremely low-coverage sequence data |
title_short | A method for positive forensic identification of samples from extremely low-coverage sequence data |
title_sort | method for positive forensic identification of samples from extremely low-coverage sequence data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4672566/ https://www.ncbi.nlm.nih.gov/pubmed/26643904 http://dx.doi.org/10.1186/s12864-015-2241-6 |
work_keys_str_mv | AT vohrsamuelh amethodforpositiveforensicidentificationofsamplesfromextremelylowcoveragesequencedata AT buenabadnajarcarlosfernando amethodforpositiveforensicidentificationofsamplesfromextremelylowcoveragesequencedata AT shapirobeth amethodforpositiveforensicidentificationofsamplesfromextremelylowcoveragesequencedata AT greenricharde amethodforpositiveforensicidentificationofsamplesfromextremelylowcoveragesequencedata AT vohrsamuelh methodforpositiveforensicidentificationofsamplesfromextremelylowcoveragesequencedata AT buenabadnajarcarlosfernando methodforpositiveforensicidentificationofsamplesfromextremelylowcoveragesequencedata AT shapirobeth methodforpositiveforensicidentificationofsamplesfromextremelylowcoveragesequencedata AT greenricharde methodforpositiveforensicidentificationofsamplesfromextremelylowcoveragesequencedata |