Cargando…
Detecting sample swaps in diverse NGS data types using linkage disequilibrium
As the number of genomics datasets grows rapidly, sample mislabeling has become a high stakes issue. We present CrosscheckFingerprints (Crosscheck), a tool for quantifying sample-relatedness and detecting incorrectly paired sequencing datasets from different donors. Crosscheck outperforms similar me...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7391710/ https://www.ncbi.nlm.nih.gov/pubmed/32728101 http://dx.doi.org/10.1038/s41467-020-17453-5 |
_version_ | 1783564704812105728 |
---|---|
author | Javed, Nauman Farjoun, Yossi Fennell, Tim J. Epstein, Charles B. Bernstein, Bradley E. Shoresh, Noam |
author_facet | Javed, Nauman Farjoun, Yossi Fennell, Tim J. Epstein, Charles B. Bernstein, Bradley E. Shoresh, Noam |
author_sort | Javed, Nauman |
collection | PubMed |
description | As the number of genomics datasets grows rapidly, sample mislabeling has become a high stakes issue. We present CrosscheckFingerprints (Crosscheck), a tool for quantifying sample-relatedness and detecting incorrectly paired sequencing datasets from different donors. Crosscheck outperforms similar methods and is effective even when data are sparse or from different assays. Application of Crosscheck to 8851 ENCODE ChIP-, RNA-, and DNase-seq datasets enabled us to identify and correct dozens of mislabeled samples and ambiguous metadata annotations, representing ~1% of ENCODE datasets. |
format | Online Article Text |
id | pubmed-7391710 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-73917102020-08-12 Detecting sample swaps in diverse NGS data types using linkage disequilibrium Javed, Nauman Farjoun, Yossi Fennell, Tim J. Epstein, Charles B. Bernstein, Bradley E. Shoresh, Noam Nat Commun Article As the number of genomics datasets grows rapidly, sample mislabeling has become a high stakes issue. We present CrosscheckFingerprints (Crosscheck), a tool for quantifying sample-relatedness and detecting incorrectly paired sequencing datasets from different donors. Crosscheck outperforms similar methods and is effective even when data are sparse or from different assays. Application of Crosscheck to 8851 ENCODE ChIP-, RNA-, and DNase-seq datasets enabled us to identify and correct dozens of mislabeled samples and ambiguous metadata annotations, representing ~1% of ENCODE datasets. Nature Publishing Group UK 2020-07-29 /pmc/articles/PMC7391710/ /pubmed/32728101 http://dx.doi.org/10.1038/s41467-020-17453-5 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Javed, Nauman Farjoun, Yossi Fennell, Tim J. Epstein, Charles B. Bernstein, Bradley E. Shoresh, Noam Detecting sample swaps in diverse NGS data types using linkage disequilibrium |
title | Detecting sample swaps in diverse NGS data types using linkage disequilibrium |
title_full | Detecting sample swaps in diverse NGS data types using linkage disequilibrium |
title_fullStr | Detecting sample swaps in diverse NGS data types using linkage disequilibrium |
title_full_unstemmed | Detecting sample swaps in diverse NGS data types using linkage disequilibrium |
title_short | Detecting sample swaps in diverse NGS data types using linkage disequilibrium |
title_sort | detecting sample swaps in diverse ngs data types using linkage disequilibrium |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7391710/ https://www.ncbi.nlm.nih.gov/pubmed/32728101 http://dx.doi.org/10.1038/s41467-020-17453-5 |
work_keys_str_mv | AT javednauman detectingsampleswapsindiversengsdatatypesusinglinkagedisequilibrium AT farjounyossi detectingsampleswapsindiversengsdatatypesusinglinkagedisequilibrium AT fennelltimj detectingsampleswapsindiversengsdatatypesusinglinkagedisequilibrium AT epsteincharlesb detectingsampleswapsindiversengsdatatypesusinglinkagedisequilibrium AT bernsteinbradleye detectingsampleswapsindiversengsdatatypesusinglinkagedisequilibrium AT shoreshnoam detectingsampleswapsindiversengsdatatypesusinglinkagedisequilibrium |