Cargando…
reGenotyper: Detecting mislabeled samples in genetic data
In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recov...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5305221/ https://www.ncbi.nlm.nih.gov/pubmed/28192439 http://dx.doi.org/10.1371/journal.pone.0171324 |
_version_ | 1782507012801691648 |
---|---|
author | Zych, Konrad Snoek, Basten L. Elvin, Mark Rodriguez, Miriam Van der Velde, K. Joeri Arends, Danny Westra, Harm-Jan Swertz, Morris A. Poulin, Gino Kammenga, Jan E. Breitling, Rainer Jansen, Ritsert C. Li, Yang |
author_facet | Zych, Konrad Snoek, Basten L. Elvin, Mark Rodriguez, Miriam Van der Velde, K. Joeri Arends, Danny Westra, Harm-Jan Swertz, Morris A. Poulin, Gino Kammenga, Jan E. Breitling, Rainer Jansen, Ritsert C. Li, Yang |
author_sort | Zych, Konrad |
collection | PubMed |
description | In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the “ideal” genotype and identify “best-matched” labels for mislabeled samples. On average, we identified 4% of samples as mislabeled in eight published datasets, highlighting the necessity of applying a “data cleaning” step before standard data analysis. |
format | Online Article Text |
id | pubmed-5305221 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-53052212017-02-28 reGenotyper: Detecting mislabeled samples in genetic data Zych, Konrad Snoek, Basten L. Elvin, Mark Rodriguez, Miriam Van der Velde, K. Joeri Arends, Danny Westra, Harm-Jan Swertz, Morris A. Poulin, Gino Kammenga, Jan E. Breitling, Rainer Jansen, Ritsert C. Li, Yang PLoS One Research Article In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the “ideal” genotype and identify “best-matched” labels for mislabeled samples. On average, we identified 4% of samples as mislabeled in eight published datasets, highlighting the necessity of applying a “data cleaning” step before standard data analysis. Public Library of Science 2017-02-13 /pmc/articles/PMC5305221/ /pubmed/28192439 http://dx.doi.org/10.1371/journal.pone.0171324 Text en © 2017 Zych et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Zych, Konrad Snoek, Basten L. Elvin, Mark Rodriguez, Miriam Van der Velde, K. Joeri Arends, Danny Westra, Harm-Jan Swertz, Morris A. Poulin, Gino Kammenga, Jan E. Breitling, Rainer Jansen, Ritsert C. Li, Yang reGenotyper: Detecting mislabeled samples in genetic data |
title | reGenotyper: Detecting mislabeled samples in genetic data |
title_full | reGenotyper: Detecting mislabeled samples in genetic data |
title_fullStr | reGenotyper: Detecting mislabeled samples in genetic data |
title_full_unstemmed | reGenotyper: Detecting mislabeled samples in genetic data |
title_short | reGenotyper: Detecting mislabeled samples in genetic data |
title_sort | regenotyper: detecting mislabeled samples in genetic data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5305221/ https://www.ncbi.nlm.nih.gov/pubmed/28192439 http://dx.doi.org/10.1371/journal.pone.0171324 |
work_keys_str_mv | AT zychkonrad regenotyperdetectingmislabeledsamplesingeneticdata AT snoekbastenl regenotyperdetectingmislabeledsamplesingeneticdata AT elvinmark regenotyperdetectingmislabeledsamplesingeneticdata AT rodriguezmiriam regenotyperdetectingmislabeledsamplesingeneticdata AT vanderveldekjoeri regenotyperdetectingmislabeledsamplesingeneticdata AT arendsdanny regenotyperdetectingmislabeledsamplesingeneticdata AT westraharmjan regenotyperdetectingmislabeledsamplesingeneticdata AT swertzmorrisa regenotyperdetectingmislabeledsamplesingeneticdata AT poulingino regenotyperdetectingmislabeledsamplesingeneticdata AT kammengajane regenotyperdetectingmislabeledsamplesingeneticdata AT breitlingrainer regenotyperdetectingmislabeledsamplesingeneticdata AT jansenritsertc regenotyperdetectingmislabeledsamplesingeneticdata AT liyang regenotyperdetectingmislabeledsamplesingeneticdata |