Cargando…

reGenotyper: Detecting mislabeled samples in genetic data

In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recov...

Descripción completa

Detalles Bibliográficos
Autores principales: Zych, Konrad, Snoek, Basten L., Elvin, Mark, Rodriguez, Miriam, Van der Velde, K. Joeri, Arends, Danny, Westra, Harm-Jan, Swertz, Morris A., Poulin, Gino, Kammenga, Jan E., Breitling, Rainer, Jansen, Ritsert C., Li, Yang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5305221/
https://www.ncbi.nlm.nih.gov/pubmed/28192439
http://dx.doi.org/10.1371/journal.pone.0171324
_version_ 1782507012801691648
author Zych, Konrad
Snoek, Basten L.
Elvin, Mark
Rodriguez, Miriam
Van der Velde, K. Joeri
Arends, Danny
Westra, Harm-Jan
Swertz, Morris A.
Poulin, Gino
Kammenga, Jan E.
Breitling, Rainer
Jansen, Ritsert C.
Li, Yang
author_facet Zych, Konrad
Snoek, Basten L.
Elvin, Mark
Rodriguez, Miriam
Van der Velde, K. Joeri
Arends, Danny
Westra, Harm-Jan
Swertz, Morris A.
Poulin, Gino
Kammenga, Jan E.
Breitling, Rainer
Jansen, Ritsert C.
Li, Yang
author_sort Zych, Konrad
collection PubMed
description In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the “ideal” genotype and identify “best-matched” labels for mislabeled samples. On average, we identified 4% of samples as mislabeled in eight published datasets, highlighting the necessity of applying a “data cleaning” step before standard data analysis.
format Online
Article
Text
id pubmed-5305221
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-53052212017-02-28 reGenotyper: Detecting mislabeled samples in genetic data Zych, Konrad Snoek, Basten L. Elvin, Mark Rodriguez, Miriam Van der Velde, K. Joeri Arends, Danny Westra, Harm-Jan Swertz, Morris A. Poulin, Gino Kammenga, Jan E. Breitling, Rainer Jansen, Ritsert C. Li, Yang PLoS One Research Article In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the “ideal” genotype and identify “best-matched” labels for mislabeled samples. On average, we identified 4% of samples as mislabeled in eight published datasets, highlighting the necessity of applying a “data cleaning” step before standard data analysis. Public Library of Science 2017-02-13 /pmc/articles/PMC5305221/ /pubmed/28192439 http://dx.doi.org/10.1371/journal.pone.0171324 Text en © 2017 Zych et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zych, Konrad
Snoek, Basten L.
Elvin, Mark
Rodriguez, Miriam
Van der Velde, K. Joeri
Arends, Danny
Westra, Harm-Jan
Swertz, Morris A.
Poulin, Gino
Kammenga, Jan E.
Breitling, Rainer
Jansen, Ritsert C.
Li, Yang
reGenotyper: Detecting mislabeled samples in genetic data
title reGenotyper: Detecting mislabeled samples in genetic data
title_full reGenotyper: Detecting mislabeled samples in genetic data
title_fullStr reGenotyper: Detecting mislabeled samples in genetic data
title_full_unstemmed reGenotyper: Detecting mislabeled samples in genetic data
title_short reGenotyper: Detecting mislabeled samples in genetic data
title_sort regenotyper: detecting mislabeled samples in genetic data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5305221/
https://www.ncbi.nlm.nih.gov/pubmed/28192439
http://dx.doi.org/10.1371/journal.pone.0171324
work_keys_str_mv AT zychkonrad regenotyperdetectingmislabeledsamplesingeneticdata
AT snoekbastenl regenotyperdetectingmislabeledsamplesingeneticdata
AT elvinmark regenotyperdetectingmislabeledsamplesingeneticdata
AT rodriguezmiriam regenotyperdetectingmislabeledsamplesingeneticdata
AT vanderveldekjoeri regenotyperdetectingmislabeledsamplesingeneticdata
AT arendsdanny regenotyperdetectingmislabeledsamplesingeneticdata
AT westraharmjan regenotyperdetectingmislabeledsamplesingeneticdata
AT swertzmorrisa regenotyperdetectingmislabeledsamplesingeneticdata
AT poulingino regenotyperdetectingmislabeledsamplesingeneticdata
AT kammengajane regenotyperdetectingmislabeledsamplesingeneticdata
AT breitlingrainer regenotyperdetectingmislabeledsamplesingeneticdata
AT jansenritsertc regenotyperdetectingmislabeledsamplesingeneticdata
AT liyang regenotyperdetectingmislabeledsamplesingeneticdata