Cargando…

GBScleanR: robust genotyping error correction using a hidden Markov model with error pattern recognition

Reduced-representation sequencing (RRS) provides cost-effective and time-saving genotyping platforms. Despite the outstanding advantage of RRS in throughput, the obtained genotype data usually contain a large number of errors. Several error correction methods employing the hidden Markov model (HMM)...

Descripción completa

Detalles Bibliográficos
Autores principales: Furuta, Tomoyuki, Yamamoto, Toshio, Ashikari, Motoyuki
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10213493/
https://www.ncbi.nlm.nih.gov/pubmed/36988327
http://dx.doi.org/10.1093/genetics/iyad055
_version_ 1785047632406642688
author Furuta, Tomoyuki
Yamamoto, Toshio
Ashikari, Motoyuki
author_facet Furuta, Tomoyuki
Yamamoto, Toshio
Ashikari, Motoyuki
author_sort Furuta, Tomoyuki
collection PubMed
description Reduced-representation sequencing (RRS) provides cost-effective and time-saving genotyping platforms. Despite the outstanding advantage of RRS in throughput, the obtained genotype data usually contain a large number of errors. Several error correction methods employing the hidden Markov model (HMM) have been developed to overcome these issues. These methods assume that markers have a uniform error rate with no bias in the allele read ratio. However, bias does occur because of uneven amplification of genomic fragments and read mismapping. In this paper, we introduce an error correction tool, GBScleanR, which enables robust and precise error correction for noisy RRS-based genotype data by incorporating marker-specific error rates into the HMM. The results indicate that GBScleanR improves the accuracy by more than 25 percentage points at maximum compared to the existing tools in simulation data sets and achieves the most reliable genotype estimation in real data even with error-prone markers.
format Online
Article
Text
id pubmed-10213493
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-102134932023-05-27 GBScleanR: robust genotyping error correction using a hidden Markov model with error pattern recognition Furuta, Tomoyuki Yamamoto, Toshio Ashikari, Motoyuki Genetics Investigation Reduced-representation sequencing (RRS) provides cost-effective and time-saving genotyping platforms. Despite the outstanding advantage of RRS in throughput, the obtained genotype data usually contain a large number of errors. Several error correction methods employing the hidden Markov model (HMM) have been developed to overcome these issues. These methods assume that markers have a uniform error rate with no bias in the allele read ratio. However, bias does occur because of uneven amplification of genomic fragments and read mismapping. In this paper, we introduce an error correction tool, GBScleanR, which enables robust and precise error correction for noisy RRS-based genotype data by incorporating marker-specific error rates into the HMM. The results indicate that GBScleanR improves the accuracy by more than 25 percentage points at maximum compared to the existing tools in simulation data sets and achieves the most reliable genotype estimation in real data even with error-prone markers. Oxford University Press 2023-03-29 /pmc/articles/PMC10213493/ /pubmed/36988327 http://dx.doi.org/10.1093/genetics/iyad055 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of The Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigation
Furuta, Tomoyuki
Yamamoto, Toshio
Ashikari, Motoyuki
GBScleanR: robust genotyping error correction using a hidden Markov model with error pattern recognition
title GBScleanR: robust genotyping error correction using a hidden Markov model with error pattern recognition
title_full GBScleanR: robust genotyping error correction using a hidden Markov model with error pattern recognition
title_fullStr GBScleanR: robust genotyping error correction using a hidden Markov model with error pattern recognition
title_full_unstemmed GBScleanR: robust genotyping error correction using a hidden Markov model with error pattern recognition
title_short GBScleanR: robust genotyping error correction using a hidden Markov model with error pattern recognition
title_sort gbscleanr: robust genotyping error correction using a hidden markov model with error pattern recognition
topic Investigation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10213493/
https://www.ncbi.nlm.nih.gov/pubmed/36988327
http://dx.doi.org/10.1093/genetics/iyad055
work_keys_str_mv AT furutatomoyuki gbscleanrrobustgenotypingerrorcorrectionusingahiddenmarkovmodelwitherrorpatternrecognition
AT yamamototoshio gbscleanrrobustgenotypingerrorcorrectionusingahiddenmarkovmodelwitherrorpatternrecognition
AT ashikarimotoyuki gbscleanrrobustgenotypingerrorcorrectionusingahiddenmarkovmodelwitherrorpatternrecognition