Cargando…

ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations

Motivation: The development of new high-throughput genotyping products requires a significant investment in testing and training samples to evaluate and optimize the product before it can be used reliably on new samples. One reason for this is current methods for automated calling of genotypes are b...

Descripción completa

Detalles Bibliográficos
Autores principales: Wright, Mark H., Tung, Chih-Wei, Zhao, Keyan, Reynolds, Andy, McCouch, Susan R., Bustamante, Carlos D.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2982150/
https://www.ncbi.nlm.nih.gov/pubmed/20926420
http://dx.doi.org/10.1093/bioinformatics/btq533
_version_ 1782191748611571712
author Wright, Mark H.
Tung, Chih-Wei
Zhao, Keyan
Reynolds, Andy
McCouch, Susan R.
Bustamante, Carlos D.
author_facet Wright, Mark H.
Tung, Chih-Wei
Zhao, Keyan
Reynolds, Andy
McCouch, Susan R.
Bustamante, Carlos D.
author_sort Wright, Mark H.
collection PubMed
description Motivation: The development of new high-throughput genotyping products requires a significant investment in testing and training samples to evaluate and optimize the product before it can be used reliably on new samples. One reason for this is current methods for automated calling of genotypes are based on clustering approaches which require a large number of samples to be analyzed simultaneously, or an extensive training dataset to seed clusters. In systems where inbred samples are of primary interest, current clustering approaches perform poorly due to the inability to clearly identify a heterozygote cluster. Results: As part of the development of two custom single nucleotide polymorphism genotyping products for Oryza sativa (domestic rice), we have developed a new genotype calling algorithm called ‘ALCHEMY’ based on statistical modeling of the raw intensity data rather than modelless clustering. A novel feature of the model is the ability to estimate and incorporate inbreeding information on a per sample basis allowing accurate genotyping of both inbred and heterozygous samples even when analyzed simultaneously. Since clustering is not used explicitly, ALCHEMY performs well on small sample sizes with accuracy exceeding 99% with as few as 18 samples. Availability: ALCHEMY is available for both commercial and academic use free of charge and distributed under the GNU General Public License at http://alchemy.sourceforge.net/ Contact: mhw6@cornell.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-2982150
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-29821502010-11-16 ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations Wright, Mark H. Tung, Chih-Wei Zhao, Keyan Reynolds, Andy McCouch, Susan R. Bustamante, Carlos D. Bioinformatics Original Papers Motivation: The development of new high-throughput genotyping products requires a significant investment in testing and training samples to evaluate and optimize the product before it can be used reliably on new samples. One reason for this is current methods for automated calling of genotypes are based on clustering approaches which require a large number of samples to be analyzed simultaneously, or an extensive training dataset to seed clusters. In systems where inbred samples are of primary interest, current clustering approaches perform poorly due to the inability to clearly identify a heterozygote cluster. Results: As part of the development of two custom single nucleotide polymorphism genotyping products for Oryza sativa (domestic rice), we have developed a new genotype calling algorithm called ‘ALCHEMY’ based on statistical modeling of the raw intensity data rather than modelless clustering. A novel feature of the model is the ability to estimate and incorporate inbreeding information on a per sample basis allowing accurate genotyping of both inbred and heterozygous samples even when analyzed simultaneously. Since clustering is not used explicitly, ALCHEMY performs well on small sample sizes with accuracy exceeding 99% with as few as 18 samples. Availability: ALCHEMY is available for both commercial and academic use free of charge and distributed under the GNU General Public License at http://alchemy.sourceforge.net/ Contact: mhw6@cornell.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2010-12-01 2010-10-05 /pmc/articles/PMC2982150/ /pubmed/20926420 http://dx.doi.org/10.1093/bioinformatics/btq533 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Wright, Mark H.
Tung, Chih-Wei
Zhao, Keyan
Reynolds, Andy
McCouch, Susan R.
Bustamante, Carlos D.
ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations
title ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations
title_full ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations
title_fullStr ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations
title_full_unstemmed ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations
title_short ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations
title_sort alchemy: a reliable method for automated snp genotype calling for small batch sizes and highly homozygous populations
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2982150/
https://www.ncbi.nlm.nih.gov/pubmed/20926420
http://dx.doi.org/10.1093/bioinformatics/btq533
work_keys_str_mv AT wrightmarkh alchemyareliablemethodforautomatedsnpgenotypecallingforsmallbatchsizesandhighlyhomozygouspopulations
AT tungchihwei alchemyareliablemethodforautomatedsnpgenotypecallingforsmallbatchsizesandhighlyhomozygouspopulations
AT zhaokeyan alchemyareliablemethodforautomatedsnpgenotypecallingforsmallbatchsizesandhighlyhomozygouspopulations
AT reynoldsandy alchemyareliablemethodforautomatedsnpgenotypecallingforsmallbatchsizesandhighlyhomozygouspopulations
AT mccouchsusanr alchemyareliablemethodforautomatedsnpgenotypecallingforsmallbatchsizesandhighlyhomozygouspopulations
AT bustamantecarlosd alchemyareliablemethodforautomatedsnpgenotypecallingforsmallbatchsizesandhighlyhomozygouspopulations