Cargando…

CAPG: comprehensive allopolyploid genotyper

MOTIVATION: Genotyping by sequencing is a powerful tool for investigating genetic variation in plants, but many economically important plants are allopolyploids, where homoeologous similarity obscures the subgenomic origin of reads and confounds allelic and homoeologous SNPs. Recent polyploid genoty...

Descripción completa

Detalles Bibliográficos
Autores principales: Kulkarni, Roshan, Zhang, Yudi, Cannon, Steven B, Dorman, Karin S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9825759/
https://www.ncbi.nlm.nih.gov/pubmed/36367243
http://dx.doi.org/10.1093/bioinformatics/btac729
Descripción
Sumario:MOTIVATION: Genotyping by sequencing is a powerful tool for investigating genetic variation in plants, but many economically important plants are allopolyploids, where homoeologous similarity obscures the subgenomic origin of reads and confounds allelic and homoeologous SNPs. Recent polyploid genotyping methods use allelic frequencies, rate of heterozygosity, parental cross or other information to resolve read assignment, but good subgenomic references offer the most direct information. The typical strategy aligns reads to the joint reference, performs diploid genotyping within each subgenome, and filters the results, but persistent read misassignment results in an excess of false heterozygous calls. RESULTS: We introduce the Comprehensive Allopolyploid Genotyper (CAPG), which formulates an explicit likelihood to weight read alignments against both subgenomic references and genotype individual allopolyploids from whole-genome resequencing data. We demonstrate CAPG in allotetraploids, where it performs better than Genome Analysis Toolkit’s HaplotypeCaller applied to reads aligned to the combined subgenomic references. AVAILABILITY AND IMPLEMENTATION: Code and tutorials are available at https://github.com/Kkulkarni1/CAPG.git. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.