Cargando…

Analysis and Optimization of Bulk DNA Sampling with Binary Scoring for Germplasm Characterization

The strategy of bulk DNA sampling has been a valuable method for studying large numbers of individuals through genetic markers. The application of this strategy for discrimination among germplasm sources was analyzed through information theory, considering the case of polymorphic alleles scored bina...

Descripción completa

Detalles Bibliográficos
Autores principales: Reyes-Valdés, M. Humberto, Santacruz-Varela, Amalio, Martínez, Octavio, Simpson, June, Hayano-Kanashiro, Corina, Cortés-Romero, Celso
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3833943/
https://www.ncbi.nlm.nih.gov/pubmed/24260321
http://dx.doi.org/10.1371/journal.pone.0079936
_version_ 1782291918579826688
author Reyes-Valdés, M. Humberto
Santacruz-Varela, Amalio
Martínez, Octavio
Simpson, June
Hayano-Kanashiro, Corina
Cortés-Romero, Celso
author_facet Reyes-Valdés, M. Humberto
Santacruz-Varela, Amalio
Martínez, Octavio
Simpson, June
Hayano-Kanashiro, Corina
Cortés-Romero, Celso
author_sort Reyes-Valdés, M. Humberto
collection PubMed
description The strategy of bulk DNA sampling has been a valuable method for studying large numbers of individuals through genetic markers. The application of this strategy for discrimination among germplasm sources was analyzed through information theory, considering the case of polymorphic alleles scored binarily for their presence or absence in DNA pools. We defined the informativeness of a set of marker loci in bulks as the mutual information between genotype and population identity, composed by two terms: diversity and noise. The first term is the entropy of bulk genotypes, whereas the noise term is measured through the conditional entropy of bulk genotypes given germplasm sources. Thus, optimizing marker information implies increasing diversity and reducing noise. Simple formulas were devised to estimate marker information per allele from a set of estimated allele frequencies across populations. As an example, they allowed optimization of bulk size for SSR genotyping in maize, from allele frequencies estimated in a sample of 56 maize populations. It was found that a sample of 30 plants from a random mating population is adequate for maize germplasm SSR characterization. We analyzed the use of divided bulks to overcome the allele dilution problem in DNA pools, and concluded that samples of 30 plants divided into three bulks of 10 plants are efficient to characterize maize germplasm sources through SSR with a good control of the dilution problem. We estimated the informativeness of 30 SSR loci from the estimated allele frequencies in maize populations, and found a wide variation of marker informativeness, which positively correlated with the number of alleles per locus.
format Online
Article
Text
id pubmed-3833943
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38339432013-11-20 Analysis and Optimization of Bulk DNA Sampling with Binary Scoring for Germplasm Characterization Reyes-Valdés, M. Humberto Santacruz-Varela, Amalio Martínez, Octavio Simpson, June Hayano-Kanashiro, Corina Cortés-Romero, Celso PLoS One Research Article The strategy of bulk DNA sampling has been a valuable method for studying large numbers of individuals through genetic markers. The application of this strategy for discrimination among germplasm sources was analyzed through information theory, considering the case of polymorphic alleles scored binarily for their presence or absence in DNA pools. We defined the informativeness of a set of marker loci in bulks as the mutual information between genotype and population identity, composed by two terms: diversity and noise. The first term is the entropy of bulk genotypes, whereas the noise term is measured through the conditional entropy of bulk genotypes given germplasm sources. Thus, optimizing marker information implies increasing diversity and reducing noise. Simple formulas were devised to estimate marker information per allele from a set of estimated allele frequencies across populations. As an example, they allowed optimization of bulk size for SSR genotyping in maize, from allele frequencies estimated in a sample of 56 maize populations. It was found that a sample of 30 plants from a random mating population is adequate for maize germplasm SSR characterization. We analyzed the use of divided bulks to overcome the allele dilution problem in DNA pools, and concluded that samples of 30 plants divided into three bulks of 10 plants are efficient to characterize maize germplasm sources through SSR with a good control of the dilution problem. We estimated the informativeness of 30 SSR loci from the estimated allele frequencies in maize populations, and found a wide variation of marker informativeness, which positively correlated with the number of alleles per locus. Public Library of Science 2013-11-19 /pmc/articles/PMC3833943/ /pubmed/24260321 http://dx.doi.org/10.1371/journal.pone.0079936 Text en © 2013 Reyes-Valdés et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Reyes-Valdés, M. Humberto
Santacruz-Varela, Amalio
Martínez, Octavio
Simpson, June
Hayano-Kanashiro, Corina
Cortés-Romero, Celso
Analysis and Optimization of Bulk DNA Sampling with Binary Scoring for Germplasm Characterization
title Analysis and Optimization of Bulk DNA Sampling with Binary Scoring for Germplasm Characterization
title_full Analysis and Optimization of Bulk DNA Sampling with Binary Scoring for Germplasm Characterization
title_fullStr Analysis and Optimization of Bulk DNA Sampling with Binary Scoring for Germplasm Characterization
title_full_unstemmed Analysis and Optimization of Bulk DNA Sampling with Binary Scoring for Germplasm Characterization
title_short Analysis and Optimization of Bulk DNA Sampling with Binary Scoring for Germplasm Characterization
title_sort analysis and optimization of bulk dna sampling with binary scoring for germplasm characterization
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3833943/
https://www.ncbi.nlm.nih.gov/pubmed/24260321
http://dx.doi.org/10.1371/journal.pone.0079936
work_keys_str_mv AT reyesvaldesmhumberto analysisandoptimizationofbulkdnasamplingwithbinaryscoringforgermplasmcharacterization
AT santacruzvarelaamalio analysisandoptimizationofbulkdnasamplingwithbinaryscoringforgermplasmcharacterization
AT martinezoctavio analysisandoptimizationofbulkdnasamplingwithbinaryscoringforgermplasmcharacterization
AT simpsonjune analysisandoptimizationofbulkdnasamplingwithbinaryscoringforgermplasmcharacterization
AT hayanokanashirocorina analysisandoptimizationofbulkdnasamplingwithbinaryscoringforgermplasmcharacterization
AT cortesromerocelso analysisandoptimizationofbulkdnasamplingwithbinaryscoringforgermplasmcharacterization