Cargando…

Simple regression models as a threshold for selecting AFLP loci with reduced error rates

BACKGROUND: Amplified fragment length polymorphism is a popular DNA marker technique that has applications in multiple fields of study. Technological improvements and decreasing costs have dramatically increased the number of markers that can be generated in an amplified fragment length polymorphism...

Descripción completa

Detalles Bibliográficos
Autores principales: Price, David L, Casler, Michael D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3534328/
https://www.ncbi.nlm.nih.gov/pubmed/23072295
http://dx.doi.org/10.1186/1471-2105-13-268
_version_ 1782475315587579904
author Price, David L
Casler, Michael D
author_facet Price, David L
Casler, Michael D
author_sort Price, David L
collection PubMed
description BACKGROUND: Amplified fragment length polymorphism is a popular DNA marker technique that has applications in multiple fields of study. Technological improvements and decreasing costs have dramatically increased the number of markers that can be generated in an amplified fragment length polymorphism experiment. As datasets increase in size, the number of genotyping errors also increases. Error within a DNA marker dataset can result in reduced statistical power, incorrect conclusions, and decreased reproducibility. It is essential that error within a dataset be recognized and reduced where possible, while still balancing the need for genomic diversity. RESULTS: Using simple regression with a second-degree polynomial term, a model was fit to describe the relationship between locus-specific error rate and the frequency of present alleles. This model was then used to set a moving error rate threshold that varied based on the frequency of present alleles at a given locus. Loci with error rates greater than the threshold were removed from further analyses. This method of selecting loci is advantageous, as it accounts for differences in error rate between loci of varying frequencies of present alleles. An example using this method to select loci is demonstrated in an amplified fragment length polymorphism dataset generated from the North American prairie species big bluestem. Within this dataset the error rate was reduced from 12.5% to 8.8% by removal of loci with error rates greater than the defined threshold. By repeating the method on selected loci, the error rate was further reduced to 5.9%. This reduction in error resulted in a substantial increase in the amount of genetic variation attributable to regional and population variation. CONCLUSIONS: This paper demonstrates a logical and computationally simple method for selecting loci with a reduced error rate. In the context of a genetic diversity study, this method resulted in an increased ability to detect differences between populations. Further application of this locus selection method, in addition to error-reducing methodological precautions, will result in amplified fragment length polymorphism datasets with reduced error rates. This reduction in error rate should result in greater power to detect differences and increased reproducibility.
format Online
Article
Text
id pubmed-3534328
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35343282013-01-03 Simple regression models as a threshold for selecting AFLP loci with reduced error rates Price, David L Casler, Michael D BMC Bioinformatics Methodology Article BACKGROUND: Amplified fragment length polymorphism is a popular DNA marker technique that has applications in multiple fields of study. Technological improvements and decreasing costs have dramatically increased the number of markers that can be generated in an amplified fragment length polymorphism experiment. As datasets increase in size, the number of genotyping errors also increases. Error within a DNA marker dataset can result in reduced statistical power, incorrect conclusions, and decreased reproducibility. It is essential that error within a dataset be recognized and reduced where possible, while still balancing the need for genomic diversity. RESULTS: Using simple regression with a second-degree polynomial term, a model was fit to describe the relationship between locus-specific error rate and the frequency of present alleles. This model was then used to set a moving error rate threshold that varied based on the frequency of present alleles at a given locus. Loci with error rates greater than the threshold were removed from further analyses. This method of selecting loci is advantageous, as it accounts for differences in error rate between loci of varying frequencies of present alleles. An example using this method to select loci is demonstrated in an amplified fragment length polymorphism dataset generated from the North American prairie species big bluestem. Within this dataset the error rate was reduced from 12.5% to 8.8% by removal of loci with error rates greater than the defined threshold. By repeating the method on selected loci, the error rate was further reduced to 5.9%. This reduction in error resulted in a substantial increase in the amount of genetic variation attributable to regional and population variation. CONCLUSIONS: This paper demonstrates a logical and computationally simple method for selecting loci with a reduced error rate. In the context of a genetic diversity study, this method resulted in an increased ability to detect differences between populations. Further application of this locus selection method, in addition to error-reducing methodological precautions, will result in amplified fragment length polymorphism datasets with reduced error rates. This reduction in error rate should result in greater power to detect differences and increased reproducibility. BioMed Central 2012-10-16 /pmc/articles/PMC3534328/ /pubmed/23072295 http://dx.doi.org/10.1186/1471-2105-13-268 Text en Copyright ©2012 Price and Casler; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Price, David L
Casler, Michael D
Simple regression models as a threshold for selecting AFLP loci with reduced error rates
title Simple regression models as a threshold for selecting AFLP loci with reduced error rates
title_full Simple regression models as a threshold for selecting AFLP loci with reduced error rates
title_fullStr Simple regression models as a threshold for selecting AFLP loci with reduced error rates
title_full_unstemmed Simple regression models as a threshold for selecting AFLP loci with reduced error rates
title_short Simple regression models as a threshold for selecting AFLP loci with reduced error rates
title_sort simple regression models as a threshold for selecting aflp loci with reduced error rates
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3534328/
https://www.ncbi.nlm.nih.gov/pubmed/23072295
http://dx.doi.org/10.1186/1471-2105-13-268
work_keys_str_mv AT pricedavidl simpleregressionmodelsasathresholdforselectingaflplociwithreducederrorrates
AT caslermichaeld simpleregressionmodelsasathresholdforselectingaflplociwithreducederrorrates