Cargando…

Homoplasy corrected estimation of genetic similarity from AFLP bands, and the effect of the number of bands on the precision of estimation

AFLP is a DNA fingerprinting technique, resulting in binary band presence–absence patterns, called profiles, with known or unknown band positions. We model AFLP as a sampling procedure of fragments, with lengths sampled from a distribution. Bands represent fragments of specific lengths. We focus on...

Descripción completa

Detalles Bibliográficos
Autores principales: Gort, Gerrit, van Hintum, Theo, van Eeuwijk, Fred
Formato: Texto
Lenguaje:English
Publicado: Springer-Verlag 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2715462/
https://www.ncbi.nlm.nih.gov/pubmed/19436989
http://dx.doi.org/10.1007/s00122-009-1047-9
_version_ 1782169782048522240
author Gort, Gerrit
van Hintum, Theo
van Eeuwijk, Fred
author_facet Gort, Gerrit
van Hintum, Theo
van Eeuwijk, Fred
author_sort Gort, Gerrit
collection PubMed
description AFLP is a DNA fingerprinting technique, resulting in binary band presence–absence patterns, called profiles, with known or unknown band positions. We model AFLP as a sampling procedure of fragments, with lengths sampled from a distribution. Bands represent fragments of specific lengths. We focus on estimation of pairwise genetic similarity, defined as average fraction of common fragments, by AFLP. Usual estimators are Dice (D) or Jaccard coefficients. D overestimates genetic similarity, since identical bands in profile pairs may correspond to different fragments (homoplasy). Another complicating factor is the occurrence of different fragments of equal length within a profile, appearing as a single band, which we call collision. The bias of D increases with larger numbers of bands, and lower genetic similarity. We propose two homoplasy- and collision-corrected estimators of genetic similarity. The first is a modification of D, replacing band counts by estimated fragment counts. The second is a maximum likelihood estimator, only applicable if band positions are available. Properties of the estimators are studied by simulation. Standard errors and confidence intervals for the first are obtained by bootstrapping, and for the second by likelihood theory. The estimators are nearly unbiased, and have for most practical cases smaller standard error than D. The likelihood-based estimator generally gives the highest precision. The relationship between fragment counts and precision is studied using simulation. The usual range of band counts (50–100) appears nearly optimal. The methodology is illustrated using data from a phylogenetic study on lettuce.
format Text
id pubmed-2715462
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Springer-Verlag
record_format MEDLINE/PubMed
spelling pubmed-27154622009-07-29 Homoplasy corrected estimation of genetic similarity from AFLP bands, and the effect of the number of bands on the precision of estimation Gort, Gerrit van Hintum, Theo van Eeuwijk, Fred Theor Appl Genet Original Paper AFLP is a DNA fingerprinting technique, resulting in binary band presence–absence patterns, called profiles, with known or unknown band positions. We model AFLP as a sampling procedure of fragments, with lengths sampled from a distribution. Bands represent fragments of specific lengths. We focus on estimation of pairwise genetic similarity, defined as average fraction of common fragments, by AFLP. Usual estimators are Dice (D) or Jaccard coefficients. D overestimates genetic similarity, since identical bands in profile pairs may correspond to different fragments (homoplasy). Another complicating factor is the occurrence of different fragments of equal length within a profile, appearing as a single band, which we call collision. The bias of D increases with larger numbers of bands, and lower genetic similarity. We propose two homoplasy- and collision-corrected estimators of genetic similarity. The first is a modification of D, replacing band counts by estimated fragment counts. The second is a maximum likelihood estimator, only applicable if band positions are available. Properties of the estimators are studied by simulation. Standard errors and confidence intervals for the first are obtained by bootstrapping, and for the second by likelihood theory. The estimators are nearly unbiased, and have for most practical cases smaller standard error than D. The likelihood-based estimator generally gives the highest precision. The relationship between fragment counts and precision is studied using simulation. The usual range of band counts (50–100) appears nearly optimal. The methodology is illustrated using data from a phylogenetic study on lettuce. Springer-Verlag 2009-05-13 2009-08 /pmc/articles/PMC2715462/ /pubmed/19436989 http://dx.doi.org/10.1007/s00122-009-1047-9 Text en © The Author(s) 2009
spellingShingle Original Paper
Gort, Gerrit
van Hintum, Theo
van Eeuwijk, Fred
Homoplasy corrected estimation of genetic similarity from AFLP bands, and the effect of the number of bands on the precision of estimation
title Homoplasy corrected estimation of genetic similarity from AFLP bands, and the effect of the number of bands on the precision of estimation
title_full Homoplasy corrected estimation of genetic similarity from AFLP bands, and the effect of the number of bands on the precision of estimation
title_fullStr Homoplasy corrected estimation of genetic similarity from AFLP bands, and the effect of the number of bands on the precision of estimation
title_full_unstemmed Homoplasy corrected estimation of genetic similarity from AFLP bands, and the effect of the number of bands on the precision of estimation
title_short Homoplasy corrected estimation of genetic similarity from AFLP bands, and the effect of the number of bands on the precision of estimation
title_sort homoplasy corrected estimation of genetic similarity from aflp bands, and the effect of the number of bands on the precision of estimation
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2715462/
https://www.ncbi.nlm.nih.gov/pubmed/19436989
http://dx.doi.org/10.1007/s00122-009-1047-9
work_keys_str_mv AT gortgerrit homoplasycorrectedestimationofgeneticsimilarityfromaflpbandsandtheeffectofthenumberofbandsontheprecisionofestimation
AT vanhintumtheo homoplasycorrectedestimationofgeneticsimilarityfromaflpbandsandtheeffectofthenumberofbandsontheprecisionofestimation
AT vaneeuwijkfred homoplasycorrectedestimationofgeneticsimilarityfromaflpbandsandtheeffectofthenumberofbandsontheprecisionofestimation