Cargando…

UPIC: Perl scripts to determine the number of SSR markers to run

We introduce here the concept of Unique Pattern Informative Combinations (UPIC), a decision tool for the cost-effective design of DNA fingerprinting/genotyping experiments using simple-sequence/tandem repeat (SSR/STR) markers. After the first screening of SSR-markers tested on a subset of DNA sample...

Descripción completa

Detalles Bibliográficos
Autores principales: Arias, Renee S, Ballard, Linda L, Scheffler, Brian E
Formato: Texto
Lenguaje:English
Publicado: Biomedical Informatics Publishing Group 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2720665/
https://www.ncbi.nlm.nih.gov/pubmed/19707300
_version_ 1782170141555949568
author Arias, Renee S
Ballard, Linda L
Scheffler, Brian E
author_facet Arias, Renee S
Ballard, Linda L
Scheffler, Brian E
author_sort Arias, Renee S
collection PubMed
description We introduce here the concept of Unique Pattern Informative Combinations (UPIC), a decision tool for the cost-effective design of DNA fingerprinting/genotyping experiments using simple-sequence/tandem repeat (SSR/STR) markers. After the first screening of SSR-markers tested on a subset of DNA samples, the user can apply UPIC to find marker combinations that maximize the genetic information obtained by a minimum or desirable number of markers. This allows a cost-effective planning of future experiments. We have developed Perl scripts to calculate all possible subset combinations of SSR markers, and determine based on unique patterns or alleles, which combinations can discriminate among all DNA samples included in a test. This makes UPIC an essential tool for optimizing resources when working with microsatellites. An example using real data from eight markers and 12 genotypes shows that UPIC detected groups of as few as three markers sufficient to discriminate all 12- DNA samples. Should markers for future experiments be chosen based only on polymorphism-information content (PIC), the necessary number of markers for discrimination of all samples cannot be determined. We also show that choosing markers using UPIC, an informative combination of four markers can provide similar information as using a combination of six markers (23 vs. 25 patterns, respectively), granting a more efficient planning of experiments. Perl scripts with documentation are also included to calculate the percentage of heterozygous loci on the DNA samples tested and to calculate three PIC values depending on the type of fertilization and allele frequency of the organism. AVAILABILITY: Perl scripts are freely available for download from http://www.ars.usda.gov/msa/jwdsrc/gbru.
format Text
id pubmed-2720665
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Biomedical Informatics Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-27206652009-08-25 UPIC: Perl scripts to determine the number of SSR markers to run Arias, Renee S Ballard, Linda L Scheffler, Brian E Bioinformation Software We introduce here the concept of Unique Pattern Informative Combinations (UPIC), a decision tool for the cost-effective design of DNA fingerprinting/genotyping experiments using simple-sequence/tandem repeat (SSR/STR) markers. After the first screening of SSR-markers tested on a subset of DNA samples, the user can apply UPIC to find marker combinations that maximize the genetic information obtained by a minimum or desirable number of markers. This allows a cost-effective planning of future experiments. We have developed Perl scripts to calculate all possible subset combinations of SSR markers, and determine based on unique patterns or alleles, which combinations can discriminate among all DNA samples included in a test. This makes UPIC an essential tool for optimizing resources when working with microsatellites. An example using real data from eight markers and 12 genotypes shows that UPIC detected groups of as few as three markers sufficient to discriminate all 12- DNA samples. Should markers for future experiments be chosen based only on polymorphism-information content (PIC), the necessary number of markers for discrimination of all samples cannot be determined. We also show that choosing markers using UPIC, an informative combination of four markers can provide similar information as using a combination of six markers (23 vs. 25 patterns, respectively), granting a more efficient planning of experiments. Perl scripts with documentation are also included to calculate the percentage of heterozygous loci on the DNA samples tested and to calculate three PIC values depending on the type of fertilization and allele frequency of the organism. AVAILABILITY: Perl scripts are freely available for download from http://www.ars.usda.gov/msa/jwdsrc/gbru. Biomedical Informatics Publishing Group 2009-04-21 /pmc/articles/PMC2720665/ /pubmed/19707300 Text en © 2009 Biomedical Informatics Publishing Group This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited.
spellingShingle Software
Arias, Renee S
Ballard, Linda L
Scheffler, Brian E
UPIC: Perl scripts to determine the number of SSR markers to run
title UPIC: Perl scripts to determine the number of SSR markers to run
title_full UPIC: Perl scripts to determine the number of SSR markers to run
title_fullStr UPIC: Perl scripts to determine the number of SSR markers to run
title_full_unstemmed UPIC: Perl scripts to determine the number of SSR markers to run
title_short UPIC: Perl scripts to determine the number of SSR markers to run
title_sort upic: perl scripts to determine the number of ssr markers to run
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2720665/
https://www.ncbi.nlm.nih.gov/pubmed/19707300
work_keys_str_mv AT ariasrenees upicperlscriptstodeterminethenumberofssrmarkerstorun
AT ballardlindal upicperlscriptstodeterminethenumberofssrmarkerstorun
AT schefflerbriane upicperlscriptstodeterminethenumberofssrmarkerstorun