Cargando…
Statistical modeling of STR capillary electrophoresis signal
BACKGROUND: In order to isolate an individual’s genotype from a sample of biological material, most laboratories use PCR and Capillary Electrophoresis (CE) to construct a genetic profile based on polymorphic loci known as Short Tandem Repeats (STRs). The resulting profile consists of CE signal which...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6886162/ https://www.ncbi.nlm.nih.gov/pubmed/31787097 http://dx.doi.org/10.1186/s12859-019-3074-0 |
_version_ | 1783474828514164736 |
---|---|
author | Karkar, Slim Alfonse, Lauren E. Grgicak, Catherine M. Lun, Desmond S. |
author_facet | Karkar, Slim Alfonse, Lauren E. Grgicak, Catherine M. Lun, Desmond S. |
author_sort | Karkar, Slim |
collection | PubMed |
description | BACKGROUND: In order to isolate an individual’s genotype from a sample of biological material, most laboratories use PCR and Capillary Electrophoresis (CE) to construct a genetic profile based on polymorphic loci known as Short Tandem Repeats (STRs). The resulting profile consists of CE signal which contains information about the length and number of STR units amplified. For samples collected from the environment, interpretation of the signal can be challenging given that information regarding the quality and quantity of the DNA is often limited. The signal can be further compounded by the presence of noise and PCR artifacts such as stutter which can mask or mimic biological alleles. Because manual interpretation methods cannot comprehensively account for such nuances, it would be valuable to develop a signal model that can effectively characterize the various components of STR signal independent of a priori knowledge of the quantity or quality of DNA. RESULTS: First, we seek to mathematically characterize the quality of the profile by measuring changes in the signal with respect to amplicon size. Next, we examine the noise, allele, and stutter components of the signal and develop distinct models for each. Using cross-validation and model selection, we identify a model that can be effectively utilized for downstream interpretation. Finally, we show an implementation of the model in NOCIt, a software system that calculates the a posteriori probability distribution on the number of contributors. CONCLUSION: The model was selected using a large, diverse set of DNA samples obtained from 144 different laboratory conditions; with DNA amounts ranging from a single copy of DNA to hundreds of copies, and the quality of the profiles ranging from pristine to highly degraded. Implemented in NOCIt, the model enables a probabilisitc approach to estimating the number of contributors to complex, environmental samples. |
format | Online Article Text |
id | pubmed-6886162 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-68861622019-12-11 Statistical modeling of STR capillary electrophoresis signal Karkar, Slim Alfonse, Lauren E. Grgicak, Catherine M. Lun, Desmond S. BMC Bioinformatics Research BACKGROUND: In order to isolate an individual’s genotype from a sample of biological material, most laboratories use PCR and Capillary Electrophoresis (CE) to construct a genetic profile based on polymorphic loci known as Short Tandem Repeats (STRs). The resulting profile consists of CE signal which contains information about the length and number of STR units amplified. For samples collected from the environment, interpretation of the signal can be challenging given that information regarding the quality and quantity of the DNA is often limited. The signal can be further compounded by the presence of noise and PCR artifacts such as stutter which can mask or mimic biological alleles. Because manual interpretation methods cannot comprehensively account for such nuances, it would be valuable to develop a signal model that can effectively characterize the various components of STR signal independent of a priori knowledge of the quantity or quality of DNA. RESULTS: First, we seek to mathematically characterize the quality of the profile by measuring changes in the signal with respect to amplicon size. Next, we examine the noise, allele, and stutter components of the signal and develop distinct models for each. Using cross-validation and model selection, we identify a model that can be effectively utilized for downstream interpretation. Finally, we show an implementation of the model in NOCIt, a software system that calculates the a posteriori probability distribution on the number of contributors. CONCLUSION: The model was selected using a large, diverse set of DNA samples obtained from 144 different laboratory conditions; with DNA amounts ranging from a single copy of DNA to hundreds of copies, and the quality of the profiles ranging from pristine to highly degraded. Implemented in NOCIt, the model enables a probabilisitc approach to estimating the number of contributors to complex, environmental samples. BioMed Central 2019-12-02 /pmc/articles/PMC6886162/ /pubmed/31787097 http://dx.doi.org/10.1186/s12859-019-3074-0 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Karkar, Slim Alfonse, Lauren E. Grgicak, Catherine M. Lun, Desmond S. Statistical modeling of STR capillary electrophoresis signal |
title | Statistical modeling of STR capillary electrophoresis signal |
title_full | Statistical modeling of STR capillary electrophoresis signal |
title_fullStr | Statistical modeling of STR capillary electrophoresis signal |
title_full_unstemmed | Statistical modeling of STR capillary electrophoresis signal |
title_short | Statistical modeling of STR capillary electrophoresis signal |
title_sort | statistical modeling of str capillary electrophoresis signal |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6886162/ https://www.ncbi.nlm.nih.gov/pubmed/31787097 http://dx.doi.org/10.1186/s12859-019-3074-0 |
work_keys_str_mv | AT karkarslim statisticalmodelingofstrcapillaryelectrophoresissignal AT alfonselaurene statisticalmodelingofstrcapillaryelectrophoresissignal AT grgicakcatherinem statisticalmodelingofstrcapillaryelectrophoresissignal AT lundesmonds statisticalmodelingofstrcapillaryelectrophoresissignal |