Cargando…
A unified STR profiling system across multiple species with whole genome sequencing data
BACKGROUND: Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6923897/ https://www.ncbi.nlm.nih.gov/pubmed/31861983 http://dx.doi.org/10.1186/s12859-019-3246-y |
_version_ | 1783481616372334592 |
---|---|
author | Liu, Yilin Xu, Jiao Chen, Miaoxia Wang, Changfa Li, Shuaicheng |
author_facet | Liu, Yilin Xu, Jiao Chen, Miaoxia Wang, Changfa Li, Shuaicheng |
author_sort | Liu, Yilin |
collection | PubMed |
description | BACKGROUND: Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species. RESULT: To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1−10(−9), for both individual species and the mixed population, as well as the random-match probability, <10(−7) for all the involved species, indicating that the identified set of STR loci could be applied to multiple species. CONCLUSIONS: We identified a set of STR loci which shared by multiple species. It implies that a unified STR profiling system is possible for these species under the forensic scenes. The system can be applied to the individual identification or paternal test of each of the ten common species which are Sus scrofa (pig), Bos taurus (cattle), Capra hircus (goat), Equus caballus (horse), Canis lupus familiaris (dog), Felis catus (cat), Ovis aries (sheep), Oryctolagus cuniculus (rabbit), and Bos grunniens (yak), and Homo sapiens (human). Our loci selection algorithm employed a greedy approach. The algorithm can generate the loci under different forensic parameters and for a specific combination of species. |
format | Online Article Text |
id | pubmed-6923897 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69238972019-12-30 A unified STR profiling system across multiple species with whole genome sequencing data Liu, Yilin Xu, Jiao Chen, Miaoxia Wang, Changfa Li, Shuaicheng BMC Bioinformatics Methodology BACKGROUND: Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species. RESULT: To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1−10(−9), for both individual species and the mixed population, as well as the random-match probability, <10(−7) for all the involved species, indicating that the identified set of STR loci could be applied to multiple species. CONCLUSIONS: We identified a set of STR loci which shared by multiple species. It implies that a unified STR profiling system is possible for these species under the forensic scenes. The system can be applied to the individual identification or paternal test of each of the ten common species which are Sus scrofa (pig), Bos taurus (cattle), Capra hircus (goat), Equus caballus (horse), Canis lupus familiaris (dog), Felis catus (cat), Ovis aries (sheep), Oryctolagus cuniculus (rabbit), and Bos grunniens (yak), and Homo sapiens (human). Our loci selection algorithm employed a greedy approach. The algorithm can generate the loci under different forensic parameters and for a specific combination of species. BioMed Central 2019-12-20 /pmc/articles/PMC6923897/ /pubmed/31861983 http://dx.doi.org/10.1186/s12859-019-3246-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Liu, Yilin Xu, Jiao Chen, Miaoxia Wang, Changfa Li, Shuaicheng A unified STR profiling system across multiple species with whole genome sequencing data |
title | A unified STR profiling system across multiple species with whole genome sequencing data |
title_full | A unified STR profiling system across multiple species with whole genome sequencing data |
title_fullStr | A unified STR profiling system across multiple species with whole genome sequencing data |
title_full_unstemmed | A unified STR profiling system across multiple species with whole genome sequencing data |
title_short | A unified STR profiling system across multiple species with whole genome sequencing data |
title_sort | unified str profiling system across multiple species with whole genome sequencing data |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6923897/ https://www.ncbi.nlm.nih.gov/pubmed/31861983 http://dx.doi.org/10.1186/s12859-019-3246-y |
work_keys_str_mv | AT liuyilin aunifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata AT xujiao aunifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata AT chenmiaoxia aunifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata AT wangchangfa aunifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata AT lishuaicheng aunifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata AT liuyilin unifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata AT xujiao unifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata AT chenmiaoxia unifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata AT wangchangfa unifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata AT lishuaicheng unifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata |