Cargando…

A unified STR profiling system across multiple species with whole genome sequencing data

BACKGROUND: Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yilin, Xu, Jiao, Chen, Miaoxia, Wang, Changfa, Li, Shuaicheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6923897/
https://www.ncbi.nlm.nih.gov/pubmed/31861983
http://dx.doi.org/10.1186/s12859-019-3246-y
_version_ 1783481616372334592
author Liu, Yilin
Xu, Jiao
Chen, Miaoxia
Wang, Changfa
Li, Shuaicheng
author_facet Liu, Yilin
Xu, Jiao
Chen, Miaoxia
Wang, Changfa
Li, Shuaicheng
author_sort Liu, Yilin
collection PubMed
description BACKGROUND: Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species. RESULT: To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1−10(−9), for both individual species and the mixed population, as well as the random-match probability, <10(−7) for all the involved species, indicating that the identified set of STR loci could be applied to multiple species. CONCLUSIONS: We identified a set of STR loci which shared by multiple species. It implies that a unified STR profiling system is possible for these species under the forensic scenes. The system can be applied to the individual identification or paternal test of each of the ten common species which are Sus scrofa (pig), Bos taurus (cattle), Capra hircus (goat), Equus caballus (horse), Canis lupus familiaris (dog), Felis catus (cat), Ovis aries (sheep), Oryctolagus cuniculus (rabbit), and Bos grunniens (yak), and Homo sapiens (human). Our loci selection algorithm employed a greedy approach. The algorithm can generate the loci under different forensic parameters and for a specific combination of species.
format Online
Article
Text
id pubmed-6923897
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69238972019-12-30 A unified STR profiling system across multiple species with whole genome sequencing data Liu, Yilin Xu, Jiao Chen, Miaoxia Wang, Changfa Li, Shuaicheng BMC Bioinformatics Methodology BACKGROUND: Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species. RESULT: To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1−10(−9), for both individual species and the mixed population, as well as the random-match probability, <10(−7) for all the involved species, indicating that the identified set of STR loci could be applied to multiple species. CONCLUSIONS: We identified a set of STR loci which shared by multiple species. It implies that a unified STR profiling system is possible for these species under the forensic scenes. The system can be applied to the individual identification or paternal test of each of the ten common species which are Sus scrofa (pig), Bos taurus (cattle), Capra hircus (goat), Equus caballus (horse), Canis lupus familiaris (dog), Felis catus (cat), Ovis aries (sheep), Oryctolagus cuniculus (rabbit), and Bos grunniens (yak), and Homo sapiens (human). Our loci selection algorithm employed a greedy approach. The algorithm can generate the loci under different forensic parameters and for a specific combination of species. BioMed Central 2019-12-20 /pmc/articles/PMC6923897/ /pubmed/31861983 http://dx.doi.org/10.1186/s12859-019-3246-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Liu, Yilin
Xu, Jiao
Chen, Miaoxia
Wang, Changfa
Li, Shuaicheng
A unified STR profiling system across multiple species with whole genome sequencing data
title A unified STR profiling system across multiple species with whole genome sequencing data
title_full A unified STR profiling system across multiple species with whole genome sequencing data
title_fullStr A unified STR profiling system across multiple species with whole genome sequencing data
title_full_unstemmed A unified STR profiling system across multiple species with whole genome sequencing data
title_short A unified STR profiling system across multiple species with whole genome sequencing data
title_sort unified str profiling system across multiple species with whole genome sequencing data
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6923897/
https://www.ncbi.nlm.nih.gov/pubmed/31861983
http://dx.doi.org/10.1186/s12859-019-3246-y
work_keys_str_mv AT liuyilin aunifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata
AT xujiao aunifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata
AT chenmiaoxia aunifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata
AT wangchangfa aunifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata
AT lishuaicheng aunifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata
AT liuyilin unifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata
AT xujiao unifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata
AT chenmiaoxia unifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata
AT wangchangfa unifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata
AT lishuaicheng unifiedstrprofilingsystemacrossmultiplespecieswithwholegenomesequencingdata