Cargando…

Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats

Y-chromosome short tandem repeats (Y-STRs) are genetic markers with practical applications in human identification. However, where mass identification is required (e.g., in the aftermath of disasters with significant fatalities), the efficiency of the process could be improved with new statistical a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Seman, Ali, Sapawi, Azizian Mohd, Salleh, Mohd Zaki
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Mary Ann Liebert, Inc. 2015
Materias:	Original Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4486443/ https://www.ncbi.nlm.nih.gov/pubmed/25945508 http://dx.doi.org/10.1089/omi.2014.0136

_version_	1782378892702515200
author	Seman, Ali Sapawi, Azizian Mohd Salleh, Mohd Zaki
author_facet	Seman, Ali Sapawi, Azizian Mohd Salleh, Mohd Zaki
author_sort	Seman, Ali
collection	PubMed
description	Y-chromosome short tandem repeats (Y-STRs) are genetic markers with practical applications in human identification. However, where mass identification is required (e.g., in the aftermath of disasters with significant fatalities), the efficiency of the process could be improved with new statistical approaches. Clustering applications are relatively new tools for large-scale comparative genotyping, and the k-Approximate Modal Haplotype (k-AMH), an efficient algorithm for clustering large-scale Y-STR data, represents a promising method for developing these tools. In this study we improved the k-AMH and produced three new algorithms: the Nk-AMH I (including a new initial cluster center selection), the Nk-AMH II (including a new dominant weighting value), and the Nk-AMH III (combining I and II). The Nk-AMH III was the superior algorithm, with mean clustering accuracy that increased in four out of six datasets and remained at 100% in the other two. Additionally, the Nk-AMH III achieved a 2% higher overall mean clustering accuracy score than the k-AMH, as well as optimal accuracy for all datasets (0.84–1.00). With inclusion of the two new methods, the Nk-AMH III produced an optimal solution for clustering Y-STR data; thus, the algorithm has potential for further development towards fully automatic clustering of any large-scale genotypic data.
format	Online Article Text
id	pubmed-4486443
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Mary Ann Liebert, Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-44864432015-09-23 Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats Seman, Ali Sapawi, Azizian Mohd Salleh, Mohd Zaki OMICS Original Articles Y-chromosome short tandem repeats (Y-STRs) are genetic markers with practical applications in human identification. However, where mass identification is required (e.g., in the aftermath of disasters with significant fatalities), the efficiency of the process could be improved with new statistical approaches. Clustering applications are relatively new tools for large-scale comparative genotyping, and the k-Approximate Modal Haplotype (k-AMH), an efficient algorithm for clustering large-scale Y-STR data, represents a promising method for developing these tools. In this study we improved the k-AMH and produced three new algorithms: the Nk-AMH I (including a new initial cluster center selection), the Nk-AMH II (including a new dominant weighting value), and the Nk-AMH III (combining I and II). The Nk-AMH III was the superior algorithm, with mean clustering accuracy that increased in four out of six datasets and remained at 100% in the other two. Additionally, the Nk-AMH III achieved a 2% higher overall mean clustering accuracy score than the k-AMH, as well as optimal accuracy for all datasets (0.84–1.00). With inclusion of the two new methods, the Nk-AMH III produced an optimal solution for clustering Y-STR data; thus, the algorithm has potential for further development towards fully automatic clustering of any large-scale genotypic data. Mary Ann Liebert, Inc. 2015-06-01 /pmc/articles/PMC4486443/ /pubmed/25945508 http://dx.doi.org/10.1089/omi.2014.0136 Text en © The Author(s) 2015; Published by Mary Ann Liebert, Inc. This Open Access article is distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle	Original Articles Seman, Ali Sapawi, Azizian Mohd Salleh, Mohd Zaki Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats
title	Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats
title_full	Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats
title_fullStr	Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats
title_full_unstemmed	Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats
title_short	Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats
title_sort	towards development of clustering applications for large-scale comparative genotyping and kinship analysis using y-short tandem repeats
topic	Original Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4486443/ https://www.ncbi.nlm.nih.gov/pubmed/25945508 http://dx.doi.org/10.1089/omi.2014.0136
work_keys_str_mv	AT semanali towardsdevelopmentofclusteringapplicationsforlargescalecomparativegenotypingandkinshipanalysisusingyshorttandemrepeats AT sapawiazizianmohd towardsdevelopmentofclusteringapplicationsforlargescalecomparativegenotypingandkinshipanalysisusingyshorttandemrepeats AT sallehmohdzaki towardsdevelopmentofclusteringapplicationsforlargescalecomparativegenotypingandkinshipanalysisusingyshorttandemrepeats

Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats

Ejemplares similares