Cargando…

Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data

Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regi...

Descripción completa

Detalles Bibliográficos
Autores principales: Hung, Che-Lun, Chen, Wen-Pei, Hua, Guan-Jie, Zheng, Huiru, Tsai, Suh-Jen Jane, Lin, Yaw-Ling
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4307292/
https://www.ncbi.nlm.nih.gov/pubmed/25569088
http://dx.doi.org/10.3390/ijms16011096
_version_ 1782354440119910400
author Hung, Che-Lun
Chen, Wen-Pei
Hua, Guan-Jie
Zheng, Huiru
Tsai, Suh-Jen Jane
Lin, Yaw-Ling
author_facet Hung, Che-Lun
Chen, Wen-Pei
Hua, Guan-Jie
Zheng, Huiru
Tsai, Suh-Jen Jane
Lin, Yaw-Ling
author_sort Hung, Che-Lun
collection PubMed
description Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used.
format Online
Article
Text
id pubmed-4307292
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-43072922015-02-02 Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data Hung, Che-Lun Chen, Wen-Pei Hua, Guan-Jie Zheng, Huiru Tsai, Suh-Jen Jane Lin, Yaw-Ling Int J Mol Sci Article Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used. MDPI 2015-01-05 /pmc/articles/PMC4307292/ /pubmed/25569088 http://dx.doi.org/10.3390/ijms16011096 Text en © 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hung, Che-Lun
Chen, Wen-Pei
Hua, Guan-Jie
Zheng, Huiru
Tsai, Suh-Jen Jane
Lin, Yaw-Ling
Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data
title Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data
title_full Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data
title_fullStr Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data
title_full_unstemmed Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data
title_short Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data
title_sort cloud computing-based tagsnp selection algorithm for human genome data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4307292/
https://www.ncbi.nlm.nih.gov/pubmed/25569088
http://dx.doi.org/10.3390/ijms16011096
work_keys_str_mv AT hungchelun cloudcomputingbasedtagsnpselectionalgorithmforhumangenomedata
AT chenwenpei cloudcomputingbasedtagsnpselectionalgorithmforhumangenomedata
AT huaguanjie cloudcomputingbasedtagsnpselectionalgorithmforhumangenomedata
AT zhenghuiru cloudcomputingbasedtagsnpselectionalgorithmforhumangenomedata
AT tsaisuhjenjane cloudcomputingbasedtagsnpselectionalgorithmforhumangenomedata
AT linyawling cloudcomputingbasedtagsnpselectionalgorithmforhumangenomedata