Cargando…
Efficient techniques for genotype‐phenotype correlational analysis
BACKGROUND: Single Nucleotide Polymorphisms (SNPs) are sequence variations found in individuals at some specific points in the genomic sequence. As SNPs are highly conserved throughout evolution and within a population, the map of SNPs serves as an excellent genotypic marker. Conventional SNPs analy...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3686582/ https://www.ncbi.nlm.nih.gov/pubmed/23557276 http://dx.doi.org/10.1186/1472-6947-13-41 |
_version_ | 1782273795165257728 |
---|---|
author | Saha, Subrata Rajasekaran, Sanguthevar Bi, Jinbo Pathak, Sudipta |
author_facet | Saha, Subrata Rajasekaran, Sanguthevar Bi, Jinbo Pathak, Sudipta |
author_sort | Saha, Subrata |
collection | PubMed |
description | BACKGROUND: Single Nucleotide Polymorphisms (SNPs) are sequence variations found in individuals at some specific points in the genomic sequence. As SNPs are highly conserved throughout evolution and within a population, the map of SNPs serves as an excellent genotypic marker. Conventional SNPs analysis mechanisms suffer from large run times, inefficient memory usage, and frequent overestimation. In this paper, we propose efficient, scalable, and reliable algorithms to select a small subset of SNPs from a large set of SNPs which can together be employed to perform phenotypic classification. METHODS: Our algorithms exploit the techniques of gene selection and random projections to identify a meaningful subset of SNPs. To the best of our knowledge, these techniques have not been employed before in the context of genotype‐phenotype correlations. Random projections are used to project the input data into a lower dimensional space (closely preserving distances). Gene selection is then applied on the projected data to identify a subset of the most relevant SNPs. RESULTS: We have compared the performance of our algorithms with one of the currently known best algorithms called Multifactor Dimensionality Reduction (MDR), and Principal Component Analysis (PCA) technique. Experimental results demonstrate that our algorithms are superior in terms of accuracy as well as run time. CONCLUSIONS: In our proposed techniques, random projection is used to map data from a high dimensional space to a lower dimensional space, and thus overcomes the curse of dimensionality problem. From this space of reduced dimension, we select the best subset of attributes. It is a unique mechanism in the domain of SNPs analysis, and to the best of our knowledge it is not employed before. As revealed by our experimental results, our proposed techniques offer the potential of high accuracies while keeping the run times low. |
format | Online Article Text |
id | pubmed-3686582 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-36865822013-06-25 Efficient techniques for genotype‐phenotype correlational analysis Saha, Subrata Rajasekaran, Sanguthevar Bi, Jinbo Pathak, Sudipta BMC Med Inform Decis Mak Research Article BACKGROUND: Single Nucleotide Polymorphisms (SNPs) are sequence variations found in individuals at some specific points in the genomic sequence. As SNPs are highly conserved throughout evolution and within a population, the map of SNPs serves as an excellent genotypic marker. Conventional SNPs analysis mechanisms suffer from large run times, inefficient memory usage, and frequent overestimation. In this paper, we propose efficient, scalable, and reliable algorithms to select a small subset of SNPs from a large set of SNPs which can together be employed to perform phenotypic classification. METHODS: Our algorithms exploit the techniques of gene selection and random projections to identify a meaningful subset of SNPs. To the best of our knowledge, these techniques have not been employed before in the context of genotype‐phenotype correlations. Random projections are used to project the input data into a lower dimensional space (closely preserving distances). Gene selection is then applied on the projected data to identify a subset of the most relevant SNPs. RESULTS: We have compared the performance of our algorithms with one of the currently known best algorithms called Multifactor Dimensionality Reduction (MDR), and Principal Component Analysis (PCA) technique. Experimental results demonstrate that our algorithms are superior in terms of accuracy as well as run time. CONCLUSIONS: In our proposed techniques, random projection is used to map data from a high dimensional space to a lower dimensional space, and thus overcomes the curse of dimensionality problem. From this space of reduced dimension, we select the best subset of attributes. It is a unique mechanism in the domain of SNPs analysis, and to the best of our knowledge it is not employed before. As revealed by our experimental results, our proposed techniques offer the potential of high accuracies while keeping the run times low. BioMed Central 2013-04-04 /pmc/articles/PMC3686582/ /pubmed/23557276 http://dx.doi.org/10.1186/1472-6947-13-41 Text en Copyright © 2013 Saha et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Saha, Subrata Rajasekaran, Sanguthevar Bi, Jinbo Pathak, Sudipta Efficient techniques for genotype‐phenotype correlational analysis |
title | Efficient techniques for genotype‐phenotype correlational analysis |
title_full | Efficient techniques for genotype‐phenotype correlational analysis |
title_fullStr | Efficient techniques for genotype‐phenotype correlational analysis |
title_full_unstemmed | Efficient techniques for genotype‐phenotype correlational analysis |
title_short | Efficient techniques for genotype‐phenotype correlational analysis |
title_sort | efficient techniques for genotype‐phenotype correlational analysis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3686582/ https://www.ncbi.nlm.nih.gov/pubmed/23557276 http://dx.doi.org/10.1186/1472-6947-13-41 |
work_keys_str_mv | AT sahasubrata efficienttechniquesforgenotypephenotypecorrelationalanalysis AT rajasekaransanguthevar efficienttechniquesforgenotypephenotypecorrelationalanalysis AT bijinbo efficienttechniquesforgenotypephenotypecorrelationalanalysis AT pathaksudipta efficienttechniquesforgenotypephenotypecorrelationalanalysis |