Cargando…
DNA Molecule Classification Using Feature Primitives
BACKGROUND: We present a novel strategy for classification of DNA molecules using measurements from an alpha-Hemolysin channel detector. The proposed approach provides excellent classification performance for five different DNA hairpins that differ in only one base-pair. For multi-class DNA classifi...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1683559/ https://www.ncbi.nlm.nih.gov/pubmed/17118136 http://dx.doi.org/10.1186/1471-2105-7-S2-S15 |
_version_ | 1782131169225080832 |
---|---|
author | Iqbal, Raja Tanveer Landry, Matthew Winters-Hilt, Stephen |
author_facet | Iqbal, Raja Tanveer Landry, Matthew Winters-Hilt, Stephen |
author_sort | Iqbal, Raja Tanveer |
collection | PubMed |
description | BACKGROUND: We present a novel strategy for classification of DNA molecules using measurements from an alpha-Hemolysin channel detector. The proposed approach provides excellent classification performance for five different DNA hairpins that differ in only one base-pair. For multi-class DNA classification problems, practitioners usually adopt approaches that use decision trees consisting of binary classifiers. Finding the best tree topology requires exploring all possible tree topologies and is computationally prohibitive. We propose a computational framework based on feature primitives that eliminates the need of a decision tree of binary classifiers. In the first phase, we generate a pool of weak features from nanopore blockade current measurements by using HMM analysis, principal component analysis and various wavelet filters. In the next phase, feature selection is performed using AdaBoost. AdaBoost provides an ensemble of weak learners of various types learned from feature primitives. RESULTS AND CONCLUSION: We show that our technique, despite its inherent simplicity, provides a performance comparable to recent multi-class DNA molecule classification results. Unlike the approach presented by Winters-Hilt et al., where weaker data is dropped to obtain better classification, the proposed approach provides comparable classification accuracy without any need for rejection of weak data. A weakness of this approach, on the other hand, is the very "hands-on" tuning and feature selection that is required to obtain good generalization. Simply put, this method obtains a more informed set of features and provides better results for that reason. The strength of this approach appears to be in its ability to identify strong features, an area where further results are actively being sought. |
format | Text |
id | pubmed-1683559 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-16835592006-12-05 DNA Molecule Classification Using Feature Primitives Iqbal, Raja Tanveer Landry, Matthew Winters-Hilt, Stephen BMC Bioinformatics Proceedings BACKGROUND: We present a novel strategy for classification of DNA molecules using measurements from an alpha-Hemolysin channel detector. The proposed approach provides excellent classification performance for five different DNA hairpins that differ in only one base-pair. For multi-class DNA classification problems, practitioners usually adopt approaches that use decision trees consisting of binary classifiers. Finding the best tree topology requires exploring all possible tree topologies and is computationally prohibitive. We propose a computational framework based on feature primitives that eliminates the need of a decision tree of binary classifiers. In the first phase, we generate a pool of weak features from nanopore blockade current measurements by using HMM analysis, principal component analysis and various wavelet filters. In the next phase, feature selection is performed using AdaBoost. AdaBoost provides an ensemble of weak learners of various types learned from feature primitives. RESULTS AND CONCLUSION: We show that our technique, despite its inherent simplicity, provides a performance comparable to recent multi-class DNA molecule classification results. Unlike the approach presented by Winters-Hilt et al., where weaker data is dropped to obtain better classification, the proposed approach provides comparable classification accuracy without any need for rejection of weak data. A weakness of this approach, on the other hand, is the very "hands-on" tuning and feature selection that is required to obtain good generalization. Simply put, this method obtains a more informed set of features and provides better results for that reason. The strength of this approach appears to be in its ability to identify strong features, an area where further results are actively being sought. BioMed Central 2006-09-26 /pmc/articles/PMC1683559/ /pubmed/17118136 http://dx.doi.org/10.1186/1471-2105-7-S2-S15 Text en Copyright © 2006 Iqbal et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Iqbal, Raja Tanveer Landry, Matthew Winters-Hilt, Stephen DNA Molecule Classification Using Feature Primitives |
title | DNA Molecule Classification Using Feature Primitives |
title_full | DNA Molecule Classification Using Feature Primitives |
title_fullStr | DNA Molecule Classification Using Feature Primitives |
title_full_unstemmed | DNA Molecule Classification Using Feature Primitives |
title_short | DNA Molecule Classification Using Feature Primitives |
title_sort | dna molecule classification using feature primitives |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1683559/ https://www.ncbi.nlm.nih.gov/pubmed/17118136 http://dx.doi.org/10.1186/1471-2105-7-S2-S15 |
work_keys_str_mv | AT iqbalrajatanveer dnamoleculeclassificationusingfeatureprimitives AT landrymatthew dnamoleculeclassificationusingfeatureprimitives AT wintershiltstephen dnamoleculeclassificationusingfeatureprimitives |