Cargando…

DNA Molecule Classification Using Feature Primitives

BACKGROUND: We present a novel strategy for classification of DNA molecules using measurements from an alpha-Hemolysin channel detector. The proposed approach provides excellent classification performance for five different DNA hairpins that differ in only one base-pair. For multi-class DNA classifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Iqbal, Raja Tanveer, Landry, Matthew, Winters-Hilt, Stephen
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1683559/
https://www.ncbi.nlm.nih.gov/pubmed/17118136
http://dx.doi.org/10.1186/1471-2105-7-S2-S15
_version_ 1782131169225080832
author Iqbal, Raja Tanveer
Landry, Matthew
Winters-Hilt, Stephen
author_facet Iqbal, Raja Tanveer
Landry, Matthew
Winters-Hilt, Stephen
author_sort Iqbal, Raja Tanveer
collection PubMed
description BACKGROUND: We present a novel strategy for classification of DNA molecules using measurements from an alpha-Hemolysin channel detector. The proposed approach provides excellent classification performance for five different DNA hairpins that differ in only one base-pair. For multi-class DNA classification problems, practitioners usually adopt approaches that use decision trees consisting of binary classifiers. Finding the best tree topology requires exploring all possible tree topologies and is computationally prohibitive. We propose a computational framework based on feature primitives that eliminates the need of a decision tree of binary classifiers. In the first phase, we generate a pool of weak features from nanopore blockade current measurements by using HMM analysis, principal component analysis and various wavelet filters. In the next phase, feature selection is performed using AdaBoost. AdaBoost provides an ensemble of weak learners of various types learned from feature primitives. RESULTS AND CONCLUSION: We show that our technique, despite its inherent simplicity, provides a performance comparable to recent multi-class DNA molecule classification results. Unlike the approach presented by Winters-Hilt et al., where weaker data is dropped to obtain better classification, the proposed approach provides comparable classification accuracy without any need for rejection of weak data. A weakness of this approach, on the other hand, is the very "hands-on" tuning and feature selection that is required to obtain good generalization. Simply put, this method obtains a more informed set of features and provides better results for that reason. The strength of this approach appears to be in its ability to identify strong features, an area where further results are actively being sought.
format Text
id pubmed-1683559
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-16835592006-12-05 DNA Molecule Classification Using Feature Primitives Iqbal, Raja Tanveer Landry, Matthew Winters-Hilt, Stephen BMC Bioinformatics Proceedings BACKGROUND: We present a novel strategy for classification of DNA molecules using measurements from an alpha-Hemolysin channel detector. The proposed approach provides excellent classification performance for five different DNA hairpins that differ in only one base-pair. For multi-class DNA classification problems, practitioners usually adopt approaches that use decision trees consisting of binary classifiers. Finding the best tree topology requires exploring all possible tree topologies and is computationally prohibitive. We propose a computational framework based on feature primitives that eliminates the need of a decision tree of binary classifiers. In the first phase, we generate a pool of weak features from nanopore blockade current measurements by using HMM analysis, principal component analysis and various wavelet filters. In the next phase, feature selection is performed using AdaBoost. AdaBoost provides an ensemble of weak learners of various types learned from feature primitives. RESULTS AND CONCLUSION: We show that our technique, despite its inherent simplicity, provides a performance comparable to recent multi-class DNA molecule classification results. Unlike the approach presented by Winters-Hilt et al., where weaker data is dropped to obtain better classification, the proposed approach provides comparable classification accuracy without any need for rejection of weak data. A weakness of this approach, on the other hand, is the very "hands-on" tuning and feature selection that is required to obtain good generalization. Simply put, this method obtains a more informed set of features and provides better results for that reason. The strength of this approach appears to be in its ability to identify strong features, an area where further results are actively being sought. BioMed Central 2006-09-26 /pmc/articles/PMC1683559/ /pubmed/17118136 http://dx.doi.org/10.1186/1471-2105-7-S2-S15 Text en Copyright © 2006 Iqbal et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Iqbal, Raja Tanveer
Landry, Matthew
Winters-Hilt, Stephen
DNA Molecule Classification Using Feature Primitives
title DNA Molecule Classification Using Feature Primitives
title_full DNA Molecule Classification Using Feature Primitives
title_fullStr DNA Molecule Classification Using Feature Primitives
title_full_unstemmed DNA Molecule Classification Using Feature Primitives
title_short DNA Molecule Classification Using Feature Primitives
title_sort dna molecule classification using feature primitives
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1683559/
https://www.ncbi.nlm.nih.gov/pubmed/17118136
http://dx.doi.org/10.1186/1471-2105-7-S2-S15
work_keys_str_mv AT iqbalrajatanveer dnamoleculeclassificationusingfeatureprimitives
AT landrymatthew dnamoleculeclassificationusingfeatureprimitives
AT wintershiltstephen dnamoleculeclassificationusingfeatureprimitives