Cargando…
Boosting the prediction and understanding of DNA-binding domains from sequence
DNA-binding proteins perform vital functions related to transcription, repair and replication. We have developed a new sequence-based machine learning protocol to identify DNA-binding proteins. We compare our method with an extensive benchmark of previously published structure-based machine learning...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2879530/ https://www.ncbi.nlm.nih.gov/pubmed/20156993 http://dx.doi.org/10.1093/nar/gkq061 |
_version_ | 1782181937645879296 |
---|---|
author | Langlois, Robert E. Lu, Hui |
author_facet | Langlois, Robert E. Lu, Hui |
author_sort | Langlois, Robert E. |
collection | PubMed |
description | DNA-binding proteins perform vital functions related to transcription, repair and replication. We have developed a new sequence-based machine learning protocol to identify DNA-binding proteins. We compare our method with an extensive benchmark of previously published structure-based machine learning methods as well as a standard sequence alignment technique, BLAST. Furthermore, we elucidate important feature interactions found in a learned model and analyze how specific rules capture general mechanisms that extend across DNA-binding motifs. This analysis is carried out using the malibu machine learning workbench available at http://proteomics.bioengr.uic.edu/malibu and the corresponding data sets and features are available at http://proteomics.bioengr.uic.edu/dna. |
format | Text |
id | pubmed-2879530 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-28795302010-06-02 Boosting the prediction and understanding of DNA-binding domains from sequence Langlois, Robert E. Lu, Hui Nucleic Acids Res Computational Biology DNA-binding proteins perform vital functions related to transcription, repair and replication. We have developed a new sequence-based machine learning protocol to identify DNA-binding proteins. We compare our method with an extensive benchmark of previously published structure-based machine learning methods as well as a standard sequence alignment technique, BLAST. Furthermore, we elucidate important feature interactions found in a learned model and analyze how specific rules capture general mechanisms that extend across DNA-binding motifs. This analysis is carried out using the malibu machine learning workbench available at http://proteomics.bioengr.uic.edu/malibu and the corresponding data sets and features are available at http://proteomics.bioengr.uic.edu/dna. Oxford University Press 2010-06 2010-02-15 /pmc/articles/PMC2879530/ /pubmed/20156993 http://dx.doi.org/10.1093/nar/gkq061 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Computational Biology Langlois, Robert E. Lu, Hui Boosting the prediction and understanding of DNA-binding domains from sequence |
title | Boosting the prediction and understanding of DNA-binding domains from sequence |
title_full | Boosting the prediction and understanding of DNA-binding domains from sequence |
title_fullStr | Boosting the prediction and understanding of DNA-binding domains from sequence |
title_full_unstemmed | Boosting the prediction and understanding of DNA-binding domains from sequence |
title_short | Boosting the prediction and understanding of DNA-binding domains from sequence |
title_sort | boosting the prediction and understanding of dna-binding domains from sequence |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2879530/ https://www.ncbi.nlm.nih.gov/pubmed/20156993 http://dx.doi.org/10.1093/nar/gkq061 |
work_keys_str_mv | AT langloisroberte boostingthepredictionandunderstandingofdnabindingdomainsfromsequence AT luhui boostingthepredictionandunderstandingofdnabindingdomainsfromsequence |