Cargando…

Identification of degenerate motifs using position restricted selection and hybrid ranking combination

The identification of regulatory elements recognized by transcription factors and chromatin remodeling factors is essential to studying the regulation of gene expression. When no auxiliary data, such as orthologous sequences or expression profiles, are used, the accuracy of most tools for motif disc...

Descripción completa

Detalles Bibliográficos
Autores principales: Peng, Chien-Hua, Hsu, Jeh-Ting, Chung, Yun-Sheng, Lin, Yen-Jen, Chow, Wei-Yuan, Hsu, D. Frank, Tang, Chuan Yi
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1702486/
https://www.ncbi.nlm.nih.gov/pubmed/17130169
http://dx.doi.org/10.1093/nar/gkl658
_version_ 1782131256100651008
author Peng, Chien-Hua
Hsu, Jeh-Ting
Chung, Yun-Sheng
Lin, Yen-Jen
Chow, Wei-Yuan
Hsu, D. Frank
Tang, Chuan Yi
author_facet Peng, Chien-Hua
Hsu, Jeh-Ting
Chung, Yun-Sheng
Lin, Yen-Jen
Chow, Wei-Yuan
Hsu, D. Frank
Tang, Chuan Yi
author_sort Peng, Chien-Hua
collection PubMed
description The identification of regulatory elements recognized by transcription factors and chromatin remodeling factors is essential to studying the regulation of gene expression. When no auxiliary data, such as orthologous sequences or expression profiles, are used, the accuracy of most tools for motif discovery is strongly influenced by the motif degeneracy and the lengths of sequence. Since suitable auxiliary data may not always be available, more work must be conducted to enhance tool performance to identify transcription elements in the metazoan. A non-alignment-based algorithm, MotifSeeker, is proposed to enhance the accuracy of discovering degenerate motifs. MotifSeeker utilizes the property that variable sites of transcription elements are usually position-specific to reduce exposure to noise. Consequently, the efficiency and accuracy of motif identification are improved. Using data fusion, the ranking process integrates two measures of motif significance, resulting in a more robust significance measure. Testing results for the synthetic data reveal that the accuracy of MotifSeeker is less sensitive to the motif degeneracy and the length of input sequences. Furthermore, MotifSeeker has been tested on a well-known benchmark [M. Tompa, N. Li, T.L. Bailey, G.M. Church, B. De Moor, E. Eskin, A.V. Favorov, M.C. Frith, Y. Fu, W.J. Kent, et al. (2005) Nat. Biotechnol., 23, 137–144], yielding a correlation coefficient of 0.262, which compares favorably with those of other tools. The high applicability of MotifSeeker to biological data is further demonstrated experimentally on regulons of Saccharomyces cerevisiae and liver-specific genes with experimentally verified regulatory elements.
format Text
id pubmed-1702486
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-17024862006-12-26 Identification of degenerate motifs using position restricted selection and hybrid ranking combination Peng, Chien-Hua Hsu, Jeh-Ting Chung, Yun-Sheng Lin, Yen-Jen Chow, Wei-Yuan Hsu, D. Frank Tang, Chuan Yi Nucleic Acids Res Computational Biology The identification of regulatory elements recognized by transcription factors and chromatin remodeling factors is essential to studying the regulation of gene expression. When no auxiliary data, such as orthologous sequences or expression profiles, are used, the accuracy of most tools for motif discovery is strongly influenced by the motif degeneracy and the lengths of sequence. Since suitable auxiliary data may not always be available, more work must be conducted to enhance tool performance to identify transcription elements in the metazoan. A non-alignment-based algorithm, MotifSeeker, is proposed to enhance the accuracy of discovering degenerate motifs. MotifSeeker utilizes the property that variable sites of transcription elements are usually position-specific to reduce exposure to noise. Consequently, the efficiency and accuracy of motif identification are improved. Using data fusion, the ranking process integrates two measures of motif significance, resulting in a more robust significance measure. Testing results for the synthetic data reveal that the accuracy of MotifSeeker is less sensitive to the motif degeneracy and the length of input sequences. Furthermore, MotifSeeker has been tested on a well-known benchmark [M. Tompa, N. Li, T.L. Bailey, G.M. Church, B. De Moor, E. Eskin, A.V. Favorov, M.C. Frith, Y. Fu, W.J. Kent, et al. (2005) Nat. Biotechnol., 23, 137–144], yielding a correlation coefficient of 0.262, which compares favorably with those of other tools. The high applicability of MotifSeeker to biological data is further demonstrated experimentally on regulons of Saccharomyces cerevisiae and liver-specific genes with experimentally verified regulatory elements. Oxford University Press 2006-12 2006-11-27 /pmc/articles/PMC1702486/ /pubmed/17130169 http://dx.doi.org/10.1093/nar/gkl658 Text en © 2006 The Author(s). https://creativecommons.org/licenses/by-nc/2.0/uk/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/ (https://creativecommons.org/licenses/by-nc/2.0/uk/) ) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Peng, Chien-Hua
Hsu, Jeh-Ting
Chung, Yun-Sheng
Lin, Yen-Jen
Chow, Wei-Yuan
Hsu, D. Frank
Tang, Chuan Yi
Identification of degenerate motifs using position restricted selection and hybrid ranking combination
title Identification of degenerate motifs using position restricted selection and hybrid ranking combination
title_full Identification of degenerate motifs using position restricted selection and hybrid ranking combination
title_fullStr Identification of degenerate motifs using position restricted selection and hybrid ranking combination
title_full_unstemmed Identification of degenerate motifs using position restricted selection and hybrid ranking combination
title_short Identification of degenerate motifs using position restricted selection and hybrid ranking combination
title_sort identification of degenerate motifs using position restricted selection and hybrid ranking combination
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1702486/
https://www.ncbi.nlm.nih.gov/pubmed/17130169
http://dx.doi.org/10.1093/nar/gkl658
work_keys_str_mv AT pengchienhua identificationofdegeneratemotifsusingpositionrestrictedselectionandhybridrankingcombination
AT hsujehting identificationofdegeneratemotifsusingpositionrestrictedselectionandhybridrankingcombination
AT chungyunsheng identificationofdegeneratemotifsusingpositionrestrictedselectionandhybridrankingcombination
AT linyenjen identificationofdegeneratemotifsusingpositionrestrictedselectionandhybridrankingcombination
AT chowweiyuan identificationofdegeneratemotifsusingpositionrestrictedselectionandhybridrankingcombination
AT hsudfrank identificationofdegeneratemotifsusingpositionrestrictedselectionandhybridrankingcombination
AT tangchuanyi identificationofdegeneratemotifsusingpositionrestrictedselectionandhybridrankingcombination