Cargando…
PCA-HPR: A principle component analysis model for human promoter recognition
We describe a promoter recognition method named PCA-HPR to locate eukaryotic promoter regions and predict transcription start sites (TSSs). We computed codon (3-mer) and pentamer (5-mer) frequencies and created codon and pentamer frequency feature matrices to extract informative and discriminative f...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Biomedical Informatics Publishing Group
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2533055/ https://www.ncbi.nlm.nih.gov/pubmed/18795109 |
_version_ | 1782159014923075584 |
---|---|
author | Li, Xiaomeng Zeng, Jia Yan, Hong |
author_facet | Li, Xiaomeng Zeng, Jia Yan, Hong |
author_sort | Li, Xiaomeng |
collection | PubMed |
description | We describe a promoter recognition method named PCA-HPR to locate eukaryotic promoter regions and predict transcription start sites (TSSs). We computed codon (3-mer) and pentamer (5-mer) frequencies and created codon and pentamer frequency feature matrices to extract informative and discriminative features for effective classification. Principal component analysis (PCA) is applied to the feature matrices and a subset of principal components (PCs) are selected for classification. Our system uses three neural network classifiers to distinguish promoters versus exons, promoters versus introns, and promoters versus 3' un-translated region (3'UTR). We compared PCA-HPR with three well-known existing promoter prediction systems such as DragonGSF, Eponine and FirstEF. Validation shows that PCA-HPR achieves the best performance with three test sets for all the four predictive systems. |
format | Text |
id | pubmed-2533055 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | Biomedical Informatics Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-25330552008-09-15 PCA-HPR: A principle component analysis model for human promoter recognition Li, Xiaomeng Zeng, Jia Yan, Hong Bioinformation Prediction Model We describe a promoter recognition method named PCA-HPR to locate eukaryotic promoter regions and predict transcription start sites (TSSs). We computed codon (3-mer) and pentamer (5-mer) frequencies and created codon and pentamer frequency feature matrices to extract informative and discriminative features for effective classification. Principal component analysis (PCA) is applied to the feature matrices and a subset of principal components (PCs) are selected for classification. Our system uses three neural network classifiers to distinguish promoters versus exons, promoters versus introns, and promoters versus 3' un-translated region (3'UTR). We compared PCA-HPR with three well-known existing promoter prediction systems such as DragonGSF, Eponine and FirstEF. Validation shows that PCA-HPR achieves the best performance with three test sets for all the four predictive systems. Biomedical Informatics Publishing Group 2008-06-19 /pmc/articles/PMC2533055/ /pubmed/18795109 Text en © 2008 Biomedical Informatics Publishing Group This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited. |
spellingShingle | Prediction Model Li, Xiaomeng Zeng, Jia Yan, Hong PCA-HPR: A principle component analysis model for human promoter recognition |
title | PCA-HPR: A principle component analysis model for human promoter recognition |
title_full | PCA-HPR: A principle component analysis model for human promoter recognition |
title_fullStr | PCA-HPR: A principle component analysis model for human promoter recognition |
title_full_unstemmed | PCA-HPR: A principle component analysis model for human promoter recognition |
title_short | PCA-HPR: A principle component analysis model for human promoter recognition |
title_sort | pca-hpr: a principle component analysis model for human promoter recognition |
topic | Prediction Model |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2533055/ https://www.ncbi.nlm.nih.gov/pubmed/18795109 |
work_keys_str_mv | AT lixiaomeng pcahpraprinciplecomponentanalysismodelforhumanpromoterrecognition AT zengjia pcahpraprinciplecomponentanalysismodelforhumanpromoterrecognition AT yanhong pcahpraprinciplecomponentanalysismodelforhumanpromoterrecognition |