Cargando…
M6A-BiNP: predicting N(6)-methyladenosine sites based on bidirectional position-specific propensities of polynucleotides and pointwise joint mutual information
N(6)-methyladenosine (m(6)A) plays an important role in various biological processes. Identifying m(6)A site is a key step in exploring its biological functions. One of the biggest challenges in identifying m(6)A sites is how to extract features comprising rich categorical information to distinguish...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Taylor & Francis
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8632114/ https://www.ncbi.nlm.nih.gov/pubmed/34161188 http://dx.doi.org/10.1080/15476286.2021.1930729 |
Sumario: | N(6)-methyladenosine (m(6)A) plays an important role in various biological processes. Identifying m(6)A site is a key step in exploring its biological functions. One of the biggest challenges in identifying m(6)A sites is how to extract features comprising rich categorical information to distinguish m(6)A and non-m(6)A sites. To address this challenge, we propose bidirectional dinucleotide and trinucleotide position-specific propensities, respectively, in this paper. Based on this, we propose two feature-encoding algorithms: Position-Specific Propensities and Pointwise Mutual Information (PSP-PMI) and Position-Specific Propensities and Pointwise Joint Mutual Information (PSP-PJMI). PSP-PMI is based on the bidirectional dinucleotide propensity and the pointwise mutual information, while PSP-PJMI is based on the bidirectional trinucleotide position-specific propensity and the proposed pointwise joint mutual information in this paper. We introduce parameters [Image: see text] and [Image: see text] in PSP-PMI and PSP-PJMI, respectively, to represent the distance from the nucleotide to its forward or backward adjacent nucleotide or dinucleotide, so as to extract features containing local and global classification information. Finally, we propose the M6A-BiNP predictor based on PSP-PMI or PSP-PJMI and SVM classifier. The 10-fold cross-validation experimental results on the benchmark datasets of non-single-base resolution and single-base resolution demonstrate that PSP-PMI and PSP-PJMI can extract features with strong capabilities to identify m(6)A and non-m(6)A sites. The M6A-BiNP predictor based on our proposed feature encoding algorithm PSP-PJMI is better than the state-of-the-art predictors, and it is so far the best model to identify m(6)A and non-m(6)A sites. |
---|