Cargando…
Deep and accurate detection of m(6)A RNA modifications using miCLIP2 and m6Aboost machine learning
N6-methyladenosine (m(6)A) is the most abundant internal RNA modification in eukaryotic mRNAs and influences many aspects of RNA processing. miCLIP (m(6)A individual-nucleotide resolution UV crosslinking and immunoprecipitation) is an antibody-based approach to map m(6)A sites with single-nucleotide...
Autores principales: | , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8450095/ https://www.ncbi.nlm.nih.gov/pubmed/34157120 http://dx.doi.org/10.1093/nar/gkab485 |
Sumario: | N6-methyladenosine (m(6)A) is the most abundant internal RNA modification in eukaryotic mRNAs and influences many aspects of RNA processing. miCLIP (m(6)A individual-nucleotide resolution UV crosslinking and immunoprecipitation) is an antibody-based approach to map m(6)A sites with single-nucleotide resolution. However, due to broad antibody reactivity, reliable identification of m(6)A sites from miCLIP data remains challenging. Here, we present miCLIP2 in combination with machine learning to significantly improve m(6)A detection. The optimized miCLIP2 results in high-complexity libraries from less input material. Importantly, we established a robust computational pipeline to tackle the inherent issue of false positives in antibody-based m(6)A detection. The analyses were calibrated with Mettl3 knockout cells to learn the characteristics of m(6)A deposition, including m(6)A sites outside of DRACH motifs. To make our results universally applicable, we trained a machine learning model, m6Aboost, based on the experimental and RNA sequence features. Importantly, m6Aboost allows prediction of genuine m(6)A sites in miCLIP2 data without filtering for DRACH motifs or the need for Mettl3 depletion. Using m6Aboost, we identify thousands of high-confidence m(6)A sites in different murine and human cell lines, which provide a rich resource for future analysis. Collectively, our combined experimental and computational methodology greatly improves m(6)A identification. |
---|