Cargando…

Paired-end small RNA sequencing reveals a possible overestimation in the isomiR sequence repertoire previously reported from conventional single read data analysis

BACKGROUND: Next generation sequencing has allowed the discovery of miRNA isoforms, termed isomiRs. Some isomiRs are derived from imprecise processing of pre-miRNA precursors, leading to length variants. Additional variability is introduced by non-templated addition of bases at the ends or editing o...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sanchez Herrero, Jose Francisco, Pluvinet, Raquel, Luna de Haro, Antonio, Sumoy, Lauro
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8077951/ https://www.ncbi.nlm.nih.gov/pubmed/33902448 http://dx.doi.org/10.1186/s12859-021-04128-1

Descripción
Sumario:	BACKGROUND: Next generation sequencing has allowed the discovery of miRNA isoforms, termed isomiRs. Some isomiRs are derived from imprecise processing of pre-miRNA precursors, leading to length variants. Additional variability is introduced by non-templated addition of bases at the ends or editing of internal bases, resulting in base differences relative to the template DNA sequence. We hypothesized that some component of the isomiR variation reported so far could be due to systematic technical noise and not real. RESULTS: We have developed the XICRA pipeline to analyze small RNA sequencing data at the isomiR level. We exploited its ability to use single or merged reads to compare isomiR results derived from paired-end (PE) reads with those from single reads (SR) to address whether detectable sequence differences relative to canonical miRNAs found in isomiRs are true biological variations or the result of errors in sequencing. We have detected non-negligible systematic differences between SR and PE data which primarily affect putative internally edited isomiRs, and at a much smaller frequency terminal length changing isomiRs. This is relevant for the identification of true isomiRs in small RNA sequencing datasets. CONCLUSIONS: We conclude that potential artifacts derived from sequencing errors and/or data processing could result in an overestimation of abundance and diversity of miRNA isoforms. Efforts in annotating the isomiRnome should take this into account. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04128-1.

Paired-end small RNA sequencing reveals a possible overestimation in the isomiR sequence repertoire previously reported from conventional single read data analysis

Ejemplares similares