Cargando…

LCAT: an isoform-sensitive error correction for transcriptome sequencing long reads

As the carrier of genetic information, RNA carries the information from genes to proteins. Transcriptome sequencing technology is an important way to obtain transcriptome sequences, and it is also the basis for transcriptome research. With the development of third-generation sequencing, long reads c...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Wufei, Liao, Xingyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10245045/
https://www.ncbi.nlm.nih.gov/pubmed/37292144
http://dx.doi.org/10.3389/fgene.2023.1166975
_version_ 1785054777499975680
author Zhu, Wufei
Liao, Xingyu
author_facet Zhu, Wufei
Liao, Xingyu
author_sort Zhu, Wufei
collection PubMed
description As the carrier of genetic information, RNA carries the information from genes to proteins. Transcriptome sequencing technology is an important way to obtain transcriptome sequences, and it is also the basis for transcriptome research. With the development of third-generation sequencing, long reads can cover full-length transcripts and reflect the composition of different isoforms. However, the high error rate of third-generation sequencing affects the accuracy of long reads and downstream analysis. The current error correction methods seldom consider the existence of different isoforms in RNA, which makes the diversity of isoforms a serious loss. Here, we introduce LCAT (long-read error correction algorithm for transcriptome sequencing data), a wrapper algorithm of MECAT, to reduce the loss of isoform diversity while keeping MECAT’s error correction performance. The experimental results show that LCAT can not only improve the quality of transcriptome sequencing long reads but also retain the diversity of isoforms.
format Online
Article
Text
id pubmed-10245045
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-102450452023-06-08 LCAT: an isoform-sensitive error correction for transcriptome sequencing long reads Zhu, Wufei Liao, Xingyu Front Genet Genetics As the carrier of genetic information, RNA carries the information from genes to proteins. Transcriptome sequencing technology is an important way to obtain transcriptome sequences, and it is also the basis for transcriptome research. With the development of third-generation sequencing, long reads can cover full-length transcripts and reflect the composition of different isoforms. However, the high error rate of third-generation sequencing affects the accuracy of long reads and downstream analysis. The current error correction methods seldom consider the existence of different isoforms in RNA, which makes the diversity of isoforms a serious loss. Here, we introduce LCAT (long-read error correction algorithm for transcriptome sequencing data), a wrapper algorithm of MECAT, to reduce the loss of isoform diversity while keeping MECAT’s error correction performance. The experimental results show that LCAT can not only improve the quality of transcriptome sequencing long reads but also retain the diversity of isoforms. Frontiers Media S.A. 2023-05-24 /pmc/articles/PMC10245045/ /pubmed/37292144 http://dx.doi.org/10.3389/fgene.2023.1166975 Text en Copyright © 2023 Zhu and Liao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhu, Wufei
Liao, Xingyu
LCAT: an isoform-sensitive error correction for transcriptome sequencing long reads
title LCAT: an isoform-sensitive error correction for transcriptome sequencing long reads
title_full LCAT: an isoform-sensitive error correction for transcriptome sequencing long reads
title_fullStr LCAT: an isoform-sensitive error correction for transcriptome sequencing long reads
title_full_unstemmed LCAT: an isoform-sensitive error correction for transcriptome sequencing long reads
title_short LCAT: an isoform-sensitive error correction for transcriptome sequencing long reads
title_sort lcat: an isoform-sensitive error correction for transcriptome sequencing long reads
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10245045/
https://www.ncbi.nlm.nih.gov/pubmed/37292144
http://dx.doi.org/10.3389/fgene.2023.1166975
work_keys_str_mv AT zhuwufei lcatanisoformsensitiveerrorcorrectionfortranscriptomesequencinglongreads
AT liaoxingyu lcatanisoformsensitiveerrorcorrectionfortranscriptomesequencinglongreads