Cargando…

Accurate quantification of transcriptome from RNA-Seq data by effective length normalization

We propose a novel, efficient and intuitive approach of estimating mRNA abundances from the whole transcriptome shotgun sequencing (RNA-Seq) data. Our method, NEUMA (Normalization by Expected Uniquely Mappable Area), is based on effective length normalization using uniquely mappable areas of gene an...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Soohyun, Seo, Chae Hwa, Lim, Byungho, Yang, Jin Ok, Oh, Jeongsu, Kim, Minjin, Lee, Sooncheol, Lee, Byungwook, Kang, Changwon, Lee, Sanghyuk
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3025570/
https://www.ncbi.nlm.nih.gov/pubmed/21059678
http://dx.doi.org/10.1093/nar/gkq1015
_version_ 1782196925952425984
author Lee, Soohyun
Seo, Chae Hwa
Lim, Byungho
Yang, Jin Ok
Oh, Jeongsu
Kim, Minjin
Lee, Sooncheol
Lee, Byungwook
Kang, Changwon
Lee, Sanghyuk
author_facet Lee, Soohyun
Seo, Chae Hwa
Lim, Byungho
Yang, Jin Ok
Oh, Jeongsu
Kim, Minjin
Lee, Sooncheol
Lee, Byungwook
Kang, Changwon
Lee, Sanghyuk
author_sort Lee, Soohyun
collection PubMed
description We propose a novel, efficient and intuitive approach of estimating mRNA abundances from the whole transcriptome shotgun sequencing (RNA-Seq) data. Our method, NEUMA (Normalization by Expected Uniquely Mappable Area), is based on effective length normalization using uniquely mappable areas of gene and mRNA isoform models. Using the known transcriptome sequence model such as RefSeq, NEUMA pre-computes the numbers of all possible gene-wise and isoform-wise informative reads: the former being sequences mapped to all mRNA isoforms of a single gene exclusively and the latter uniquely mapped to a single mRNA isoform. The results are used to estimate the effective length of genes and transcripts, taking experimental distributions of fragment size into consideration. Quantitative RT–PCR based on 27 randomly selected genes in two human cell lines and computer simulation experiments demonstrated superior accuracy of NEUMA over other recently developed methods. NEUMA covers a large proportion of genes and mRNA isoforms and offers a measure of consistency (‘consistency coefficient’) for each gene between an independently measured gene-wise level and the sum of the isoform levels. NEUMA is applicable to both paired-end and single-end RNA-Seq data. We propose that NEUMA could make a standard method in quantifying gene transcript levels from RNA-Seq data.
format Text
id pubmed-3025570
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-30255702011-01-24 Accurate quantification of transcriptome from RNA-Seq data by effective length normalization Lee, Soohyun Seo, Chae Hwa Lim, Byungho Yang, Jin Ok Oh, Jeongsu Kim, Minjin Lee, Sooncheol Lee, Byungwook Kang, Changwon Lee, Sanghyuk Nucleic Acids Res Methods Online We propose a novel, efficient and intuitive approach of estimating mRNA abundances from the whole transcriptome shotgun sequencing (RNA-Seq) data. Our method, NEUMA (Normalization by Expected Uniquely Mappable Area), is based on effective length normalization using uniquely mappable areas of gene and mRNA isoform models. Using the known transcriptome sequence model such as RefSeq, NEUMA pre-computes the numbers of all possible gene-wise and isoform-wise informative reads: the former being sequences mapped to all mRNA isoforms of a single gene exclusively and the latter uniquely mapped to a single mRNA isoform. The results are used to estimate the effective length of genes and transcripts, taking experimental distributions of fragment size into consideration. Quantitative RT–PCR based on 27 randomly selected genes in two human cell lines and computer simulation experiments demonstrated superior accuracy of NEUMA over other recently developed methods. NEUMA covers a large proportion of genes and mRNA isoforms and offers a measure of consistency (‘consistency coefficient’) for each gene between an independently measured gene-wise level and the sum of the isoform levels. NEUMA is applicable to both paired-end and single-end RNA-Seq data. We propose that NEUMA could make a standard method in quantifying gene transcript levels from RNA-Seq data. Oxford University Press 2011-01 2010-11-08 /pmc/articles/PMC3025570/ /pubmed/21059678 http://dx.doi.org/10.1093/nar/gkq1015 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Lee, Soohyun
Seo, Chae Hwa
Lim, Byungho
Yang, Jin Ok
Oh, Jeongsu
Kim, Minjin
Lee, Sooncheol
Lee, Byungwook
Kang, Changwon
Lee, Sanghyuk
Accurate quantification of transcriptome from RNA-Seq data by effective length normalization
title Accurate quantification of transcriptome from RNA-Seq data by effective length normalization
title_full Accurate quantification of transcriptome from RNA-Seq data by effective length normalization
title_fullStr Accurate quantification of transcriptome from RNA-Seq data by effective length normalization
title_full_unstemmed Accurate quantification of transcriptome from RNA-Seq data by effective length normalization
title_short Accurate quantification of transcriptome from RNA-Seq data by effective length normalization
title_sort accurate quantification of transcriptome from rna-seq data by effective length normalization
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3025570/
https://www.ncbi.nlm.nih.gov/pubmed/21059678
http://dx.doi.org/10.1093/nar/gkq1015
work_keys_str_mv AT leesoohyun accuratequantificationoftranscriptomefromrnaseqdatabyeffectivelengthnormalization
AT seochaehwa accuratequantificationoftranscriptomefromrnaseqdatabyeffectivelengthnormalization
AT limbyungho accuratequantificationoftranscriptomefromrnaseqdatabyeffectivelengthnormalization
AT yangjinok accuratequantificationoftranscriptomefromrnaseqdatabyeffectivelengthnormalization
AT ohjeongsu accuratequantificationoftranscriptomefromrnaseqdatabyeffectivelengthnormalization
AT kimminjin accuratequantificationoftranscriptomefromrnaseqdatabyeffectivelengthnormalization
AT leesooncheol accuratequantificationoftranscriptomefromrnaseqdatabyeffectivelengthnormalization
AT leebyungwook accuratequantificationoftranscriptomefromrnaseqdatabyeffectivelengthnormalization
AT kangchangwon accuratequantificationoftranscriptomefromrnaseqdatabyeffectivelengthnormalization
AT leesanghyuk accuratequantificationoftranscriptomefromrnaseqdatabyeffectivelengthnormalization