Cargando…

Insufficiently complex unique-molecular identifiers (UMIs) distort small RNA sequencing

The attachment of unique molecular identifiers (UMIs) to RNA molecules prior to PCR amplification and sequencing, makes it possible to amplify libraries to a level that is sufficient to identify rare molecules, whilst simultaneously eliminating PCR bias through the identification of duplicated reads...

Descripción completa

Detalles Bibliográficos
Autores principales: Saunders, Klay, Bert, Andrew G., Dredge, B. Kate, Toubia, John, Gregory, Philip A., Pillman, Katherine A., Goodall, Gregory J., Bracken, Cameron P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7471316/
https://www.ncbi.nlm.nih.gov/pubmed/32884024
http://dx.doi.org/10.1038/s41598-020-71323-0
_version_ 1783578754781544448
author Saunders, Klay
Bert, Andrew G.
Dredge, B. Kate
Toubia, John
Gregory, Philip A.
Pillman, Katherine A.
Goodall, Gregory J.
Bracken, Cameron P.
author_facet Saunders, Klay
Bert, Andrew G.
Dredge, B. Kate
Toubia, John
Gregory, Philip A.
Pillman, Katherine A.
Goodall, Gregory J.
Bracken, Cameron P.
author_sort Saunders, Klay
collection PubMed
description The attachment of unique molecular identifiers (UMIs) to RNA molecules prior to PCR amplification and sequencing, makes it possible to amplify libraries to a level that is sufficient to identify rare molecules, whilst simultaneously eliminating PCR bias through the identification of duplicated reads. Accurate de-duplication is dependent upon a sufficiently complex pool of UMIs to allow unique labelling. In applications dealing with complex libraries, such as total RNA-seq, only a limited variety of UMIs are required as the variation in molecules to be sequenced is enormous. However, when sequencing a less complex library, such as small RNAs for which there is a more limited range of possible sequences, we find increased variation in UMIs are required, even beyond that provided in a commercial kit specifically designed for the preparation of small RNA libraries for sequencing. We show that a pool of UMIs randomly varying across eight nucleotides is not of sufficient depth to uniquely tag the microRNAs to be sequenced. This results in over de-duplication of reads and the marked under-estimation of expression of the more abundant microRNAs. Whilst still arguing for the utility of UMIs, this work demonstrates the importance of their considered design to avoid errors in the estimation of gene expression in libraries derived from select regions of the transcriptome or small genomes.
format Online
Article
Text
id pubmed-7471316
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-74713162020-09-04 Insufficiently complex unique-molecular identifiers (UMIs) distort small RNA sequencing Saunders, Klay Bert, Andrew G. Dredge, B. Kate Toubia, John Gregory, Philip A. Pillman, Katherine A. Goodall, Gregory J. Bracken, Cameron P. Sci Rep Article The attachment of unique molecular identifiers (UMIs) to RNA molecules prior to PCR amplification and sequencing, makes it possible to amplify libraries to a level that is sufficient to identify rare molecules, whilst simultaneously eliminating PCR bias through the identification of duplicated reads. Accurate de-duplication is dependent upon a sufficiently complex pool of UMIs to allow unique labelling. In applications dealing with complex libraries, such as total RNA-seq, only a limited variety of UMIs are required as the variation in molecules to be sequenced is enormous. However, when sequencing a less complex library, such as small RNAs for which there is a more limited range of possible sequences, we find increased variation in UMIs are required, even beyond that provided in a commercial kit specifically designed for the preparation of small RNA libraries for sequencing. We show that a pool of UMIs randomly varying across eight nucleotides is not of sufficient depth to uniquely tag the microRNAs to be sequenced. This results in over de-duplication of reads and the marked under-estimation of expression of the more abundant microRNAs. Whilst still arguing for the utility of UMIs, this work demonstrates the importance of their considered design to avoid errors in the estimation of gene expression in libraries derived from select regions of the transcriptome or small genomes. Nature Publishing Group UK 2020-09-03 /pmc/articles/PMC7471316/ /pubmed/32884024 http://dx.doi.org/10.1038/s41598-020-71323-0 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Saunders, Klay
Bert, Andrew G.
Dredge, B. Kate
Toubia, John
Gregory, Philip A.
Pillman, Katherine A.
Goodall, Gregory J.
Bracken, Cameron P.
Insufficiently complex unique-molecular identifiers (UMIs) distort small RNA sequencing
title Insufficiently complex unique-molecular identifiers (UMIs) distort small RNA sequencing
title_full Insufficiently complex unique-molecular identifiers (UMIs) distort small RNA sequencing
title_fullStr Insufficiently complex unique-molecular identifiers (UMIs) distort small RNA sequencing
title_full_unstemmed Insufficiently complex unique-molecular identifiers (UMIs) distort small RNA sequencing
title_short Insufficiently complex unique-molecular identifiers (UMIs) distort small RNA sequencing
title_sort insufficiently complex unique-molecular identifiers (umis) distort small rna sequencing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7471316/
https://www.ncbi.nlm.nih.gov/pubmed/32884024
http://dx.doi.org/10.1038/s41598-020-71323-0
work_keys_str_mv AT saundersklay insufficientlycomplexuniquemolecularidentifiersumisdistortsmallrnasequencing
AT bertandrewg insufficientlycomplexuniquemolecularidentifiersumisdistortsmallrnasequencing
AT dredgebkate insufficientlycomplexuniquemolecularidentifiersumisdistortsmallrnasequencing
AT toubiajohn insufficientlycomplexuniquemolecularidentifiersumisdistortsmallrnasequencing
AT gregoryphilipa insufficientlycomplexuniquemolecularidentifiersumisdistortsmallrnasequencing
AT pillmankatherinea insufficientlycomplexuniquemolecularidentifiersumisdistortsmallrnasequencing
AT goodallgregoryj insufficientlycomplexuniquemolecularidentifiersumisdistortsmallrnasequencing
AT brackencameronp insufficientlycomplexuniquemolecularidentifiersumisdistortsmallrnasequencing