Cargando…

False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors

Sequence similarity among distinct genomic regions can lead to errors in alignment of short reads from next-generation sequencing. While this is well known, the downstream consequences of misalignment have not been fully characterized.  We assessed the potential for incorrect alignment of RNA-sequen...

Descripción completa

Detalles Bibliográficos
Autores principales: Saha, Ashis, Battle, Alexis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6305209/
https://www.ncbi.nlm.nih.gov/pubmed/30613398
http://dx.doi.org/10.12688/f1000research.17145.2
_version_ 1783382514857934848
author Saha, Ashis
Battle, Alexis
author_facet Saha, Ashis
Battle, Alexis
author_sort Saha, Ashis
collection PubMed
description Sequence similarity among distinct genomic regions can lead to errors in alignment of short reads from next-generation sequencing. While this is well known, the downstream consequences of misalignment have not been fully characterized.  We assessed the potential for incorrect alignment of RNA-sequencing reads to cause false positives in both gene expression quantitative trait locus (eQTL) and co-expression analyses. Trans-eQTLs identified from human RNA-sequencing studies appeared to be particularly affected by this phenomenon, even when only uniquely aligned reads are considered. Over 75% of trans-eQTLs using a standard pipeline occurred between regions of sequence similarity and therefore could be due to alignment errors. Further, associations due to mapping errors are likely to misleadingly replicate between studies. To help address this problem, we quantified the potential for "cross-mapping'' to occur between every pair of annotated genes in the human genome. Such cross-mapping data can be used to filter or flag potential false positives in both trans-eQTL and co-expression analyses. Such filtering substantially alters the detection of significant associations and can have an impact on the assessment of false discovery rate, functional enrichment, and replication for RNA-sequencing association studies.
format Online
Article
Text
id pubmed-6305209
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-63052092019-01-03 False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors Saha, Ashis Battle, Alexis F1000Res Method Article Sequence similarity among distinct genomic regions can lead to errors in alignment of short reads from next-generation sequencing. While this is well known, the downstream consequences of misalignment have not been fully characterized.  We assessed the potential for incorrect alignment of RNA-sequencing reads to cause false positives in both gene expression quantitative trait locus (eQTL) and co-expression analyses. Trans-eQTLs identified from human RNA-sequencing studies appeared to be particularly affected by this phenomenon, even when only uniquely aligned reads are considered. Over 75% of trans-eQTLs using a standard pipeline occurred between regions of sequence similarity and therefore could be due to alignment errors. Further, associations due to mapping errors are likely to misleadingly replicate between studies. To help address this problem, we quantified the potential for "cross-mapping'' to occur between every pair of annotated genes in the human genome. Such cross-mapping data can be used to filter or flag potential false positives in both trans-eQTL and co-expression analyses. Such filtering substantially alters the detection of significant associations and can have an impact on the assessment of false discovery rate, functional enrichment, and replication for RNA-sequencing association studies. F1000 Research Limited 2019-04-08 /pmc/articles/PMC6305209/ /pubmed/30613398 http://dx.doi.org/10.12688/f1000research.17145.2 Text en Copyright: © 2019 Saha A and Battle A http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Method Article
Saha, Ashis
Battle, Alexis
False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors
title False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors
title_full False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors
title_fullStr False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors
title_full_unstemmed False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors
title_short False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors
title_sort false positives in trans-eqtl and co-expression analyses arising from rna-sequencing alignment errors
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6305209/
https://www.ncbi.nlm.nih.gov/pubmed/30613398
http://dx.doi.org/10.12688/f1000research.17145.2
work_keys_str_mv AT sahaashis falsepositivesintranseqtlandcoexpressionanalysesarisingfromrnasequencingalignmenterrors
AT battlealexis falsepositivesintranseqtlandcoexpressionanalysesarisingfromrnasequencingalignmenterrors