Cargando…

Ancestry patterns inferred from massive RNA-seq data

There is a growing body of evidence suggesting that patterns of gene expression vary within and between human populations. However, the impact of this variation in human diseases has been poorly explored, in part owing to the lack of a standardized protocol to estimate biogeographical ancestry from...

Descripción completa

Detalles Bibliográficos
Autores principales: Barral-Arca, Ruth, Pardo-Seco, Jacobo, Bello, Xabi, Martinón-Torres, Federico, Salas, Antonio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6573782/
https://www.ncbi.nlm.nih.gov/pubmed/31010885
http://dx.doi.org/10.1261/rna.070052.118
_version_ 1783427783643365376
author Barral-Arca, Ruth
Pardo-Seco, Jacobo
Bello, Xabi
Martinón-Torres, Federico
Salas, Antonio
author_facet Barral-Arca, Ruth
Pardo-Seco, Jacobo
Bello, Xabi
Martinón-Torres, Federico
Salas, Antonio
author_sort Barral-Arca, Ruth
collection PubMed
description There is a growing body of evidence suggesting that patterns of gene expression vary within and between human populations. However, the impact of this variation in human diseases has been poorly explored, in part owing to the lack of a standardized protocol to estimate biogeographical ancestry from gene expression studies. Here we examine several studies that provide new solid evidence indicating that the ancestral background of individuals impacts gene expression patterns. Next, we test a procedure to infer genetic ancestry from RNA-seq data in 25 data sets where information on ethnicity was reported. Genome data of reference continental populations retrieved from The 1000 Genomes Project were used for comparisons. Remarkably, only eight out of 25 data sets passed FastQC default filters. We demonstrate that, for these eight population sets, the ancestral background of donors could be inferred very efficiently, even in data sets including samples with complex patterns of admixture (e.g., American-admixed populations). For most of the gene expression data sets of suboptimal quality, ancestral inference yielded odd patterns. The present study thus brings a cautionary note for gene expression studies highlighting the importance to control for the potential confounding effect of ancestral genetic background.
format Online
Article
Text
id pubmed-6573782
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-65737822020-07-01 Ancestry patterns inferred from massive RNA-seq data Barral-Arca, Ruth Pardo-Seco, Jacobo Bello, Xabi Martinón-Torres, Federico Salas, Antonio RNA Article There is a growing body of evidence suggesting that patterns of gene expression vary within and between human populations. However, the impact of this variation in human diseases has been poorly explored, in part owing to the lack of a standardized protocol to estimate biogeographical ancestry from gene expression studies. Here we examine several studies that provide new solid evidence indicating that the ancestral background of individuals impacts gene expression patterns. Next, we test a procedure to infer genetic ancestry from RNA-seq data in 25 data sets where information on ethnicity was reported. Genome data of reference continental populations retrieved from The 1000 Genomes Project were used for comparisons. Remarkably, only eight out of 25 data sets passed FastQC default filters. We demonstrate that, for these eight population sets, the ancestral background of donors could be inferred very efficiently, even in data sets including samples with complex patterns of admixture (e.g., American-admixed populations). For most of the gene expression data sets of suboptimal quality, ancestral inference yielded odd patterns. The present study thus brings a cautionary note for gene expression studies highlighting the importance to control for the potential confounding effect of ancestral genetic background. Cold Spring Harbor Laboratory Press 2019-07 /pmc/articles/PMC6573782/ /pubmed/31010885 http://dx.doi.org/10.1261/rna.070052.118 Text en © 2019 Barral-Arca et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by the RNA Society for the first 12 months after the full-issue publication date (see http://rnajournal.cshlp.org/site/misc/terms.xhtml). After 12 months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Article
Barral-Arca, Ruth
Pardo-Seco, Jacobo
Bello, Xabi
Martinón-Torres, Federico
Salas, Antonio
Ancestry patterns inferred from massive RNA-seq data
title Ancestry patterns inferred from massive RNA-seq data
title_full Ancestry patterns inferred from massive RNA-seq data
title_fullStr Ancestry patterns inferred from massive RNA-seq data
title_full_unstemmed Ancestry patterns inferred from massive RNA-seq data
title_short Ancestry patterns inferred from massive RNA-seq data
title_sort ancestry patterns inferred from massive rna-seq data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6573782/
https://www.ncbi.nlm.nih.gov/pubmed/31010885
http://dx.doi.org/10.1261/rna.070052.118
work_keys_str_mv AT barralarcaruth ancestrypatternsinferredfrommassivernaseqdata
AT pardosecojacobo ancestrypatternsinferredfrommassivernaseqdata
AT belloxabi ancestrypatternsinferredfrommassivernaseqdata
AT martinontorresfederico ancestrypatternsinferredfrommassivernaseqdata
AT salasantonio ancestrypatternsinferredfrommassivernaseqdata