Cargando…

Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population

BACKGROUND: Next-generation sequencing has allowed for the identification of different genetic variations, which are known to contribute to diseases. Of these, insertions and deletions are the second most abundant type of variations in the genome, but their biological importance or disease associati...

Descripción completa

Detalles Bibliográficos
Autores principales: Wong, Jing Hao, Shigemizu, Daichi, Yoshii, Yukiko, Akiyama, Shintaro, Tanaka, Azusa, Nakagawa, Hidewaki, Narumiya, Shu, Fujimoto, Akihiro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6657090/
https://www.ncbi.nlm.nih.gov/pubmed/31340865
http://dx.doi.org/10.1186/s13073-019-0656-4
_version_ 1783438742036414464
author Wong, Jing Hao
Shigemizu, Daichi
Yoshii, Yukiko
Akiyama, Shintaro
Tanaka, Azusa
Nakagawa, Hidewaki
Narumiya, Shu
Fujimoto, Akihiro
author_facet Wong, Jing Hao
Shigemizu, Daichi
Yoshii, Yukiko
Akiyama, Shintaro
Tanaka, Azusa
Nakagawa, Hidewaki
Narumiya, Shu
Fujimoto, Akihiro
author_sort Wong, Jing Hao
collection PubMed
description BACKGROUND: Next-generation sequencing has allowed for the identification of different genetic variations, which are known to contribute to diseases. Of these, insertions and deletions are the second most abundant type of variations in the genome, but their biological importance or disease association is not well-studied, especially for deletions of intermediate sizes. METHODS: We identified intermediate-sized deletions from whole-genome sequencing (WGS) data of Japanese samples (n = 174) with a novel deletion calling method which considered multiple samples. These deletions were used to construct a reference panel for use in imputation. Imputation was then conducted using the reference panel and data from 82 publically available Japanese samples with gene expression data. The accuracy of the deletion calling and imputation was examined with Nanopore long-read sequencing technology. We also conducted an expression quantitative trait loci (eQTL) association analysis using the deletions to infer their functional impacts on genes, before characterizing the deletions causal for gene expression level changes. RESULTS: We obtained a set of polymorphic 4378 high-confidence deletions and constructed a reference panel. The deletions were successfully imputed into the Japanese samples with high accuracy (97.3%). The eQTL analysis identified 181 deletions (4.1%) suggested as causal for gene expression level changes. The causal deletion candidates were significantly enriched in promoters, super-enhancers, and transcription elongation chromatin states. Generation of deletions in a cell line with the CRISPR-Cas9 system confirmed that they were indeed causative variants for gene expression change. Furthermore, one of the deletions was observed to affect the gene expression levels of a gene it was not located in. CONCLUSIONS: This paper reports an accurate deletion calling method for genotype imputation at the whole genome level and shows the importance of intermediate-sized deletions in the human population. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13073-019-0656-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6657090
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-66570902019-07-31 Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population Wong, Jing Hao Shigemizu, Daichi Yoshii, Yukiko Akiyama, Shintaro Tanaka, Azusa Nakagawa, Hidewaki Narumiya, Shu Fujimoto, Akihiro Genome Med Research BACKGROUND: Next-generation sequencing has allowed for the identification of different genetic variations, which are known to contribute to diseases. Of these, insertions and deletions are the second most abundant type of variations in the genome, but their biological importance or disease association is not well-studied, especially for deletions of intermediate sizes. METHODS: We identified intermediate-sized deletions from whole-genome sequencing (WGS) data of Japanese samples (n = 174) with a novel deletion calling method which considered multiple samples. These deletions were used to construct a reference panel for use in imputation. Imputation was then conducted using the reference panel and data from 82 publically available Japanese samples with gene expression data. The accuracy of the deletion calling and imputation was examined with Nanopore long-read sequencing technology. We also conducted an expression quantitative trait loci (eQTL) association analysis using the deletions to infer their functional impacts on genes, before characterizing the deletions causal for gene expression level changes. RESULTS: We obtained a set of polymorphic 4378 high-confidence deletions and constructed a reference panel. The deletions were successfully imputed into the Japanese samples with high accuracy (97.3%). The eQTL analysis identified 181 deletions (4.1%) suggested as causal for gene expression level changes. The causal deletion candidates were significantly enriched in promoters, super-enhancers, and transcription elongation chromatin states. Generation of deletions in a cell line with the CRISPR-Cas9 system confirmed that they were indeed causative variants for gene expression change. Furthermore, one of the deletions was observed to affect the gene expression levels of a gene it was not located in. CONCLUSIONS: This paper reports an accurate deletion calling method for genotype imputation at the whole genome level and shows the importance of intermediate-sized deletions in the human population. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13073-019-0656-4) contains supplementary material, which is available to authorized users. BioMed Central 2019-07-24 /pmc/articles/PMC6657090/ /pubmed/31340865 http://dx.doi.org/10.1186/s13073-019-0656-4 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wong, Jing Hao
Shigemizu, Daichi
Yoshii, Yukiko
Akiyama, Shintaro
Tanaka, Azusa
Nakagawa, Hidewaki
Narumiya, Shu
Fujimoto, Akihiro
Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population
title Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population
title_full Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population
title_fullStr Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population
title_full_unstemmed Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population
title_short Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population
title_sort identification of intermediate-sized deletions and inference of their impact on gene expression in a human population
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6657090/
https://www.ncbi.nlm.nih.gov/pubmed/31340865
http://dx.doi.org/10.1186/s13073-019-0656-4
work_keys_str_mv AT wongjinghao identificationofintermediatesizeddeletionsandinferenceoftheirimpactongeneexpressioninahumanpopulation
AT shigemizudaichi identificationofintermediatesizeddeletionsandinferenceoftheirimpactongeneexpressioninahumanpopulation
AT yoshiiyukiko identificationofintermediatesizeddeletionsandinferenceoftheirimpactongeneexpressioninahumanpopulation
AT akiyamashintaro identificationofintermediatesizeddeletionsandinferenceoftheirimpactongeneexpressioninahumanpopulation
AT tanakaazusa identificationofintermediatesizeddeletionsandinferenceoftheirimpactongeneexpressioninahumanpopulation
AT nakagawahidewaki identificationofintermediatesizeddeletionsandinferenceoftheirimpactongeneexpressioninahumanpopulation
AT narumiyashu identificationofintermediatesizeddeletionsandinferenceoftheirimpactongeneexpressioninahumanpopulation
AT fujimotoakihiro identificationofintermediatesizeddeletionsandinferenceoftheirimpactongeneexpressioninahumanpopulation