Cargando…

Computational Inference, Validation, and Analysis of 5’UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes

Upstream and downstream sequences of immunoglobulin genes may affect the expression of such genes. However, these sequences are rarely studied or characterized in most studies of immunoglobulin repertoires. Inference from large, rearranged immunoglobulin transcriptome data sets offers an opportunity...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Yixun, Thörnqvist, Linnea, Ohlin, Mats
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8521166/
https://www.ncbi.nlm.nih.gov/pubmed/34671351
http://dx.doi.org/10.3389/fimmu.2021.730105
_version_ 1784584844694519808
author Huang, Yixun
Thörnqvist, Linnea
Ohlin, Mats
author_facet Huang, Yixun
Thörnqvist, Linnea
Ohlin, Mats
author_sort Huang, Yixun
collection PubMed
description Upstream and downstream sequences of immunoglobulin genes may affect the expression of such genes. However, these sequences are rarely studied or characterized in most studies of immunoglobulin repertoires. Inference from large, rearranged immunoglobulin transcriptome data sets offers an opportunity to define the upstream regions (5’-untranslated regions and leader sequences). We have now established a new data pre-processing procedure to eliminate artifacts caused by a 5’-RACE library generation process, reanalyzed a previously studied data set defining human immunoglobulin heavy chain genes, and identified novel upstream regions, as well as previously identified upstream regions that may have been identified in error. Upstream sequences were also identified for a set of previously uncharacterized germline gene alleles. Several novel upstream region variants were validated, for instance by their segregation to a single haplotype in heterozygotic subjects. SNPs representing several sequence variants were identified from population data. Finally, based on the outcomes of the analysis, we define a set of testable hypotheses with respect to the placement of particular alleles in complex IGHV locus haplotypes, and discuss the evolutionary relatedness of particular heavy chain variable genes based on sequences of their upstream regions.
format Online
Article
Text
id pubmed-8521166
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-85211662021-10-19 Computational Inference, Validation, and Analysis of 5’UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes Huang, Yixun Thörnqvist, Linnea Ohlin, Mats Front Immunol Immunology Upstream and downstream sequences of immunoglobulin genes may affect the expression of such genes. However, these sequences are rarely studied or characterized in most studies of immunoglobulin repertoires. Inference from large, rearranged immunoglobulin transcriptome data sets offers an opportunity to define the upstream regions (5’-untranslated regions and leader sequences). We have now established a new data pre-processing procedure to eliminate artifacts caused by a 5’-RACE library generation process, reanalyzed a previously studied data set defining human immunoglobulin heavy chain genes, and identified novel upstream regions, as well as previously identified upstream regions that may have been identified in error. Upstream sequences were also identified for a set of previously uncharacterized germline gene alleles. Several novel upstream region variants were validated, for instance by their segregation to a single haplotype in heterozygotic subjects. SNPs representing several sequence variants were identified from population data. Finally, based on the outcomes of the analysis, we define a set of testable hypotheses with respect to the placement of particular alleles in complex IGHV locus haplotypes, and discuss the evolutionary relatedness of particular heavy chain variable genes based on sequences of their upstream regions. Frontiers Media S.A. 2021-10-04 /pmc/articles/PMC8521166/ /pubmed/34671351 http://dx.doi.org/10.3389/fimmu.2021.730105 Text en Copyright © 2021 Huang, Thörnqvist and Ohlin https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Immunology
Huang, Yixun
Thörnqvist, Linnea
Ohlin, Mats
Computational Inference, Validation, and Analysis of 5’UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes
title Computational Inference, Validation, and Analysis of 5’UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes
title_full Computational Inference, Validation, and Analysis of 5’UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes
title_fullStr Computational Inference, Validation, and Analysis of 5’UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes
title_full_unstemmed Computational Inference, Validation, and Analysis of 5’UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes
title_short Computational Inference, Validation, and Analysis of 5’UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes
title_sort computational inference, validation, and analysis of 5’utr-leader sequences of alleles of immunoglobulin heavy chain variable genes
topic Immunology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8521166/
https://www.ncbi.nlm.nih.gov/pubmed/34671351
http://dx.doi.org/10.3389/fimmu.2021.730105
work_keys_str_mv AT huangyixun computationalinferencevalidationandanalysisof5utrleadersequencesofallelesofimmunoglobulinheavychainvariablegenes
AT thornqvistlinnea computationalinferencevalidationandanalysisof5utrleadersequencesofallelesofimmunoglobulinheavychainvariablegenes
AT ohlinmats computationalinferencevalidationandanalysisof5utrleadersequencesofallelesofimmunoglobulinheavychainvariablegenes