Cargando…

Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information

Inter-sample comparisons of T-cell receptor (TCR) repertoires are crucial for gaining a better understanding of the immunological states determined by different collections of T cells from different donor sites, cell types, and genetic and pathological backgrounds. For quantitative comparison, most...

Descripción completa

Detalles Bibliográficos
Autores principales: Yokota, Ryo, Kaminaga, Yuki, Kobayashi, Tetsuya J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5694755/
https://www.ncbi.nlm.nih.gov/pubmed/29187849
http://dx.doi.org/10.3389/fimmu.2017.01500
_version_ 1783280191599017984
author Yokota, Ryo
Kaminaga, Yuki
Kobayashi, Tetsuya J.
author_facet Yokota, Ryo
Kaminaga, Yuki
Kobayashi, Tetsuya J.
author_sort Yokota, Ryo
collection PubMed
description Inter-sample comparisons of T-cell receptor (TCR) repertoires are crucial for gaining a better understanding of the immunological states determined by different collections of T cells from different donor sites, cell types, and genetic and pathological backgrounds. For quantitative comparison, most previous studies utilized conventional methods in ecology, which focus on TCR sequences that overlap between pairwise samples. Some recent studies attempted another approach that is categorized into Poisson abundance models using the abundance distribution of observed TCR sequences. However, these methods ignore the details of the measured sequences and are consequently unable to identify sub-repertoires that might have important contributions to the observed inter-sample differences. Moreover, the sparsity of sequence data due to the huge diversity of repertoires hampers the performance of these methods, especially when few overlapping sequences exist. In this paper, we propose a new approach for REpertoire COmparison in Low Dimensions (RECOLD) based on TCR sequence information, which can estimate the low-dimensional structure by embedding the pairwise sequence dissimilarities in high-dimensional sequence space. The inter-sample differences between repertoires are then quantified by information-theoretic measures among the distributions of data estimated in the embedded space. Using datasets of mouse and human TCR repertoires, we demonstrate that RECOLD can accurately identify the inter-sample hierarchical structures, which have a good correspondence with our intuitive understanding about sample conditions. Moreover, for the dataset of transgenic mice that have strong restrictions on the diversity of their repertoires, our estimated inter-sample structure was consistent with the structure estimated by previous methods based on abundance or overlapping sequence information. For the dataset of human healthy donors and Sézary syndrome patients, our method also showed robust estimation performance even under the condition of high sparsity in TCR sequences, while previous studies failed to estimate the structure. In addition, we identified the sequences that contribute to the pairwise-sample differences between the repertoires with the different genetic backgrounds of mice. Such identification of the sequences contributing to variation in immune cell repertoires may provide substantial insight for the development of new immunotherapies and vaccines.
format Online
Article
Text
id pubmed-5694755
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-56947552017-11-29 Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information Yokota, Ryo Kaminaga, Yuki Kobayashi, Tetsuya J. Front Immunol Immunology Inter-sample comparisons of T-cell receptor (TCR) repertoires are crucial for gaining a better understanding of the immunological states determined by different collections of T cells from different donor sites, cell types, and genetic and pathological backgrounds. For quantitative comparison, most previous studies utilized conventional methods in ecology, which focus on TCR sequences that overlap between pairwise samples. Some recent studies attempted another approach that is categorized into Poisson abundance models using the abundance distribution of observed TCR sequences. However, these methods ignore the details of the measured sequences and are consequently unable to identify sub-repertoires that might have important contributions to the observed inter-sample differences. Moreover, the sparsity of sequence data due to the huge diversity of repertoires hampers the performance of these methods, especially when few overlapping sequences exist. In this paper, we propose a new approach for REpertoire COmparison in Low Dimensions (RECOLD) based on TCR sequence information, which can estimate the low-dimensional structure by embedding the pairwise sequence dissimilarities in high-dimensional sequence space. The inter-sample differences between repertoires are then quantified by information-theoretic measures among the distributions of data estimated in the embedded space. Using datasets of mouse and human TCR repertoires, we demonstrate that RECOLD can accurately identify the inter-sample hierarchical structures, which have a good correspondence with our intuitive understanding about sample conditions. Moreover, for the dataset of transgenic mice that have strong restrictions on the diversity of their repertoires, our estimated inter-sample structure was consistent with the structure estimated by previous methods based on abundance or overlapping sequence information. For the dataset of human healthy donors and Sézary syndrome patients, our method also showed robust estimation performance even under the condition of high sparsity in TCR sequences, while previous studies failed to estimate the structure. In addition, we identified the sequences that contribute to the pairwise-sample differences between the repertoires with the different genetic backgrounds of mice. Such identification of the sequences contributing to variation in immune cell repertoires may provide substantial insight for the development of new immunotherapies and vaccines. Frontiers Media S.A. 2017-11-15 /pmc/articles/PMC5694755/ /pubmed/29187849 http://dx.doi.org/10.3389/fimmu.2017.01500 Text en Copyright © 2017 Yokota, Kaminaga and Kobayashi. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Immunology
Yokota, Ryo
Kaminaga, Yuki
Kobayashi, Tetsuya J.
Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information
title Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information
title_full Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information
title_fullStr Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information
title_full_unstemmed Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information
title_short Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information
title_sort quantification of inter-sample differences in t-cell receptor repertoires using sequence-based information
topic Immunology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5694755/
https://www.ncbi.nlm.nih.gov/pubmed/29187849
http://dx.doi.org/10.3389/fimmu.2017.01500
work_keys_str_mv AT yokotaryo quantificationofintersampledifferencesintcellreceptorrepertoiresusingsequencebasedinformation
AT kaminagayuki quantificationofintersampledifferencesintcellreceptorrepertoiresusingsequencebasedinformation
AT kobayashitetsuyaj quantificationofintersampledifferencesintcellreceptorrepertoiresusingsequencebasedinformation