Cargando…

Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis

BACKGROUND: Deep sequencing of lymphocyte receptor repertoires has made it possible to comprehensively profile the clonal composition of lymphocyte populations. This opens the door for novel approaches to diagnose and prognosticate diseases with a driving immune component by identifying repertoire s...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ostmeyer, Jared, Christley, Scott, Rounds, William H., Toby, Inimary, Greenberg, Benjamin M., Monson, Nancy L., Cowell, Lindsay G.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5588725/ https://www.ncbi.nlm.nih.gov/pubmed/28882107 http://dx.doi.org/10.1186/s12859-017-1814-6

_version_	1783262228130037760
author	Ostmeyer, Jared Christley, Scott Rounds, William H. Toby, Inimary Greenberg, Benjamin M. Monson, Nancy L. Cowell, Lindsay G.
author_facet	Ostmeyer, Jared Christley, Scott Rounds, William H. Toby, Inimary Greenberg, Benjamin M. Monson, Nancy L. Cowell, Lindsay G.
author_sort	Ostmeyer, Jared
collection	PubMed
description	BACKGROUND: Deep sequencing of lymphocyte receptor repertoires has made it possible to comprehensively profile the clonal composition of lymphocyte populations. This opens the door for novel approaches to diagnose and prognosticate diseases with a driving immune component by identifying repertoire sequence patterns associated with clinical phenotypes. Indeed, recent studies support the feasibility of this, demonstrating an association between repertoire-level summary statistics (e.g., diversity) and patient outcomes for several diseases. In our own prior work, we have shown that six codons in VH4-containing genes in B cells from the cerebrospinal fluid of patients with relapsing remitting multiple sclerosis (RRMS) have higher replacement mutation frequencies than observed in healthy controls or patients with other neurological diseases. However, prior methods to date have been limited to focusing on repertoire-level summary statistics, ignoring the vast amounts of information in the millions of individual immune receptors comprising a repertoire. We have developed a novel method that addresses this limitation by using innovative approaches for accommodating the extraordinary sequence diversity of immune receptors and widely used machine learning approaches. We applied our method to RRMS, an autoimmune disease that is notoriously difficult to diagnose. RESULTS: We use the biochemical features encoded by the complementarity determining region 3 of each B cell receptor heavy chain in every patient repertoire as input to a detector function, which is fit to give the correct diagnosis for each patient using maximum likelihood optimization methods. The resulting statistical classifier assigns patients to one of two diagnosis categories, RRMS or other neurological disease, with 87% accuracy by leave-one-out cross-validation on training data (N = 23) and 72% accuracy on unused data from a separate study (N = 102). CONCLUSIONS: Our method is the first to apply statistical learning to immune repertoires to aid disease diagnosis, learning repertoire-level labels from the set of individual immune repertoire sequences. This method produced a repertoire-based statistical classifier for diagnosing RRMS that provides a high degree of diagnostic capability, rivaling the accuracy of diagnosis by a clinical expert. Additionally, this method points to a diagnostic biochemical motif in the antibodies of RRMS patients, which may offer insight into the disease process. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-1814-6) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5588725
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-55887252017-09-14 Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis Ostmeyer, Jared Christley, Scott Rounds, William H. Toby, Inimary Greenberg, Benjamin M. Monson, Nancy L. Cowell, Lindsay G. BMC Bioinformatics Research Article BACKGROUND: Deep sequencing of lymphocyte receptor repertoires has made it possible to comprehensively profile the clonal composition of lymphocyte populations. This opens the door for novel approaches to diagnose and prognosticate diseases with a driving immune component by identifying repertoire sequence patterns associated with clinical phenotypes. Indeed, recent studies support the feasibility of this, demonstrating an association between repertoire-level summary statistics (e.g., diversity) and patient outcomes for several diseases. In our own prior work, we have shown that six codons in VH4-containing genes in B cells from the cerebrospinal fluid of patients with relapsing remitting multiple sclerosis (RRMS) have higher replacement mutation frequencies than observed in healthy controls or patients with other neurological diseases. However, prior methods to date have been limited to focusing on repertoire-level summary statistics, ignoring the vast amounts of information in the millions of individual immune receptors comprising a repertoire. We have developed a novel method that addresses this limitation by using innovative approaches for accommodating the extraordinary sequence diversity of immune receptors and widely used machine learning approaches. We applied our method to RRMS, an autoimmune disease that is notoriously difficult to diagnose. RESULTS: We use the biochemical features encoded by the complementarity determining region 3 of each B cell receptor heavy chain in every patient repertoire as input to a detector function, which is fit to give the correct diagnosis for each patient using maximum likelihood optimization methods. The resulting statistical classifier assigns patients to one of two diagnosis categories, RRMS or other neurological disease, with 87% accuracy by leave-one-out cross-validation on training data (N = 23) and 72% accuracy on unused data from a separate study (N = 102). CONCLUSIONS: Our method is the first to apply statistical learning to immune repertoires to aid disease diagnosis, learning repertoire-level labels from the set of individual immune repertoire sequences. This method produced a repertoire-based statistical classifier for diagnosing RRMS that provides a high degree of diagnostic capability, rivaling the accuracy of diagnosis by a clinical expert. Additionally, this method points to a diagnostic biochemical motif in the antibodies of RRMS patients, which may offer insight into the disease process. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-1814-6) contains supplementary material, which is available to authorized users. BioMed Central 2017-09-07 /pmc/articles/PMC5588725/ /pubmed/28882107 http://dx.doi.org/10.1186/s12859-017-1814-6 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Ostmeyer, Jared Christley, Scott Rounds, William H. Toby, Inimary Greenberg, Benjamin M. Monson, Nancy L. Cowell, Lindsay G. Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis
title	Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis
title_full	Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis
title_fullStr	Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis
title_full_unstemmed	Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis
title_short	Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis
title_sort	statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5588725/ https://www.ncbi.nlm.nih.gov/pubmed/28882107 http://dx.doi.org/10.1186/s12859-017-1814-6
work_keys_str_mv	AT ostmeyerjared statisticalclassifiersfordiagnosingdiseasefromimmunerepertoiresacasestudyusingmultiplesclerosis AT christleyscott statisticalclassifiersfordiagnosingdiseasefromimmunerepertoiresacasestudyusingmultiplesclerosis AT roundswilliamh statisticalclassifiersfordiagnosingdiseasefromimmunerepertoiresacasestudyusingmultiplesclerosis AT tobyinimary statisticalclassifiersfordiagnosingdiseasefromimmunerepertoiresacasestudyusingmultiplesclerosis AT greenbergbenjaminm statisticalclassifiersfordiagnosingdiseasefromimmunerepertoiresacasestudyusingmultiplesclerosis AT monsonnancyl statisticalclassifiersfordiagnosingdiseasefromimmunerepertoiresacasestudyusingmultiplesclerosis AT cowelllindsayg statisticalclassifiersfordiagnosingdiseasefromimmunerepertoiresacasestudyusingmultiplesclerosis

Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis

Ejemplares similares