Cargando…

Likelihood-Based Inference of B Cell Clonal Families

The human immune system depends on a highly diverse collection of antibody-making B cells. B cell receptor sequence diversity is generated by a random recombination process called “rearrangement” forming progenitor B cells, then a Darwinian process of lineage diversification and selection called “af...

Descripción completa

Detalles Bibliográficos
Autores principales: Ralph, Duncan K., Matsen, Frederick A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5066976/
https://www.ncbi.nlm.nih.gov/pubmed/27749910
http://dx.doi.org/10.1371/journal.pcbi.1005086
_version_ 1782460574051860480
author Ralph, Duncan K.
Matsen, Frederick A.
author_facet Ralph, Duncan K.
Matsen, Frederick A.
author_sort Ralph, Duncan K.
collection PubMed
description The human immune system depends on a highly diverse collection of antibody-making B cells. B cell receptor sequence diversity is generated by a random recombination process called “rearrangement” forming progenitor B cells, then a Darwinian process of lineage diversification and selection called “affinity maturation.” The resulting receptors can be sequenced in high throughput for research and diagnostics. Such a collection of sequences contains a mixture of various lineages, each of which may be quite numerous, or may consist of only a single member. As a step to understanding the process and result of this diversification, one may wish to reconstruct lineage membership, i.e. to cluster sampled sequences according to which came from the same rearrangement events. We call this clustering problem “clonal family inference.” In this paper we describe and validate a likelihood-based framework for clonal family inference based on a multi-hidden Markov Model (multi-HMM) framework for B cell receptor sequences. We describe an agglomerative algorithm to find a maximum likelihood clustering, two approximate algorithms with various trade-offs of speed versus accuracy, and a third, fast algorithm for finding specific lineages. We show that under simulation these algorithms greatly improve upon existing clonal family inference methods, and that they also give significantly different clusters than previous methods when applied to two real data sets.
format Online
Article
Text
id pubmed-5066976
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-50669762016-10-27 Likelihood-Based Inference of B Cell Clonal Families Ralph, Duncan K. Matsen, Frederick A. PLoS Comput Biol Research Article The human immune system depends on a highly diverse collection of antibody-making B cells. B cell receptor sequence diversity is generated by a random recombination process called “rearrangement” forming progenitor B cells, then a Darwinian process of lineage diversification and selection called “affinity maturation.” The resulting receptors can be sequenced in high throughput for research and diagnostics. Such a collection of sequences contains a mixture of various lineages, each of which may be quite numerous, or may consist of only a single member. As a step to understanding the process and result of this diversification, one may wish to reconstruct lineage membership, i.e. to cluster sampled sequences according to which came from the same rearrangement events. We call this clustering problem “clonal family inference.” In this paper we describe and validate a likelihood-based framework for clonal family inference based on a multi-hidden Markov Model (multi-HMM) framework for B cell receptor sequences. We describe an agglomerative algorithm to find a maximum likelihood clustering, two approximate algorithms with various trade-offs of speed versus accuracy, and a third, fast algorithm for finding specific lineages. We show that under simulation these algorithms greatly improve upon existing clonal family inference methods, and that they also give significantly different clusters than previous methods when applied to two real data sets. Public Library of Science 2016-10-17 /pmc/articles/PMC5066976/ /pubmed/27749910 http://dx.doi.org/10.1371/journal.pcbi.1005086 Text en © 2016 Ralph, Matsen http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Ralph, Duncan K.
Matsen, Frederick A.
Likelihood-Based Inference of B Cell Clonal Families
title Likelihood-Based Inference of B Cell Clonal Families
title_full Likelihood-Based Inference of B Cell Clonal Families
title_fullStr Likelihood-Based Inference of B Cell Clonal Families
title_full_unstemmed Likelihood-Based Inference of B Cell Clonal Families
title_short Likelihood-Based Inference of B Cell Clonal Families
title_sort likelihood-based inference of b cell clonal families
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5066976/
https://www.ncbi.nlm.nih.gov/pubmed/27749910
http://dx.doi.org/10.1371/journal.pcbi.1005086
work_keys_str_mv AT ralphduncank likelihoodbasedinferenceofbcellclonalfamilies
AT matsenfredericka likelihoodbasedinferenceofbcellclonalfamilies