Cargando…

The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories

The adaptation of high-throughput sequencing to the B cell receptor and T cell receptor has made it possible to characterize the adaptive immune receptor repertoire (AIRR) at unprecedented depth. These AIRR sequencing (AIRR-seq) studies offer tremendous potential to increase the understanding of ada...

Descripción completa

Detalles Bibliográficos
Autores principales: Bukhari, Syed Ahmad Chan, O’Connor, Martin J., Martínez-Romero, Marcos, Egyedi, Attila L., Willrett, Debra, Graybeal, John, Musen, Mark A., Rubelt, Florian, Cheung, Kei-Hoi, Kleinstein, Steven H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6105692/
https://www.ncbi.nlm.nih.gov/pubmed/30166985
http://dx.doi.org/10.3389/fimmu.2018.01877
_version_ 1783349678267432960
author Bukhari, Syed Ahmad Chan
O’Connor, Martin J.
Martínez-Romero, Marcos
Egyedi, Attila L.
Willrett, Debra
Graybeal, John
Musen, Mark A.
Rubelt, Florian
Cheung, Kei-Hoi
Kleinstein, Steven H.
author_facet Bukhari, Syed Ahmad Chan
O’Connor, Martin J.
Martínez-Romero, Marcos
Egyedi, Attila L.
Willrett, Debra
Graybeal, John
Musen, Mark A.
Rubelt, Florian
Cheung, Kei-Hoi
Kleinstein, Steven H.
author_sort Bukhari, Syed Ahmad Chan
collection PubMed
description The adaptation of high-throughput sequencing to the B cell receptor and T cell receptor has made it possible to characterize the adaptive immune receptor repertoire (AIRR) at unprecedented depth. These AIRR sequencing (AIRR-seq) studies offer tremendous potential to increase the understanding of adaptive immune responses in vaccinology, infectious disease, autoimmunity, and cancer. The increasingly wide application of AIRR-seq is leading to a critical mass of studies being deposited in the public domain, offering the possibility of novel scientific insights through secondary analyses and meta-analyses. However, effective sharing of these large-scale data remains a challenge. The AIRR community has proposed minimal information about adaptive immune receptor repertoire (MiAIRR), a standard for reporting AIRR-seq studies. The MiAIRR standard has been operationalized using the National Center for Biotechnology Information (NCBI) repositories. Submissions of AIRR-seq data to the NCBI repositories typically use a combination of web-based and flat-file templates and include only a minimal amount of terminology validation. As a result, AIRR-seq studies at the NCBI are often described using inconsistent terminologies, limiting scientists’ ability to access, find, interoperate, and reuse the data sets. In order to improve metadata quality and ease submission of AIRR-seq studies to the NCBI, we have leveraged the software framework developed by the Center for Expanded Data Annotation and Retrieval (CEDAR), which develops technologies involving the use of data standards and ontologies to improve metadata quality. The resulting CEDAR-AIRR (CAIRR) pipeline enables data submitters to: (i) create web-based templates whose entries are controlled by ontology terms, (ii) generate and validate metadata, and (iii) submit the ontology-linked metadata and sequence files (FASTQ) to the NCBI BioProject, BioSample, and Sequence Read Archive databases. Overall, CAIRR provides a web-based metadata submission interface that supports compliance with the MiAIRR standard. This pipeline is available at http://cairr.miairr.org, and will facilitate the NCBI submission process and improve the metadata quality of AIRR-seq studies.
format Online
Article
Text
id pubmed-6105692
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-61056922018-08-30 The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories Bukhari, Syed Ahmad Chan O’Connor, Martin J. Martínez-Romero, Marcos Egyedi, Attila L. Willrett, Debra Graybeal, John Musen, Mark A. Rubelt, Florian Cheung, Kei-Hoi Kleinstein, Steven H. Front Immunol Immunology The adaptation of high-throughput sequencing to the B cell receptor and T cell receptor has made it possible to characterize the adaptive immune receptor repertoire (AIRR) at unprecedented depth. These AIRR sequencing (AIRR-seq) studies offer tremendous potential to increase the understanding of adaptive immune responses in vaccinology, infectious disease, autoimmunity, and cancer. The increasingly wide application of AIRR-seq is leading to a critical mass of studies being deposited in the public domain, offering the possibility of novel scientific insights through secondary analyses and meta-analyses. However, effective sharing of these large-scale data remains a challenge. The AIRR community has proposed minimal information about adaptive immune receptor repertoire (MiAIRR), a standard for reporting AIRR-seq studies. The MiAIRR standard has been operationalized using the National Center for Biotechnology Information (NCBI) repositories. Submissions of AIRR-seq data to the NCBI repositories typically use a combination of web-based and flat-file templates and include only a minimal amount of terminology validation. As a result, AIRR-seq studies at the NCBI are often described using inconsistent terminologies, limiting scientists’ ability to access, find, interoperate, and reuse the data sets. In order to improve metadata quality and ease submission of AIRR-seq studies to the NCBI, we have leveraged the software framework developed by the Center for Expanded Data Annotation and Retrieval (CEDAR), which develops technologies involving the use of data standards and ontologies to improve metadata quality. The resulting CEDAR-AIRR (CAIRR) pipeline enables data submitters to: (i) create web-based templates whose entries are controlled by ontology terms, (ii) generate and validate metadata, and (iii) submit the ontology-linked metadata and sequence files (FASTQ) to the NCBI BioProject, BioSample, and Sequence Read Archive databases. Overall, CAIRR provides a web-based metadata submission interface that supports compliance with the MiAIRR standard. This pipeline is available at http://cairr.miairr.org, and will facilitate the NCBI submission process and improve the metadata quality of AIRR-seq studies. Frontiers Media S.A. 2018-08-16 /pmc/articles/PMC6105692/ /pubmed/30166985 http://dx.doi.org/10.3389/fimmu.2018.01877 Text en Copyright © 2018 Bukhari, O’Connor, Martínez-Romero, Egyedi, Willrett, Graybeal, Musen, Rubelt, Cheung and Kleinstein. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Immunology
Bukhari, Syed Ahmad Chan
O’Connor, Martin J.
Martínez-Romero, Marcos
Egyedi, Attila L.
Willrett, Debra
Graybeal, John
Musen, Mark A.
Rubelt, Florian
Cheung, Kei-Hoi
Kleinstein, Steven H.
The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories
title The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories
title_full The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories
title_fullStr The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories
title_full_unstemmed The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories
title_short The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories
title_sort cairr pipeline for submitting standards-compliant b and t cell receptor repertoire sequencing studies to the national center for biotechnology information repositories
topic Immunology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6105692/
https://www.ncbi.nlm.nih.gov/pubmed/30166985
http://dx.doi.org/10.3389/fimmu.2018.01877
work_keys_str_mv AT bukharisyedahmadchan thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT oconnormartinj thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT martinezromeromarcos thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT egyediattilal thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT willrettdebra thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT graybealjohn thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT musenmarka thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT rubeltflorian thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT cheungkeihoi thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT kleinsteinstevenh thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT bukharisyedahmadchan cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT oconnormartinj cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT martinezromeromarcos cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT egyediattilal cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT willrettdebra cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT graybealjohn cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT musenmarka cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT rubeltflorian cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT cheungkeihoi cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories
AT kleinsteinstevenh cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories