Cargando…
The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories
The adaptation of high-throughput sequencing to the B cell receptor and T cell receptor has made it possible to characterize the adaptive immune receptor repertoire (AIRR) at unprecedented depth. These AIRR sequencing (AIRR-seq) studies offer tremendous potential to increase the understanding of ada...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6105692/ https://www.ncbi.nlm.nih.gov/pubmed/30166985 http://dx.doi.org/10.3389/fimmu.2018.01877 |
_version_ | 1783349678267432960 |
---|---|
author | Bukhari, Syed Ahmad Chan O’Connor, Martin J. Martínez-Romero, Marcos Egyedi, Attila L. Willrett, Debra Graybeal, John Musen, Mark A. Rubelt, Florian Cheung, Kei-Hoi Kleinstein, Steven H. |
author_facet | Bukhari, Syed Ahmad Chan O’Connor, Martin J. Martínez-Romero, Marcos Egyedi, Attila L. Willrett, Debra Graybeal, John Musen, Mark A. Rubelt, Florian Cheung, Kei-Hoi Kleinstein, Steven H. |
author_sort | Bukhari, Syed Ahmad Chan |
collection | PubMed |
description | The adaptation of high-throughput sequencing to the B cell receptor and T cell receptor has made it possible to characterize the adaptive immune receptor repertoire (AIRR) at unprecedented depth. These AIRR sequencing (AIRR-seq) studies offer tremendous potential to increase the understanding of adaptive immune responses in vaccinology, infectious disease, autoimmunity, and cancer. The increasingly wide application of AIRR-seq is leading to a critical mass of studies being deposited in the public domain, offering the possibility of novel scientific insights through secondary analyses and meta-analyses. However, effective sharing of these large-scale data remains a challenge. The AIRR community has proposed minimal information about adaptive immune receptor repertoire (MiAIRR), a standard for reporting AIRR-seq studies. The MiAIRR standard has been operationalized using the National Center for Biotechnology Information (NCBI) repositories. Submissions of AIRR-seq data to the NCBI repositories typically use a combination of web-based and flat-file templates and include only a minimal amount of terminology validation. As a result, AIRR-seq studies at the NCBI are often described using inconsistent terminologies, limiting scientists’ ability to access, find, interoperate, and reuse the data sets. In order to improve metadata quality and ease submission of AIRR-seq studies to the NCBI, we have leveraged the software framework developed by the Center for Expanded Data Annotation and Retrieval (CEDAR), which develops technologies involving the use of data standards and ontologies to improve metadata quality. The resulting CEDAR-AIRR (CAIRR) pipeline enables data submitters to: (i) create web-based templates whose entries are controlled by ontology terms, (ii) generate and validate metadata, and (iii) submit the ontology-linked metadata and sequence files (FASTQ) to the NCBI BioProject, BioSample, and Sequence Read Archive databases. Overall, CAIRR provides a web-based metadata submission interface that supports compliance with the MiAIRR standard. This pipeline is available at http://cairr.miairr.org, and will facilitate the NCBI submission process and improve the metadata quality of AIRR-seq studies. |
format | Online Article Text |
id | pubmed-6105692 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-61056922018-08-30 The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories Bukhari, Syed Ahmad Chan O’Connor, Martin J. Martínez-Romero, Marcos Egyedi, Attila L. Willrett, Debra Graybeal, John Musen, Mark A. Rubelt, Florian Cheung, Kei-Hoi Kleinstein, Steven H. Front Immunol Immunology The adaptation of high-throughput sequencing to the B cell receptor and T cell receptor has made it possible to characterize the adaptive immune receptor repertoire (AIRR) at unprecedented depth. These AIRR sequencing (AIRR-seq) studies offer tremendous potential to increase the understanding of adaptive immune responses in vaccinology, infectious disease, autoimmunity, and cancer. The increasingly wide application of AIRR-seq is leading to a critical mass of studies being deposited in the public domain, offering the possibility of novel scientific insights through secondary analyses and meta-analyses. However, effective sharing of these large-scale data remains a challenge. The AIRR community has proposed minimal information about adaptive immune receptor repertoire (MiAIRR), a standard for reporting AIRR-seq studies. The MiAIRR standard has been operationalized using the National Center for Biotechnology Information (NCBI) repositories. Submissions of AIRR-seq data to the NCBI repositories typically use a combination of web-based and flat-file templates and include only a minimal amount of terminology validation. As a result, AIRR-seq studies at the NCBI are often described using inconsistent terminologies, limiting scientists’ ability to access, find, interoperate, and reuse the data sets. In order to improve metadata quality and ease submission of AIRR-seq studies to the NCBI, we have leveraged the software framework developed by the Center for Expanded Data Annotation and Retrieval (CEDAR), which develops technologies involving the use of data standards and ontologies to improve metadata quality. The resulting CEDAR-AIRR (CAIRR) pipeline enables data submitters to: (i) create web-based templates whose entries are controlled by ontology terms, (ii) generate and validate metadata, and (iii) submit the ontology-linked metadata and sequence files (FASTQ) to the NCBI BioProject, BioSample, and Sequence Read Archive databases. Overall, CAIRR provides a web-based metadata submission interface that supports compliance with the MiAIRR standard. This pipeline is available at http://cairr.miairr.org, and will facilitate the NCBI submission process and improve the metadata quality of AIRR-seq studies. Frontiers Media S.A. 2018-08-16 /pmc/articles/PMC6105692/ /pubmed/30166985 http://dx.doi.org/10.3389/fimmu.2018.01877 Text en Copyright © 2018 Bukhari, O’Connor, Martínez-Romero, Egyedi, Willrett, Graybeal, Musen, Rubelt, Cheung and Kleinstein. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Immunology Bukhari, Syed Ahmad Chan O’Connor, Martin J. Martínez-Romero, Marcos Egyedi, Attila L. Willrett, Debra Graybeal, John Musen, Mark A. Rubelt, Florian Cheung, Kei-Hoi Kleinstein, Steven H. The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories |
title | The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories |
title_full | The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories |
title_fullStr | The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories |
title_full_unstemmed | The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories |
title_short | The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories |
title_sort | cairr pipeline for submitting standards-compliant b and t cell receptor repertoire sequencing studies to the national center for biotechnology information repositories |
topic | Immunology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6105692/ https://www.ncbi.nlm.nih.gov/pubmed/30166985 http://dx.doi.org/10.3389/fimmu.2018.01877 |
work_keys_str_mv | AT bukharisyedahmadchan thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT oconnormartinj thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT martinezromeromarcos thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT egyediattilal thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT willrettdebra thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT graybealjohn thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT musenmarka thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT rubeltflorian thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT cheungkeihoi thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT kleinsteinstevenh thecairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT bukharisyedahmadchan cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT oconnormartinj cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT martinezromeromarcos cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT egyediattilal cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT willrettdebra cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT graybealjohn cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT musenmarka cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT rubeltflorian cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT cheungkeihoi cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories AT kleinsteinstevenh cairrpipelineforsubmittingstandardscompliantbandtcellreceptorrepertoiresequencingstudiestothenationalcenterforbiotechnologyinformationrepositories |