Cargando…

The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data

The Cancer Genomics Hub (CGHub) is the online repository of the sequencing programs of the National Cancer Institute (NCI), including The Cancer Genomics Atlas (TCGA), the Cancer Cell Line Encyclopedia (CCLE) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) proje...

Descripción completa

Detalles Bibliográficos
Autores principales: Wilks, Christopher, Cline, Melissa S., Weiler, Erich, Diehkans, Mark, Craft, Brian, Martin, Christy, Murphy, Daniel, Pierce, Howdy, Black, John, Nelson, Donavan, Litzinger, Brian, Hatton, Thomas, Maltbie, Lori, Ainsworth, Michael, Allen, Patrick, Rosewood, Linda, Mitchell, Elizabeth, Smith, Bradley, Warner, Jim, Groboske, John, Telc, Haifang, Wilson, Daniel, Sanford, Brian, Schmidt, Hannes, Haussler, David, Maltbie, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4178372/
https://www.ncbi.nlm.nih.gov/pubmed/25267794
http://dx.doi.org/10.1093/database/bau093
_version_ 1782336945356013568
author Wilks, Christopher
Cline, Melissa S.
Weiler, Erich
Diehkans, Mark
Craft, Brian
Martin, Christy
Murphy, Daniel
Pierce, Howdy
Black, John
Nelson, Donavan
Litzinger, Brian
Hatton, Thomas
Maltbie, Lori
Ainsworth, Michael
Allen, Patrick
Rosewood, Linda
Mitchell, Elizabeth
Smith, Bradley
Warner, Jim
Groboske, John
Telc, Haifang
Wilson, Daniel
Sanford, Brian
Schmidt, Hannes
Haussler, David
Maltbie, Daniel
author_facet Wilks, Christopher
Cline, Melissa S.
Weiler, Erich
Diehkans, Mark
Craft, Brian
Martin, Christy
Murphy, Daniel
Pierce, Howdy
Black, John
Nelson, Donavan
Litzinger, Brian
Hatton, Thomas
Maltbie, Lori
Ainsworth, Michael
Allen, Patrick
Rosewood, Linda
Mitchell, Elizabeth
Smith, Bradley
Warner, Jim
Groboske, John
Telc, Haifang
Wilson, Daniel
Sanford, Brian
Schmidt, Hannes
Haussler, David
Maltbie, Daniel
author_sort Wilks, Christopher
collection PubMed
description The Cancer Genomics Hub (CGHub) is the online repository of the sequencing programs of the National Cancer Institute (NCI), including The Cancer Genomics Atlas (TCGA), the Cancer Cell Line Encyclopedia (CCLE) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) projects, with data from 25 different types of cancer. The CGHub currently contains >1.4 PB of data, has grown at an average rate of 50 TB a month and serves >100 TB per week. The architecture of CGHub is designed to support bulk searching and downloading through a Web-accessible application programming interface, enforce patient genome confidentiality in data storage and transmission and optimize for efficiency in access and transfer. In this article, we describe the design of these three components, present performance results for our transfer protocol, GeneTorrent, and finally report on the growth of the system in terms of data stored and transferred, including estimated limits on the current architecture. Our experienced-based estimates suggest that centralizing storage and computational resources is more efficient than wide distribution across many satellite labs. Database URL: https://cghub.ucsc.edu
format Online
Article
Text
id pubmed-4178372
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-41783722014-10-03 The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data Wilks, Christopher Cline, Melissa S. Weiler, Erich Diehkans, Mark Craft, Brian Martin, Christy Murphy, Daniel Pierce, Howdy Black, John Nelson, Donavan Litzinger, Brian Hatton, Thomas Maltbie, Lori Ainsworth, Michael Allen, Patrick Rosewood, Linda Mitchell, Elizabeth Smith, Bradley Warner, Jim Groboske, John Telc, Haifang Wilson, Daniel Sanford, Brian Schmidt, Hannes Haussler, David Maltbie, Daniel Database (Oxford) Original Article The Cancer Genomics Hub (CGHub) is the online repository of the sequencing programs of the National Cancer Institute (NCI), including The Cancer Genomics Atlas (TCGA), the Cancer Cell Line Encyclopedia (CCLE) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) projects, with data from 25 different types of cancer. The CGHub currently contains >1.4 PB of data, has grown at an average rate of 50 TB a month and serves >100 TB per week. The architecture of CGHub is designed to support bulk searching and downloading through a Web-accessible application programming interface, enforce patient genome confidentiality in data storage and transmission and optimize for efficiency in access and transfer. In this article, we describe the design of these three components, present performance results for our transfer protocol, GeneTorrent, and finally report on the growth of the system in terms of data stored and transferred, including estimated limits on the current architecture. Our experienced-based estimates suggest that centralizing storage and computational resources is more efficient than wide distribution across many satellite labs. Database URL: https://cghub.ucsc.edu Oxford University Press 2014-09-27 /pmc/articles/PMC4178372/ /pubmed/25267794 http://dx.doi.org/10.1093/database/bau093 Text en Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.
spellingShingle Original Article
Wilks, Christopher
Cline, Melissa S.
Weiler, Erich
Diehkans, Mark
Craft, Brian
Martin, Christy
Murphy, Daniel
Pierce, Howdy
Black, John
Nelson, Donavan
Litzinger, Brian
Hatton, Thomas
Maltbie, Lori
Ainsworth, Michael
Allen, Patrick
Rosewood, Linda
Mitchell, Elizabeth
Smith, Bradley
Warner, Jim
Groboske, John
Telc, Haifang
Wilson, Daniel
Sanford, Brian
Schmidt, Hannes
Haussler, David
Maltbie, Daniel
The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data
title The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data
title_full The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data
title_fullStr The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data
title_full_unstemmed The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data
title_short The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data
title_sort cancer genomics hub (cghub): overcoming cancer through the power of torrential data
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4178372/
https://www.ncbi.nlm.nih.gov/pubmed/25267794
http://dx.doi.org/10.1093/database/bau093
work_keys_str_mv AT wilkschristopher thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT clinemelissas thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT weilererich thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT diehkansmark thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT craftbrian thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT martinchristy thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT murphydaniel thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT piercehowdy thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT blackjohn thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT nelsondonavan thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT litzingerbrian thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT hattonthomas thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT maltbielori thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT ainsworthmichael thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT allenpatrick thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT rosewoodlinda thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT mitchellelizabeth thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT smithbradley thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT warnerjim thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT groboskejohn thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT telchaifang thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT wilsondaniel thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT sanfordbrian thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT schmidthannes thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT hausslerdavid thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT maltbiedaniel thecancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT wilkschristopher cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT clinemelissas cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT weilererich cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT diehkansmark cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT craftbrian cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT martinchristy cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT murphydaniel cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT piercehowdy cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT blackjohn cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT nelsondonavan cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT litzingerbrian cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT hattonthomas cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT maltbielori cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT ainsworthmichael cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT allenpatrick cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT rosewoodlinda cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT mitchellelizabeth cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT smithbradley cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT warnerjim cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT groboskejohn cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT telchaifang cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT wilsondaniel cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT sanfordbrian cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT schmidthannes cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT hausslerdavid cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata
AT maltbiedaniel cancergenomicshubcghubovercomingcancerthroughthepoweroftorrentialdata