Cargando…

GenArk: Towards a million UCSC genome browsers

Interactive graphical genome browsers are essential tools for biologists working with DNA sequences. Although tens of thousands of new genome assemblies have become available over the last decade, accessibility is limited by the work involved in manually creating browsers and curating annotations. T...

Descripción completa

Detalles Bibliográficos
Autores principales: Clawson, Hiram, Lee, Brian T, Raney, Brian J, Barber, Galt P, Casper, Jonathan, Diekhans, Mark, Fischer, Clay, Gonzalez, Jairo Navarro, Hinrichs, Angie S, Lee, Christopher M, Nassar, Luis R, Perez, Gerardo, Wick, Brittney, Schmelter, Daniel, Speir, Matthew L, Armstrong, Joel, Zweig, Ann S, Kuhn, Robert M, Kirilenko, Bogdan M., Hiller, Michael, Haussler, David, Kent, W James, Haeussler, Maximilian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Journal Experts 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10104252/
https://www.ncbi.nlm.nih.gov/pubmed/37066427
http://dx.doi.org/10.21203/rs.3.rs-2697398/v1
_version_ 1785025999760523264
author Clawson, Hiram
Lee, Brian T
Raney, Brian J
Barber, Galt P
Casper, Jonathan
Diekhans, Mark
Fischer, Clay
Gonzalez, Jairo Navarro
Hinrichs, Angie S
Lee, Christopher M
Nassar, Luis R
Perez, Gerardo
Wick, Brittney
Schmelter, Daniel
Speir, Matthew L
Armstrong, Joel
Zweig, Ann S
Kuhn, Robert M
Kirilenko, Bogdan M.
Hiller, Michael
Haussler, David
Kent, W James
Haeussler, Maximilian
author_facet Clawson, Hiram
Lee, Brian T
Raney, Brian J
Barber, Galt P
Casper, Jonathan
Diekhans, Mark
Fischer, Clay
Gonzalez, Jairo Navarro
Hinrichs, Angie S
Lee, Christopher M
Nassar, Luis R
Perez, Gerardo
Wick, Brittney
Schmelter, Daniel
Speir, Matthew L
Armstrong, Joel
Zweig, Ann S
Kuhn, Robert M
Kirilenko, Bogdan M.
Hiller, Michael
Haussler, David
Kent, W James
Haeussler, Maximilian
author_sort Clawson, Hiram
collection PubMed
description Interactive graphical genome browsers are essential tools for biologists working with DNA sequences. Although tens of thousands of new genome assemblies have become available over the last decade, accessibility is limited by the work involved in manually creating browsers and curating annotations. The results can push the limits of the existing data storage infrastructure. To facilitate managing this increasing number of genome assemblies, we created the Genome Archive (GenArk) collection of UCSC Genome Browsers from assemblies hosted at NCBI (1). Built on our established assembly hub system, this collection enables fast, on-demand visualization of chromosome regions without requiring a database server. Available annotations include gene models, some mapped through whole-genome alignments, repeat masks, GC content, and others. We also modified our popular BLAT (2) aligner and in-silico PCR to support a high number of genomes using limited RAM. Users can upload additional annotations themselves via track hubs (3) and custom tracks. We can import more annotations in bulk from third-party resources, demonstrated here with TOGA (4) gene models. Our system overcomes previous technical limits on the number of genomes and annotations. At the time of writing, 2,430 GenArk assemblies are listed at https://hgdownload.soe.ucsc.edu/hubs/ and can be found by searching on the main UCSC gateway page. We will continue to add all human high-quality assemblies and for other organisms, we are looking forward to receiving requests from the research community for ever more browsers and whole-genome alignments via http://genome.ucsc.edu/assemblyRequest.html.
format Online
Article
Text
id pubmed-10104252
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Journal Experts
record_format MEDLINE/PubMed
spelling pubmed-101042522023-04-15 GenArk: Towards a million UCSC genome browsers Clawson, Hiram Lee, Brian T Raney, Brian J Barber, Galt P Casper, Jonathan Diekhans, Mark Fischer, Clay Gonzalez, Jairo Navarro Hinrichs, Angie S Lee, Christopher M Nassar, Luis R Perez, Gerardo Wick, Brittney Schmelter, Daniel Speir, Matthew L Armstrong, Joel Zweig, Ann S Kuhn, Robert M Kirilenko, Bogdan M. Hiller, Michael Haussler, David Kent, W James Haeussler, Maximilian Res Sq Article Interactive graphical genome browsers are essential tools for biologists working with DNA sequences. Although tens of thousands of new genome assemblies have become available over the last decade, accessibility is limited by the work involved in manually creating browsers and curating annotations. The results can push the limits of the existing data storage infrastructure. To facilitate managing this increasing number of genome assemblies, we created the Genome Archive (GenArk) collection of UCSC Genome Browsers from assemblies hosted at NCBI (1). Built on our established assembly hub system, this collection enables fast, on-demand visualization of chromosome regions without requiring a database server. Available annotations include gene models, some mapped through whole-genome alignments, repeat masks, GC content, and others. We also modified our popular BLAT (2) aligner and in-silico PCR to support a high number of genomes using limited RAM. Users can upload additional annotations themselves via track hubs (3) and custom tracks. We can import more annotations in bulk from third-party resources, demonstrated here with TOGA (4) gene models. Our system overcomes previous technical limits on the number of genomes and annotations. At the time of writing, 2,430 GenArk assemblies are listed at https://hgdownload.soe.ucsc.edu/hubs/ and can be found by searching on the main UCSC gateway page. We will continue to add all human high-quality assemblies and for other organisms, we are looking forward to receiving requests from the research community for ever more browsers and whole-genome alignments via http://genome.ucsc.edu/assemblyRequest.html. American Journal Experts 2023-04-03 /pmc/articles/PMC10104252/ /pubmed/37066427 http://dx.doi.org/10.21203/rs.3.rs-2697398/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. https://creativecommons.org/licenses/by/4.0/License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License (https://creativecommons.org/licenses/by/4.0/)
spellingShingle Article
Clawson, Hiram
Lee, Brian T
Raney, Brian J
Barber, Galt P
Casper, Jonathan
Diekhans, Mark
Fischer, Clay
Gonzalez, Jairo Navarro
Hinrichs, Angie S
Lee, Christopher M
Nassar, Luis R
Perez, Gerardo
Wick, Brittney
Schmelter, Daniel
Speir, Matthew L
Armstrong, Joel
Zweig, Ann S
Kuhn, Robert M
Kirilenko, Bogdan M.
Hiller, Michael
Haussler, David
Kent, W James
Haeussler, Maximilian
GenArk: Towards a million UCSC genome browsers
title GenArk: Towards a million UCSC genome browsers
title_full GenArk: Towards a million UCSC genome browsers
title_fullStr GenArk: Towards a million UCSC genome browsers
title_full_unstemmed GenArk: Towards a million UCSC genome browsers
title_short GenArk: Towards a million UCSC genome browsers
title_sort genark: towards a million ucsc genome browsers
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10104252/
https://www.ncbi.nlm.nih.gov/pubmed/37066427
http://dx.doi.org/10.21203/rs.3.rs-2697398/v1
work_keys_str_mv AT clawsonhiram genarktowardsamillionucscgenomebrowsers
AT leebriant genarktowardsamillionucscgenomebrowsers
AT raneybrianj genarktowardsamillionucscgenomebrowsers
AT barbergaltp genarktowardsamillionucscgenomebrowsers
AT casperjonathan genarktowardsamillionucscgenomebrowsers
AT diekhansmark genarktowardsamillionucscgenomebrowsers
AT fischerclay genarktowardsamillionucscgenomebrowsers
AT gonzalezjaironavarro genarktowardsamillionucscgenomebrowsers
AT hinrichsangies genarktowardsamillionucscgenomebrowsers
AT leechristopherm genarktowardsamillionucscgenomebrowsers
AT nassarluisr genarktowardsamillionucscgenomebrowsers
AT perezgerardo genarktowardsamillionucscgenomebrowsers
AT wickbrittney genarktowardsamillionucscgenomebrowsers
AT schmelterdaniel genarktowardsamillionucscgenomebrowsers
AT speirmatthewl genarktowardsamillionucscgenomebrowsers
AT armstrongjoel genarktowardsamillionucscgenomebrowsers
AT zweiganns genarktowardsamillionucscgenomebrowsers
AT kuhnrobertm genarktowardsamillionucscgenomebrowsers
AT kirilenkobogdanm genarktowardsamillionucscgenomebrowsers
AT hillermichael genarktowardsamillionucscgenomebrowsers
AT hausslerdavid genarktowardsamillionucscgenomebrowsers
AT kentwjames genarktowardsamillionucscgenomebrowsers
AT haeusslermaximilian genarktowardsamillionucscgenomebrowsers