Cargando…
GenArk: Towards a million UCSC genome browsers
Interactive graphical genome browsers are essential tools for biologists working with DNA sequences. Although tens of thousands of new genome assemblies have become available over the last decade, accessibility is limited by the work involved in manually creating browsers and curating annotations. T...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Journal Experts
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10104252/ https://www.ncbi.nlm.nih.gov/pubmed/37066427 http://dx.doi.org/10.21203/rs.3.rs-2697398/v1 |
_version_ | 1785025999760523264 |
---|---|
author | Clawson, Hiram Lee, Brian T Raney, Brian J Barber, Galt P Casper, Jonathan Diekhans, Mark Fischer, Clay Gonzalez, Jairo Navarro Hinrichs, Angie S Lee, Christopher M Nassar, Luis R Perez, Gerardo Wick, Brittney Schmelter, Daniel Speir, Matthew L Armstrong, Joel Zweig, Ann S Kuhn, Robert M Kirilenko, Bogdan M. Hiller, Michael Haussler, David Kent, W James Haeussler, Maximilian |
author_facet | Clawson, Hiram Lee, Brian T Raney, Brian J Barber, Galt P Casper, Jonathan Diekhans, Mark Fischer, Clay Gonzalez, Jairo Navarro Hinrichs, Angie S Lee, Christopher M Nassar, Luis R Perez, Gerardo Wick, Brittney Schmelter, Daniel Speir, Matthew L Armstrong, Joel Zweig, Ann S Kuhn, Robert M Kirilenko, Bogdan M. Hiller, Michael Haussler, David Kent, W James Haeussler, Maximilian |
author_sort | Clawson, Hiram |
collection | PubMed |
description | Interactive graphical genome browsers are essential tools for biologists working with DNA sequences. Although tens of thousands of new genome assemblies have become available over the last decade, accessibility is limited by the work involved in manually creating browsers and curating annotations. The results can push the limits of the existing data storage infrastructure. To facilitate managing this increasing number of genome assemblies, we created the Genome Archive (GenArk) collection of UCSC Genome Browsers from assemblies hosted at NCBI (1). Built on our established assembly hub system, this collection enables fast, on-demand visualization of chromosome regions without requiring a database server. Available annotations include gene models, some mapped through whole-genome alignments, repeat masks, GC content, and others. We also modified our popular BLAT (2) aligner and in-silico PCR to support a high number of genomes using limited RAM. Users can upload additional annotations themselves via track hubs (3) and custom tracks. We can import more annotations in bulk from third-party resources, demonstrated here with TOGA (4) gene models. Our system overcomes previous technical limits on the number of genomes and annotations. At the time of writing, 2,430 GenArk assemblies are listed at https://hgdownload.soe.ucsc.edu/hubs/ and can be found by searching on the main UCSC gateway page. We will continue to add all human high-quality assemblies and for other organisms, we are looking forward to receiving requests from the research community for ever more browsers and whole-genome alignments via http://genome.ucsc.edu/assemblyRequest.html. |
format | Online Article Text |
id | pubmed-10104252 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Journal Experts |
record_format | MEDLINE/PubMed |
spelling | pubmed-101042522023-04-15 GenArk: Towards a million UCSC genome browsers Clawson, Hiram Lee, Brian T Raney, Brian J Barber, Galt P Casper, Jonathan Diekhans, Mark Fischer, Clay Gonzalez, Jairo Navarro Hinrichs, Angie S Lee, Christopher M Nassar, Luis R Perez, Gerardo Wick, Brittney Schmelter, Daniel Speir, Matthew L Armstrong, Joel Zweig, Ann S Kuhn, Robert M Kirilenko, Bogdan M. Hiller, Michael Haussler, David Kent, W James Haeussler, Maximilian Res Sq Article Interactive graphical genome browsers are essential tools for biologists working with DNA sequences. Although tens of thousands of new genome assemblies have become available over the last decade, accessibility is limited by the work involved in manually creating browsers and curating annotations. The results can push the limits of the existing data storage infrastructure. To facilitate managing this increasing number of genome assemblies, we created the Genome Archive (GenArk) collection of UCSC Genome Browsers from assemblies hosted at NCBI (1). Built on our established assembly hub system, this collection enables fast, on-demand visualization of chromosome regions without requiring a database server. Available annotations include gene models, some mapped through whole-genome alignments, repeat masks, GC content, and others. We also modified our popular BLAT (2) aligner and in-silico PCR to support a high number of genomes using limited RAM. Users can upload additional annotations themselves via track hubs (3) and custom tracks. We can import more annotations in bulk from third-party resources, demonstrated here with TOGA (4) gene models. Our system overcomes previous technical limits on the number of genomes and annotations. At the time of writing, 2,430 GenArk assemblies are listed at https://hgdownload.soe.ucsc.edu/hubs/ and can be found by searching on the main UCSC gateway page. We will continue to add all human high-quality assemblies and for other organisms, we are looking forward to receiving requests from the research community for ever more browsers and whole-genome alignments via http://genome.ucsc.edu/assemblyRequest.html. American Journal Experts 2023-04-03 /pmc/articles/PMC10104252/ /pubmed/37066427 http://dx.doi.org/10.21203/rs.3.rs-2697398/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. https://creativecommons.org/licenses/by/4.0/License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License (https://creativecommons.org/licenses/by/4.0/) |
spellingShingle | Article Clawson, Hiram Lee, Brian T Raney, Brian J Barber, Galt P Casper, Jonathan Diekhans, Mark Fischer, Clay Gonzalez, Jairo Navarro Hinrichs, Angie S Lee, Christopher M Nassar, Luis R Perez, Gerardo Wick, Brittney Schmelter, Daniel Speir, Matthew L Armstrong, Joel Zweig, Ann S Kuhn, Robert M Kirilenko, Bogdan M. Hiller, Michael Haussler, David Kent, W James Haeussler, Maximilian GenArk: Towards a million UCSC genome browsers |
title | GenArk: Towards a million UCSC genome browsers |
title_full | GenArk: Towards a million UCSC genome browsers |
title_fullStr | GenArk: Towards a million UCSC genome browsers |
title_full_unstemmed | GenArk: Towards a million UCSC genome browsers |
title_short | GenArk: Towards a million UCSC genome browsers |
title_sort | genark: towards a million ucsc genome browsers |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10104252/ https://www.ncbi.nlm.nih.gov/pubmed/37066427 http://dx.doi.org/10.21203/rs.3.rs-2697398/v1 |
work_keys_str_mv | AT clawsonhiram genarktowardsamillionucscgenomebrowsers AT leebriant genarktowardsamillionucscgenomebrowsers AT raneybrianj genarktowardsamillionucscgenomebrowsers AT barbergaltp genarktowardsamillionucscgenomebrowsers AT casperjonathan genarktowardsamillionucscgenomebrowsers AT diekhansmark genarktowardsamillionucscgenomebrowsers AT fischerclay genarktowardsamillionucscgenomebrowsers AT gonzalezjaironavarro genarktowardsamillionucscgenomebrowsers AT hinrichsangies genarktowardsamillionucscgenomebrowsers AT leechristopherm genarktowardsamillionucscgenomebrowsers AT nassarluisr genarktowardsamillionucscgenomebrowsers AT perezgerardo genarktowardsamillionucscgenomebrowsers AT wickbrittney genarktowardsamillionucscgenomebrowsers AT schmelterdaniel genarktowardsamillionucscgenomebrowsers AT speirmatthewl genarktowardsamillionucscgenomebrowsers AT armstrongjoel genarktowardsamillionucscgenomebrowsers AT zweiganns genarktowardsamillionucscgenomebrowsers AT kuhnrobertm genarktowardsamillionucscgenomebrowsers AT kirilenkobogdanm genarktowardsamillionucscgenomebrowsers AT hillermichael genarktowardsamillionucscgenomebrowsers AT hausslerdavid genarktowardsamillionucscgenomebrowsers AT kentwjames genarktowardsamillionucscgenomebrowsers AT haeusslermaximilian genarktowardsamillionucscgenomebrowsers |