Cargando…
CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure
CHESS 3 represents an improved human gene catalog based on nearly 10,000 RNA-seq experiments across 54 body sites. It significantly improves current genome annotation by integrating the latest reference data and algorithms, machine learning techniques for noise filtering, and new protein structure p...
Autores principales: | , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10614308/ https://www.ncbi.nlm.nih.gov/pubmed/37904256 http://dx.doi.org/10.1186/s13059-023-03088-4 |
_version_ | 1785129000473985024 |
---|---|
author | Varabyou, Ales Sommer, Markus J. Erdogdu, Beril Shinder, Ida Minkin, Ilia Chao, Kuan-Hao Park, Sukhwan Heinz, Jakob Pockrandt, Christopher Shumate, Alaina Rincon, Natalia Puiu, Daniela Steinegger, Martin Salzberg, Steven L. Pertea, Mihaela |
author_facet | Varabyou, Ales Sommer, Markus J. Erdogdu, Beril Shinder, Ida Minkin, Ilia Chao, Kuan-Hao Park, Sukhwan Heinz, Jakob Pockrandt, Christopher Shumate, Alaina Rincon, Natalia Puiu, Daniela Steinegger, Martin Salzberg, Steven L. Pertea, Mihaela |
author_sort | Varabyou, Ales |
collection | PubMed |
description | CHESS 3 represents an improved human gene catalog based on nearly 10,000 RNA-seq experiments across 54 body sites. It significantly improves current genome annotation by integrating the latest reference data and algorithms, machine learning techniques for noise filtering, and new protein structure prediction methods. CHESS 3 contains 41,356 genes, including 19,839 protein-coding genes and 158,377 transcripts, with 14,863 protein-coding transcripts not in other catalogs. It includes all MANE transcripts and at least one transcript for most RefSeq and GENCODE genes. On the CHM13 human genome, the CHESS 3 catalog contains an additional 129 protein-coding genes. CHESS 3 is available at http://ccb.jhu.edu/chess. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03088-4. |
format | Online Article Text |
id | pubmed-10614308 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-106143082023-10-31 CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure Varabyou, Ales Sommer, Markus J. Erdogdu, Beril Shinder, Ida Minkin, Ilia Chao, Kuan-Hao Park, Sukhwan Heinz, Jakob Pockrandt, Christopher Shumate, Alaina Rincon, Natalia Puiu, Daniela Steinegger, Martin Salzberg, Steven L. Pertea, Mihaela Genome Biol Database CHESS 3 represents an improved human gene catalog based on nearly 10,000 RNA-seq experiments across 54 body sites. It significantly improves current genome annotation by integrating the latest reference data and algorithms, machine learning techniques for noise filtering, and new protein structure prediction methods. CHESS 3 contains 41,356 genes, including 19,839 protein-coding genes and 158,377 transcripts, with 14,863 protein-coding transcripts not in other catalogs. It includes all MANE transcripts and at least one transcript for most RefSeq and GENCODE genes. On the CHM13 human genome, the CHESS 3 catalog contains an additional 129 protein-coding genes. CHESS 3 is available at http://ccb.jhu.edu/chess. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03088-4. BioMed Central 2023-10-30 /pmc/articles/PMC10614308/ /pubmed/37904256 http://dx.doi.org/10.1186/s13059-023-03088-4 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Database Varabyou, Ales Sommer, Markus J. Erdogdu, Beril Shinder, Ida Minkin, Ilia Chao, Kuan-Hao Park, Sukhwan Heinz, Jakob Pockrandt, Christopher Shumate, Alaina Rincon, Natalia Puiu, Daniela Steinegger, Martin Salzberg, Steven L. Pertea, Mihaela CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure |
title | CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure |
title_full | CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure |
title_fullStr | CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure |
title_full_unstemmed | CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure |
title_short | CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure |
title_sort | chess 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure |
topic | Database |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10614308/ https://www.ncbi.nlm.nih.gov/pubmed/37904256 http://dx.doi.org/10.1186/s13059-023-03088-4 |
work_keys_str_mv | AT varabyouales chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT sommermarkusj chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT erdogduberil chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT shinderida chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT minkinilia chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT chaokuanhao chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT parksukhwan chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT heinzjakob chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT pockrandtchristopher chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT shumatealaina chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT rinconnatalia chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT puiudaniela chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT steineggermartin chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT salzbergstevenl chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure AT perteamihaela chess3animprovedcomprehensivecatalogofhumangenesandtranscriptsbasedonlargescaleexpressiondataphylogeneticanalysisandproteinstructure |