Cargando…

SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences

Single-exon coding sequences (CDSs), also known as ‘single-exon genes’ (SEGs), are defined as nuclear, protein-coding genes that lack introns in their CDSs. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of h...

Descripción completa

Detalles Bibliográficos
Autores principales: Jorquera, R, González, C, Clausen, P T L C, Petersen, B, Holmes, D S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7904048/
https://www.ncbi.nlm.nih.gov/pubmed/33507271
http://dx.doi.org/10.1093/database/baab002
_version_ 1783654853384339456
author Jorquera, R
González, C
Clausen, P T L C
Petersen, B
Holmes, D S
author_facet Jorquera, R
González, C
Clausen, P T L C
Petersen, B
Holmes, D S
author_sort Jorquera, R
collection PubMed
description Single-exon coding sequences (CDSs), also known as ‘single-exon genes’ (SEGs), are defined as nuclear, protein-coding genes that lack introns in their CDSs. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of human cancers and neurological/developmental disorders, and many exhibit tissue-specific transcription. We developed SinEx DB that houses DNA and protein sequence information of SEGs from 10 mammalian genomes including human. SinEx DB includes their functional predictions (KOG (euKaryotic Orthologous Groups)) and the relative distribution of these functions within species. Here, we report SinEx 2.0, a major update of SinEx DB that includes information of the occurrence, distribution and functional prediction of SEGs from 60 completely sequenced eukaryotic genomes, representing animals, fungi, protists and plants. The information is stored in a relational database built with MySQL Server 5.7, and the complete dataset of SEG sequences and their GO (Gene Ontology) functional assignations are available for downloading. SinEx DB 2.0 was built with a novel pipeline that helps disambiguate single-exon isoforms from SEGs. SinEx DB 2.0 is the largest available database for SEGs and provides a rich source of information for advancing our understanding of the evolution, function of SEGs and their associations with disorders including cancers and neurological and developmental diseases. Database URL: http://v2.sinex.cl/
format Online
Article
Text
id pubmed-7904048
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-79040482021-03-01 SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences Jorquera, R González, C Clausen, P T L C Petersen, B Holmes, D S Database (Oxford) Database Update Single-exon coding sequences (CDSs), also known as ‘single-exon genes’ (SEGs), are defined as nuclear, protein-coding genes that lack introns in their CDSs. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of human cancers and neurological/developmental disorders, and many exhibit tissue-specific transcription. We developed SinEx DB that houses DNA and protein sequence information of SEGs from 10 mammalian genomes including human. SinEx DB includes their functional predictions (KOG (euKaryotic Orthologous Groups)) and the relative distribution of these functions within species. Here, we report SinEx 2.0, a major update of SinEx DB that includes information of the occurrence, distribution and functional prediction of SEGs from 60 completely sequenced eukaryotic genomes, representing animals, fungi, protists and plants. The information is stored in a relational database built with MySQL Server 5.7, and the complete dataset of SEG sequences and their GO (Gene Ontology) functional assignations are available for downloading. SinEx DB 2.0 was built with a novel pipeline that helps disambiguate single-exon isoforms from SEGs. SinEx DB 2.0 is the largest available database for SEGs and provides a rich source of information for advancing our understanding of the evolution, function of SEGs and their associations with disorders including cancers and neurological and developmental diseases. Database URL: http://v2.sinex.cl/ Oxford University Press 2021-01-28 /pmc/articles/PMC7904048/ /pubmed/33507271 http://dx.doi.org/10.1093/database/baab002 Text en © The Author(s) 2021. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database Update
Jorquera, R
González, C
Clausen, P T L C
Petersen, B
Holmes, D S
SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences
title SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences
title_full SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences
title_fullStr SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences
title_full_unstemmed SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences
title_short SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences
title_sort sinex db 2.0 update 2020: database for eukaryotic single-exon coding sequences
topic Database Update
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7904048/
https://www.ncbi.nlm.nih.gov/pubmed/33507271
http://dx.doi.org/10.1093/database/baab002
work_keys_str_mv AT jorquerar sinexdb20update2020databaseforeukaryoticsingleexoncodingsequences
AT gonzalezc sinexdb20update2020databaseforeukaryoticsingleexoncodingsequences
AT clausenptlc sinexdb20update2020databaseforeukaryoticsingleexoncodingsequences
AT petersenb sinexdb20update2020databaseforeukaryoticsingleexoncodingsequences
AT holmesds sinexdb20update2020databaseforeukaryoticsingleexoncodingsequences