Cargando…

Dataset for genome sequencing and de novo assembly of the Vietnamese bighead catfish (Clarias macrocephalus Günther, 1864)

Freshwater catfish of the genus Clarias, known as the airbreathing catfish, are widespread and important for food security through small scale inland fisheries and aquaculture. Limited genomic data are available for this important group of fishes. The bighead catfish (Clarias macrocephalus) is a com...

Descripción completa

Detalles Bibliográficos
Autores principales: Duong, Thuy-Yen, Tan, Mun Hua, Lee, Yin Peng, Croft, Larry, Austin, Christopher M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7326715/
https://www.ncbi.nlm.nih.gov/pubmed/32637481
http://dx.doi.org/10.1016/j.dib.2020.105861
_version_ 1783552395187322880
author Duong, Thuy-Yen
Tan, Mun Hua
Lee, Yin Peng
Croft, Larry
Austin, Christopher M.
author_facet Duong, Thuy-Yen
Tan, Mun Hua
Lee, Yin Peng
Croft, Larry
Austin, Christopher M.
author_sort Duong, Thuy-Yen
collection PubMed
description Freshwater catfish of the genus Clarias, known as the airbreathing catfish, are widespread and important for food security through small scale inland fisheries and aquaculture. Limited genomic data are available for this important group of fishes. The bighead catfish (Clarias macrocephalus) is a commercial aquaculture species in southeast Asia used for aquaculture and threatened in its natural environment through habitat destruction, over-exploitation and competition from other introduced species of Clarias. Despite its commercial importance and threats to natural populations, public databases do not include any genomic data for C. macrocephalus. We present the first genomic data for the bighead catfish from Illumina sequencing. A total of 128 Gb of sequence data in paired-end 150 bp reads were assembled de novo, generating a final assembly of 883 Mbp contained in 27,833 scaffolds (N(50) length: 80.8 kbp) with BUSCO completeness assessments of 96.3% and 87.6% based on metazoan and Actinopterygii ortholog datasets, respectively. Annotation of the genome predicted 21,124 gene sequences, which were assigned putative functions based on homology to existing protein sequences in public databases. Raw fastq reads and the final version of the genome assembly have been deposited in the NCBI (BioProject: PRJNA604477, WGS: JAAGKR000000000, SRA: SRR11188453). The complete C. macrocephalus mitochondrial genome was also recovered from the same sequence read dataset and is available on NCBI (accession: MT109097), representing the first mitogenome for this species. Lastly, we find an expansion of the mb and ora1 genes thought to be associated with adaptations to air-breathing and a semi-terrestrial life style in this genus of catfish.
format Online
Article
Text
id pubmed-7326715
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-73267152020-07-06 Dataset for genome sequencing and de novo assembly of the Vietnamese bighead catfish (Clarias macrocephalus Günther, 1864) Duong, Thuy-Yen Tan, Mun Hua Lee, Yin Peng Croft, Larry Austin, Christopher M. Data Brief Biochemistry, Genetics and Molecular Biology Freshwater catfish of the genus Clarias, known as the airbreathing catfish, are widespread and important for food security through small scale inland fisheries and aquaculture. Limited genomic data are available for this important group of fishes. The bighead catfish (Clarias macrocephalus) is a commercial aquaculture species in southeast Asia used for aquaculture and threatened in its natural environment through habitat destruction, over-exploitation and competition from other introduced species of Clarias. Despite its commercial importance and threats to natural populations, public databases do not include any genomic data for C. macrocephalus. We present the first genomic data for the bighead catfish from Illumina sequencing. A total of 128 Gb of sequence data in paired-end 150 bp reads were assembled de novo, generating a final assembly of 883 Mbp contained in 27,833 scaffolds (N(50) length: 80.8 kbp) with BUSCO completeness assessments of 96.3% and 87.6% based on metazoan and Actinopterygii ortholog datasets, respectively. Annotation of the genome predicted 21,124 gene sequences, which were assigned putative functions based on homology to existing protein sequences in public databases. Raw fastq reads and the final version of the genome assembly have been deposited in the NCBI (BioProject: PRJNA604477, WGS: JAAGKR000000000, SRA: SRR11188453). The complete C. macrocephalus mitochondrial genome was also recovered from the same sequence read dataset and is available on NCBI (accession: MT109097), representing the first mitogenome for this species. Lastly, we find an expansion of the mb and ora1 genes thought to be associated with adaptations to air-breathing and a semi-terrestrial life style in this genus of catfish. Elsevier 2020-06-16 /pmc/articles/PMC7326715/ /pubmed/32637481 http://dx.doi.org/10.1016/j.dib.2020.105861 Text en © 2020 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Biochemistry, Genetics and Molecular Biology
Duong, Thuy-Yen
Tan, Mun Hua
Lee, Yin Peng
Croft, Larry
Austin, Christopher M.
Dataset for genome sequencing and de novo assembly of the Vietnamese bighead catfish (Clarias macrocephalus Günther, 1864)
title Dataset for genome sequencing and de novo assembly of the Vietnamese bighead catfish (Clarias macrocephalus Günther, 1864)
title_full Dataset for genome sequencing and de novo assembly of the Vietnamese bighead catfish (Clarias macrocephalus Günther, 1864)
title_fullStr Dataset for genome sequencing and de novo assembly of the Vietnamese bighead catfish (Clarias macrocephalus Günther, 1864)
title_full_unstemmed Dataset for genome sequencing and de novo assembly of the Vietnamese bighead catfish (Clarias macrocephalus Günther, 1864)
title_short Dataset for genome sequencing and de novo assembly of the Vietnamese bighead catfish (Clarias macrocephalus Günther, 1864)
title_sort dataset for genome sequencing and de novo assembly of the vietnamese bighead catfish (clarias macrocephalus günther, 1864)
topic Biochemistry, Genetics and Molecular Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7326715/
https://www.ncbi.nlm.nih.gov/pubmed/32637481
http://dx.doi.org/10.1016/j.dib.2020.105861
work_keys_str_mv AT duongthuyyen datasetforgenomesequencinganddenovoassemblyofthevietnamesebigheadcatfishclariasmacrocephalusgunther1864
AT tanmunhua datasetforgenomesequencinganddenovoassemblyofthevietnamesebigheadcatfishclariasmacrocephalusgunther1864
AT leeyinpeng datasetforgenomesequencinganddenovoassemblyofthevietnamesebigheadcatfishclariasmacrocephalusgunther1864
AT croftlarry datasetforgenomesequencinganddenovoassemblyofthevietnamesebigheadcatfishclariasmacrocephalusgunther1864
AT austinchristopherm datasetforgenomesequencinganddenovoassemblyofthevietnamesebigheadcatfishclariasmacrocephalusgunther1864