Cargando…

Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome

The domestic dog has evolved to be an important biomedical model for studies regarding the genetic basis of disease, morphology and behavior. Genetic studies in the dog have relied on a draft reference genome of a purebred female boxer dog named “Tasha” initially published in 2005. Derived from a Sa...

Descripción completa

Detalles Bibliográficos
Autores principales: Jagannathan, Vidhya, Hitte, Christophe, Kidd, Jeffrey M., Masterson, Patrick, Murphy, Terence D., Emery, Sarah, Davis, Brian, Buckley, Reuben M., Liu, Yan-Hu, Zhang, Xiang-Quan, Leeb, Tosso, Zhang, Ya-Ping, Ostrander, Elaine A., Wang, Guo-Dong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8228171/
https://www.ncbi.nlm.nih.gov/pubmed/34070911
http://dx.doi.org/10.3390/genes12060847
_version_ 1783712680440233984
author Jagannathan, Vidhya
Hitte, Christophe
Kidd, Jeffrey M.
Masterson, Patrick
Murphy, Terence D.
Emery, Sarah
Davis, Brian
Buckley, Reuben M.
Liu, Yan-Hu
Zhang, Xiang-Quan
Leeb, Tosso
Zhang, Ya-Ping
Ostrander, Elaine A.
Wang, Guo-Dong
author_facet Jagannathan, Vidhya
Hitte, Christophe
Kidd, Jeffrey M.
Masterson, Patrick
Murphy, Terence D.
Emery, Sarah
Davis, Brian
Buckley, Reuben M.
Liu, Yan-Hu
Zhang, Xiang-Quan
Leeb, Tosso
Zhang, Ya-Ping
Ostrander, Elaine A.
Wang, Guo-Dong
author_sort Jagannathan, Vidhya
collection PubMed
description The domestic dog has evolved to be an important biomedical model for studies regarding the genetic basis of disease, morphology and behavior. Genetic studies in the dog have relied on a draft reference genome of a purebred female boxer dog named “Tasha” initially published in 2005. Derived from a Sanger whole genome shotgun sequencing approach coupled with limited clone-based sequencing, the initial assembly and subsequent updates have served as the predominant resource for canine genetics for 15 years. While the initial assembly produced a good-quality draft, as with all assemblies produced at the time, it contained gaps, assembly errors and missing sequences, particularly in GC-rich regions, which are found at many promoters and in the first exons of protein-coding genes. Here, we present Dog10K_Boxer_Tasha_1.0, an improved chromosome-level highly contiguous genome assembly of Tasha created with long-read technologies that increases sequence contiguity >100-fold, closes >23,000 gaps of the CanFam3.1 reference assembly and improves gene annotation by identifying >1200 new protein-coding transcripts. The assembly and annotation are available at NCBI under the accession GCF_000002285.5.
format Online
Article
Text
id pubmed-8228171
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-82281712021-06-26 Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome Jagannathan, Vidhya Hitte, Christophe Kidd, Jeffrey M. Masterson, Patrick Murphy, Terence D. Emery, Sarah Davis, Brian Buckley, Reuben M. Liu, Yan-Hu Zhang, Xiang-Quan Leeb, Tosso Zhang, Ya-Ping Ostrander, Elaine A. Wang, Guo-Dong Genes (Basel) Article The domestic dog has evolved to be an important biomedical model for studies regarding the genetic basis of disease, morphology and behavior. Genetic studies in the dog have relied on a draft reference genome of a purebred female boxer dog named “Tasha” initially published in 2005. Derived from a Sanger whole genome shotgun sequencing approach coupled with limited clone-based sequencing, the initial assembly and subsequent updates have served as the predominant resource for canine genetics for 15 years. While the initial assembly produced a good-quality draft, as with all assemblies produced at the time, it contained gaps, assembly errors and missing sequences, particularly in GC-rich regions, which are found at many promoters and in the first exons of protein-coding genes. Here, we present Dog10K_Boxer_Tasha_1.0, an improved chromosome-level highly contiguous genome assembly of Tasha created with long-read technologies that increases sequence contiguity >100-fold, closes >23,000 gaps of the CanFam3.1 reference assembly and improves gene annotation by identifying >1200 new protein-coding transcripts. The assembly and annotation are available at NCBI under the accession GCF_000002285.5. MDPI 2021-05-30 /pmc/articles/PMC8228171/ /pubmed/34070911 http://dx.doi.org/10.3390/genes12060847 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Jagannathan, Vidhya
Hitte, Christophe
Kidd, Jeffrey M.
Masterson, Patrick
Murphy, Terence D.
Emery, Sarah
Davis, Brian
Buckley, Reuben M.
Liu, Yan-Hu
Zhang, Xiang-Quan
Leeb, Tosso
Zhang, Ya-Ping
Ostrander, Elaine A.
Wang, Guo-Dong
Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title_full Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title_fullStr Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title_full_unstemmed Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title_short Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome
title_sort dog10k_boxer_tasha_1.0: a long-read assembly of the dog reference genome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8228171/
https://www.ncbi.nlm.nih.gov/pubmed/34070911
http://dx.doi.org/10.3390/genes12060847
work_keys_str_mv AT jagannathanvidhya dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT hittechristophe dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT kiddjeffreym dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT mastersonpatrick dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT murphyterenced dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT emerysarah dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT davisbrian dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT buckleyreubenm dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT liuyanhu dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT zhangxiangquan dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT leebtosso dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT zhangyaping dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT ostranderelainea dog10kboxertasha10alongreadassemblyofthedogreferencegenome
AT wangguodong dog10kboxertasha10alongreadassemblyofthedogreferencegenome