Cargando…

An improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome

BACKGROUND: The domestic sheep (Ovis aries) is an important agricultural species raised for meat, wool, and milk across the world. A high-quality reference genome for this species enhances the ability to discover genetic mechanisms influencing biological traits. Furthermore, a high-quality reference...

Descripción completa

Detalles Bibliográficos
Autores principales: Davenport, Kimberly M, Bickhart, Derek M, Worley, Kim, Murali, Shwetha C, Salavati, Mazdak, Clark, Emily L, Cockett, Noelle E, Heaton, Michael P, Smith, Timothy P L, Murdoch, Brenda M, Rosen, Benjamin D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8848310/
https://www.ncbi.nlm.nih.gov/pubmed/35134925
http://dx.doi.org/10.1093/gigascience/giab096
_version_ 1784652222089396224
author Davenport, Kimberly M
Bickhart, Derek M
Worley, Kim
Murali, Shwetha C
Salavati, Mazdak
Clark, Emily L
Cockett, Noelle E
Heaton, Michael P
Smith, Timothy P L
Murdoch, Brenda M
Rosen, Benjamin D
author_facet Davenport, Kimberly M
Bickhart, Derek M
Worley, Kim
Murali, Shwetha C
Salavati, Mazdak
Clark, Emily L
Cockett, Noelle E
Heaton, Michael P
Smith, Timothy P L
Murdoch, Brenda M
Rosen, Benjamin D
author_sort Davenport, Kimberly M
collection PubMed
description BACKGROUND: The domestic sheep (Ovis aries) is an important agricultural species raised for meat, wool, and milk across the world. A high-quality reference genome for this species enhances the ability to discover genetic mechanisms influencing biological traits. Furthermore, a high-quality reference genome allows for precise functional annotation of gene regulatory elements. The rapid advances in genome assembly algorithms and emergence of sequencing technologies with increasingly long reads provide the opportunity for an improved de novo assembly of the sheep reference genome. FINDINGS: Short-read Illumina (55× coverage), long-read Pacific Biosciences (75× coverage), and Hi-C data from this ewe retrieved from public databases were combined with an additional 50× coverage of Oxford Nanopore data and assembled with canu v1.9. The assembled contigs were scaffolded using Hi-C data with Salsa v2.2, gaps filled with PBsuitev15.8.24, and polished with Nanopolish v0.12.5. After duplicate contig removal with PurgeDups v1.0.1, chromosomes were oriented and polished with 2 rounds of a pipeline that consisted of freebayes v1.3.1 to call variants, Merfin to validate them, and BCFtools to generate the consensus fasta. The ARS-UI_Ramb_v2.0 assembly is 2.63 Gb in length and has improved continuity (contig NG50 of 43.18 Mb), with a 19- and 38-fold decrease in the number of scaffolds compared with Oar_rambouillet_v1.0 and Oar_v4.0. ARS-UI_Ramb_v2.0 has greater per-base accuracy and fewer insertions and deletions identified from mapped RNA sequence than previous assemblies. CONCLUSIONS: The ARS-UI_Ramb_v2.0 assembly is a substantial improvement in contiguity that will optimize the functional annotation of the sheep genome and facilitate improved mapping accuracy of genetic variant and expression data for traits in sheep.
format Online
Article
Text
id pubmed-8848310
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-88483102022-02-17 An improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome Davenport, Kimberly M Bickhart, Derek M Worley, Kim Murali, Shwetha C Salavati, Mazdak Clark, Emily L Cockett, Noelle E Heaton, Michael P Smith, Timothy P L Murdoch, Brenda M Rosen, Benjamin D Gigascience Data Note BACKGROUND: The domestic sheep (Ovis aries) is an important agricultural species raised for meat, wool, and milk across the world. A high-quality reference genome for this species enhances the ability to discover genetic mechanisms influencing biological traits. Furthermore, a high-quality reference genome allows for precise functional annotation of gene regulatory elements. The rapid advances in genome assembly algorithms and emergence of sequencing technologies with increasingly long reads provide the opportunity for an improved de novo assembly of the sheep reference genome. FINDINGS: Short-read Illumina (55× coverage), long-read Pacific Biosciences (75× coverage), and Hi-C data from this ewe retrieved from public databases were combined with an additional 50× coverage of Oxford Nanopore data and assembled with canu v1.9. The assembled contigs were scaffolded using Hi-C data with Salsa v2.2, gaps filled with PBsuitev15.8.24, and polished with Nanopolish v0.12.5. After duplicate contig removal with PurgeDups v1.0.1, chromosomes were oriented and polished with 2 rounds of a pipeline that consisted of freebayes v1.3.1 to call variants, Merfin to validate them, and BCFtools to generate the consensus fasta. The ARS-UI_Ramb_v2.0 assembly is 2.63 Gb in length and has improved continuity (contig NG50 of 43.18 Mb), with a 19- and 38-fold decrease in the number of scaffolds compared with Oar_rambouillet_v1.0 and Oar_v4.0. ARS-UI_Ramb_v2.0 has greater per-base accuracy and fewer insertions and deletions identified from mapped RNA sequence than previous assemblies. CONCLUSIONS: The ARS-UI_Ramb_v2.0 assembly is a substantial improvement in contiguity that will optimize the functional annotation of the sheep genome and facilitate improved mapping accuracy of genetic variant and expression data for traits in sheep. Oxford University Press 2022-02-04 /pmc/articles/PMC8848310/ /pubmed/35134925 http://dx.doi.org/10.1093/gigascience/giab096 Text en © The Author(s) 2022. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Data Note
Davenport, Kimberly M
Bickhart, Derek M
Worley, Kim
Murali, Shwetha C
Salavati, Mazdak
Clark, Emily L
Cockett, Noelle E
Heaton, Michael P
Smith, Timothy P L
Murdoch, Brenda M
Rosen, Benjamin D
An improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome
title An improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome
title_full An improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome
title_fullStr An improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome
title_full_unstemmed An improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome
title_short An improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome
title_sort improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome
topic Data Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8848310/
https://www.ncbi.nlm.nih.gov/pubmed/35134925
http://dx.doi.org/10.1093/gigascience/giab096
work_keys_str_mv AT davenportkimberlym animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT bickhartderekm animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT worleykim animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT muralishwethac animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT salavatimazdak animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT clarkemilyl animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT cockettnoellee animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT heatonmichaelp animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT smithtimothypl animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT murdochbrendam animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT rosenbenjamind animprovedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT davenportkimberlym improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT bickhartderekm improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT worleykim improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT muralishwethac improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT salavatimazdak improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT clarkemilyl improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT cockettnoellee improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT heatonmichaelp improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT smithtimothypl improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT murdochbrendam improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome
AT rosenbenjamind improvedovinereferencegenomeassemblytofacilitateindepthfunctionalannotationofthesheepgenome