Cargando…

A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling

Inspired by the production of reference data sets in the Genome in a Bottle project, we sequenced one Charolais heifer with different technologies: Illumina paired-end, Oxford Nanopore, Pacific Biosciences (HiFi and CLR), 10X Genomics linked-reads, and Hi-C. In order to generate haplotypic assemblie...

Descripción completa

Detalles Bibliográficos
Autores principales: Eché, Camille, Iampietro, Carole, Birbes, Clément, Dréau, Andreea, Kuchly, Claire, Di Franco, Arnaud, Klopp, Christophe, Faraut, Thomas, Djebali, Sarah, Castinel, Adrien, Zytnicki, Matthias, Denis, Erwan, Boussaha, Mekki, Grohs, Cécile, Boichard, Didier, Gaspin, Christine, Milan, Denis, Donnadieu, Cécile
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10250393/
https://www.ncbi.nlm.nih.gov/pubmed/37291142
http://dx.doi.org/10.1038/s41597-023-02249-1
_version_ 1785055745737228288
author Eché, Camille
Iampietro, Carole
Birbes, Clément
Dréau, Andreea
Kuchly, Claire
Di Franco, Arnaud
Klopp, Christophe
Faraut, Thomas
Djebali, Sarah
Castinel, Adrien
Zytnicki, Matthias
Denis, Erwan
Boussaha, Mekki
Grohs, Cécile
Boichard, Didier
Gaspin, Christine
Milan, Denis
Donnadieu, Cécile
author_facet Eché, Camille
Iampietro, Carole
Birbes, Clément
Dréau, Andreea
Kuchly, Claire
Di Franco, Arnaud
Klopp, Christophe
Faraut, Thomas
Djebali, Sarah
Castinel, Adrien
Zytnicki, Matthias
Denis, Erwan
Boussaha, Mekki
Grohs, Cécile
Boichard, Didier
Gaspin, Christine
Milan, Denis
Donnadieu, Cécile
author_sort Eché, Camille
collection PubMed
description Inspired by the production of reference data sets in the Genome in a Bottle project, we sequenced one Charolais heifer with different technologies: Illumina paired-end, Oxford Nanopore, Pacific Biosciences (HiFi and CLR), 10X Genomics linked-reads, and Hi-C. In order to generate haplotypic assemblies, we also sequenced both parents with short reads. From these data, we built two haplotyped trio high quality reference genomes and a consensus assembly, using up-to-date software packages. The assemblies obtained using PacBio HiFi reaches a size of 3.2 Gb, which is significantly larger than the 2.7 Gb ARS-UCD1.2 reference. The BUSCO score of the consensus assembly reaches a completeness of 95.8%, among highly conserved mammal genes. We also identified 35,866 structural variants larger than 50 base pairs. This assembly is a contribution to the bovine pangenome for the “Charolais” breed. These datasets will prove to be useful resources enabling the community to gain additional insight on sequencing technologies for applications such as SNP, indel or structural variant calling, and de novo assembly.
format Online
Article
Text
id pubmed-10250393
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-102503932023-06-10 A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling Eché, Camille Iampietro, Carole Birbes, Clément Dréau, Andreea Kuchly, Claire Di Franco, Arnaud Klopp, Christophe Faraut, Thomas Djebali, Sarah Castinel, Adrien Zytnicki, Matthias Denis, Erwan Boussaha, Mekki Grohs, Cécile Boichard, Didier Gaspin, Christine Milan, Denis Donnadieu, Cécile Sci Data Data Descriptor Inspired by the production of reference data sets in the Genome in a Bottle project, we sequenced one Charolais heifer with different technologies: Illumina paired-end, Oxford Nanopore, Pacific Biosciences (HiFi and CLR), 10X Genomics linked-reads, and Hi-C. In order to generate haplotypic assemblies, we also sequenced both parents with short reads. From these data, we built two haplotyped trio high quality reference genomes and a consensus assembly, using up-to-date software packages. The assemblies obtained using PacBio HiFi reaches a size of 3.2 Gb, which is significantly larger than the 2.7 Gb ARS-UCD1.2 reference. The BUSCO score of the consensus assembly reaches a completeness of 95.8%, among highly conserved mammal genes. We also identified 35,866 structural variants larger than 50 base pairs. This assembly is a contribution to the bovine pangenome for the “Charolais” breed. These datasets will prove to be useful resources enabling the community to gain additional insight on sequencing technologies for applications such as SNP, indel or structural variant calling, and de novo assembly. Nature Publishing Group UK 2023-06-08 /pmc/articles/PMC10250393/ /pubmed/37291142 http://dx.doi.org/10.1038/s41597-023-02249-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Data Descriptor
Eché, Camille
Iampietro, Carole
Birbes, Clément
Dréau, Andreea
Kuchly, Claire
Di Franco, Arnaud
Klopp, Christophe
Faraut, Thomas
Djebali, Sarah
Castinel, Adrien
Zytnicki, Matthias
Denis, Erwan
Boussaha, Mekki
Grohs, Cécile
Boichard, Didier
Gaspin, Christine
Milan, Denis
Donnadieu, Cécile
A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling
title A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling
title_full A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling
title_fullStr A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling
title_full_unstemmed A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling
title_short A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling
title_sort bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10250393/
https://www.ncbi.nlm.nih.gov/pubmed/37291142
http://dx.doi.org/10.1038/s41597-023-02249-1
work_keys_str_mv AT echecamille abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT iampietrocarole abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT birbesclement abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT dreauandreea abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT kuchlyclaire abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT difrancoarnaud abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT kloppchristophe abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT farautthomas abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT djebalisarah abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT castineladrien abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT zytnickimatthias abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT deniserwan abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT boussahamekki abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT grohscecile abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT boicharddidier abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT gaspinchristine abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT milandenis abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT donnadieucecile abostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT echecamille bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT iampietrocarole bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT birbesclement bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT dreauandreea bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT kuchlyclaire bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT difrancoarnaud bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT kloppchristophe bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT farautthomas bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT djebalisarah bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT castineladrien bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT zytnickimatthias bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT deniserwan bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT boussahamekki bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT grohscecile bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT boicharddidier bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT gaspinchristine bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT milandenis bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling
AT donnadieucecile bostaurussequencingmethodsbenchmarkforassemblyhaplotypingandvariantcalling