Cargando…

Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture

BACKGROUND: The international Dog10K project aims to sequence and analyze several thousand canine genomes. Incorporating 20 × data from 1987 individuals, including 1611 dogs (321 breeds), 309 village dogs, 63 wolves, and four coyotes, we identify genomic variation across the canid family, setting th...

Descripción completa

Detalles Bibliográficos
Autores principales: Meadows, Jennifer R. S., Kidd, Jeffrey M., Wang, Guo-Dong, Parker, Heidi G., Schall, Peter Z., Bianchi, Matteo, Christmas, Matthew J., Bougiouri, Katia, Buckley, Reuben M., Hitte, Christophe, Nguyen, Anthony K., Wang, Chao, Jagannathan, Vidhya, Niskanen, Julia E., Frantz, Laurent A. F., Arumilli, Meharji, Hundi, Sruthi, Lindblad-Toh, Kerstin, Ginja, Catarina, Agustina, Kadek Karang, André, Catherine, Boyko, Adam R., Davis, Brian W., Drögemüller, Michaela, Feng, Xin-Yao, Gkagkavouzis, Konstantinos, Iliopoulos, Giorgos, Harris, Alexander C., Hytönen, Marjo K., Kalthoff, Daniela C., Liu, Yan-Hu, Lymberakis, Petros, Poulakakis, Nikolaos, Pires, Ana Elisabete, Racimo, Fernando, Ramos-Almodovar, Fabian, Savolainen, Peter, Venetsani, Semina, Tammen, Imke, Triantafyllidis, Alexandros, vonHoldt, Bridgett, Wayne, Robert K., Larson, Greger, Nicholas, Frank W., Lohi, Hannes, Leeb, Tosso, Zhang, Ya-Ping, Ostrander, Elaine A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10426128/
https://www.ncbi.nlm.nih.gov/pubmed/37582787
http://dx.doi.org/10.1186/s13059-023-03023-7
_version_ 1785089989677154304
author Meadows, Jennifer R. S.
Kidd, Jeffrey M.
Wang, Guo-Dong
Parker, Heidi G.
Schall, Peter Z.
Bianchi, Matteo
Christmas, Matthew J.
Bougiouri, Katia
Buckley, Reuben M.
Hitte, Christophe
Nguyen, Anthony K.
Wang, Chao
Jagannathan, Vidhya
Niskanen, Julia E.
Frantz, Laurent A. F.
Arumilli, Meharji
Hundi, Sruthi
Lindblad-Toh, Kerstin
Ginja, Catarina
Agustina, Kadek Karang
André, Catherine
Boyko, Adam R.
Davis, Brian W.
Drögemüller, Michaela
Feng, Xin-Yao
Gkagkavouzis, Konstantinos
Iliopoulos, Giorgos
Harris, Alexander C.
Hytönen, Marjo K.
Kalthoff, Daniela C.
Liu, Yan-Hu
Lymberakis, Petros
Poulakakis, Nikolaos
Pires, Ana Elisabete
Racimo, Fernando
Ramos-Almodovar, Fabian
Savolainen, Peter
Venetsani, Semina
Tammen, Imke
Triantafyllidis, Alexandros
vonHoldt, Bridgett
Wayne, Robert K.
Larson, Greger
Nicholas, Frank W.
Lohi, Hannes
Leeb, Tosso
Zhang, Ya-Ping
Ostrander, Elaine A.
author_facet Meadows, Jennifer R. S.
Kidd, Jeffrey M.
Wang, Guo-Dong
Parker, Heidi G.
Schall, Peter Z.
Bianchi, Matteo
Christmas, Matthew J.
Bougiouri, Katia
Buckley, Reuben M.
Hitte, Christophe
Nguyen, Anthony K.
Wang, Chao
Jagannathan, Vidhya
Niskanen, Julia E.
Frantz, Laurent A. F.
Arumilli, Meharji
Hundi, Sruthi
Lindblad-Toh, Kerstin
Ginja, Catarina
Agustina, Kadek Karang
André, Catherine
Boyko, Adam R.
Davis, Brian W.
Drögemüller, Michaela
Feng, Xin-Yao
Gkagkavouzis, Konstantinos
Iliopoulos, Giorgos
Harris, Alexander C.
Hytönen, Marjo K.
Kalthoff, Daniela C.
Liu, Yan-Hu
Lymberakis, Petros
Poulakakis, Nikolaos
Pires, Ana Elisabete
Racimo, Fernando
Ramos-Almodovar, Fabian
Savolainen, Peter
Venetsani, Semina
Tammen, Imke
Triantafyllidis, Alexandros
vonHoldt, Bridgett
Wayne, Robert K.
Larson, Greger
Nicholas, Frank W.
Lohi, Hannes
Leeb, Tosso
Zhang, Ya-Ping
Ostrander, Elaine A.
author_sort Meadows, Jennifer R. S.
collection PubMed
description BACKGROUND: The international Dog10K project aims to sequence and analyze several thousand canine genomes. Incorporating 20 × data from 1987 individuals, including 1611 dogs (321 breeds), 309 village dogs, 63 wolves, and four coyotes, we identify genomic variation across the canid family, setting the stage for detailed studies of domestication, behavior, morphology, disease susceptibility, and genome architecture and function. RESULTS: We report the analysis of > 48 M single-nucleotide, indel, and structural variants spanning the autosomes, X chromosome, and mitochondria. We discover more than 75% of variation for 239 sampled breeds. Allele sharing analysis indicates that 94.9% of breeds form monophyletic clusters and 25 major clades. German Shepherd Dogs and related breeds show the highest allele sharing with independent breeds from multiple clades. On average, each breed dog differs from the UU_Cfam_GSD_1.0 reference at 26,960 deletions and 14,034 insertions greater than 50 bp, with wolves having 14% more variants. Discovered variants include retrogene insertions from 926 parent genes. To aid functional prioritization, single-nucleotide variants were annotated with SnpEff and Zoonomia phyloP constraint scores. Constrained positions were negatively correlated with allele frequency. Finally, the utility of the Dog10K data as an imputation reference panel is assessed, generating high-confidence calls across varied genotyping platform densities including for breeds not included in the Dog10K collection. CONCLUSIONS: We have developed a dense dataset of 1987 sequenced canids that reveals patterns of allele sharing, identifies likely functional variants, informs breed structure, and enables accurate imputation. Dog10K data are publicly available. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03023-7.
format Online
Article
Text
id pubmed-10426128
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-104261282023-08-16 Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture Meadows, Jennifer R. S. Kidd, Jeffrey M. Wang, Guo-Dong Parker, Heidi G. Schall, Peter Z. Bianchi, Matteo Christmas, Matthew J. Bougiouri, Katia Buckley, Reuben M. Hitte, Christophe Nguyen, Anthony K. Wang, Chao Jagannathan, Vidhya Niskanen, Julia E. Frantz, Laurent A. F. Arumilli, Meharji Hundi, Sruthi Lindblad-Toh, Kerstin Ginja, Catarina Agustina, Kadek Karang André, Catherine Boyko, Adam R. Davis, Brian W. Drögemüller, Michaela Feng, Xin-Yao Gkagkavouzis, Konstantinos Iliopoulos, Giorgos Harris, Alexander C. Hytönen, Marjo K. Kalthoff, Daniela C. Liu, Yan-Hu Lymberakis, Petros Poulakakis, Nikolaos Pires, Ana Elisabete Racimo, Fernando Ramos-Almodovar, Fabian Savolainen, Peter Venetsani, Semina Tammen, Imke Triantafyllidis, Alexandros vonHoldt, Bridgett Wayne, Robert K. Larson, Greger Nicholas, Frank W. Lohi, Hannes Leeb, Tosso Zhang, Ya-Ping Ostrander, Elaine A. Genome Biol Research BACKGROUND: The international Dog10K project aims to sequence and analyze several thousand canine genomes. Incorporating 20 × data from 1987 individuals, including 1611 dogs (321 breeds), 309 village dogs, 63 wolves, and four coyotes, we identify genomic variation across the canid family, setting the stage for detailed studies of domestication, behavior, morphology, disease susceptibility, and genome architecture and function. RESULTS: We report the analysis of > 48 M single-nucleotide, indel, and structural variants spanning the autosomes, X chromosome, and mitochondria. We discover more than 75% of variation for 239 sampled breeds. Allele sharing analysis indicates that 94.9% of breeds form monophyletic clusters and 25 major clades. German Shepherd Dogs and related breeds show the highest allele sharing with independent breeds from multiple clades. On average, each breed dog differs from the UU_Cfam_GSD_1.0 reference at 26,960 deletions and 14,034 insertions greater than 50 bp, with wolves having 14% more variants. Discovered variants include retrogene insertions from 926 parent genes. To aid functional prioritization, single-nucleotide variants were annotated with SnpEff and Zoonomia phyloP constraint scores. Constrained positions were negatively correlated with allele frequency. Finally, the utility of the Dog10K data as an imputation reference panel is assessed, generating high-confidence calls across varied genotyping platform densities including for breeds not included in the Dog10K collection. CONCLUSIONS: We have developed a dense dataset of 1987 sequenced canids that reveals patterns of allele sharing, identifies likely functional variants, informs breed structure, and enables accurate imputation. Dog10K data are publicly available. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03023-7. BioMed Central 2023-08-15 /pmc/articles/PMC10426128/ /pubmed/37582787 http://dx.doi.org/10.1186/s13059-023-03023-7 Text en © The Author(s) 2023, corrected publication 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Meadows, Jennifer R. S.
Kidd, Jeffrey M.
Wang, Guo-Dong
Parker, Heidi G.
Schall, Peter Z.
Bianchi, Matteo
Christmas, Matthew J.
Bougiouri, Katia
Buckley, Reuben M.
Hitte, Christophe
Nguyen, Anthony K.
Wang, Chao
Jagannathan, Vidhya
Niskanen, Julia E.
Frantz, Laurent A. F.
Arumilli, Meharji
Hundi, Sruthi
Lindblad-Toh, Kerstin
Ginja, Catarina
Agustina, Kadek Karang
André, Catherine
Boyko, Adam R.
Davis, Brian W.
Drögemüller, Michaela
Feng, Xin-Yao
Gkagkavouzis, Konstantinos
Iliopoulos, Giorgos
Harris, Alexander C.
Hytönen, Marjo K.
Kalthoff, Daniela C.
Liu, Yan-Hu
Lymberakis, Petros
Poulakakis, Nikolaos
Pires, Ana Elisabete
Racimo, Fernando
Ramos-Almodovar, Fabian
Savolainen, Peter
Venetsani, Semina
Tammen, Imke
Triantafyllidis, Alexandros
vonHoldt, Bridgett
Wayne, Robert K.
Larson, Greger
Nicholas, Frank W.
Lohi, Hannes
Leeb, Tosso
Zhang, Ya-Ping
Ostrander, Elaine A.
Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture
title Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture
title_full Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture
title_fullStr Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture
title_full_unstemmed Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture
title_short Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture
title_sort genome sequencing of 2000 canids by the dog10k consortium advances the understanding of demography, genome function and architecture
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10426128/
https://www.ncbi.nlm.nih.gov/pubmed/37582787
http://dx.doi.org/10.1186/s13059-023-03023-7
work_keys_str_mv AT meadowsjenniferrs genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT kiddjeffreym genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT wangguodong genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT parkerheidig genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT schallpeterz genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT bianchimatteo genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT christmasmatthewj genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT bougiourikatia genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT buckleyreubenm genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT hittechristophe genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT nguyenanthonyk genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT wangchao genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT jagannathanvidhya genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT niskanenjuliae genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT frantzlaurentaf genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT arumillimeharji genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT hundisruthi genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT lindbladtohkerstin genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT ginjacatarina genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT agustinakadekkarang genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT andrecatherine genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT boykoadamr genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT davisbrianw genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT drogemullermichaela genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT fengxinyao genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT gkagkavouziskonstantinos genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT iliopoulosgiorgos genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT harrisalexanderc genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT hytonenmarjok genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT kalthoffdanielac genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT liuyanhu genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT lymberakispetros genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT poulakakisnikolaos genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT piresanaelisabete genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT racimofernando genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT ramosalmodovarfabian genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT savolainenpeter genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT venetsanisemina genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT tammenimke genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT triantafyllidisalexandros genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT vonholdtbridgett genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT waynerobertk genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT larsongreger genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT nicholasfrankw genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT lohihannes genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT leebtosso genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT zhangyaping genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture
AT ostranderelainea genomesequencingof2000canidsbythedog10kconsortiumadvancestheunderstandingofdemographygenomefunctionandarchitecture