Cargando…

Classification of cannabis strains in the Canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms

The cannabis community typically uses the terms “Sativa” and “Indica” to characterize drug strains with high tetrahydrocannabinol (THC) levels. Due to large scale, extensive, and unrecorded hybridization in the past 40 years, this vernacular naming convention has become unreliable and inadequate for...

Descripción completa

Detalles Bibliográficos
Autores principales: Jin, Dan, Henry, Philippe, Shan, Jacqueline, Chen, Jie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8238227/
https://www.ncbi.nlm.nih.gov/pubmed/34181676
http://dx.doi.org/10.1371/journal.pone.0253387
_version_ 1783714859409473536
author Jin, Dan
Henry, Philippe
Shan, Jacqueline
Chen, Jie
author_facet Jin, Dan
Henry, Philippe
Shan, Jacqueline
Chen, Jie
author_sort Jin, Dan
collection PubMed
description The cannabis community typically uses the terms “Sativa” and “Indica” to characterize drug strains with high tetrahydrocannabinol (THC) levels. Due to large scale, extensive, and unrecorded hybridization in the past 40 years, this vernacular naming convention has become unreliable and inadequate for identifying or selecting strains for clinical research and medicinal production. Additionally, cannabidiol (CBD) dominant strains and balanced strains (or intermediate strains, which have intermediate levels of THC and CBD), are not included in the current classification studies despite the increasing research interest in the therapeutic potential of CBD. This paper is the first in a series of studies proposing that a new classification system be established based on genome-wide variation and supplemented by data on secondary metabolites and morphological characteristics. This study performed a whole-genome sequencing of 23 cannabis strains marketed in Canada, aligned sequences to a reference genome, and, after filtering for minor allele frequency of 10%, identified 137,858 single nucleotide polymorphisms (SNPs). Discriminant analysis of principal components (DAPC) was applied to these SNPs and further identified 344 structural SNPs, which classified individual strains into five chemotype-aligned groups: one CBD dominant, one balanced, and three THC dominant clusters. These structural SNPs were all multiallelic and were predominantly tri-allelic (339/344). The largest portion of these SNPs (37%) occurred on the same chromosome containing genes for CBD acid synthases (CBDAS) and THC acid synthases (THCAS). The remainder (63%) were located on the other nine chromosomes. These results showed that the genetic differences between modern cannabis strains were at a whole-genome level and not limited to THC or CBD production. These SNPs contained enough genetic variation for classifying individual strains into corresponding chemotypes. In an effort to elucidate the confused genetic backgrounds of commercially available cannabis strains, this classification attempt investigated the utility of DAPC for classifying modern cannabis strains and for identifying structural SNPs.
format Online
Article
Text
id pubmed-8238227
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-82382272021-07-09 Classification of cannabis strains in the Canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms Jin, Dan Henry, Philippe Shan, Jacqueline Chen, Jie PLoS One Research Article The cannabis community typically uses the terms “Sativa” and “Indica” to characterize drug strains with high tetrahydrocannabinol (THC) levels. Due to large scale, extensive, and unrecorded hybridization in the past 40 years, this vernacular naming convention has become unreliable and inadequate for identifying or selecting strains for clinical research and medicinal production. Additionally, cannabidiol (CBD) dominant strains and balanced strains (or intermediate strains, which have intermediate levels of THC and CBD), are not included in the current classification studies despite the increasing research interest in the therapeutic potential of CBD. This paper is the first in a series of studies proposing that a new classification system be established based on genome-wide variation and supplemented by data on secondary metabolites and morphological characteristics. This study performed a whole-genome sequencing of 23 cannabis strains marketed in Canada, aligned sequences to a reference genome, and, after filtering for minor allele frequency of 10%, identified 137,858 single nucleotide polymorphisms (SNPs). Discriminant analysis of principal components (DAPC) was applied to these SNPs and further identified 344 structural SNPs, which classified individual strains into five chemotype-aligned groups: one CBD dominant, one balanced, and three THC dominant clusters. These structural SNPs were all multiallelic and were predominantly tri-allelic (339/344). The largest portion of these SNPs (37%) occurred on the same chromosome containing genes for CBD acid synthases (CBDAS) and THC acid synthases (THCAS). The remainder (63%) were located on the other nine chromosomes. These results showed that the genetic differences between modern cannabis strains were at a whole-genome level and not limited to THC or CBD production. These SNPs contained enough genetic variation for classifying individual strains into corresponding chemotypes. In an effort to elucidate the confused genetic backgrounds of commercially available cannabis strains, this classification attempt investigated the utility of DAPC for classifying modern cannabis strains and for identifying structural SNPs. Public Library of Science 2021-06-28 /pmc/articles/PMC8238227/ /pubmed/34181676 http://dx.doi.org/10.1371/journal.pone.0253387 Text en © 2021 Jin et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Jin, Dan
Henry, Philippe
Shan, Jacqueline
Chen, Jie
Classification of cannabis strains in the Canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms
title Classification of cannabis strains in the Canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms
title_full Classification of cannabis strains in the Canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms
title_fullStr Classification of cannabis strains in the Canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms
title_full_unstemmed Classification of cannabis strains in the Canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms
title_short Classification of cannabis strains in the Canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms
title_sort classification of cannabis strains in the canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8238227/
https://www.ncbi.nlm.nih.gov/pubmed/34181676
http://dx.doi.org/10.1371/journal.pone.0253387
work_keys_str_mv AT jindan classificationofcannabisstrainsinthecanadianmarketwithdiscriminantanalysisofprincipalcomponentsusinggenomewidesinglenucleotidepolymorphisms
AT henryphilippe classificationofcannabisstrainsinthecanadianmarketwithdiscriminantanalysisofprincipalcomponentsusinggenomewidesinglenucleotidepolymorphisms
AT shanjacqueline classificationofcannabisstrainsinthecanadianmarketwithdiscriminantanalysisofprincipalcomponentsusinggenomewidesinglenucleotidepolymorphisms
AT chenjie classificationofcannabisstrainsinthecanadianmarketwithdiscriminantanalysisofprincipalcomponentsusinggenomewidesinglenucleotidepolymorphisms