Date Published: January 20, 2017
Publisher: Public Library of Science
Author(s): Christophe Dufresnes, Catherine Jan, Friederike Bienert, Jérôme Goudet, Luca Fumagalli, Monica Scali.
Cannabis (hemp and marijuana) is an iconic yet controversial crop. On the one hand, it represents a growing market for pharmaceutical and agricultural sectors. On the other hand, plants synthesizing the psychoactive THC produce the most widespread illicit drug in the world. Yet, the difficulty to reliably distinguish between Cannabis varieties based on morphological or biochemical criteria impedes the development of promising industrial programs and hinders the fight against narcotrafficking. Genetics offers an appropriate alternative to characterize drug vs. non-drug Cannabis. However, forensic applications require rapid and affordable genotyping of informative and reliable molecular markers for which a broad-scale reference database, representing both intra- and inter-variety variation, is available. Here we provide such a resource for Cannabis, by genotyping 13 microsatellite loci (STRs) in 1 324 samples selected specifically for fibre (24 hemp varieties) and drug (15 marijuana varieties) production. We showed that these loci are sufficient to capture most of the genome-wide diversity patterns recently revealed by NGS data. We recovered strong genetic structure between marijuana and hemp and demonstrated that anonymous samples can be confidently assigned to either plant types. Fibres appear genetically homogeneous whereas drugs show low (often clonal) diversity within varieties, but very high genetic differentiation between them, likely resulting from breeding practices. Based on an additional test dataset including samples from 41 local police seizures, we showed that the genetic signature of marijuana cultivars could be used to trace crime scene evidence. To date, our study provides the most comprehensive genetic resource for Cannabis forensics worldwide.
Cannabis is one of humanity’s oldest cultivated plant. It is thought to have originated in central Asia and was domesticated as early as 8 000 BP for food, fibre, oil, medicines and as an inebriant. This crop was since distributed across the world during the last two millennia and, due to its recent legalization in several countries, is increasingly exploited by several industrial sectors (hemp) and as a recreational drug (marijuana). The taxonomic status of Cannabis has always been disputed, as it encompasses multiple cultural, geographic, historical and functional aspects (reviewed in [1–4]). Whereas most authors now consider it a monotypic panmictic taxon, Cannabis sativa, three species or subspecies (sativa, indica and ruderalis) are often mentioned but without a comprehensive taxonomic grouping so far. The nomenclature may thus differ depending on whether it refers to morphological or chemical variation, geographic distribution, ecotype, as well as crop-use characteristics and intoxicant properties resulting from human selection [4–7]. Cannabis presumably diversified following selection for traits enhancing fibre and seed production (”hemp”) or psychoactive properties (“drug”). Importantly, Cannabis types differ in their absolute and relative amounts of terpenophenolic cannabinoids, notably Δ1-tetrahydrocannabinol (THC), the well-known psychoactive compound of marijuana, and the non-psychoactive cannabidiol (CBD). In this context, drug-type Cannabis (marijuana) is broadly characterized by a higher overall cannabinoid content than fibre-types. However, the most widely recognized criteria to assign a Cannabis plant to either “drug” or “hemp” type is the THC:CBD ratio, according to which three main chemical phenotype (chemotype) classes are recognized: hemp-type plants with a low ratio (THC:CBD < 1), drug-type plants with a high ratio (THC:CBD > 1), and intermediate-type plants with a ratio close to one [6, 8]. The informal designation sativa and indica may have various, controversial meanings. Morphologically, the name sativa designates tall plants with narrow leaves, while indica refers to short plants with wide leaves. Among the marijuana community however, sativa rather refers to equatorial varieties producing stimulating psychoactive effects (THC:CBD ≈ 1), whereas indica-type plants from Central Asia are used for relaxing and sedative drugs (THC:CBD > 1) .
The selected STR markers (detailed in S2 Table) unanimously recovered the strong structure between fibres and drug Cannabis samples. This is clearly depicted by a Principal Component Analysis (PCA, Fig 1A), genetic distances between accessions (Fst, S1 Fig) and genotype clustering by STRUCTURE (Fig 1B), where two groups appears as the best clustering solution (ΔK2 = 1205.6). As recently evidenced from NGS data , this pattern reflects differentiation between hemp and marijuana over the entire genome, not only at genes underlying THC and fibre synthesis. Some drugs and fibres show weak signs of genetic admixture (intermediate PCA scores and STRUCTURE probabilities, Fig 1; lower Fst, S1 Fig), which might stem from introgressive crossbreeding, as reported elsewhere . Interestingly, except for RI (indica/ruderalis hybrid), all drug varieties closely-related to hemps are of sativa ancestry (HMW, HA, SWA, MS; based on available information from suppliers). This would support the common assumption that hemp varieties selected for fibre and seed production derived from sativa, although this view has been challenged by other studies that found more similarities between hemp and indica [7, 23, 36]. Alternatively, sativa drugs, which are nowadays distributed in more equatorial regions, may be frequently crossbred with indica and agricultural varieties to facilitate their cultivation in temperate countries. In any case, marijuana genetic diversity seems weakly associated with the documented breeding history: we also performed a PCA solely on drugs, which only marginally clustered according to their main sativa and indica pedigree (S2 Fig). Some cultivars of the same appellation appear genetically distinct (e.g. Alpine Rocket, ARa and ARb, FST = 0.36) whereas others harboring different names are genetically identical (e.g. PM, T44, BS, FST = 0.00; identical clones shared by ARa and B52, S1 Table). Overall, these observations are in line with the general conclusions of Sawler et al.  that drug varieties are often misinformed due to the clandestine nature of Cannabis breeding over the last century, and that names do not necessarily reflect a meaningful genetic identity. In addition, hemp varieties grouped according to reproductive characteristics, as expected (dioecious versus monoecious; S1 Table), as a result of their breeding history (illustrated on the PCA, Fig 1; Fst tree, S1 Fig).