Date Published: January 26, 2017
Publisher: Public Library of Science
Author(s): Annarita Marrano, Giovanni Birolo, Maria Lucia Prazzoli, Silvia Lorenzi, Giorgio Valle, Maria Stella Grando, Sara Amancio.
Whole-genome comparisons of Vitis vinifera subsp. sativa and V. vinifera subsp. sylvestris are expected to provide a better estimate of the valuable genetic diversity still present in grapevine, and help to reconstruct the evolutionary history of a major crop worldwide. To this aim, the increase of molecular marker density across the grapevine genome is fundamental. Here we describe the SNP discovery in a grapevine germplasm collection of 51 cultivars and 44 wild accessions through a novel protocol of restriction-site associated DNA (RAD) sequencing. By resequencing 1.1% of the grapevine genome at a high coverage, we recovered 34K BamHI unique restriction sites, of which 6.8% were absent in the ‘PN40024’ reference genome. Moreover, we identified 37,748 single nucleotide polymorphisms (SNPs), 93% of which belonged to the 19 assembled chromosomes with an average of 1.8K SNPs per chromosome. Nearly half of the SNPs fell in genic regions mostly assigned to the functional categories of metabolism and regulation, whereas some nonsynonymous variants were identified in genes related with the detection and response to environmental stimuli. SNP validation was carried-out, showing the ability of RAD-seq to accurately determine genotypes in a highly heterozygous species. To test the usefulness of our SNP panel, the main diversity statistics were evaluated, highlighting how the wild grapevine retained less genetic variability than the cultivated form. Furthermore, the analysis of Linkage Disequilibrium (LD) in the two subspecies separately revealed how the LD decays faster within the domesticated grapevine compared to its wild relative. Being the first application of RAD-seq in a diverse grapevine germplasm collection, our approach holds great promise for exploiting the genetic resources available in one of the most economically important fruit crops.
The introduction of molecular markers in plant breeding has enabled remarkable advances in agricultural production thanks to the discovery of genes associated to major agronomic traits, the study of species diversity and evolution, and the characterization of plant genetic resources . During the last ten years, Single Nucleotide Polymorphisms (SNP) have become the most widely used markers due to their abundance in genomes. They compensate the biallelic nature by being ubiquitous and amenable to high-throughput automation . The advent of Next Generation Sequencing (NGS) has increased the possibilities of de novo and reference SNP discovery in cost-effective and parallel manners. At the same time, huge progress has been achieved for high throughput SNP genotyping thanks to the introduction of array-based technologies, able to screen several thousand SNPs per assay . SNP arrays rely on the prior production of sequence information, the identification and validation of polymorphisms and finally the array construction . Myles et al.  designed the first SNP array for grape (Illumina Vitis9KSNP chip) by using a discovering panel of 17 genomic DNA samples from V. vinifera cultivars and wild Vitis species. The second high throughput SNP array (Illumina Vitis18KSNP array) was produced in grapevine as part of the GrapeReSeq Consortium . Many experiments have shown how the application of these array-based technologies to population genetic studies may underestimate the real genetic diversity of the investigated populations, especially when the discovery panel is evolutionary divergent from the studied accessions [7–8].
All the modern cultivars of grapevine belong to the species V. vinifera, one of the most important crops worldwide  and the only endemic taxon of the family Vitaceae in Eurasia and Maghreb . The Eurasian grapevine exists nowadays as the cultivated (V. v. subsp sativa) and the wild form (V. v. subsp sylvestris), which is supposed to be the ancestor of present varieties . The genetic relationship between the two subspecies sativa and sylvestris is still controversial [34–35]. The creation of genomic databases of reference sativa and sylvestris accessions will make it possible to characterize the relatedness between wild and cultivated grapes at genomic level and to deeply explore the natural genetic variation still preserved within wild grapevine populations . In this regard, we applied a novel protocol of RAD-seq to a germplasm collection of wild and cultivated grapevine individuals. We obtained 36.8 Gb of sequences, of which over 40% did not align successfully or were mapped in multiple locations on the 12X V. vinifera reference genome (Fig 3). The same rate of unaligned reads was even observed for the sample PN40024 (Fig 3, sample 51). This may be due to the incomplete assembly of the reference genome or, with exception of the reference sample itself, to the high levels of genetic variation between the PN40024 and the investigated grapevine accessions. Similar findings have also emerged from the comparison of both “Tannat” and “Sultanina” de-novo assembled grapevine genomes with the reference genome [24–25]. This can be even more evident in our study since half of the population belongs to the wild Eurasian vine sylvestris whose genome has not been thoroughly investigated yet. By now it is well accepted that plant genomes contain core sequences that are common to all individuals, as well as dispensable sequences comprising partially shared and non-shared genes that contribute to intraspecific variation . Moreover, the heterozygous cultivar Pinot Noir showed a relevant portion of hemizygous DNA that confirms how the grape genome exists in a dynamic state mediated in part by transposable elements . The advances in sequencing technologies (i.e. the Single Molecule Real-Time (SMRT) Sequencing technologies) and the development of novel algorithms and software will go beyond the difficulties emerged in the assembly and alignment of grapevine genomes, as recently reported by Chin et al. . Moreover, it will become possible in grapevine moving from one single reference genome to multiple reference genomes, helping to reconstruct the evolutionary history of viticulture as well as to better interpret and eventually exploit the phenotypic variation observed nowadays in natural populations .