Research Article: Development, Characterization and Experimental Validation of a Cultivated Sunflower (Helianthus annuus L.) Gene Expression Oligonucleotide Microarray

Date Published: October 26, 2012

Publisher: Public Library of Science

Author(s): Paula Fernandez, Marcelo Soria, David Blesa, Julio DiRienzo, Sebastian Moschen, Maximo Rivarola, Bernardo Jose Clavijo, Sergio Gonzalez, Lucila Peluffo, Dario Príncipi, Guillermo Dosio, Luis Aguirrezabal, Francisco García-García, Ana Conesa, Esteban Hopp, Joaquín Dopazo, Ruth Amelia Heinz, Norma Paniego, Cynthia Gibas.


Oligonucleotide-based microarrays with accurate gene coverage represent a key strategy for transcriptional studies in orphan species such as sunflower, H. annuus L., which lacks full genome sequences. The goal of this study was the development and functional annotation of a comprehensive sunflower unigene collection and the design and validation of a custom sunflower oligonucleotide-based microarray. A large scale EST (>130,000 ESTs) curation, assembly and sequence annotation was performed using Blast2GO ( The EST assembly comprises 41,013 putative transcripts (12,924 contigs and 28,089 singletons). The resulting Sunflower Unigen Resource (SUR version 1.0) was used to design an oligonucleotide-based Agilent microarray for cultivated sunflower. This microarray includes a total of 42,326 features: 1,417 Agilent controls, 74 control probes for sunflower replicated 10 times (740 controls) and 40,169 different non-control probes. Microarray performance was validated using a model experiment examining the induction of senescence by water deficit. Pre-processing and differential expression analysis of Agilent microarrays was performed using the Bioconductor limma package. The analyses based on p-values calculated by eBayes (p<0.01) allowed the detection of 558 differentially expressed genes between water stress and control conditions; from these, ten genes were further validated by qPCR. Over-represented ontologies were identified using FatiScan in the Babelomics suite. This work generated a curated and trustable sunflower unigene collection, and a custom, validated sunflower oligonucleotide-based microarray using Agilent technology. Both the curated unigene collection and the validated oligonucleotide microarray provide key resources for sunflower genome analysis, transcriptional studies, and molecular breeding for crop improvement.

Partial Text

Sunflower (Helianthus annuus L.) is an important source of edible oil and its uses have expanded to include new markets like biofuels, biolubricants and biopharmaceuticals [1]. Sunflower breeding for agronomic trait improvement, including yield, resistance to herbicide, abiotic and biotic stresses, has contributed to yield maintenance counteracting the competition for favorable agro-ecological environments imposed during the last 10 years by increasing soybean and maize production. Advances in sunflower genomics since 1995 have greatly enhanced the development and application of new tools for crop improvement [2], [3], [4], and promoted the expansion of sunflower uses. However, sunflower genome sequencing was not approached until the advent of next-generation sequencing technologies [5] and is still in progress. In this context, providing new insights into the sunflower genome is essential to enable efficient transcriptome analyses and molecular breeding. For transcriptional studies, during the last decade, cDNA macro and microarrays were developed to study cultivated sunflower seed development [6], and the responses to biotic [7], and abiotic stresses [8], [9], [10]. Arrays based on cDNAs were also developed to carry out studies in wild Helianthus[11], [12] including hybrid species [13]. This approach although useful is confined to the analysis of a limited set of genes. Currently, the shortage of candidate genes underlying agronomically important traits represents one of the main drawbacks in sunflower molecular breeding. In this context, functional tools such as a high-density oligonucleotide microarray would enable the discovery and characterization of important novel genes affecting key agronomic traits. Oligonucleotide-based chips are considered more accurate than cDNA-based chips because they require fewer manipulation steps, ensuring higher reproducibility [14]. The possibility of implementing this technology in a custom array system like Agilent, NimbleGen, or others, has the potential to create a highly useful tool for gene discovery in non-model crops [15], [16]. Clearly, most plant species do not have microarrays available for gene expression analysis, although a number of attempts are being made, mainly for plants with unsequenced genomes [15]. Condensed gene indexes based on small and large scale assemblies were successfully used in model plants for the development of microarray analysis, e.g., Arabidopsis, tobacco, melon and rice; [17], [18], [19], [20] and non-model economically relevant plants such as maize, tomato, cotton, citrus, cucumber, Brassica, wheat, flax and coffee [21], [22], [23], [24], [25], .

Microarray technology first opened a new era of high-throughput transcriptome analysis approximately fifteen years ago [53], [54]. Although next-generation sequencing technologies can explore and analyze transcriptomes from large genomes, for many species the lack of a reference genome provides a major constraint to extracting significant biological information. Because of this constraint, non-model species are excellent targets for genome studies using different strategies for gene-index construction [55]; these studies thus contribute concise transcriptomic data to improve our biological understanding of diverse processes [56]. In this context, improving coverage by accurate microarray design seems to be the most desirable application for these technologies [57]. In the particular case of sunflower, even though genome sequencing is in progress [5], there is not yet a reference genome available. Recently, two platforms for high throughput expression studies of the Compositae family have been developed based on proprietary designs [30], [31]. The Affymetrix Sunflower Array includes genes from the genus Helianthus and has been applied to study seed dormancy regulation in cultivated sunflower [31]. The NimbleGen Compositae Microarray Platform, based on unigenes derived from Ambrosia artemisiifolia, Centaurea diffusa, Centaurea solstitialis, Cirsium arvense, Helianthus sp. and H. annuus L., has been developed as a genomic tool and resource for population and comparative genomic analyses within the Compositae family [30].