Research Article: A combined strategy of neuropeptide prediction and tandem mass spectrometry identifies evolutionarily conserved ancient neuropeptides in the sea anemone Nematostella vectensis

Date Published: September 23, 2019

Publisher: Public Library of Science

Author(s): Eisuke Hayakawa, Hiroshi Watanabe, Gerben Menschaert, Thomas W. Holstein, Geert Baggerman, Liliane Schoofs, Hubert Vaudry.


Neuropeptides are a class of bioactive peptides shown to be involved in various physiological processes, including metabolism, development, and reproduction. Although neuropeptide candidates have been predicted from genomic and transcriptomic data, comprehensive characterization of neuropeptide repertoires remains a challenge owing to their small size and variable sequences. De novo prediction of neuropeptides from genome or transcriptome data is difficult and usually only efficient for those peptides that have identified orthologs in other animal species. Recent peptidomics technology has enabled systematic structural identification of neuropeptides by using the combination of liquid chromatography and tandem mass spectrometry. However, reliable identification of naturally occurring peptides using a conventional tandem mass spectrometry approach, scanning spectra against a protein database, remains difficult because a large search space must be scanned due to the absence of a cleavage enzyme specification. We developed a pipeline consisting of in silico prediction of candidate neuropeptides followed by peptide-spectrum matching. This approach enables highly sensitive and reliable neuropeptide identification, as the search space for peptide-spectrum matching is highly reduced. Nematostella vectensis is a basal eumetazoan with one of the most ancient nervous systems. We scanned the Nematostella protein database for sequences displaying structural hallmarks typical of eumetazoan neuropeptide precursors, including amino- and carboxyterminal motifs and associated modifications. Peptide-spectrum matching was performed against a dataset of peptides that are cleaved in silico from these putative peptide precursors. The dozens of newly identified neuropeptides display structural similarities to bilaterian neuropeptides including tachykinin, myoinhibitory peptide, and neuromedin-U/pyrokinin, suggesting these neuropeptides occurred in the eumetazoan ancestor of all animal species.

Partial Text

Neuropeptides are a highly diverse group of messenger molecules involved in neurotransmission. They are essential for many physiological processes, such as muscle contraction, food digestion, growth, development, and reproduction, as well as more complex behaviours, such as adaptation, learning and memory, and ageing [1]. A neuropeptide is usually encoded in a larger neuropeptide precursor gene, which also encodes an N-terminal signal peptide. The precursor is translated on the rough endoplasmic reticulum, and the signal peptide is removed by a signal peptidase. Afterwards, processing enzymes in the immature secretory granules typically produce one or more mature neuropeptides by processing the protein precursor at cleavage sites. A number of cleavage enzymes, so called prohormone convertases of the furin/subtilisin family with specific recognition patterns, have been identified [2,3]. After cleavage, the processed peptides can undergo various posttranslational modifications (PTMs) [4]. Especially, the N-terminal and/or C-terminal residues are often modified. A frequently observed modification is amidation at the C-terminus of mature neuropeptides, which is usually required for their biological activity. C-terminal amidation involves the enzymatic transformation of a glycine into an alpha-amide by peptidylglycine alpha-amidating monooxygenase [5]. First, peptidylglycine is transformed into peptidyl-alpha-hydroxyglycine in the presence of copper, ascorbate, and molecular oxygen and subsequently converted to peptide alpha-amide and glyoxylate. C-terminal amidation protects the mature peptide from enzymatic degradation by carboxypeptidases. Pyroglutamic acid is a posttranslational modification that is widely observed at N-termini of various neuropeptides and protects the peptide chain from enzymatic degradation by aminopeptidases. In addition to PTMs, many bioactive peptides contain a proline residue in the second or third position from the N-terminus. This feature also protects the peptide from peptidase activity at the N-terminus [6]. All these structural characteristics are widely observed in neuropeptide sequences across the Animal Kingdom [7].

We studied the neuropeptidome of Nematostella vectensis, a sea anemone that belongs to the Cnidaria phylum, possessing one of most basal nervous systems in the Animal Kingdom. Recent advances in mass spectrometry technologies have greatly enhanced the identification of neuropeptidomes from biological matrices [14,74–76]. However, an effective bioinformatics solution for neuropeptide identification is still missing and leaves the interpretation of tandem mass spectral data difficult. We overcame this difficulty by performing peptide spectrum matching against a narrowed database comprising in silico cleaved putative neuropeptide sequences. We successfully identified 20 peptides encoded in six neuropeptide precursor genes from Nematostella and experimentally confirmed that at least four precursor genes are expressed in neurons exhibiting distinct neurophysiological activities. Our approach can be easily incorporated with recent advanced technologies in peptidomics, and such a combination will greatly improve the analysis of the neuropeptidome in a broad range of biological samples.