Research Article: Identification of Novel Short C-Terminal Transcripts of Human SERPINA1 Gene

Date Published: January 20, 2017

Publisher: Public Library of Science

Author(s): Nerea Matamala, Nupur Aggarwal, Paolo Iadarola, Marco Fumagalli, Gema Gomez-Mariano, Beatriz Lara, Maria Teresa Martinez, Isabel Cuesta, Jan Stolk, Sabina Janciauskiene, Beatriz Martinez-Delgado, Pavel Strnad.


Human SERPINA1 gene is located on chromosome 14q31-32.3 and is organized into three (IA, IB, and IC) non-coding and four (II, III, IV, V) coding exons. This gene produces α1-antitrypsin (A1AT), a prototypical member of the serpin superfamily of proteins. We demonstrate that human peripheral blood leukocytes express not only a product corresponding to the transcript coding for the full-length A1AT protein but also two short transcripts (ST1C4 and ST1C5) of A1AT. In silico sequence analysis revealed that the last exon of the short transcripts contains an Open Reading Frame (ORF) and thus putatively can produce peptides. We found ST1C4 expression across different human tissues whereas ST1C5 was mainly restricted to leukocytes, specifically neutrophils. A high up-regulation (10-fold) of short transcripts was observed in isolated human blood neutrophils after activation with lipopolysaccharide. Parallel analyses by liquid chromatography-mass spectrometry identified peptides corresponding to C-terminal region of A1AT in supernatants of activated but not naïve neutrophils. Herein we report for the first time a tissue specific expression and regulation of short transcripts of SERPINA1 gene, and the presence of C-terminal peptides in supernatants from activated neutrophils, in vitro. This gives a novel insight into the studies on the transcription of SERPINA1 gene.

Partial Text

Gene expression is responsible for the synthesis of functional gene products, typically proteins. The question of how many genes are present in the human genome led to the collection of known protein-coding genes not only in the human genome but also in Arabidopsis, worm, fly and mouse [1–4]. The findings revealed that there is a significantly greater amount of transcriptional output from genomes than anticipated by the collection of annotated protein-coding transcripts [5]. New proteomic high-throughput technologies provide a strong evidence of the existence of non-canonical protein isoforms generated by the translation outside of the annotated protein-coding genes [6,7]. A variety of short open reading frames (ORFs), long noncoding RNAs and pseudogenes are coding short transcripts [8,9], which may express unrecognized biological functions.