Date Published: September 18, 2017
Publisher: Public Library of Science
Author(s): Andrzej Stanisław Cieplak, Eugene A. Permyakov.
Proteins associated with neurodegenerative diseases are highly pleiomorphic and may adopt an all-α-helical fold in one environment, assemble into all-β-sheet or collapse into a coil in another, and rapidly polymerize in yet another one via divergent aggregation pathways that yield broad diversity of aggregates’ morphology. A thorough understanding of this behaviour may be necessary to develop a treatment for Alzheimer’s and related disorders. Unfortunately, our present comprehension of folding and misfolding is limited for want of a physicochemical theory of protein secondary and tertiary structure. Here we demonstrate that electronic configuration and hyperconjugation of the peptide amide bonds ought to be taken into account to advance such a theory. To capture the effect of polarization of peptide linkages on conformational and H-bonding propensity of the polypeptide backbone, we introduce a function of shielding tensors of the Cα atoms. Carrying no information about side chain-side chain interactions, this function nonetheless identifies basic features of the secondary and tertiary structure, establishes sequence correlates of the metamorphic and pH-driven equilibria, relates binding affinities and folding rate constants to secondary structure preferences, and manifests common patterns of backbone density distribution in amyloidogenic regions of Alzheimer’s amyloid β and tau, Parkinson’s α-synuclein and prions. Based on those findings, a split-intein like mechanism of molecular recognition is proposed to underlie dimerization of Aβ, tau, αS and PrPC, and divergent pathways for subsequent association of dimers are outlined; a related mechanism is proposed to underlie formation of PrPSc fibrils. The model does account for: (i) structural features of paranuclei, off-pathway oligomers, non-fibrillar aggregates and fibrils; (ii) effects of incubation conditions, point mutations, isoform lengths, small-molecule assembly modulators and chirality of solid-liquid interface on the rate and morphology of aggregation; (iii) fibril-surface catalysis of secondary nucleation; and (iv) self-propagation of infectious strains of mammalian prions.
To understand protein folding, one needs to understand protein structure. And yet, in spite of the considerable interest and effort, even the most rudimentary issues of proteins conformational behaviour remain unresolved: ‘Surprisingly, the field lacks a physicochemical theory of protein secondary structure’ . Indeed, for the chemist concerned to gain insight, protein study is in want of a theory which would explain what local backbone interactions make a residue a helix-breaker or a helix-former and how that propensity depends on the context—why a residue which is a helix-breaker in water becomes a helix-former in lipid or vacuum or why a residue which is a helix-former in the folding intermediate becomes a sheet-former in the native state. Pressing puzzles such as why certain sequences of helix-breakers and helix-formers may adopt an all-α fold in one environment, all-β fold in another, and collapse into a coil in yet another one, are bound to remain elusive unless these questions are answered.
Our choice of the reference system for the PMO theory-informed analysis is a simplified structure of the polypeptide backbone which can be converted into a complete protein chain by ‘switching on’ three standard perturbations : (1) the geometry perturbation (by changing conformation); (2) the atomic substitution/electronegativity perturbation (by plugging in the side chains); and (3) the intermolecular perturbation (by embedding backbone chain in polarizing or depolarizing environment and allowing for the interactions with co-solutes, other protein chains, surfaces of biopolyelectrolytes etc.). The simplified backbone structure comprises the (-Cα-NH-C (=O)-)n chain and the localized MOs of the peptide amide bonds and their ligands. It is assumed that the sterically-allowed regions of the ψ/φ space are approximately equivalent in terms of energy unless the stereoelectronic and electrostatic interactions are introduced [27,45]. The relative importance of those interactions—which ones actually occur—depends on electronic configuration of the peptide amide bonds which in turn depends on electronic effects of the side chains (atomic substitution/electronegativity perturbation) and on the mutual polarization of the protein and the medium (intermolecular perturbation). Given that each interaction of the peptide amide bonds is maximized in a specific region of the ψ/φ space (geometry perturbation), the amino acid sequence and the medium may determine conformational and H-bonding propensity of the polypeptide backbone by ‘switching on’ and ‘switching off’ certain combinations of those interactions. To assess the effects of these perturbations, we employ (i) ab initio and DFT studies of secondary structure (geometry optimizations; NBO analysis of the donor-acceptor interactions of the localized natural bond orbitals using Weinhold’s ΔE(2) energies —the BLW-ED energies  are lower but the qualitative trends are expected to be the same; SCRF modeling of solvent effects; GIAO calculations of the NMR shielding tensors σ(Cα)Xaa of Cα atoms), and (ii) qualitative concepts of two theories of solutions: the Onsager theory of solute-solvent polarization  and the Debye-Hückel theory of dilute solutions of strong electrolytes . First, to assess the side chains’ effect on the distribution of backbone density, the NMR shielding tensors σ(Cα)Xaa of the Cα atoms were calculated using the models of helix and sheet structures in gas phase (the effect of the medium is treated as a separate perturbation). Thus, two oligopeptides, N-acetyl hexaglycyl N-methylamide AcGGGGGGNHMe and N-acetyl pentaglycyl amide AcGGGGGNH2, were initially optimized in the conformations corresponding to the hairpin with the type Ib reverse turn, and 310-helix, respectively, at the B3LYP/D95** level of the theory. The protocol involved folding of the peptide chain into the starting conformer using the standard φi and ψi values and subsequently an unconstrained optimization. The searches for the minima were completed by the default convergence criteria of Gaussian 98, Revisions A.3, A.7, A11.2 . The sixth residue of the hexapeptide hairpin and the second residue of the pentapeptide helix were then systematically varied to generate congener structures with the canonical and covalently modified residues. The side chains were set into the conformations trans and–gauche about the Cα-Cβ bond when needed, both in the neutral and ionized state when appropriate, and a number of structures were partially constrained as noted in S1 Table. Again, all the searches for the minima were completed by the default convergence criteria. The calculations yielded the total of 141 AcGGGGGXaaNHMe and AcGXaaGGGNH2 structures. Atomic coordinates of the obtained structures were used to compute the NMR shielding tensors using the B3LYP/D95** and GIAO (Gauge-Independent Atomic Orbital) methods (S1 Table). The mean values of the obtained shielding tensors of the Cα atoms were converted into the folding constants σXaa as described in Results and Discussion. To assess the effect of geometry perturbation on backbone conjugation and backbone-backbone H-bonding by examination of Weinhold energies ΔE(2) of the hyperconjugative interactions which differentiate the sterically-allowed regions of the Ramachandran map, the protocol described above was employed to optimize five models of secondary structure 1–5 (B3LYP/6-31G*): the decapeptide AcAAAAAAAAANH21 in the 310-helix conformation, the pentadecapeptide AcAAAAAAAAAAAAAANH22 in the α-helix conformation, the hexapeptide AcAAAAANHMe3 in the 27-ribbon conformation, the ternary antiparallel complex of the tripeptide (AcAANHMe)34, and the ternary parallel complex of the tripeptide (AcAANHMe)35. In addition, the binary complexes of the oligopeptides (AcAAANHMe)26a-6d and (AcAAAAANHMe)27a-7b/8a-8c were obtained by unconstrained optimization (as described above, at the B3LYP/6-31G* level of the theory, completed by the default convergence criteria of Gaussian98). To assess the effect of a polar dielectric on the distribution of backbone density in a globular protein molecule, the model of TC5b: AcAAAAAAAAGGPAAGAPPPA-NH29, was obtained by the full unconstrained optimization (HF/3-21G, gas phase) of the peptide chain placed in the conformation defined by the φ/ψ angles reported for the NMR ensemble of TC5b . The final gas-phase structure 9a was re-optimized in water taken as the continuous dielectric (HF/3-21G, the Onsager model as implemented in the Gaussian suite with ε0 = 78.39 and the radius of the spherical solvent cavity a0 = 8.43 Å) until the default convergence criteria were fully met again to obtain the structure 9b. The Cartesian atomic coordinates and the total energies of all the secondary and tertiary structure models (156 entries) are included in S1 Dataset.
The qualitative PMO theory of organic structure and reactivity provides simplified models which, inevitably, sacrifice some aspects of reality. The folding potential function, FPi, that we employed in this study is designed to capture the effect of distribution of charge density on the conformational and H-bonding propensity of the polypeptide backbone; it does not carry any information about the constraints introduced e.g. by the hydrophilic residues or by proline, and obviously does not take into account any side chain-side chain interactions. Indeed, on several occasions in the preceding discussion, it was clearly useful to complement the FPi plots by showing distribution of the ionized side chains and potential ‘steric zipper’ segments or the presence of proline and cysteine residues in the sequence. However, simplification may also underlie the strength of the PMO theory models. And so, surprisingly, the FPi and FPi vs.ΔFPi-1→i+1 plots are found to identify basic features of secondary and tertiary structure, establish sequence correlates of metamorphic and pH-driven equilibria, relate binding affinities and folding rate constants to secondary-structure preferences, and manifest common backbone polarization patterns in amyloidogenic regions of Alzheimer’s Aβ and tau, Parkinson’s α-synuclein and TSEs’ prions. Thus, by stripping away the side chain-side chain interactions, we discover that the main chain H-bonding and hyperconjugation, described in terms of two-electron stabilizing interactions of the filled and vacant molecular orbitals [7–15], make significant and sometimes decisive contributions to the stability of both native and transition states. The new vocabulary and grammar of encoding the 3D structure of proteins which emerge amidst these findings readily define hitherto elusive pathways of misfolding and aggregation associated with brain proteinopathies. One cannot account, it appears, for a wide range of folding phenomena without taking into consideration the stereoelectronic effects such as the generalized anomeric effect and the homohyperconjugation of peptide linkages [16–22,28–33,51–58]. This conclusion is in line with the arguments of backbone-based theory of protein folding , and particularly concerns conformational behaviour of highly pleiomorphic proteins associated with neurodegenerative diseases.