CAGI Scientist Opening: apply here ☀️
Predicting the effects of disease-associated variants on the stability of calmodulin
Variant data: registered users only
Last updated: 12 October 2021
This challenge is closed.
Calmodulin (CaM) is a ubiquitous calcium (Ca2+) sensor protein interacting with more than 200 molecular partners, thereby regulating a variety of biological processes. Missense point mutations in the genes encoding CaM have been associated with ventricular tachycardia and sudden cardiac death. A library encompassing up to 17 point mutations was assessed by far-UV circular dichroism (CD) by measuring melting temperature (Tm) and percentage of unfolding (%unfold) upon thermal denaturation at pH and salt concentration that mimic the physiological conditions. The challenge is to predict: (1) the Tm and %unfold values for isolated CaM variants under Ca2+-saturating conditions (Ca2+-CaM) and in the Ca2+-free (apo) state; (2) whether the point mutation stabilizes or destabilizes the protein (based on Tm and %unfold).
CaM is a 17 kDa calcium sensor protein ubiquitously expressed and highly conserved among the kingdoms of life. In humans, three genes (CALM1, CALM2 and CALM3) localized in three different chromosomes express CaM resulting in the same final protein sequence. CaM is formed by two lobes (C-lobe and N-lobe) each consisting of two EF-hand (helix-loop-helix) motifs. The two lobes exhibit different Ca2+ affinity, allowing CaM to be responsive in a wide range of Ca2+ concentrations, thus playing roles in many signaling pathways such as cytoskeleton remodeling, cell mobility, proliferation, apoptosis, ions transport and protein folding (Chin & Means, 2000). Due to its plasticity, CaM interacts with more than 200 target molecules, among which the Ryanodine receptors, giant tetrameric channels localized in the sarcoplasmic reticulum of skeletal muscles (RyR1) and heart muscle (RyR2). CaM regulates in different ways the opening of RyRs channels, leading to the Ca2+ induced Ca2+ release, fundamental for the excitation-contraction mechanism (Van Petegem, 2015).
The high conservation and redundancy of CaM genes in the human genome make it reasonable to assume that mutations in CaM are incompatible with life. However, in 2012 two missense point mutations were found to be associated with sudden heart failure (Nyegaard et al., 2012). To date, 17 variants are known to be associated with such diseases. Interestingly, most of these mutations are localized in the C-lobe, the one with the highest Ca2+ affinity. Moreover, it has been demonstrated that the presence of one sixth of mutant CaM is more severe than the complete loss of expression of one allele, suggesting that the integrity of the CaM pool is fundamental to avoid erroneous signal transmission, at least in cardiomyocytes (Hwang et al., 2014).
Human CaM in its wildtype and disease-associated variants was expressed in a heterologous system (E. coli). The expressed protein was purified via affinity chromatography and the 6xHis-tag at the N-terminus was later removed via specific cleavage by Tobacco Etch Virus protease. Thermal denaturation profiles for each variant were collected by monitoring far-UV circular dichroism spectra following the variations of ellipticity at 222 nm at different temperatures (Dal Cortivo et al., 2020), ranging from 4 to 120°C. These measurements were performed in two conditions, namely: apo CaM and Ca2+-CaM. The working buffer had pH = 7.4 and contained 150 mM KCl, mimicking physiological ionic strength.
For each variant, participants are asked to predict the Tm, for both apo CaM and Ca2+-Ca, and %unfold values, defined as (θ222(Tmax) – θ222(Tmin)) / θ222(Tmin), where θ222 is the ellipticity measured at 222 nm, the wavelength representing a hallmark of alpha-helix secondary structure in circular dichroism spectroscopy; Tmax and Tmin represent the maximal and the minimal temperatures scanned by the thermal denaturation profile (4°C and up to 120°C; see Dal Cortivo et al., 2020). Furthermore, we ask to predict the stabilization/destabilization effect, in terms of %unfold, defined as (θ222(Tmax)– θ222(Tmin))/ θ222(Tmin), for each variant.
The training dataset, which will be provided to registered participants, consists of:
Prediction submission format
The prediction submission is a tab-delimited text file. Organizers provide a template file, which must be used for submission. In addition, a validation script is provided, and predictors must check the correctness of the format before submitting their predictions. Each data row in the submitted file must include the following columns:
In the template file, cells in columns 2-7 are marked with a “*”. Submit your predictions by replacing the "*" with your value and note that you can choose to predict the melting temperature but not %unfold or vice versa. No empty cells are allowed in the submission. For a given subset, you must submit predictions and standard deviations for all or none of the variants; if you are not confident in a prediction for a variant, enter an appropriate large standard deviation for the prediction. Optionally, enter a brief comment on the basis of the prediction. If you do not enter a comment on a prediction, leave the "*" in those cells. Please make sure you follow the submission guidelines strictly.
In addition, your submission should include a detailed description of the method used to make the predictions, similar to the style of the Methods section in a scientific article. This information must be submitted as a separate file.
CAGI allows submission of up to six models per team, of which model 1 is considered primary. You can upload predictions for each model multiple times; the last submission before deadline will be evaluated for each model.
Use the following format for your submissions: <teamname>_<type>_model_(1|2|3|4|5|6).(tsv|txt), where
<type> = “apo” or “holo” (Ca2+-bound).
To include a description of your method, use the following filename: <teamname>_desc.*
Example: if your team’s name is “bestincagi” and you are submitting predictions for the apo form of calmodulin using your model number 3, your filename should be bestincagi_apo_model_3.txt.
Additional information about the human CaM is reported in UniProtKB, accessible with the identifiers P0DP23, P0DP24 and P0DP25, corresponding to the CALM1, CALM2 and CALM3 genes, respectively. Note however, that the calmodulin protein used in the experimental setting (after His-tag removal) includes the residues GA (Glycine and Alanine) at the N-terminus. Nevertheless, for this challenge, the mutated positions are provided according to the calmodulin sequence P0DP23.
The predictions will be evaluated via several methods including those that consider the difference between the predicted and experimental Tm, %unfold. The challenge will be assessed by Emidio Capriotti, University of Bologna, Italy.
Dataset provided by
Giuditta Dal Cortivo and Daniele Dell’Orco, University of Verona, Italy.
Chin D, Means AR. Calmodulin: a prototypical calcium sensor. Trends Cell Biol (2000) 10(8):322-328. PubMed
Dal Cortivo G, et al. Missense mutations affecting Ca2+-coordination in GCAP1 lead to cone-rod dystrophies by altering protein structural and functional properties. Biochim Biophys Acta Mol Cell Res (2020) 1867(10):118794. PubMed
Hwang HS, et al. Divergent regulation of ryanodine receptor 2 calcium release channels by arrhythmogenic human calmodulin missense mutants. Circ Res (2014) 114(7):1114-1124. PubMed
Nyegaard M, et al. Mutations in calmodulin cause ventricular tachycardia and sudden cardiac death. Am J Hum Genet (2012) 91(4):703-712. PubMed
Van Petegem F. Ryanodine receptors: allosteric ion channel giants. J Mol Biol (2015) 427(1):31-53. PubMed
03 May 2021: initial release
08 June 2021: challenge opens
14 August 2021: submission deadline extended to September 30
21 August 2021: minor changes in text on prediction submission format
12 September 2021: file naming for the submissions updated to include apo and holo forms.
30 September 2021: submission deadline extended to October 11
11 October 2021: challenge closed