CAGI Scientist Opening: apply here ☀️
Acid alpha-glucosidase (GAA): predict the effect of naturally occurring missense mutations on enzymatic activity
Challenge: GAA missense
Dataset description: public
Variant data: public
Dataset availability: public
Last updated: 24 April 2018
This challenge will tentatively close at 9:00 PM PST (Pacific Standard Time) on 25 Apr 2018.
Download answer key and predictions: registered users only, limited by CAGI Data Use Agreement. The answer key and predictions are accessible to registered users only, and their use is limited by the CAGI Data Use agreement. Please log in to access the file.
Presentations from the CAGI 5 conference: registered users only, limited by CAGI Data Use Agreement. Presentations are accessible to registered users only, and their use is limited by the CAGI Data Use Agreement. Please log in to access the file.
Acid alpha-glucosidase (GAA) is a lysosomal alpha-glucosidase. Some mutations in GAA cause a rare disorder, Pompe disease, (Glycogen Storage Disease II). Rare GAA missense variants found in a human population sample have been assayed for enzymatic activity in transfected cell lysates. The assessment of this challenge will include evaluations that recognize novelty of approach. The challenge is to predict the fractional enzyme activity of each mutant protein compared to the wild-type enzyme.
Acid alpha-glucosidase (GAA, NP_000143.2) is a lysosomal enzyme that hydrolyzes terminal, non-reducing end α1-4 and α1-6 linkages in glycogen. Deficiency of GAA causes Pompe disease (Glycogen Storage Disease II) (GSD-II, OMIM #232300), an autosomal recessive lysosomal storage disorder in which lysosomal glycogen accumulation results in myopathy (Hirschhorn et al., 2010). GAA is a member of glycoside hydrolase family 31 (Carbohydrate Active Enzymes database http://www.cazy.org/) (Cantarel et al., 2009). GAA is synthesized as a 952 amino acid polypeptide that contains an N-terminal signal peptide. The mature, lysosomal form of GAA is produced as the result of extensive proteolytic processing at both the N- and C-terminal ends. The fully processed, mature protein consists of 4 polypeptides derived from the precursor protein: a catalytic polypeptide of 70 KDa, as well as polypeptides of 19.4, 10.3, and 3.9 KDa (Moreland et al., 2005). The precursor form of GAA, lacking the N-terminal signal peptide, is enzymatically active (Van Hove et al., 1996). GAA functions as a monomer, and a recently released 2.0 Å resolution structure (5KZW) can be found in the PDB.
Deficiency of GAA results in lysosomal glycogen accumulation in multiple tissues, with cardiac and skeletal muscle tissues most seriously affected. Pompe disease is a spectrum disorder with a broad range of clinical manifestations. The most severe form is infantile-onset Pompe disease which presents with prominent cardiomegaly, myocardial failure, generalized muscle hypotonia without muscle-wasting, and death prior to 1 to 2 years of age. Infantile patients have little to no detectable GAA activity. At the other extreme is the slowly progressive adult-onset form of the disease. The late-onset form is generally characterized by slowly progressive proximal muscle weakness and respiratory insufficiency, and can present anytime from childhood until as late as the 2nd to 6th decade of life. It is distinguished from the infantile-onset form by the absence of severe cardiac involvement. Late-onset patients may have residual GAA activity up to 30% of normal (van der Ploeg & Reuser, 2008). It is estimated that approximately one third of those with Pompe disease have the rapidly fatal infantile-onset form, while the majority of patients present with late-onset Pompe disease (Hirschhorn et al., 2010). While life expectancy can vary, death generally occurs due to respiratory failure (Hirschhorn et al., 2010). The incidence of Pompe disease is believed to be approximately 1:40,000 births (Martiniuk et al., 1998).
BioMarin has functionally assessed the enzymatic activity of the 357 novel missense mutations in the ExAC dataset. Plasmids containing cDNAs encoding each of the mutant proteins were transfected into an immortalized Pompe patient fibroblast cell line. This cell line has no GAA activity. After 72 hours, cells were lysed, and GAA activity in the lysate was assessed using the fluorogenic substrate, 4-methylumbelliferyl α-D-glucoside. The activity units are pMol/min/ug protein. Background subtracted enzyme activity for each mutant was normalized to the background subtracted activity in a cell lysate from cells transfected with the wild-type cDNA and reported as per cent wild-type GAA activity. Each mutant was assayed in at least three independent transfection experiments. The results from these determinations were averaged, and the standard deviation calculated.
Participants are asked to submit predictions on the effect of the variants on GAA enzymatic activity. The submitted prediction should be a numeric value ranging from 0 (no activity) to 1 (wild-type level of activity), or >1 if the predicted activity is greater than wild-type activity (e.g., 0.7 means 70% of wild-type and 1.3 means 130% of wild-type activity). Each predicted activity must include a standard deviation. Optionally, a comment on the basis of the prediction may be given. The predictions will be assessed against the numeric values actually measured for each mutation in the enzyme assay. In the previous challenges, it has been observed that predictions often cluster more with other predictions other than with the experimental value. Assessment will include metrics that recognize prediction sets that differ substantially from results provided by standard methods such as PolyPhen-2 and SIFT.
Download dataset: 5-GAA_dataset.txt (4.2 KB)
Download submission template: This submission template file is available only to registered users. Please log in to access the file.
Download submission validation script: This submission validation script is available only to registered users. Please log in to access the file.
Prediction submission format
The prediction submission is a tab-delimited text file. Organizers provide a template file, which should be used for submission. In addition, a validation script is provided, and predictors should check the correctness of the format before submitting their predictions. In the submitted file, each row includes the following columns:
In the template file, cells in columns 2-4 are marked with a "*". Submit your predictions by replacing the "*" with your value. No empty cells are allowed in the submission. You must enter a prediction and standard deviation for every mutant; if you are not confident in a prediction for a mutant, enter a large standard deviation for the prediction. Optionally, enter a brief comment on the basis of the prediction, otherwise, leave the "*" in these cells. Please make sure you follow the submission guidelines strictly.
In addition, your submission must include a detailed description of the method used to make the predictions, similar to the style of the Methods section in a scientific article. This information will be submitted as a separate file.
To submit predictions, you need to create or be part of a CAGI User group. Submit your predictions by accessing the link "All submission forms" from the front page of your group. For more details, please read the FAQ page.
Cantarel BL, et al. The Carbohydrate-Active EnZymes database (CAZy): an expert resource for glycogenomics. Nucleic Acids Res (2009) 37(Database issue):D233-D238. PubMed
Hirschhorn R, et al. Glycogen storage disease type II: acid alpha-glucosidase (acid maltase) deficiency, in The Metabolic and Molecular Basis of Inherited Disease. New York: McGraw-Hill. pp. 3389-3420 (2010).
Kroos M, et al. The genotype-phenotype correlation in Pompe disease. Am J Med Genet C Semin Med Genet (2012) 160C(1):59-68. PubMed
Martiniuk F, et al. Carrier frequency for glycogen storage disease type II in New York and estimates of affected individuals born with the disease. Am J Med Genet (1998) 79(1):69-72. PubMed
Moreland RJ, et al. Lysosomal acid α-glucosidase consists of four different peptides processed from a single chain precursor. J Biol Chem (2005) 280(8):6780-6791. PubMed
Stenson PD, et al. Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat (2003) 21(6):577-581. PubMed
van der Ploeg AT, Reuser AJJ. Pompe’s disease. Lancet (2008) 372(9646):1342-1353. PubMed
Van Hove JL, et al. High-level production of recombinant human lysosomal acid alpha-glucosidase in Chinese hamster ovary cells which targets to heart muscle and corrects glycogen accumulation in fibroblasts from patients with Pompe disease. Proc Natl Acad Sci U S A (1996) 93(1):65-70. PubMed
Data provided by
Wyatt Clark, Kevin Ru, Karen Yu, Jonathan H. LeBowitz
BioMarin Pharmaceutical, Inc. 105 Digital Drive, Novato CA 94949
9 Nov 2017 (v01): initial release
13 Nov 2017 (v02): revised description of enzyme assay
30 Nov 2017 (v03): more details on the submission template and variant chosen added
11 Jan 2018 (v04): closing date extended
16 Apr 2018 (v05): UniProtKB/Swiss-Prot: P10253.4 and closign date added
24 Sep 2018 (v06): Dataset availability added