CAGI 2015 about to launch!

We expect to begin releasing CAGI 2015 challenges on 3 August 2015. The list of challenges under consideration follows. Please distribute widely and follow our Twitter feed @CAGInews and the web site for updates. Because challenges are currently under development, we cannot guarantee the release of each challenge. The likelihood of release is indicated: + low, ++ moderate, and +++ high likelihood. Please see below regarding CAGI timing.

Protein-Coding Variants
+++ N-Acetyl-Glucosaminidase (NAGLU) mutations cause mucopolysaccharidosis IIIB (data provided by Jon LeBowtiz, Biomarin Pharmaceutical). Challenge: predict the effect of natually occuring single-amino-acid mutations on enzyme activity in cell extracts.

+++ Pyruvate kinase mutations cause hereditary non-spherocytic hemolytic anemia (data provided by Aron Fenton, The University of Kansas Medical School). Challenge: predict the effects of single-amino-acid mutations on enzyme activity, and allosteric activation and inhibition in cell extracts.

+++ Anaplastic Lymphoma Kinase (ALK) (data provided by Paolo Bonvini and Federica Lovisa, Padua Children's Hospital). Challenge: predict the effects of kinase domain mutations in the oncogenic NPM-ALK fusion gene on kinase activity and Hsp90 binding affinity in transfected cell extracts.

+++ Human SUMO ligase (Ube2I) (data provided by Fritz Roth, University of Toronto). Challenge: predict the effects of a library of amino-acid mutations on competitive growth in a high-throughput yeast complementaion assay.

Splicing Variants
+++ Pre-mRNA splicing (data provided by William Fairbrother, Brown University). Challenge: predict the effect of naturally occuring intronic variants near 3’ splice sites of medically actionable genes in a high-throughput splicing assay.

Regulatory Variants
++ Regulatory sequences and variants associated with eQTLs (data provided by
Ryan Tewhey and Pardis Sabeti, Broad Institute). Challenge: predict the effects of eQTL-associated variants on activation of transcription in a high-throughput reporter assay.

Clinical Sequences
++ Genome sequences and microarrays from sick children (provided by M. Stephen Meyn, Hospital for Sick Children). Challenge: match patients’ genomes to their clincal descriptions and predict causal pathogenic variants.

++ Clinical gene panel sequences (provided by Garry Cutting Johns Hopkins). Challenge: match the patients’ gene panel sequences to their clinical descriptions and predict the causal pathogenic variants.

Research Exomes: predict phenotypes and mutations
+++ Exomes of Crohn's disease patients and healthy individuals (provided
by Andre Franke). Challenge: predict which individuals have Crohn's disease.

++ Exomes, RNA-seq profiles, and microbomes from pre-diabetes patients and healthy individuals (provided by Mike Snyder, Stanford University). Challenge: predict each individual’s fasting glucose level, glucose metabolic rate, and insulin response.

++ Exomes of patients on warfarin (provided by Roxana Daneshjou and Russ Altman, Stanford University School of Medicine). Challenge: predict each patient’s stable warfarin dose.

+ Harvard Personal Genomes Project (provided by George Church, Harvard Medical School). Challenge: predict clinical phenotypes from genome sequences and match individual’s genomes to their health records.

+ Exomes of bipolar disorder patients and healthy individuals (provided
by Peter Zandi, James B. Potash, and Richard McCombie). Challenge: predict which individuals have bipolar disorder.

*** Timeline for CAGI 2015 ***
The prediction season start date of 3 August 2015 is later than planned. We would now like your feedback on two options for the full experiment timeline:

Option 1: Hold the CAGI conference around 12 December 2015 as originally planned, with the prediction season ending early October (i.e. two months in total with less time for some challenges). That is short, potentially reducing the number of participants and the number of challenges each participant can attempt. It also leaves a short assessment season.

Option 2: Hold the CAGI conference later, perhaps March 2016, allowing time for a full-length prediction season (ending some time in November) and an adequate assessment season, providing the assessors time to do a deeper analysis of the results. In addition, a full-length prediction season will allow the CAGI organizers to provide updates on the details of how challenges will be assessed.

Please go to the following URL and let us know which you think is better.

Further information:
This is the second newsletter for CAGI 2015. Please see the CAGI website ( if you missed the first one. Briefly, CAGI has now received funding from NHGRI and NCI, and so we expect to conduct one experiment a year for the next three years.

The Critical Assessment of Genome Interpretation (CAGI) is a community experiment to assess computational methods for predicting the phenotypic impacts of genomic variation. CAGI 2015 will have approximately 10 challenges which will be made available on our web site:

In the CAGI experiment, participants are provided genetic variants and make predictions of resulting phenotypes. Independent assessors then evaluate these predictions against experimental characterizations. The primary goals of the experiment are to establish the current state of the art, identify bottlenecks in genome interpretation, inform critical areas of future research, and connect researchers from diverse disciplines whose expertise is essential for advancing methods for interpreting genomic variation. Anonymous submissions, with limitations, are possible:
In order to access the challenges and submit predictions for CAGI 2015,
please register as soon as possible at our web site.

We encourage use of both established methods and experimental approaches, and we welcome predictors of all backgrounds. An NHGRI R13 grant will help support travel and participation in the meeting for junior investigators.
Previous CAGI experiments have highlighted striking breakthroughs as well as disappointing failures. Registered users have access to slides and posters presentations describing the results of previous CAGI experiments at:
The results from the CAGI 2015 challenges will be posted as well, and publication of results from the experiment are planned and encouraged.

We look forward to your participation and predictions in CAGI 2015!

Roger Hoskins, CAGI Organizer
John Moult, CAGI Co-Chair
Steven Brenner, CAGI Chair
email: ""