Critical Assessment of Genome Interpretation (CAGI)--Open Challenges and Conference

Following is an omnibus update on CAGI, including new information about
the CAGI conference in July and the recently added anonymity policy.
Please distribute widely and follow our Twitter feed @CAGInews.

The Critical Assessment of Genome Interpretation (CAGI) is a community
experiment to assess computational methods for predicting the
phenotypic impacts of genomic variation. The current CAGI experiment
has eight open challenges, available on the CAGI website:

In the CAGI experiment, participants are provided genetic variants and
make predictions of resulting phenotypes. Independent assessors then
evaluate these predictions against experimental characterizations.
The primary goals of the experiment are to establish the current
state of the art, identify bottlenecks in genome interpretation,
inform critical areas of future research, and connect researchers
from diverse disciplines whose expertise is essential for advancing
methods for interpreting genomic variation.

The deadline for current CAGI predictions is
***EDIT: extended to 25 April 2013.***
Anonymous submissions, with limitations, are allowed this year.
We encourage use of both established methods and experimental
approaches, and we welcome predictors of all backgrounds.

The current CAGI experiment will culminate in a conference in Berlin,
on 17-18 July 2013, immediately before the ISMB SIGs. An NHGRI R13
grant will help support travel and participation in the meeting.

Previous CAGI experiments have highlighted striking breakthroughs
as well as disappointing failures. Publications from the previous
CAGI are underway; slides and posters presentations about CAGI may
be found at:
The results from the current CAGI challenge will be published as well.

The currently open CAGI challenges are:

+ Seventy-seven PGP genomes (provided by George Church).
Challenge: Predict clinical phenotypes from genome data, and match
individuals to their health records.

+ Exomes of Crohn's disease patients and healthy individuals (provided
by Andre Franke). Challenge: predict which individuals have Crohn's.

+ Exomes from two families with lipid metabolism disorders (provided
by John Kane and Pui-Yan Kwok). Challenge: predict lipid profiles
and a causative variant.

+ Variants in DNA double-strand break repair genes (provided by Sean
Tavtigan). Challenge: predict probability of each variant occurring
in a breast cancer case versus healthy control.

+ Mutations in p53 gene exons affecting mRNA splicing (provided by
Jeremy Sanford). Challenge: predict how variants impact splicing.

+ Variants of a p16 tumor suppressor protein (provided by Silvio
Tosatto). Challenge: predict how well variants inhibit cell

+ Shewanella oneidensis MR-1 gene disruptions (provided by Adam Arkin).
Challenge: Predict impact of microbial gene disruptions on cell
growth under stress conditions

+ riskSNPs disease-associated loci (provided by John Moult). Challenge:
identify potential causative SNPs.

We are also soliciting challenges for the next CAGI. Please contact us
at with proposals for suitable datasets.

In order to access the current challenges and submit predictions for CAGI,
please register at

Registered users also have access to presentations from the previous
CAGI conferences, as well as posters and talk slides that summarize
the results.


Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair