CAGI 2012 prediction season is now open

The Critical Assessment of Genome Interpretation (CAGI) is a community experiment to assess computational methods for predicting the phenotypic impacts of genomic variation. The CAGI 2012 experiment is planned to include ten challenges. The prediction season is now open with the first dataset available, and the other datasets will be rolling out soon on the CAGI website:

Most prediction deadlines will be in early November.

In the CAGI experiment, participants are provided genetic variants and make predictions of resulting phenotypes. These predictions are evaluated against experimental characterizations, with independent assessors performing the evaluations. The primary goals of the experiment are to identify bottlenecks in genome interpretation, inform critical areas of future research, and connect researchers from diverse disciplines whose expertise is essential for advancing
methods of genome interpretation.

The CAGI 2012 experiment will culminate in a conference planned for 15-16 December 2012 at UCSF. An NHGRI R13 grant will help support travel and participation in the meeting.

The anticipated CAGI 2012 challenges are:

  • Exomes of Crohn's disease patients and healthy individuals (provided by Andre Franke). Challenge: predict which individuals have Crohn's.
    Released NOW: Crohn's Disease
  • Variants in DNA double-strand break repair genes (provided by Sean Tavtigan).
    Challenge: predict probability of each variant occurring in a breast cancer case versus healthy control.
  • Variants of BRCA1 and BRCA2 (provided by Robert Nussbaum). Challenge: predict which variants are associated with increased risk for breast cancer.
  • Mutations in p53 gene exons affecting mRNA splicing (provided by Jeremy Sanford). Challenge: predict how variants impact splicing.
  • New PGP genomes (provided by George Church). Challenge: Predict clinical phenotypes from genome data, and match individuals to their health records.
  • Exomes from two families with lipid metabolism disorders (provided by John Kane and Pui-Yan Kwok). Challenge: predict lipid profiles and a causative variant.
  • Variants of a p16 tumor suppressor protein (provided by Silvio Tosatto). Challenge: predict how well variants inhibit cell proliferation.
  • Shewanella oneidensis MR-1 gene disruptions (provided by Adam Arkin). Challenge: Predict impact of microbial gene disruptions on cell growth under stress conditions
  • Whole genomes of a family affected by primary congenital glaucoma (provided by Luba Kalaydjieva and Michael Snyder). Challenge: Discover the genetic basis of the disease.
  • riskSNPs disease-associated loci (provided by John Moult). Challenge: identify potential causative SNPs.

This year there will be a hiatus in the CAGI Cancer Pharmacogenomics challenge (provided in 2010-2011 by Joe Gray,, as DREAM is using these data for the equivalent challenge this year:

In order to access the challenges and submit predictions for CAGI 2012, please register at

Registered users also have access to presentations from the previous CAGI conferences, as well as posters and talk slides that summarize the results.


Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair