CAGI Newsletter

New CAGI challenges and updates: Mechanisms of risk SNPs, bacterial fitness, Crohn's, and splicing

Crohn's challenge update: The data providers are in the process of re-basecalling all samples together so as to provide one VCF for all the data. The revised dataset will provide all variants in the capture region (exome target) as well as the particular capture technology. It is expected that this updated dataset will be available in mid-December.

Splicing challenge update: The TP53 minigene experiments were done in HEK293T cells. Previously, the particular cell line was not specified.

We have also just released two new CAGI 2012 challenges:

1. Assign possible mechanisms for SNPs associated with risk of seven complex trait diseases.

2. Predict impact of Shewanella oneidensis MR-1 gene disruptions on cell growth under stress conditions.

Coming soon: We will shortly be releasing a "PGP challenge" wherein predictors are asked to match individual genomes to individual trait profiles.

Feedback requested for the CAGI 2012 meeting plans

We would like to solicit your input regarding the postponed CAGI 2012 meeting. Please take this brief survey so that we can decide when, where, and if to hold a separate meeting to discuss the CAGI 2012 experiment.

ASHG attendees: Informal CAGI gathering Thursday at 1:15pm

An informal CAGI gathering will be held Thursday, Nov. 8, 1:15 to 2:15 pm in Moscone South. Please write for location details. Also please write if you want to meet with Steven Brenner but cannot make that time.

Two new CAGI 2012 challenges: lipid metabolism diseases FCH and HA

We have recently released two new CAGI 2012 challenges:

1. Using exome sequencing data from a family, identify which individual has hypoalphalipoproteinemia (HA) and other disease phenotypes.

2. In a family affected by familial combined hyperlipidemia (FCH), identify mutation(s) conferring low-density lipoprotein cholesterol (LDL-C) disease phenotype and identify individuals with abnormal triglycerides (TG) and high-density lipoprotein cholesterol (HDL-C) levels.

Important! since this challenge was first posted, there have been important updates, namely
(a) a correction regarding who has the elevated LDL-C phenotype (Patient-17 is the unaffected daughter) and
(b) a clarification about how the variant calling was performed.

CAGI 2012 meeting postponed

Dear CAGI participant:

As you may have realized, the late release of CAGI challenges has resulted in it becoming impractical to hold the conference on the tentatively proposed dates in December 2012. There is just not enough time for engaged predictions or for assessment.

We are now considering alternative options and will soon solicit your input. As a result of the postponement, most challenges will now remain open for predictions until at least 10 January 2013. (Please do check expiry dates for exceptions, such as the BRCA mutation challenge expiring on 31 Oct 2012 due to of imminent public release of the data).

We apologize that the postponement may be disruptive to your plans. If you did make travel arrangements, please let us know, as we would like to mitigate the impact. Hopefully the greater time to address challenges and the broader ranges of challenges that can be considered (for example, it will now be possible to include a nice set of PGP genomes) will be at least a partial compensation.

We will endeavor to provide firm prediction deadlines and meeting plans as soon as possible, probably in about two weeks.

Steven Brenner
John Moult
Daniel Barsky

BRCA challenge closes Wednesday, 31 October 2012

Reminder: The CAGI 2012 BRCA challenge predictions are due on Wednesday, 31 October 2012, 11:59pm (UTC-12).

Other challenges will be due after mid-November, more details will be posted soon.


Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair

Two new CAGI 2012 challenges released

We have recently released two new CAGI 2012 challenges:
1. Predict the probability of observed variants in DNA double-strand break repair genes to occur in a breast cancer case versus healthy control.

2. Predict how mutations in p53 gene exons affect mRNA splicing of the gene.


Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair

BRCA Challenge is released now!

CAGI continues its rollout of 2012 challenges. We have released the BRCA challenge, to predict which variants of BRCA1 and BRCA2 are associated with increased risk of breast cancer.

Details of the challenge are at:


Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair

CAGI 2012 prediction season is now open

The Critical Assessment of Genome Interpretation (CAGI) is a community experiment to assess computational methods for predicting the phenotypic impacts of genomic variation. The CAGI 2012 experiment is planned to include ten challenges. The prediction season is now open with the first dataset available, and the other datasets will be rolling out soon on the CAGI website:

Most prediction deadlines will be in early November.

In the CAGI experiment, participants are provided genetic variants and make predictions of resulting phenotypes. These predictions are evaluated against experimental characterizations, with independent assessors performing the evaluations. The primary goals of the experiment are to identify bottlenecks in genome interpretation, inform critical areas of future research, and connect researchers from diverse disciplines whose expertise is essential for advancing
methods of genome interpretation.

The CAGI 2012 experiment will culminate in a conference planned for 15-16 December 2012 at UCSF. An NHGRI R13 grant will help support travel and participation in the meeting.

The anticipated CAGI 2012 challenges are:

  • Exomes of Crohn's disease patients and healthy individuals (provided by Andre Franke). Challenge: predict which individuals have Crohn's.
    Released NOW: Crohn's Disease
  • Variants in DNA double-strand break repair genes (provided by Sean Tavtigan).
    Challenge: predict probability of each variant occurring in a breast cancer case versus healthy control.
  • Variants of BRCA1 and BRCA2 (provided by Robert Nussbaum). Challenge: predict which variants are associated with increased risk for breast cancer.
  • Mutations in p53 gene exons affecting mRNA splicing (provided by Jeremy Sanford). Challenge: predict how variants impact splicing.
  • New PGP genomes (provided by George Church). Challenge: Predict clinical phenotypes from genome data, and match individuals to their health records.
  • Exomes from two families with lipid metabolism disorders (provided by John Kane and Pui-Yan Kwok). Challenge: predict lipid profiles and a causative variant.
  • Variants of a p16 tumor suppressor protein (provided by Silvio Tosatto). Challenge: predict how well variants inhibit cell proliferation.
  • Shewanella oneidensis MR-1 gene disruptions (provided by Adam Arkin). Challenge: Predict impact of microbial gene disruptions on cell growth under stress conditions
  • Whole genomes of a family affected by primary congenital glaucoma (provided by Luba Kalaydjieva and Michael Snyder). Challenge: Discover the genetic basis of the disease.
  • riskSNPs disease-associated loci (provided by John Moult). Challenge: identify potential causative SNPs.

This year there will be a hiatus in the CAGI Cancer Pharmacogenomics challenge (provided in 2010-2011 by Joe Gray,, as DREAM is using these data for the equivalent challenge this year:

In order to access the challenges and submit predictions for CAGI 2012, please register at

Registered users also have access to presentations from the previous CAGI conferences, as well as posters and talk slides that summarize the results.


Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair