C ritical Assessment of Genome Interpretation

CAGI3 Challenge

All challenges have now closed for submissions. The deadline for most prediction submissions for the CAGI 2013 challenges was 25 April 2013 at 11:59 pm PDT.

Overview

The Critical Assessment of Genome Interpretation (CAGI, \'kā-jē\) is a community experiment to objectively assess computational methods for predicting phenotypic impacts of genomic variation and to inform future research directions. In this assessment, participants are provided genetic variants and make predictions of resulting phenotype. These predictions are evaluated against experimental characterizations by independent assessors. The CAGI experiment culminates with a community workshop and publications to disseminate results.

The Experimental Goals are:

To evaluate the capability of state-of-the art methods to make useful predictions of molecular, cellular, or organismal phenotypes from genomic data.
To identify bottlenecks in genome interpretation that suggest especially critical areas of future research.
To highlight innovations.
To standardize the field by suggesting appropriate assessment methods and defining what is required for accurate prediction.
To engage and connect researchers from the diverse disciplines whose expertise is essential to methods for genome interpretation.

Motivation

The acquisition of large numbers of personal genomes has long been the aspiration of genomics researchers, and sequencing technologies promise to make this affordable within the next several years. Already, large-scale genotyping arrays are widely used in research and retail DNA tests of genetic markers have captured the public’s imagination. Unfortunately, personal genomes presently have limited research, or medical value due to a variety of scientific, technical, legal, sociological, and ethical challenges. Yet, whole genomes are also providing tremendous breakthroughs in basic science, such as revealing the genetic basis of Mendelian diseases that had proven refractory to traditional genetics for decades, and helping to unravel the mechanisms by which cancer emerges and evolves.

The CAGI experiment is timely and of wide relevance because of the burgeoning availability of individuals’ genomes, and the desire to interpret them for research and clinical applications. Currently, the field lacks a consensus on the absolute and relative suitability of the panoply of different methods for prediction of the phenotypic impact of genomic variation. The results from CAGI will help the broader community understand the appropriate level of confidence they should have in variant prediction methods, and which classes of approaches are most suitable to a particular application.

CAGI follows in the spirit of the long-running Critical Assessment of Structure Prediction (CASP). Organizers are in the process of collecting unpublished genomic data with associated phenotype characteristics. During the prediction season, participating groups will submit predictions in these areas based on data provided. The prediction accuracy will be evaluated by assessors and results will be revealed at the CAGI conference.

Key Dates

October-December 2012: Challenges and datasets posted on the CAGI website

24/25 April 2013: Predictions due

31 May 2013: The deadline for fellowship applications.

1 June 2013: Preliminary, blind assessments completed.

15 June 2013: Early conference registration closes

15 June 2013: The deadline for abstract submissions.

21 June 2013: Regular conference registration closes.

17-18 July 2013: The CAGI 2012 Conference in Berlin.

Updates are available through the CAGI newsletter or via Twitter @CAGInews.

Challenges

1. Crohn's disease: Distinguish between exomes of Crohn's disease patients and healthy individuals. Exomes of Crohn's disease patients and healthy individuals (provided by Andre Franke). Challenge: predict which individuals have Crohn's.

2. Variants in DNA double-strand break repair genes (provided by Sean Tavtigian). Challenge: predict probability of each variant occurring in a breast cancer case versus healthy control.

3. Variants of BRCA1 and BRCA2 (provided by Robert Nussbaum). Challenge: predict which variants are associated with increased risk for breast cancer.

4. Mutations in p53 gene exons affecting mRNA splicing (provided by Jeremy Sanford). Challenge: predict how variants impact splicing.

5. New PGP genomes (provided by George Church). Challenge: Predict clinical phenotypes from genome data, and match individuals to their health records.

6. Exomes from a family with familial combined hyperlipidemia and a family with hypoalphalipoproteinemia, lipid metabolism disorders (provided by John Kane and Pui-Yan Kwok). Challenges: identify stricken individuals, a causative variant for elevated LDL-C, and optionally predict lipid profiles of individuals.

7. Variants of a p16 tumor suppressor protein (provided by Silvio Tosatto). Challenge: predict how well variants inhibit cell proliferation.

8. Shewanella oneidensis MR-1 gene disruptions (provided by Adam Arkin). Challenge: Predict impact of microbial gene disruptions on cell growth under stress conditions

9. riskSNPs disease-associated loci (provided by John Moult). Challenge: identify potential causative SNPs.

For challenge updates please see the CAGI Newsletter, subscribe to Frequent Updates, or follow @CAGInews on Twitter.

Latest news

The CAGI 2013 predictors were: Yesim Aydin Son, Benjamin Bachman, Brady Bernard, Marcus Breese, Yana Bromberg, Chen Cao, Emidio Capriotti, Rita Casadio, Chien-Yuan Chen, Shann-Ching Chen, Yun-Ching Chen, Carla Davis, Christopher Douville, Roland Dunbrack, Carlo Ferrari, Adam Frankish, Manuel Giollo, Nina Gonzaludo, Julian Gough, Jennifer Harrow, Tadashi Imanishi, Chan-Seok Jeong, Yuxiang Jiang, Rachel Karchin, Panagiotis Katsonis, Dongsup Kim, Michael Kleyman, Pietro Di Lena, Emanuela Leonardi, Biao Li, Jun Li, Olivier Lichtarge, Chiao-Feng Lin, Rhonald Lua, Angel Mak, Pier Luigi Martelli, Sean Mooney, Zev Medoff, Matthew Mort, John Moult, Steve Mount, Eliseos Mucaki, Jonathan Mudge, Katsuhiko Murakami, Yoko Nagai, Abhishek Niroula, Yanay Ofran, Kymberleigh Pagel, Nathaniel Pearson, Vikas Pejaver, Alexandra Piryatinska, Catherine Plotts, Predrag Radivojac, Aliz Rao, Lipika Ray, Graham Ritchie, Aharon Brodie, Peter Rogan, Jana Marie Schwarz, George Shackelford, Nuttinee Teerakulkittipong, Janita Thusberg, Silvio Tosatto, Ron Unger, Gurkan Ustunkar, Jouni Valiaho, Mauno Vihinen, Mary Wahl, Qiong Wei, Yuedong Yang, Christopher Yates, Yizhou Yin, Chen-Hsin Yu, Dejian Yuan, Maya Zuhl.

The CAGI 2013 Conference took place 17 - 18 July 2013 at the Max Planck Institute for Molecular Genetics in Berlin, Germany. This year's The CAGI 2013 results are now posted. The conference program page contains the full set of slides that were presented at the meeting. The abstract book contains abstracts describing the prediction methods. Each challenge page has the slides for talks given for that challenge as well as the challenge's answer key, assessor summary, and predictions.

The Challenges and Assessors

Crohn’s disease: 56 submissions. Assessor: Alexander Morgan.
PGP: 16 submissions. Assessor: Sean Mooney
MRN variants: 22 submissions. Assessor: Sean Tavtigian.
P16 variants: 22 submissions. Assessor: Silvio Tosatto.
BRCA variants: 14 submissions. Assessor: Robert Nussbaum
Splicing: 5 submissions. Assessor: Jeremy Sanford.
FCH and HA: 39 submissions. Assessor: Shamil Sunyaev
MR-1 fitness: 0 submissions.
RiskSNPs: 12 submissions. Assessor: John Moult.

The 2013 prediction season ended on 25 April 2013. The meeting to discuss the results will be held in Berlin, 17-18 July 2013 (the two days preceding the ISMB SIGs).

Anonymity Policy

See CAGI anonymity policy.

Center for Critical Assessment of Genome Interpretation

Register/Login

C ritical Assessment of Genome Interpretation