Welcome to the CAGI experiment!

The Critical Assessment of Genome Interpretation (CAGI, \'kā-jē\) is a community experiment to objectively assess computational methods for predicting phenotypic impacts of genomic variation and to inform future research directions. In this assessment, participants are provided genetic variants and make predictions of resulting phenotype. These predictions are evaluated against experimental characterizations by independent assessors. The CAGI experiment culminates with a community workshop and publications to disseminate results.

The CAGI Goals are:

  1. To evaluate the capability of state-of-the art methods to make useful predictions of molecular, cellular, or organismal phenotypes from genomic data.
  2. To identify bottlenecks in genome interpretation that suggest especially critical areas of future research.
  3. To highlight innovations.
  4. To standardize the field by suggesting appropriate assessment methods and defining what is required for accurate prediction.
  5. To engage and connect researchers from the diverse disciplines whose expertise is essential to methods for genome interpretation.

CAGI 5
We are currently looking for new data sets and new data providers for CAGI 5.
If you have ideas for contacts or recommendations for CAGI challenge datasets, we would much appreciate hearing about them. We have a few leads on challenges, including those that provide continuity from the previous experiments.

CAGI publications and presentations
There will be a CAGI special issue with papers in the journals Human Mutation and the Annals of Human Genetics. A flagship manuscript is under preparation. A list of past and future presentations about CAGI is available here with downloadable posters and slides.

Motivation
The acquisition of large numbers of personal genomes has long been the aspiration of genomics researchers, and sequencing technologies promise to make this affordable within the next several years. Already, large-scale genotyping arrays are widely used in research and retail DNA tests of genetic markers have captured the public’s imagination. Unfortunately, personal genomes presently have limited research, or medical value due to a variety of scientific, technical, legal, sociological, and ethical challenges. Yet, whole genomes are also providing tremendous breakthroughs in basic science, such as revealing the genetic basis of Mendelian diseases that had proven refractory to traditional genetics for decades, and helping to unravel the mechanisms by which cancer emerges and evolves.

The CAGI experiment is timely and of wide relevance because of the burgeoning availability of individuals’ genomes, and the desire to interpret them for research and clinical applications. Currently, the field lacks a consensus on the absolute and relative suitability of the panoply of different methods for prediction of the phenotypic impact of genomic variation. The results from CAGI will help the broader community understand the appropriate level of confidence they should have in variant prediction methods, and which classes of approaches are most suitable to a particular application.

CAGI follows in the spirit of the long-running Critical Assessment of Structure Prediction (CASP). Organizers are in the process of collecting unpublished genomic data with associated phenotype characteristics. During the prediction season, participating groups will submit predictions in these areas based on data provided. The prediction accuracy will be evaluated by assessors and results will be revealed at the CAGI conference.