CAGI Newsletter

CAGI2012-13 challenges close tonight--send your predictions before 11:59 PM PDT!

Dear CAGI Participants,

This is a reminder that today is the last day to send your predictions for CAGI2012-13. The submission site will close at 11:59 PM PDT!
https://genomeinterpretation.org/all-submission-forms

(Note that the PGP challenge already closed yesterday.)

Hope to see many of you in Berlin. Details will be posted in the next few weeks.
https://genomeinterpretation.org/content/cagi-2012-conference

Sincerely,

CAGI Organizers

PGP Challenge deadline is 24 April--ONE DAY EARLIER than others; PGP templates have been modified

Dear CAGI participant,

Please note that we have had to bring the deadline for the PGP challenge forward 24 hours to 11:59PM PST on 24 April 2013. Please accept our apologies for this additional change and notification, but this is necessary to allow free exchange of the genomic information the next morning at the GET meeting (25 Apr).

Please also note that the two submission template files have been modified slightly as follows: 1) A training genome (474576e34ad0f39488d9c9b75946f7a7a4248427) that should not have been in the templates has been replaced by the missing challenge genome (fb74440463ffaaedecdac54b38ba2db7915965a0). 2) The templates have now been alpha-numerically sorted. Please download and use the corrected submission templates.
https://genomeinterpretation.org/content/PGP2012#templates

Sincerely,

CAGI organizers

CAGI prediction season extended to April 25; submission forms ready; various errata

There is additional time to tackle CAGI challenges! After consulting with predictors, we are making a final extension of the CAGI prediction deadline for all open challenges until 11:59 pm PDT on 25 April 2013.

Prediction submission forms are now available. You may make up to 6 predictions for each challenge, and any of these may be anonymous.
https://genomeinterpretation.org/all-submission-forms
(You must be registered and logged in to access the submission page.)

Following are additions, clarifications, and errata for several challenges:

1. MRN challenge: Previous versions of the datasets did not designate the anomalous variants as described in the challenge description (There is one such variant for MRE11, and there are five variants for NBS1). The dataset has been updated appropriately. In addition, there was an error in a previous version of the NBS1 Variant Dataset where the mutation p.V314E was incorrectly denoted c.941T>A.

2. HA challenge: We have been asked to explain why members of the family have variants reported on the Y chromosome. We note that this occasionally occurs in females due to mismapping to pseudo-autosomal regions of sex chromosomes.

3. FCH challenge: When identifying a specific gene, please include the gene name in front of the variant with a colon (e.g., NM_004006.1(DMD):c.3G>T).

4. The p16 challenge now includes training data.
https://genomeinterpretation.org/sites/default/files/protected_files/CAG...
(Only available to registered and signed-in users)

Critical Assessment of Genome Interpretation (CAGI)--Open Challenges and Conference

Following is an omnibus update on CAGI, including new information about
the CAGI conference in July and the recently added anonymity policy.
Please distribute widely and follow our Twitter feed @CAGInews.

The Critical Assessment of Genome Interpretation (CAGI) is a community
experiment to assess computational methods for predicting the
phenotypic impacts of genomic variation. The current CAGI experiment
has eight open challenges, available on the CAGI website:
https://genomeinterpretation.org/

In the CAGI experiment, participants are provided genetic variants and
make predictions of resulting phenotypes. Independent assessors then
evaluate these predictions against experimental characterizations.
The primary goals of the experiment are to establish the current
state of the art, identify bottlenecks in genome interpretation,
inform critical areas of future research, and connect researchers
from diverse disciplines whose expertise is essential for advancing
methods for interpreting genomic variation.

The deadline for current CAGI predictions is
***EDIT: extended to 25 April 2013.***
Anonymous submissions, with limitations, are allowed this year.
https://genomeinterpretation.org/content/anonymity-policy
We encourage use of both established methods and experimental
approaches, and we welcome predictors of all backgrounds.

The current CAGI experiment will culminate in a conference in Berlin,
on 17-18 July 2013, immediately before the ISMB SIGs. An NHGRI R13
grant will help support travel and participation in the meeting.
https://genomeinterpretation.org/content/cagi-2012-conference

Previous CAGI experiments have highlighted striking breakthroughs
as well as disappointing failures. Publications from the previous
CAGI are underway; slides and posters presentations about CAGI may
be found at:
https://genomeinterpretation.org/content/cagi-presentations
The results from the current CAGI challenge will be published as well.

The currently open CAGI challenges are:

+ Seventy-seven PGP genomes (provided by George Church).
Challenge: Predict clinical phenotypes from genome data, and match
individuals to their health records.
https://genomeinterpretation.org/content/PGP2012

+ Exomes of Crohn's disease patients and healthy individuals (provided
by Andre Franke). Challenge: predict which individuals have Crohn's.
https://genomeinterpretation.org/content/new-crohns-dataset

+ Exomes from two families with lipid metabolism disorders (provided
by John Kane and Pui-Yan Kwok). Challenge: predict lipid profiles
and a causative variant.
https://genomeinterpretation.org/content/FCH
https://genomeinterpretation.org/content/HA

+ Variants in DNA double-strand break repair genes (provided by Sean
Tavtigan). Challenge: predict probability of each variant occurring
in a breast cancer case versus healthy control.
https://genomeinterpretation.org/content/MRN

+ Mutations in p53 gene exons affecting mRNA splicing (provided by
Jeremy Sanford). Challenge: predict how variants impact splicing.
https://genomeinterpretation.org/content/Splicing-2012

+ Variants of a p16 tumor suppressor protein (provided by Silvio
Tosatto). Challenge: predict how well variants inhibit cell
proliferation.
https://genomeinterpretation.org/content/p16_2012

+ Shewanella oneidensis MR-1 gene disruptions (provided by Adam Arkin).
Challenge: Predict impact of microbial gene disruptions on cell
growth under stress conditions
https://genomeinterpretation.org/content/MR-1_2012

+ riskSNPs disease-associated loci (provided by John Moult). Challenge:
identify potential causative SNPs.
https://genomeinterpretation.org/content/risksnps2012

We are also soliciting challenges for the next CAGI. Please contact us
at cagi@genomeinterpretation.org with proposals for suitable datasets.

In order to access the current challenges and submit predictions for CAGI,
please register at https://genomeinterpretation.org/.

Registered users also have access to presentations from the previous
CAGI conferences, as well as posters and talk slides that summarize
the results.

Sincerely,

Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair

cagi@genomeinterpretation.org

CAGI News: deadlines, meeting, PGP, p16, and Crohn's challenges

Dear CAGI Community,

We have several updates for the CAGI 2012 experiment, including the prediction submission deadline, two challenge releases, one major challenge update, and tentative CAGI meeting plans.

Deadline for prediction submissions:
The deadline for prediction submissions for the CAGI 2012 challenges is 28 March 2013 at 11:59 pm PDT. All CAGI 2012 challenges (except BRCA, which already closed) will be open until that time.

PGP challenge released:
The PGP challenge has just been released. This year the challenge is to submit predictions that match 54 genomes to phenotypic profiles (trait lists). Please visit the challenge page for details and to download the data: https://genomeinterpretation.org/content/PGP2012

p16 challenge released:
The p16 challenge has also just been released. The challenge is to evaluate how different variants of the p16 tumor suppressor protein impact its ability to block cell proliferation.
https://genomeinterpretation.org/content/p16_2012

Reworked Crohn's challenge data posted:
The reworked Crohn's disease dataset is now available on the Crohn's challenge page. The reworked dataset consists of variant calls made for all 66 exomes together, providing better quality variant calls and critical information for interpreting relatives. Please visit the challenge page for more information and to download the data: https://genomeinterpretation.org/content/new-crohns-dataset

News on the meeting for CAGI 2012 challenges:
We are looking into having the postponed CAGI 2012 meeting in Berlin, on 18-19 July 2013, immediately before the ISMB SIGs. More information will be available on the CAGI website as the plans develop. If you would be interested in helping with the meeting in any capacity, please let us know.

CAGI 2013 plans:
Pending funding from NIH, we hope to have a CAGI 2013 on a usual cycle (predictions in summer, meeting in December). To that end, we are beginning to solicit data for 2013 challenges. Please let us know if you know of any datasets appropriate for next year's CAGI experiment.

The CAGI organizers wish you and your families a joyous holiday season and happy 2013!

New CAGI challenges and updates: Mechanisms of risk SNPs, bacterial fitness, Crohn's, and splicing

Crohn's challenge update: The data providers are in the process of re-basecalling all samples together so as to provide one VCF for all the data. The revised dataset will provide all variants in the capture region (exome target) as well as the particular capture technology. It is expected that this updated dataset will be available in mid-December.

Splicing challenge update: The TP53 minigene experiments were done in HEK293T cells. Previously, the particular cell line was not specified.

We have also just released two new CAGI 2012 challenges:

1. Assign possible mechanisms for SNPs associated with risk of seven complex trait diseases. https://genomeinterpretation.org/content/risksnps2012

2. Predict impact of Shewanella oneidensis MR-1 gene disruptions on cell growth under stress conditions. https://genomeinterpretation.org/content/MR-1_2012

Coming soon: We will shortly be releasing a "PGP challenge" wherein predictors are asked to match individual genomes to individual trait profiles.

Feedback requested for the CAGI 2012 meeting plans

We would like to solicit your input regarding the postponed CAGI 2012 meeting. Please take this brief survey so that we can decide when, where, and if to hold a separate meeting to discuss the CAGI 2012 experiment.
http://www.surveymonkey.com/s/NB3CBW5

ASHG attendees: Informal CAGI gathering Thursday at 1:15pm

An informal CAGI gathering will be held Thursday, Nov. 8, 1:15 to 2:15 pm in Moscone South. Please write cagi@genomeinterpretation.org for location details. Also please write if you want to meet with Steven Brenner but cannot make that time.

Two new CAGI 2012 challenges: lipid metabolism diseases FCH and HA

We have recently released two new CAGI 2012 challenges:

1. Using exome sequencing data from a family, identify which individual has hypoalphalipoproteinemia (HA) and other disease phenotypes.
https://genomeinterpretation.org/content/HA-2012

2. In a family affected by familial combined hyperlipidemia (FCH), identify mutation(s) conferring low-density lipoprotein cholesterol (LDL-C) disease phenotype and identify individuals with abnormal triglycerides (TG) and high-density lipoprotein cholesterol (HDL-C) levels.
https://genomeinterpretation.org/content/FCH

Important! since this challenge was first posted, there have been important updates, namely
(a) a correction regarding who has the elevated LDL-C phenotype (Patient-17 is the unaffected daughter) and
(b) a clarification about how the variant calling was performed.

CAGI 2012 meeting postponed

Dear CAGI participant:

As you may have realized, the late release of CAGI challenges has resulted in it becoming impractical to hold the conference on the tentatively proposed dates in December 2012. There is just not enough time for engaged predictions or for assessment.

We are now considering alternative options and will soon solicit your input. As a result of the postponement, most challenges will now remain open for predictions until at least 10 January 2013. (Please do check expiry dates for exceptions, such as the BRCA mutation challenge expiring on 31 Oct 2012 due to of imminent public release of the data).

We apologize that the postponement may be disruptive to your plans. If you did make travel arrangements, please let us know, as we would like to mitigate the impact. Hopefully the greater time to address challenges and the broader ranges of challenges that can be considered (for example, it will now be possible to include a nice set of PGP genomes) will be at least a partial compensation.

We will endeavor to provide firm prediction deadlines and meeting plans as soon as possible, probably in about two weeks.

Steven Brenner
John Moult
Daniel Barsky

BRCA challenge closes Wednesday, 31 October 2012

Reminder: The CAGI 2012 BRCA challenge predictions are due on Wednesday, 31 October 2012, 11:59pm (UTC-12).

Other challenges will be due after mid-November, more details will be posted soon.

Sincerely,

Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair

cagi@genomeinterpretation.org

Two new CAGI 2012 challenges released

We have recently released two new CAGI 2012 challenges:
1. Predict the probability of observed variants in DNA double-strand break repair genes to occur in a breast cancer case versus healthy control.
https://genomeinterpretation.org/content/MRN

2. Predict how mutations in p53 gene exons affect mRNA splicing of the gene.
https://genomeinterpretation.org/content/splicing-2012

Sincerely,

Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair

cagi@genomeinterpretation.org

BRCA Challenge is released now!

CAGI continues its rollout of 2012 challenges. We have released the BRCA challenge, to predict which variants of BRCA1 and BRCA2 are associated with increased risk of breast cancer.

Details of the challenge are at:
https://genomeinterpretation.org/content/BRCA-2012

Sincerely,

Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair

cagi@genomeinterpretation.org

CAGI 2012 prediction season is now open

The Critical Assessment of Genome Interpretation (CAGI) is a community experiment to assess computational methods for predicting the phenotypic impacts of genomic variation. The CAGI 2012 experiment is planned to include ten challenges. The prediction season is now open with the first dataset available, and the other datasets will be rolling out soon on the CAGI website: https://genomeinterpretation.org/

Most prediction deadlines will be in early November.

In the CAGI experiment, participants are provided genetic variants and make predictions of resulting phenotypes. These predictions are evaluated against experimental characterizations, with independent assessors performing the evaluations. The primary goals of the experiment are to identify bottlenecks in genome interpretation, inform critical areas of future research, and connect researchers from diverse disciplines whose expertise is essential for advancing
methods of genome interpretation.

The CAGI 2012 experiment will culminate in a conference planned for 15-16 December 2012 at UCSF. An NHGRI R13 grant will help support travel and participation in the meeting.

The anticipated CAGI 2012 challenges are:

  • Exomes of Crohn's disease patients and healthy individuals (provided by Andre Franke). Challenge: predict which individuals have Crohn's.
    Released NOW: Crohn's Disease
  • Variants in DNA double-strand break repair genes (provided by Sean Tavtigan).
    Challenge: predict probability of each variant occurring in a breast cancer case versus healthy control.
  • Variants of BRCA1 and BRCA2 (provided by Robert Nussbaum). Challenge: predict which variants are associated with increased risk for breast cancer.
  • Mutations in p53 gene exons affecting mRNA splicing (provided by Jeremy Sanford). Challenge: predict how variants impact splicing.
  • New PGP genomes (provided by George Church). Challenge: Predict clinical phenotypes from genome data, and match individuals to their health records.
  • Exomes from two families with lipid metabolism disorders (provided by John Kane and Pui-Yan Kwok). Challenge: predict lipid profiles and a causative variant.
  • Variants of a p16 tumor suppressor protein (provided by Silvio Tosatto). Challenge: predict how well variants inhibit cell proliferation.
  • Shewanella oneidensis MR-1 gene disruptions (provided by Adam Arkin). Challenge: Predict impact of microbial gene disruptions on cell growth under stress conditions
  • Whole genomes of a family affected by primary congenital glaucoma (provided by Luba Kalaydjieva and Michael Snyder). Challenge: Discover the genetic basis of the disease.
  • riskSNPs disease-associated loci (provided by John Moult). Challenge: identify potential causative SNPs.

This year there will be a hiatus in the CAGI Cancer Pharmacogenomics challenge (provided in 2010-2011 by Joe Gray, https://genomeinterpretation.org/content/breast-cancer-cell-line-pharmac...), as DREAM is using these data for the equivalent challenge this year:
http://www.the-dream-project.org/challenges/nci-dream-drug-sensitivity-p...

In order to access the challenges and submit predictions for CAGI 2012, please register at https://genomeinterpretation.org/.

Registered users also have access to presentations from the previous CAGI conferences, as well as posters and talk slides that summarize the results.

Sincerely,

Daniel Barsky, CAGI 2012 Organizer
Steven E. Brenner, CAGI Chair
John Moult, CAGI Chair

cagi@genomeinterpretation.org