Variants of BRCA1 and BRCA2: predict which variants are associated with increased risk of breast cancer
Dataset description: public
Exome sequence data: registered users only, limited by CAGI Data Use Agreement
This challenge closed on 31 October 2012.
Assessor summary (688 KB, zip): registered users only, limited by CAGI Data Use Agreement
Slides from the CAGI conference: registered users only, limited by CAGI Data Use Agreement
Roland Dunbrack: Assessment (2.6 MB, remixable ppt)
Background
In normal cells, the BRCA1 and BRCA2 genes are involved in homologous recombination for double strand break repair and ensure the stability of a cell's genetic material. Mutations in these genes have been linked to development of breast and ovarian cancer (references below). Myriad Genetics created the BRACAnalysis test in order to assess a woman’s risk of developing hereditary breast or ovarian cancer based on detection of mutations in the BRCA1 and BRCA2 genes. This test has become the standard of care in identification of individuals with hereditary breast and ovarian cancer (HBOC) syndrome. It is based on proprietary methods. For each variant in the dataset, Myriad Genetics made one of four classifications:
These designations are based on a database of patient testing, including frequency of the variants in populations and segregation of variants with disease in families, and are the gold standard in medical diagnoses; patients are making life-changing decisions (e.g., to operate or not) based on these data. Precisely how Myriad Genetics assigns these designations, and their complete database of assignments, is proprietary. Nevertheless, by BRACAnalysis patient test results from clinics where the tests were ordered, it was possible to determine these assignments for the variants observed in patients. These variants and associated pathogenicity assessment were not found in the public domain.
Prediction challenge
For each variant, provide the probability that Myriad Genetics has classified it to be deleterious (Probability 0 – 1 and a standard deviation).
Optional sub-challenge
Additionally provide the probability of Myriad Genetics assigning benign, VFP, or VUS (Probability 0 – 1 and a standard deviation).
Predictions on variants designated VUS will not be included in evaluating the accuracy of the predictions. The assessor will, however, compare different predictions on the VUS variants to understand how different methods approach the question and the extent to which their predictions are consistent.
Dataset information: A set of 100 variants (36 in BRCA1; 64 in BRCA2).
Download dataset: The dataset file is only available for registered users, please log in to access the file.
Prediction submission format
A flat file containing a list of variants will be provided. Eight columns will be designated for predictions in the form of a probability and standard deviation for each designation: deleterious, benign, VFP, and VUS. Prediction submission format: Please use the submission file template provided for your submission. In addition, a validation script is provided, and predictors should check the correctness of the format before submitting their predictions.
The submission template file is only available for registered users, please log in to access the file.
Download BRCA validation script (not available).
Prediction submission template: The prediction submission file is a tab-delimited text file. Each row beyond the header row should include the following columns:
Here is a summary of the column designations.
Gene DNA Variant Variant Deleterious Benign VFP VUS
1 2 3 4 5 6 7 8 9 10 11
BRCA# ##A->B X##Y P SD P SD P SD P SD
In the template file, cells in columns 4-11 are marked with an "*". Submit your predictions by replacing the "*" with your value. No empty cells are allowed in the submission; if you cannot submit predictions for a variant, leave the symbol "*" in these cells. Please make sure you follow the submission guidelines strictly. In addition, your submission should include a detailed description of the method used to make the predictions (similar to the style of the Methods section in a scientific article). This information will be submitted as a separate file. To submit predictions, you need to create or be part of a CAGI User group. Submit your predictions by accessing the link: "All submission forms" from the front page of your group. For more details, please read the FAQ page.
References
Deadline: 31 October 2012. On 1 November 2012, these variants and their assignments will be released by ClinVar.
Data Provider
Robert Nussbaum, University of California, San Francisco
Assessment
This challenge is being assessed by Robert Nussbaum, University of California, San Francisco.