CAGI Scientist Opening: apply here ☀️
Estimate patients’ therapeutic warfarin doses from their exome sequences
Challenge: Warfarin exomesClotting disease (DVT or PE) exomes
Dataset description: public
Dataset availability: encrypted
Exome sequence data: registered users only, limited by CAGI Data Use Agreement
Last updated: 14 April 2016
This challenge will tentatively close at 8:00 PM PST (Pacific Standard Time) on 8 December 2015.
Download answer key, predictions, and assessment: registered users only, limited by CAGI Data Use Agreement. The answer key, predictions, and assessment files are accessible to registered users only, and their use is limited by the CAGI Data Use agreement. Please log in to access the file.
Presentations from the CAGI 4 conference: registered users only, limited by CAGI Data Use Agreement. Presentations are accessible to registered users only, and their use is limited by the CAGI Data Use Agreement. Please log in to access the file.
With over 33 million prescriptions in 2011, warfarin is the most commonly used anticoagulant for preventing thromboembolic events . Warfarin has a twenty-fold inter-individual dose variability and a narrow therapeutic index, and it is responsible for a third of adverse drug event hospitalizations in older Americans . Alternatives to warfarin, such as direct thrombin inhibitors and factor Xa inhibitors, are now available. However, these are more expensive, irreversible, and may cause a higher rate of acute coronary events compared to warfarin [3,4]. Thus, warfarin remains a mainstay of anticoagulant therapy, and better methods of dosing warfarin will lead to fewer adverse events due to overcoagulation.
Both clinical modifiers and genetic polymorphisms are known to affect an individual’s stable therapeutic warfarin dose . Previously, warfarin dose prediction algorithms have been formulated; however, these algorithms are less predictive in diverse populations .
With the provided exome data and clinical covariates, predict the therapeutic warfarin dose for 53 individuals.
The data set contains the following components:
A description of how the genomic data were collected is available in the methods section of  (http://www.ncbi.nlm.nih.gov/pubmed/25079360). Reference  contains a large amount of other relevant information, including an analysis and prediction model developed by the dataset providers.
Download dataset: This dataset file is available only to registered users. Please log in to access the file.
Download submission template: This submission template file is available only to registered users. Please log in to access the file.
Download validation script: This submission validation script is available only to registered users. Please log in to access the file.
Prediction submission format
The prediction submission is a tab-delimited text file. Organizers provide a file template, which should be used for submission. A validation script is provided, and predictors should check the correctness of the format before submitting their predictions.
In the submitted file, each of the 53 rows includes the following columns:
In the template file, cells in columns 2-4 are marked with an "*". Submit your predictions by replacing the "*" with your value. No empty cells are allowed in the submission. You must enter a prediction and standard deviation for every individual; if you are not confident in a prediction for an individual, enter a large standard deviation for the prediction. Optionally, enter a brief comment indicating the basis of each prediction;,otherwise, leave the "*" in these cells. Please make sure you follow the submission guidelines strictly.
Note that although numerical dose prediction is required and will be assessed, it is likely evaluation will also include an assessment based on a binary prediction of high/low dose. Values at or below 44 will be considered low, and values above 44 will be considered high.
In addition, your submission should include a detailed description of the method used to make the predictions, similar to the style of the Methods section in a scientific article. This information will be submitted as a separate file.
To submit predictions, you need to create or be part of a CAGI User group. Submit your predictions by accessing the link: "All submission forms" from the front page of your group. For more details, please read the FAQ page.
Dataset provided by
Roxana Daneshjou and Russ Altman, Stanford University School of Medicine
6 Aug 2015 (v01): initial release
4 Sep 2015 (v02): challenge close date added
28 Oct 2015 (v03): submission instructions and template updated, validation script provided
7 Nov 2015 (v04): submission deadline extended
12 Nov 2015 (v05): improved validation script provided
18 Dec 2015 (v06): answer key provided
18 Mar 2016 (v06): predictions provided
14 Apr 2016 (v07): assessment and conference presentations provided