Predict the classification of novel variants encountered across genetic tests conducted by Invitae

Challenge: Sherloc clinical classification

Genotype data: registered users only

Last updated: 08 July 2021

This challenge is open. This challenge will close on 15 September 2021.

How to participate in CAGI6?                         Download data & submit predictions on Synapse

Make sure you understand our Data Use Agreement and Anonymity Policy


Invitae is a genetic testing company that publishes their variant interpretations to ClinVar. In this challenge, over 122,000 previously uncharacterized variants are provided, spanning the range of effects seen in the clinic. Following the close of this challenge, Invitae will submit their interpretations for these variants to ClinVar. Predictors are asked to interpret the pathogenicity of these variants, and the clinical utility of predictions will be assessed across multiple categories by Invitae.


Variants at Invitae are classified according to the Sherloc interpretation framework (Nykamp et al., 2017), which refines ACMG/AMP criteria (Richards et al., 2015) into a set of discrete, weighted rules for classifying variants based on available evidence. Evidence can include previous observations of the variant in individuals, the type of variant and the mechanism of disease, experimental evidence, and indirect and predictive evidence.

Invitae classifies most variants according to the ACMG-recommended five-tier classification system. 

Read more about Invitae’s variant classification here

Prediction challenge

Over 122,000 coding (missense, silent, frameshift, stop gained, in-frame codon loss) and non-coding (intronic, splice donor/acceptor) variants are provided. Predictors are asked to submit predictions on the probability that a variant is pathogenic.

Given the specifics of the dataset, the organizers encourage predictors to primarily focus on three subcategories of variants: 

Finally, the organizers have provided an additional file of 420,000+ variant classifications from Invitae published to ClinVar. The balance of benign and pathogenic variants between training variants and challenge variants is not necessarily consistent. 

Variants will be provided in the following format with GRCh38 coordinates: 

The training variant file is provided in a similar format to the challenge variants, along with a fifth column with the published classification. Please note that the data provided for this challenge are for research purposes only, and are not intended for clinical application. 

Progress tracker

A progress tracker feature will become available in the last month of the challenge. This will look similar to a leaderboard, but may include different metrics than those used for final evaluation, and is not meant to forecast the outcome of the challenge. The purpose of the progress tracker is to provide useful feedback on participants’ progress. An email will be sent out to registered participants to notify them when it goes live.  More information about the progress tracker metrics is given below in the assessment section.

Prediction submission format

The prediction submission is a tab-delimited text file. Organizers provide a template file, which must be used for submission. Each row in the submitted file will correspond to a variant.  Each row will include the following columns: 

Please name the files with group ID, submission ID and the date. CAGI allows submission of up to six models per team, of which model 1 is considered primary. You can upload predictions for each model multiple times; the last submission before the deadline will be evaluated for each model.

In addition, your submission must include a detailed description of the method used to make the predictions, similar to the style of the Methods section in a scientific article. You are also encouraged to describe how your model could be used in clinical genetic testing (screening tool, diagnostic tool, etc.)  This information must be submitted as a separate file.

Predictors are welcome to submit multiple models on any subset of variants; however, predictors are strongly encouraged to focus on one or more of the three subcategories described above. If you would like your submission to be evaluated on a subset of variants other than those described in the challenge above, please describe the subcategory in your submission and the evaluators will make reasonable effort to accommodate.

File naming

CAGI allows submission of up to six models per team, of which model 1 is considered primary. You can upload predictions for each model multiple times; the last submission before deadline will be evaluated for each model.

Use the following format for your submissions: <teamname>_model_(1|2|3|4|5|6).(tsv|txt)

To include a description of your method, use the following filename: <teamname>_desc.*

Example: if your team’s name is “bestincagi” and you are submitting predictions for your model number 3, your filename should be bestincagi_model_3.txt.


Assessment will be based on both a quantitative evaluation of model performance, and a qualitative evaluation of the method’s originality and clinical relevance.  Rather than declaring an overall winner, assessors will highlight globally high-performing models; high-performing models across biologically meaningful subsets of variants; and inventive models that introduce new predictive techniques that future researchers can build on.

By default, all predictors will be evaluated across four groups of variants: each of the three variant subcategories described above, as well as the full set of variants in the challenge. For each variant group, predictors will be evaluated based on the classification reached by Invitae.

A range of evaluation metrics will be considered. Emphasis will be placed on computing the recall at several different levels of precision. Predictions of pathogenicity and benignity will both be considered. Other measures such as area under the ROC curve may be computed as appropriate. In each of these measures, the assessors seek to highlight predictors which perform highly for a large number of variants. 

Models will also be qualitatively evaluated based on potential clinical utility as described in the written portion of the submission.  Within reason, evaluators will take into account information from the submitters on the model’s scope and use case (for example, a model described as a screening tool rather than a diagnostic tool may be penalized more leniently for false positives, or a model specializing in a subset of genes from a particular metabolic pathway will be scored much more heavily on performance of variants within these genes.)  We will also give special consideration to models using distinctive methods (not based on current state-of-the-art predictors).

A progress tracker will be made available around August 1st, 2021 The progress assessment will return multiple metrics for each subcategory of method, including area under the ROC, precision-recall curves, and values of recall at several levels of precision. Roughly 20% of the evaluation dataset will be used for the progress tracker. Again, this is not meant to reflect final scoring of the challenge, but should be considered a useful signal for contestants as they experiment with methods and fine-tune their models.  

Data provided by Invitae

Special thanks to Rachel Hovde, Naomi Fox, Alex Colavin, Kathryn Hatchell, John Garcia, Yuya Kobayashi, Rebecca Truty, and Keith Nykamp for their help in preparing the challenge data.


Nykamp K, et al. Sherloc: a comprehensive refinement of the ACMG-AMP variant classification criteria. Genet Med (2017) 19(10): 1105-1117. Erratum in: Genet Med (2020) 22(1): 240-242. PubMed 

Richards S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med (2015) 17(5): 405-424. PubMed  

Revision history 

08 July 2021: initial release, challenge opens