CAGI Scientist Opening: apply here ☀️

Predict how variants of p16 tumor suppressor protein affect cell proliferation

Dataset description: public

Exome sequence data: registered users only, limited by CAGI Data Use Agreement

This challenge closed on 25 April 2013.

p16 Challenge answer key (1 KB, txt): registered users only, limited by CAGI Data Use Agreement

Assessor summary (555 KB, zip): registered users only, limited by CAGI Data Use Agreement

Slides from the CAGI conference: registered users only, limited by CAGI Data Use Agreement

      Maria Chiara Scaini: Data Provider Talk (5.3 MB, remixable ppt) 

      Silvio Tosatto: Assessment (19 MB, remixable ppt) 

      Emidio Capriotti: Predictor Talk (3.6 MB, remixable ppt)

Predictions (1.6 MB, zip): registered users only, limited by CAGI Data Use Agreement


CDKN2A is the most common, high penetrance, susceptibility gene identified to date in familial malignant melanoma. CDKN2A locus maps to chromosome 9p21 and codes, by alternative splicing of different first exons (1α and 1β), for two oncosuppressors involved in cell cycle regulation. These two proteins are p16INK4A , which promotes cell cycle arrest by inhibiting cyclin dependent kinase (CDK4/6), and p14ARF which inhibits the oncogenic action of MDM2 by blocking MDM2 induced degradation of p53. p16INK4A is composed of four ankyrin-type repeats, each containing a pair of antiparallel helices and a loop creating a cleft that binds to and blocks the function of CDK4 and CDK6. When associated with D-type cyclins, CDK4/6 promote cell cycle progression through the G1 phase by contributing to the phosphorylation and functional inactivation of the retinoblastoma gene product, pRb. Many p16 mutations have been shown to compromise the CDK4 inhibitory activity of p16 and/or its ability to block cell cycle; constitutional and somatic inactivating p16 mutations are common in malignant melanoma.

Besides bona-fide pathogenic mutations, over 150 different sequence variants (Variants of Uncertain Significance, VUS) at this locus have been identified worldwide, and most of them await adequate functional characterization. These alterations might have subtle effects on the folding and interactions of the p16INK4A protein domain (ankyrin units) as well as on its ability to associate with its targets (CDK4/6). Nonetheless, the most important criterion for the pathogenic role of a CDKN2A variant remains the evaluation of its effect on cell cycle regulation.

Prediction challenge

Evaluate how different variants of p16 protein impact its ability to block cell proliferation.

Experimental overview

In this study we chose an "a priori" approach for VUS evaluation, identifying p16 variants that should result in altered in vitro protein properties. Saturation mutagenesis experiments were carried out on a few pivotal positions in the p16 protein and mutants at these positions were analysed along with some proband-related missense mutations, i.e., variants identified in melanoma-prone families.

In detail, p16 cDNA was cloned in an expression vector and used as a substrate for site-specific saturation mutagenesis. The functional evaluations of the p16 variants were carried out by expressing them in U2-OS, a p16-null-human osteosarcoma cell line along with both negative (p16 wild-type) and positive (mutation-like) controls. Transfected cells were selected on G418 and cell counts (average of two replicates) were taken at different time points; cell counts were normalised as number of cells/number of cells at day1 (proliferation rate). Each variant was tested at least twice, but most variants were tested four times or more.

Consideration for classification 

The proliferation rate of the mutation-like (positive) control cells was set as 100%; The proliferation rate for p16 wild-type (negative) control cells was approximately 50%; There was no obvious cut-off point for wild-type and damaging variants; rather, there is a continuum in the distribution

Training data is available for registered users, please log in to access the data.

Dataset: The Dataset is only available for registered users, please log in to access the data.

Prediction submission format 

The template submission file contains one line per variant, and three tab delimited fields per line. 1. The first field contains the variant name, and should be unaltered. 2. The second field should contain the predicted proliferation rate bearing in mind that negative controls (wt-like) are 50% and positive controls (mutation-like) are 100%. 3. The third field should contain the standard deviation (SD)—the confidence of the prediction in column 2. High SD means low confidence, while small SD means that the predictor is confident about the submitted value.

In the template file, all blank cells are marked with an "*". Submit your predictions by replacing the "*" with your value in columns 2 and 3. No empty cells are allowed in the submission; if you cannot submit predictions for a variant, leave the sign "*" in these cells. Please make sure you follow these submission guidelines strictly.

In addition, your submission should include a detailed description of the method used to make the predictions (similar to the style of the Methods section in a scientific article). This information will be submitted as a separate file.

To submit predictions, you need to create or be part of a CAGI User group. Submit your predictions by accessing the link: "All submission forms" from the front page of your group. For more details, please read the FAQ page.

The template file is only available for registered users, please log in to access the file.

References and Additional information 

Kannengiesser C, Brookes S, del Arroyo AG, Pham D, Bombled J, Barrois M, Mauffret O, Avril MF, Chompret A, Lenoir GM, Sarasin A; French Hereditary Melanoma Study Group, Peters G, Bressac-de Paillerets B. Functional, structural, and genetic evaluation of 20 CDKN2A germ line mutations identified in melanoma-prone families or patients. Hum Mutat. 2009 Apr;30(4):564-74.

McKenzie HA, Fung C, Becker TM, Irvine M, Mann GJ, Kefford RF, Rizos H. Predicting functional significance of cancer-associated p16(INK4a) mutations in CDKN2A. Hum Mutat. 2010 Jun;31(6):692-701.

Miller PJ, Duraisamy S, Newell JA, Chan PA, Tie MM, Rogers AE, Ankuda CK, von Walstrom GM, Bond JP, Greenblatt MS. Classifying variants of CDKN2A using computational and laboratory studies. Hum Mutat. 2011 Aug ; 32 (8) : 900-11

CDKN2A variants data bases: LOVD - Leiden Open Variation Database,

UVM BioDesktop - CDKN2a Database Project

CDKN2A gene sequence NCBI Reference Sequence NG_007485.1

p16INK4A mRNA used for the experimental set-up, Homo sapiens cyclin-dependent kinase inhibitor 2A (CDKN2A), transcript variant 1; NCBI Reference Sequence: NM_000077.4

The structure of p16 bound to CDK6 has been determined by X-ray crystallography and is available from the PDB with code 1bi7

Data provided by

Maria Chiara Scaini, Lisa Elefanti, Emma D’Andrea, Chiara Menin and Silvio CE Tosatto


This challenge is being assessed by Silvio Tosatto, University of Padua, Italy