Predict gain-of-function variants in the fibroblast growth factor receptors

Challenge: FGFR

Variant data: registered users only

Last updated: 12 August 2025

This challenge is open. The challenge closes on September 30, 2025. 

How to participate in CAGI7?                         Download data & submit predictions on Synapse 

Make sure you understand our Data Use Agreement and Anonymity Policy

Summary 

Aberrantly activated fibroblast growth factor receptors (FGFRs) frequently drive tumorigenesis via activating, gain-of-function (GoF) mutations. The challenge involves predicting the functional impact of all possible missense variants (derived from single-nucleotide variants) in the kinase domains of human FGFR1, FGFR2, FGFR3, and FGFR4. These variants pose significant challenges for variant interpretation and precision oncology due to the lack of functional and clinical data, as well as the currently limited repertoire of approaches for predicting GoF. In addition to predicting whether variants cause activation or inactivation, the challenge also optionally involves predicting drug resistance. Predictions will be assessed against a high-throughput functional genomics saturation mutational scanning dataset.

Background 

The fibroblast growth factor receptor (FGFR) family (FGFR1, FGFR2, FGFR3, FGFR4) encodes receptor tyrosine kinases with key roles in cell growth and differentiation, with evidence of action as a driver oncogene (Wesche et al., 2011; Helsten et al., 2016; Murugesan et al., 2022; Dut et al., 2008; Lei & Deng, 2017; Babina & Turner, 2017; Silverman et al., 2021; Greenman et al., 2007; Rosty et al., 2005; Wang & Anderson, 2005; Mohammadi et al., 1996). FGFR point mutations (especially in the kinase domain) occur across a wide range of cancers and some are actionable targets for FGFR inhibitors. However, functional data exists for less than 2% of all possible kinase domain variants, and most variants detected in patients remain variants of unknown significance (VUS), limiting clinical interpretation and the scope of precision therapy. Two inhibitors, pemigatinib (Hoy, 2020) and futibatinib (Syed, 2022), are clinically approved for tumors with FGFR rearrangements or fusions, but not generally for point mutations. This is largely because the functional impact of most point mutations was unknown. Notably the impact of a mutation may differ between homologous positions of FGFR family members, and some activating variants are sensitive to one drug but resistant to another. 

Understanding the functional impact of mutations and mapping the spectrum of resistance directly informs clinical decision-making. Recent studies have performed saturation mutational scanning for all kinase domain point mutations in the FGFR family, using pooled functional genomics to empirically define activation, inactivation, drug sensitivity, and resistance (Tangermann et al., 2025). This challenge leverages such comprehensive functional datasets to drive systematic variant annotation.

Experiment

Saturation Mutational Scanning Platform

The study performed saturation mutagenesis of the kinase domains of all four human FGFR family members (FGFR1, FGFR2, FGFR3, FGFR4). All possible single-nucleotide substitutions (totaling 11,520) were generated, covering all missense, nonsense, and synonymous point mutations in these domains. Oligonucleotide pools encoding these variants were synthesized and cloned into lentiviral expression vectors. Libraries were quality checked by next-generation sequencing (NGS) to ensure that all mutations were present at roughly equal representation. Two screens were used.

Cell-based Functional Selection Screens

A. Gain-of-Function (Activation) / Loss-of-Function (Inactivation) screen: 

To assess activation, lentivirus libraries were transduced into MCF10A, growth factor-dependent human mammary epithelial cell line. Upon growth factor depletion, cells were positively selected for mutations that drove cell proliferation. After confluency at 17-20 days, cells were harvested and gDNA was isolated and sequenced. Reads were normalized based on count on in the library and median of replicates. Enrichment of each variant was used to infer functional effect, calculated as the frequency of the mutation in the library relative to the median of all synonymous mutations (which served as a control).

Mutations with >1.5-fold median increase and enrichment in at least 3 of 4 replicates were considered activating (potentially, weakly, activating, or strongly). Conversely, inactivating mutations were identified by a >2.5-fold depletion (≤0.4-fold enrichment) in at least 3 of 4 replicates.

B. Drug Resistance screens:

To assess drug resistance, the lentivirus libraries were transduced into NCI-H1581. After selection for stable expression, cells were exposed to clinically relevant concentrations of either pemigatinib or futibatinib, and after further incuation were harvested. Resistant cells (harboring resistance-conferring mutations) outgrew sensitive cells during drug exposure. Enrichment, as determined by NGS, identified mutations causing resistance specifically to each drug.  Mutations were considered resistant with with an enrichment >1.5-fold median increase and enrichment in 2 of 3 replicates (potentially, weakly, resistant, and strongly)

Why pemigatinib and futibatinib?

Pemigatinib and futibatinib are selective, clinically approved FGFR kinase inhibitors (Hoy, 2020; Syed, 2022). They were chosen because both are approved for tumors with FGFR rearrangements or fusions, but not generally for point mutations. This is largely because the functional impact of most point mutations was unknown. Systematically screening all kinase domain mutations for resistance to these drugs enables identification of activating mutations that are also “druggable” (i.e., sensitive to these inhibitors rather than resistant). Mapping the spectrum of resistance (some mutations confer resistance to one drug but not the other) directly informs clinical decision-making. That is, knowing whether a specific mutation is activating, resistant, or druggable helps assign targeted therapies or guide away from likely ineffective drugs.

Prediction challenge

Participants must provide predictions for 8,407 single nucleotide variant-induced amino acid substitutions (missense variants) in the kinase domains of FGFR1, FGFR2, FGFR3, and FGFR4 (reference protein sequences provided below). It is required to provide predictions of activation/inactivation, and optional to predict pemigatinib and/or futibatinib drug resistance. The primary assessment will be on gain of function activation.

Predictions should be expressed as scores representing the probability and confidence for activation, inactivation, and resistance to each inhibitor, together with a categorical classification and a standard deviation/confidence measure for each score.

Key considerations:

Submission format 

Predictions should be submitted for missense variants with supporting evidence or confidence; large standard deviations should be used to indicate low certainty. If the effect for a drug or phenotype cannot be predicted, indicate with "*" and provide a comment if appropriate (e.g., "outside kinase domain", "noncanonical residue", etc.). Reference protein sequences for the kinase domains are given in UniProt as follows: FGFR1: P11362-1, FGFR2: P21802-1, FGFR3: P22607-1, and FGFR4: P22455-1

Submissions should follow the template as provided below and make the predictions as follows:

In the template file, cells in columns 3-17 are marked with a "*". Submit your predictions by replacing the "*" with your value. No empty cells are allowed in the submission, though you may leave a "*". If you are not confident in a prediction for a variant, enter a large standard deviation for the prediction. Optionally, enter a brief comment on the basis of the prediction, otherwise, leave the "*" in these cells. Please make sure you follow the submission guidelines strictly.

In addition, each submission must be accompanied by a detailed methods description file, outlining:

File naming

CAGI allows submission of up to six models per team, of which model 1 is considered primary. You can upload predictions for each model multiple times; the last submission before deadline will be evaluated for each model. If you are submitting a single file with all predictions combined, please use the format below.

Use the following format for your submissions: <teamname>_model_(1|2|3|4|5|6).(csv|txt)

To include a description of your method, use the following filename: <teamname>_desc.*

Example: if your team’s name is “bestincagi” and you are submitting predictions for your model number 3, your filename should be bestincagi_model_3.txt.

Assessment 

The assessment will be primarily carried out for the activating potential of missense variants (gain of function). The inactivating potential (loss of function) and drug resistance are secondary in this challenge. Because no example data are provided, assessors will conduct some assessments which rescale the predictions or do not depend upon exact values. The metrics will follow the tradition of previous CAGI challengess (The Critical Assessment of Genome Interpretation Consortium Consortium, 2024).

Download data 

Download the submission template for 8,407 missense variants: cagi7fgfrsubmissiontemplate.csv (Synapse)

Download submission validation script: cagi7fgfrvalidation.py (Synapse) 

Dataset provided by

Sven Diederichs, University of Freiburg & German Cancer Consortium (DKTK)

References 

Babina IS, Turner NC. Advances and challenges in targeting FGFR signalling in cancer. Nat Rev Cancer (2017) 17(5):318-332. PubMed 

Dai S, et al. Fibroblast Growth Factor Receptors (FGFRs): structures and small molecule inhibitors. Cells (2019) 8(6):614. PubMed 

Dut A, et al. Drug-sensitive FGFR2 mutations in endometrial carcinoma. Proc Natl Acad Sci U S A (2008) 105(25):8713-8717. PubMed 

Greenman C, et al. Patterns of somatic mutation in human cancer genomes. Nature (2007) 446(7132):153-158. PubMed 

Helsten T, et al. The FGFR landscape in cancer: analysis of 4,853 tumors by next-generation sequencing. Clin Cancer Res (2016) 22(1):259-267. PubMed 

Hoy SM. Pemigatinib: first approval. Drugs (2020) 80(9):923-929. PubMed 

Lei H, Deng CX. Fibroblast growth factor receptor 2 signaling in breast cancer. Int J Biol Sci (2017) 13(9):1163-1171.  PubMed 

McTigue MA, et al. Crystal structure of the kinase domain of human vascular endothelial growth factor receptor 2: a key enzyme in angiogenesis. Structure (1993) 7(3):319-330 (1999). PubMed 

Mohammadi M, et al. Structure of the FGF receptor tyrosine kinase domain reveals a novel autoinhibitory mechanism. Cell (1996) 86(4):577-587. PubMed 

Murugesan K, et al. Pan-tumor landscape of fibroblast growth factor receptor 1-4 genomic alterations. ESMO Open (2022) 7(6):100641. PubMed 

Patterson SE, et al. The clinical trial landscape in oncology and connectivity of somatic mutational profiles to targeted therapies. Hum Genomics (2016) 10:4. PubMed 

Rosty C. et al. Clinical and biological characteristics of cervical neoplasias with FGFR3 mutation. Mol Cancer (2005) 4(1):15. PubMed 

Silverman IM, et al. Clinicogenomic analysis of FGFR2-rearranged cholangiocarcinoma identifies correlates of response and mechanisms of resistance to pemigatinib. Cancer Discov (2021) 11(2):326-339. PubMed 

Suehnholz SP, et al. Quantifying the expanding landscape of clinical actionability for patients with cancer. Cancer Discov 14(1):49-65 (2024). PubMed 

Syed YY. Futibatinib: first approval. Drugs (2022) 82(18):1737-1743. PubMed 

Tangermann, et al. Saturation mutational scanning uncovers druggability of all FGFR kinase domain point mutations. Nat Genet (2025), accepted.

The Critical Assessment of Genome Interpretation Consortium. CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods. Genome Biol (2024) 25(1): 53. PubMed 

Wang Z, Anderson KS. Therapeutic targeting of FGFR signaling in head and neck cancer. Cancer J (2022) 28(5):354-362. PubMed 

Wesche J, et al. Fibroblast growth factors and their receptors in cancer. Biochem J (2011) 437(2):199-213. PubMed 

Revision history 

10 August 2025: challenge preview posted

11 August 2025: challenge opened for submissions

12 August 2025: challenge description and template submission format/file updated