Database for Variant Impact Predictors (VIPdb)

Genetics has been used to direct clinical decision-making recently. Accurate prediction of the impact of genomic variants is the key step to convert genetics to clinical information. About one hundred tools or databases have been developed specifically for this purpose over the last decade. Here we summarize these tools as well as their characteristics, as a database for genetic Variant Impact Predictors (VIPdb). This database will not only help genomic scientists choose appropriate tools for their own purpose, but also contribute to the development of better tools in this field.

To provide an intuitive view of tool usages in the field, we provide a wordle of these tools. Each predictor size is scaled according to the logarithm of the number of 2-year citations. Tools with 0 citations were scaled to have 0.5 citations before transform so the names would show up. Colors are randomly assigned.

CAGI Wordle

Image at a higher resolution can be found here.
For a version of the figure with all-time citations instead of 2-year citations, see here.
You are welcome to use this image with appropriate citation.

If you know of any methods we did not include please let us know. Preferably email or

Information on the variant annotation tools included can be found in the table below.

A table including comprehensive information can be found here.

To cite the figure please cite: Hoskins RA, Repo S, Barsky D, Andreoletti G, Moult J, Brenner SE.2017. Reports from CAGI: The Critical Assessment of Genome Interpretation. Hum Mutat 38:1072-1084. doi:10.1002/humu.23290

Method Title DOI Homepage
SignalP SignalP 4.0: Discriminating signal peptides from transmembrane regions Homepage Link
PolyPhen 2 A method and server for predicting damaging missense mutations Homepage Link
ANNOVAR Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR Homepage Link
SIFT SIFT web server: Predicting effects of amino acid substitutions on proteins Homepage Link
SnpEff A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3 Homepage Link
CADD A general framework for estimating the relative pathogenicity of human genetic variants Homepage Link
MutationTaster2 Mutationtaster2: Mutation prediction for the deep-sequencing age Homepage Link
MutSigCV Mutational heterogeneity in cancer and the search for new cancer-associated genes Homepage Link
Exome Variant Server (EVS) A map of human genome variation from population-scale sequencing Homepage Link
PROVEAN PROVEAN web server: A tool to predict the functional effect of amino acid substitutions and indels Homepage Link
VEP The Ensembl Variant Effect Predictor Homepage Link
ClinVar ClinVar: Public archive of interpretations of clinically relevant variants Homepage Link
Panther PANTHER version 11: Expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements Homepage Link
Human Splicing Finder Human Splicing Finder: An online bioinformatics tool to predict splicing signals Homepage Link
WoLF-PSORT WoLF PSORT: Protein localization predictor Homepage Link
phastCons Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes Homepage Link
FIS Predicting the functional impact of protein mutations: Application to cancer genomics Homepage Link
MutationAssessor Predicting the functional impact of protein mutations: Application to cancer genomics Homepage Link
phyloP_ Detection of nonneutral substitution rates on mammalian phylogenies Homepage Link
FATHmm Ranking non-synonymous single nucleotide polymorphisms based on disease concepts Homepage Link
TANGO Protein aggregation and amyloidosis: Confusion of the kinds? Homepage Link
DeepSEA Predicting effects of noncoding variants with deep learning-based sequence model Homepage Link
GERP++ Identifying a high fraction of the human genome to be under selective constraint using GERP++ Homepage Link
MaxEnt Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals Homepage Link
MutPred Splice MutPred Splice: Machine learning-based prediction of exonic variants that disrupt splicing Homepage Link
MutPred-LOF When loss-of-function is loss of function: Assessing mutational signatures and impact of loss-of-function genetic variants Homepage Link
MetaLR Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies Homepage Link
MetaSVM Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies Homepage Link
Align-GVGD Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral Homepage Link
HGMD The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies Homepage Link
MutPred2 Automated inference of molecular mechanisms of disease from amino acid substitutions Homepage Link
PhD-SNP Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information Homepage Link
Oncotator Oncotator: Cancer variant annotation tool Homepage Link
SNPinfo/FuncPred SNPinfo: Integrating GWAS and candidate gene information into functional SNP selection for genetic association studies Homepage Link
DANN DANN: A deep learning approach for annotating the pathogenicity of genetic variants Homepage Link
AGGRESCAN AGGRESCAN: A server for the prediction and evaluation of "hot spots" of aggregation in polypeptides Homepage Link
MuSiC MuSiC: Identifying mutational significance in cancer genomes Homepage Link
REVEL REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants Homepage Link
Condel Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel Homepage Link
GWAVA Functional annotation of noncoding sequence variants Homepage Link
DUET DUET: A server for predicting effects of mutations on protein stability using an integrated computational approach Homepage Link
CHASM CHASM and SNVBox: Toolkit for detecting biologically important single nucleotide mutations in cancer Homepage Link
fathmm-MKL An integrative approach to predicting the functional effects of non-coding and coding sequence variation Homepage Link
PredictSNP2 PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions. Homepage Link
Dmutant Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction Homepage Link
Mupro Prediction of protein stability changes for single-site mutations using support vector machines Homepage Link
INSIGHT Application of a 5-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants in the InSiGHT locus-specific database Homepage Link
EIGEN A spectral approach integrating functional genomic annotations for coding and noncoding variants Homepage Link
Waltz Exploring the sequence determinants of amyloid structure using position-specific scoring matrices Homepage Link
Eris Eris: An automated estimator of protein stability [2] Homepage Link
LocTree3 LocTree3 prediction of localization Homepage Link
EVmutation Mutation effects predicted from sequence co-variation Homepage Link
SNPnexus SNPnexus: Assessing the functional relevance of genetic variation to facilitate the promise of precision medicine Homepage Link
dbscSNV In silico prediction of splice-altering single nucleotide variants in the human genome Homepage Link
SNAP2 Better prediction of functional effects for sequence variants Homepage Link
SNPeffect SNPeffect 4.0: On-line prediction of molecular and structural effects of protein-coding variants Homepage Link
wInterVar InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines Homepage Link
FoldAmyloid FoldAmyloid: A method of prediction of amyloidogenic regions from protein sequence Homepage Link
CUPSAT Computational modeling of protein mutant stability: Analysis and optimization of statistical potentials and structural features reveal insights into prediction model development Homepage Link
OncodriveCLUST OncodriveCLUST: Exploiting the positional clustering of somatic mutations to identify cancer genes Homepage Link
PopMuSic PoPMuSiC 2.1: A web server for the estimation of protein stability changes upon mutation and sequence optimality Homepage Link
BeAtMuSiC BeAtMuSiC: Prediction of changes in protein-protein binding affinity on mutations. Homepage Link
SNPs3D Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines Homepage Link
PASTA2 PASTA 2.0: An improved server for protein aggregation prediction Homepage Link
AGGRESCAN3D AGGRESCAN3D (A3D): Server for prediction of aggregation properties of protein structures Homepage Link
Zyggregator The Zyggregator method for predicting protein aggregation propensities Homepage Link
fitCons A method for calculating probabilities of fitness consequences for point mutations across the human genome Homepage Link
Variant Tools Reproducible simulations of realistic samples for next-generation sequencing studies using variant simulation tools Homepage Link
VAAST 2 VAAST 2.0: Improved variant classification and disease-gene identification using a conservation-controlled amino acid substitution matrix Homepage Link
KGGSeq Cepip: Context-dependent epigenomic weighting for prioritization of regulatory variants and disease-associated genes Homepage Link
KvSNP Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity Homepage Link
MAPP Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity Homepage Link
Exomiser Next-generation diagnostics and disease-gene discovery with the Exomiser Homepage Link
PON_P2 PON-P2: Prediction method for fast and reliable identification of harmful variants Homepage Link
CoMEt CoMEt: A statistical approach to identify combinations of mutually exclusive alterations in cancer Homepage Link
Personal Genome Project Privacy risks from genomic data-sharing beacons Homepage Link
SiPhy Identifying novel constrained elements by exploiting biased substitution patterns Homepage Link
nsSNPAnalyzer nsSNPAnalyzer: Identifying disease-associated nonsynonymous single nucleotide polymorphisms Homepage Link
DDIG_in Investigating DNA-, RNA-, and protein-based features as a means to discriminate pathogenic synonymous variants Homepage Link
PMUT PMut: A web-based tool for the annotation of pathological variants on proteins, 2017 update Homepage Link
Phen-Gen Phen-gen: Combining phenotype and genotype to analyze rare disorders Homepage Link
DFIRE/DDNA2 An all-atom knowledge-based energy function for protein-DNA threading, docking decoy discrimination, and prediction of transcription-factor binding profiles Homepage Link
GenoCanyon A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data Homepage Link
PantherPSEP PANTHER-PSEP: Predicting disease-causing genetic variants using position-specific evolutionary preservation Homepage Link
VarWalker VarWalker: Personalized Mutation Network Analysis of Putative Cancer Genes from Next-Generation Sequencing Data Homepage Link
DIDA DIDA: A curated and annotated digenic diseases database Homepage Link
INPS-3D INPS-MD: A web server to predict stability of protein variants from sequence and structure Homepage Link
SIFT-indel SIFT Indel: Predictions for the Functional Effects of Amino Acid Insertions/Deletions in Proteins Homepage Link
VEST-indel Assessing the Pathogenicity of Insertion and Deletion Variants with the Variant Effect Scoring Tool (VEST-Indel) Homepage Link
MuStab Sequence feature-based prediction of protein stability changes upon amino acid substitutions Homepage Link
SDM SDM: A server for predicting effects of mutations on protein stability Homepage Link
EGAD Energy functions for protein design: Adjustment with protein-protein complex affinities, models for the unfolded state, and negative design of solubility and specificity Homepage Link
Skippy Genomic features defining exonic variants that modulate splicing Homepage Link
NeEMO NeEMO: A method using residue interaction networks to improve prediction of protein stability upon mutation Homepage Link
PAGE Prediction of aggregation rate and aggregation-prone segments in polypeptide sequences Homepage Link
Scpred SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences Homepage Link
TransComp Automated prediction of protein association rate constants Homepage Link
CanDrA CanDrA: Cancer-specific driver missense mutation annotation with optimized features Homepage Link
CAROL A combined functional annotation score for non-synonymous variants Homepage Link
CoVEC Predicting the functional consequences of non-synonymous DNA sequence variants - evaluation of bioinformatics tools and development of a consensus strategy Homepage Link
DBD-Hunter DBD-Hunter: A knowledge-based method for the prediction of DNA-protein interactions Homepage Link
transFIC Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation Homepage Link
VaRank VaRank: A simple and powerful tool for ranking genetic variants Homepage Link
VariBench VariBench: A Benchmark Database for Variations Homepage Link
AUTO-MUTE AUTO-MUTE: Web-based tools for predicting stability changes in proteins due to single amino acid replacements Homepage Link
CanPredict CanPredict: A computational tool for predicting cancer-associated missense mutations Homepage Link
is-rSNP is-rSNP: A novel technique for in silico regulatory SNP detection Homepage Link
PON_mt_tRNA PON-mt-tRNA: A multifactorial probability-based method for classification of mitochondrial tRNA variations Homepage Link
UMD-predictor UMD-predictor, a new prediction tool for nucleotide substitution pathogenicity - Application to four genes: FBN1, FBN2, TGFBR1, and TGFBR2 Homepage Link
DbWGFP DbWGFP: A database and web server of human whole-genome single nucleotide variants and their functional predictions Homepage Link
ExomeSuite ExomeSuite: Whole exome sequence variant filtering tool for rapid identification of putative disease causing SNVs/indels Homepage Link
HANSA Hansa: An automated method for discriminating disease and neutral human nsSNPs Homepage Link
WGSA WGSA: An annotation pipeline for human genome sequencing studies Homepage Link
LS-SNP/PDB LS-SNP/PDB: Annotated non-synonymous SNPs mapped to Protein Data Bank structures Homepage Link
M6AVar M6AVar: A database of functional variants involved in m 6 A modification Homepage Link
SilVA Identification of deleterious synonymous variants in human genomes Homepage Link
ASP/ASPex The use of orthologous sequences to predict the impact of amino acid substitutions on protein function Homepage Link
FunSAV FunSAV: Predicting the functional effect of single amino acid variants using a two-stage random forest model Homepage Link
VarCards VarCards: An integrated genetic and clinical database for coding variants in the human genome Homepage Link
PROFcon PROFcon: Novel prediction of long-range contacts Homepage Link
SPF_Cancer A new disease-specific machine learning approach for the prediction of cancer-causing missense variants Homepage Link
VariSNP VariSNP, A benchmark database for variations from dbSNP Homepage Link
CoDP CoDP: Predicting the impact of unclassified genetic variants in MSH6 by the combination of different properties of the protein Homepage Link
IMHOTEP IMHOTEP-a composite score integrating popular tools for predicting the functional consequences of non-synonymous sequence variants Homepage Link
MultiMutate Four-body scoring function for mutagenesis Homepage Link
ProA Identification of properties important to protein aggregation using feature selection Homepage Link
Scide SCide: Identification of stabilization centers in proteins Homepage Link
ActiveDriverDB ActiveDriverDB: human disease mutations and genome variation in post-translational modification sites of proteins Homepage Link
AVIA AVIA v2.0: Annotation, visualization and impact analysis of genomic variants and genes Homepage Link
ChroMoS ChroMoS: An integrated web tool for SNP classification, prioritization and functional interpretation Homepage Link
Cscape CScape: A tool for predicting oncogenic single-point mutations in the cancer genome Homepage Link
EFIN EFIN: Predicting the functional impact of nonsynonymous single nucleotide polymorphisms in human genome Homepage Link
PERCH PERCH: A Unified Framework for Disease Gene Prioritization Homepage Link
QueryOR QueryOR: A comprehensive web platform for genetic variant analysis and prioritization Homepage Link
SNPdbe Snpdbe: Constructing an nsSnp functional impacts database Homepage Link
SPOT SPOT: A web-based tool for using biological databases to prioritize SNPs after a genome-wide association study Homepage Link
VarMod VarMod: Modelling the functional effects of non-synonymous variants Homepage Link
FOLD-RATE FOLD-RATE: Prediction of protein folding rates from amino acid sequence Homepage Link
CoagVDb CoagVDb: A comprehensive database for coagulation factors and their associated SAPs Homepage Link
HMMvar Predicting the combined effect of multiple genetic variants Homepage Link
PON_Diso Performance of Protein Disorder Prediction Programs on Amino Acid Substitutions Homepage Link
SAAPdap / SAAPred The SAAPdb web resource: A large-scale structural analysis of mutant proteins Homepage Link
ALoFT Using ALoFT to determine the impact of putative loss-of-function variants in protein-coding genes Homepage Link
ASSEDA Automated splicing mutation analysis by information theory Homepage Link
KD4i A comprehensive study of small non-frameshift insertions/deletions in proteins and prediction of their phenotypic effects by a machine learning method (KD4i) Homepage Link
SAPred Finding new structural and sequence attributes to predict possible disease association of single amino acid polymorphism (SAP) Homepage Link
MuX-48 Robust prediction of mutation-induced protein stability change by property encoding of amino acids Homepage Link
MuX-S Robust prediction of mutation-induced protein stability change by property encoding of amino acids Homepage Link
SInBaD Exploring functional variant discovery in non-coding regions with SInBaD Homepage Link
ClinLabGeneticist ClinLabGeneticist: A tool for clinical management of genetic variants from whole exome sequencing in clinical genetic laboratories Homepage Link
MAPPIN MAPPIN: A method for annotating, predicting pathogenicity and mode of inheritance for nonsynonymous variants Homepage Link
NETdiseaseSNP Prediction of Disease Causing Non-Synonymous SNPs by the Artificial Neural Network Predictor NetDiseaseSNP Homepage Link
Parepro Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines Homepage Link
SDS SDS, a structural disruption score for assessment of missense variant deleteriousness Homepage Link
VaProS VaProS: a database-integration approach for protein/genome information retrieval Homepage Link
HOPE HOPE: A homotopy optimization method for protein structure prediction Homepage Link
K-FOLD K-Fold: A tool for the prediction of the protein folding kinetic order and rate Homepage Link
SuRFR SuRFing the genomics wave: An R package for prioritising SNPs by functionality Homepage Link
ClinPred ClinPred: Prediction Tool to Identify Disease-Relevant Nonsynonymous Single-Nucleotide Variants Homepage Link
MAESTRO MAESTRO--multi agent stability prediction upon point mutations Homepage Link
MODICT Convert your favorite protein modeling program into a mutation predictor: "MODICT" Homepage Link
MutPred2 MutPred2: inferring the molecular and phenotypic impact of amino acid Homepage Link
NECTAR NECTAR: A database of codon-centric missense variant annotations Homepage Link
SeqVItA SeqVItA: Sequence Variant Identification and Annotation Platform for Next Generation Sequencing Data Homepage Link
Syntool Syntool: A novel region-based intolerance score to single nucleotide substitution for synonymous mutations predictions based on 123,136 individuals Homepage Link
Variant Ranker Variant Ranker: A web-tool to rank genomic data according to functional significance Homepage Link
VEST Identifying Mendelian disease genes with the Variant Effect Scoring Tool Homepage Link
WS-SNPs&GO WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation Homepage Link
FOLDEF(core of FOLDX) nan Homepage Link
PPT-DB PPT-DB: The protein property prediction and testing database Homepage Link