Variant Impact Predictor Database (VIPdb)

Genome sequencing identifies vast number of genetic variants. Predicting these variants’ molecular and clinical effects is one of the preeminent challenges in genetics. Accurate prediction of the impact of genetic variants improves our understanding of how genetic information is conveyed to molecular and cellular functions, and is an essential step towards precision medicine. Over one hundred tools/resources have been developed specifically for this purpose. We summarize these tools as well as their characteristics, in the genetic Variant Impact Predictor Database (VIPdb). This database will help researchers and clinicians explore appropriate tools, and inform the development of improved methods.

To provide an intuitive view of tool usages in the field, we provide a wordle of these tools. Each predictor size is scaled according to the logarithm of the number of 2-year citations. Colors are randomly assigned.



CAGI Wordle

Image at a higher resolution can be found here [tiff][pdf].
For a version of the figure with all-time citations citations, see here [tiff][pdf].
You are welcome to use this image with appropriate citation.

We accept submission of new resources. If you find any related methods are not included in VIPdb, please let us know at vipdb@compbio.berkeley.edu.

Information on the variant annotation tools included can be found in the table below.

Full database can be downloaded here.

To use the wordle, please cite: Hu Z, Yu C, Furutsuki M, Andreoletti G, Ly M, Hoskins R, Adhikari AN, Brenner SE. 2019. VIPdb, a genetic variant impact predictor database. Hum Mutat. doi:10.1002/humu.23858

Method Title DOI
ActiveDriverDB ActiveDriverDB: human disease mutations and genome variation in post-translational modification sites of proteins 10.1093/nar/gkx973
AGGRESCAN AGGRESCAN: A server for the prediction and evaluation of "hot spots" of aggregation in polypeptides 10.1186/1471-2105-8-65
AGGRESCAN3D AGGRESCAN3D (A3D): Server for prediction of aggregation properties of protein structures 10.1093/nar/gkv359
Align-GVGD Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral 10.1136/jmg.2005.033878
ALoFT Using ALoFT to determine the impact of putative loss-of-function variants in protein-coding genes 10.1038/s41467-017-00443-5
ANNOVAR Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR 10.1038/nprot.2015.105
ASP/ASPex The use of orthologous sequences to predict the impact of amino acid substitutions on protein function 10.1371/journal.pgen.1000968
ASSEDA Automated splicing mutation analysis by information theory 10.1002/humu.20151
AUTO-MUTE AUTO-MUTE: Web-based tools for predicting stability changes in proteins due to single amino acid replacements 10.1093/protein/gzq042
AVIA AVIA v2.0: Annotation, visualization and impact analysis of genomic variants and genes 10.1093/bioinformatics/btv200
BeAtMuSiC BeAtMuSiC: Prediction of changes in protein-protein binding affinity on mutations. 10.1093/nar/gkt450
CADD A general framework for estimating the relative pathogenicity of human genetic variants 10.1038/ng.2892
CanDrA CanDrA: Cancer-specific driver missense mutation annotation with optimized features 10.1371/journal.pone.0077945
CanPredict CanPredict: A computational tool for predicting cancer-associated missense mutations 10.1093/nar/gkm405
CAROL A combined functional annotation score for non-synonymous variants 10.1159/000334984
CHASM CHASM and SNVBox: Toolkit for detecting biologically important single nucleotide mutations in cancer 10.1093/bioinformatics/btr357
CHESS A fully-automated event-based variant prioritizing solution to the CAGI5 intellectual disabilitygene panel challenge 10.1002/humu.23781
ChroMoS ChroMoS: An integrated web tool for SNP classification, prioritization and functional interpretation 10.1093/bioinformatics/btt356
ClinPred ClinPred: Prediction Tool to Identify Disease-Relevant Nonsynonymous Single-Nucleotide Variants 10.1016/j.ajhg.2018.08.005
ClinVar ClinVar: Public archive of interpretations of clinically relevant variants 10.1093/nar/gkv1222
CoDP CoDP: Predicting the impact of unclassified genetic variants in MSH6 by the combination of different properties of the protein 10.1186/1423-0127-20-25
CoMEt CoMEt: A statistical approach to identify combinations of mutually exclusive alterations in cancer 10.1186/s13059-015-0700-7
Condel Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel 10.1016/j.ajhg.2011.03.004
COSMIC COSMIC: The Catalogue Of Somatic Mutations In Cancer 10.1093/nar/gky1015
CoVEC Predicting the functional consequences of non-synonymous DNA sequence variants - evaluation of bioinformatics tools and development of a consensus strategy 10.1016/j.ygeno.2013.06.005
CUPSAT Computational modeling of protein mutant stability: Analysis and optimization of statistical potentials and structural features reveal insights into prediction model development 10.1186/1472-6807-7-54
DANN DANN: A deep learning approach for annotating the pathogenicity of genetic variants 10.1093/bioinformatics/btu703
DBD-Hunter DBD-Hunter: A knowledge-based method for the prediction of DNA-protein interactions 10.1093/nar/gkn332
dbNSFP dbNSFP v3.0: A One-Stop Database of Functional Predictions and Annotations for Human Nonsynonymous and Splice-Site SNVs 10.1002/humu.22932
dbscSNV In silico prediction of splice-altering single nucleotide variants in the human genome 10.1093/nar/gku1206
dbSNP dbSNP: the NCBI database of genetic variation. 10.1093/nar/29.1.308
dbVar dbVar and DGVa: Public archives for genomic structural variation 10.1093/nar/gks1213
DDIG_in Investigating DNA-, RNA-, and protein-based features as a means to discriminate pathogenic synonymous variants 10.1002/humu.23283
DeepSEA Predicting effects of noncoding variants with deep learning-based sequence model 10.1038/nmeth.3547
DFIRE/DDNA2 An all-atom knowledge-based energy function for protein-DNA threading, docking decoy discrimination, and prediction of transcription-factor binding profiles 10.1002/prot.22384
Dmutant Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction 10.1110/ps.0217002
DUET DUET: A server for predicting effects of mutations on protein stability using an integrated computational approach 10.1093/nar/gku411
EFIN EFIN: Predicting the functional impact of nonsynonymous single nucleotide polymorphisms in human genome 10.1186/1471-2164-15-455
EGAD Energy functions for protein design: Adjustment with protein-protein complex affinities, models for the unfolded state, and negative design of solubility and specificity 10.1016/j.jmb.2004.12.019
EIGEN A spectral approach integrating functional genomic annotations for coding and noncoding variants 10.1038/ng.3477
Eris Eris: An automated estimator of protein stability [2] 10.1038/nmeth0607-466
EVmutation Mutation effects predicted from sequence co-variation 10.1038/nbt.3769
ExPecto Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. 10.1038/s41588-018-0160-6
Exome Variant Server (EVS) A map of human genome variation from population-scale sequencing 10.1038/nature09534
Exomiser Next-generation diagnostics and disease-gene discovery with the Exomiser 10.1038/nprot.2015.124
FATHmm Ranking non-synonymous single nucleotide polymorphisms based on disease concepts 10.1186/1479-7364-8-11
fathmm-MKL An integrative approach to predicting the functional effects of non-coding and coding sequence variation 10.1093/bioinformatics/btv009
FIS Predicting the functional impact of protein mutations: Application to cancer genomics 10.1093/nar/gkr407
fitCons A method for calculating probabilities of fitness consequences for point mutations across the human genome 10.1038/ng.3196
FOLD-RATE FOLD-RATE: Prediction of protein folding rates from amino acid sequence 10.1093/nar/gkl043
FoldAmyloid FoldAmyloid: A method of prediction of amyloidogenic regions from protein sequence 10.1093/bioinformatics/btp691
FOLDEF (core of FOLDX) The FoldX web server: an online force field 10.1093/nar/gki387
FunSAV FunSAV: Predicting the functional effect of single amino acid variants using a two-stage random forest model 10.1371/journal.pone.0043847
GeneTests GeneTests-GeneClinics: Genetic testing information for a growing audience 10.1002/humu.10069
GenoCanyon A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data 10.1038/srep10576
GERP++ Identifying a high fraction of the human genome to be under selective constraint using GERP++ 10.1371/journal.pcbi.1001025
GWAVA Functional annotation of noncoding sequence variants 10.1038/nmeth.2832
HANSA Hansa: An automated method for discriminating disease and neutral human nsSNPs 10.1002/humu.21642
HGMD The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies 10.1007/s00439-017-1779-6
HMMvar Predicting the combined effect of multiple genetic variants 10.1186/s40246-015-0040-4
HOPE HOPE: A homotopy optimization method for protein structure prediction 10.1089/cmb.2005.12.1275
Human Splicing Finder Human Splicing Finder: An online bioinformatics tool to predict splicing signals 10.1093/nar/gkp215
IMHOTEP IMHOTEP-a composite score integrating popular tools for predicting the functional consequences of non-synonymous sequence variants 10.1093/nar/gkw886
INPS-3D INPS-MD: A web server to predict stability of protein variants from sequence and structure 10.1093/bioinformatics/btw192
INSIGHT Application of a 5-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants in the InSiGHT locus-specific database 10.1038/ng.2854
is-rSNP is-rSNP: A novel technique for in silico regulatory SNP detection 10.1093/bioinformatics/btq378
K-FOLD K-Fold: A tool for the prediction of the protein folding kinetic order and rate 10.1093/bioinformatics/btl610
KD4i A comprehensive study of small non-frameshift insertions/deletions in proteins and prediction of their phenotypic effects by a machine learning method (KD4i) 10.1186/1471-2105-15-111
KGGSeq Cepip: Context-dependent epigenomic weighting for prioritization of regulatory variants and disease-associated genes 10.1186/s13059-017-1177-3
KvSNP Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity 10.1101/gr.3804205
LocTree LocTree3 prediction of localization 10.1093/nar/gku396
LOFTEE Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes 10.1101/531210
LOVD/LSDB LOVD v.2.0: The next generation in gene variant databases 10.1002/humu.21438
LS-SNP/PDB LS-SNP/PDB: Annotated non-synonymous SNPs mapped to Protein Data Bank structures 10.1093/bioinformatics/btp242
MAESTRO MAESTRO--multi agent stability prediction upon point mutations 10.1186/s12859-015-0548-6
MAPP Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity 10.1101/gr.3804205
MAPPIN MAPPIN: A method for annotating, predicting pathogenicity and mode of inheritance for nonsynonymous variants 10.1093/nar/gkx730
MaxEnt Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals 10.1089/1066527041410418
MetaLR Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies 10.1093/hmg/ddu733
MetaSVM Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies 10.1093/hmg/ddu733
MMSplice MMSplice: modular modeling improves the predictions of genetic variant effects on splicing 10.1186/s13059-019-1653-z
MultiMutate Four-body scoring function for mutagenesis 10.1093/bioinformatics/btm481
Mupro Prediction of protein stability changes for single-site mutations using support vector machines 10.1002/prot.20810
MuSiC MuSiC: Identifying mutational significance in cancer genomes 10.1101/gr.134635.111
MuStab Sequence feature-based prediction of protein stability changes upon amino acid substitutions 10.1186/1471-2164-11-s2-s5
MutationAssessor Predicting the functional impact of protein mutations: Application to cancer genomics 10.1093/nar/gkr407
MutationTaster Mutationtaster2: Mutation prediction for the deep-sequencing age 10.1038/nmeth.2890
MutPred Splice MutPred Splice: Machine learning-based prediction of exonic variants that disrupt splicing 10.1186/gb-2014-15-1-r19
MutPred-LOF When loss-of-function is loss of function: Assessing mutational signatures and impact of loss-of-function genetic variants 10.1093/bioinformatics/btx272
MutPred2 Automated inference of molecular mechanisms of disease from amino acid substitutions 10.1093/bioinformatics/btp528
MutPred2 MutPred2: inferring the molecular and phenotypic impact of amino acid 10.1101/134981
MutSigCV Mutational heterogeneity in cancer and the search for new cancer-associated genes 10.1038/nature12213
MuX-48 Robust prediction of mutation-induced protein stability change by property encoding of amino acids 10.1093/protein/gzn063
MuX-S Robust prediction of mutation-induced protein stability change by property encoding of amino acids 10.1093/protein/gzn063
NeEMO NeEMO: A method using residue interaction networks to improve prediction of protein stability upon mutation 10.1186/1471-2164-15-s4-s7
NETdiseaseSNP Prediction of Disease Causing Non-Synonymous SNPs by the Artificial Neural Network Predictor NetDiseaseSNP 10.1371/journal.pone.0068370
nsSNPAnalyzer nsSNPAnalyzer: Identifying disease-associated nonsynonymous single nucleotide polymorphisms 10.1093/nar/gki372
OMIM Online Mendelian Inheritance in Man (OMIM) 10.1002/(SICI)1098-1004(200001)15
OncodriveCLUST OncodriveCLUST: Exploiting the positional clustering of somatic mutations to identify cancer genes 10.1093/bioinformatics/btt395
PAGE Prediction of aggregation rate and aggregation-prone segments in polypeptide sequences 10.1110/ps.051471205
Panther PANTHER version 11: Expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements 10.1093/nar/gkw1138
PantherPSEP PANTHER-PSEP: Predicting disease-causing genetic variants using position-specific evolutionary preservation 10.1093/bioinformatics/btw222
Parepro Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines 10.1186/1471-2105-8-450
PASTA2 PASTA 2.0: An improved server for protein aggregation prediction 10.1093/nar/gku399
PEPSI Using secondary structure to predict the effects of genetic variants on alternative splicing 10.1002/humu.23790
Personal Genome Project Privacy risks from genomic data-sharing beacons 10.1016/j.ajhg.2015.09.010
PharmGKB PharmGKB: The pharmacogenomics knowledge base 10.1007/978-1-62703-435-7_20
phastCons Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes 10.1101/gr.3715005
PhD-SNP Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information 10.1093/bioinformatics/btl423
Phen-Gen Phen-gen: Combining phenotype and genotype to analyze rare disorders 10.1038/nmeth.3046
phyloP Detection of nonneutral substitution rates on mammalian phylogenies 10.1101/gr.097857.109
PMUT PMut: A web-based tool for the annotation of pathological variants on proteins, 2017 update 10.1093/nar/gkx313
PolyPhen A method and server for predicting damaging missense mutations 10.1038/nmeth0410-248
PON_Diso Performance of Protein Disorder Prediction Programs on Amino Acid Substitutions 10.1002/humu.22564
PON_mt_tRNA PON-mt-tRNA: A multifactorial probability-based method for classification of mitochondrial tRNA variations 10.1093/nar/gkw046
PON-P2 PON-P2: Prediction method for fast and reliable identification of harmful variants 10.1371/journal.pone.0117380
PopMuSic PoPMuSiC 2.1: A web server for the estimation of protein stability changes upon mutation and sequence optimality 10.1186/1471-2105-12-151
PPT-DB PPT-DB: The protein property prediction and testing database 10.1093/nar/gkm800
PredictSNP2 PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions. 10.1371/journal.pcbi.1004962
ProA Identification of properties important to protein aggregation using feature selection 10.1186/1471-2105-14-314
PROFcon PROFcon: Novel prediction of long-range contacts 10.1093/bioinformatics/bti454
PROVEAN PROVEAN web server: A tool to predict the functional effect of amino acid substitutions and indels 10.1093/bioinformatics/btv195
QueryOR QueryOR: A comprehensive web platform for genetic variant analysis and prioritization 10.1186/s12859-017-1654-4
REMM A Whole-Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease 10.1016/j.ajhg.2016.07.005
REVEL REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants 10.1016/j.ajhg.2016.08.016
SAAPdap/SAAPred The SAAPdb web resource: A large-scale structural analysis of mutant proteins 10.1002/humu.20898
SAPred Finding new structural and sequence attributes to predict possible disease association of single amino acid polymorphism (SAP) 10.1093/bioinformatics/btm119
Scide SCide: Identification of stabilization centers in proteins 10.1093/bioinformatics/btg110
Scpred SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences 10.1186/1471-2105-9-226
SDM SDM: A server for predicting effects of mutations on protein stability 10.1093/nar/gkx439
SDS SDS, a structural disruption score for assessment of missense variant deleteriousness 10.3389/fgene.2014.00082
SeqVItA SeqVItA: Sequence Variant Identification and Annotation Platform for Next Generation Sequencing Data 10.3389/fgene.2018.00537
SIFT SIFT web server: Predicting effects of amino acid substitutions on proteins 10.1093/nar/gks539
SIFT-indel SIFT Indel: Predictions for the Functional Effects of Amino Acid Insertions/Deletions in Proteins 10.1371/journal.pone.0077940
SignalP SignalP 4.0: Discriminating signal peptides from transmembrane regions 10.1038/nmeth.1701
SilVA Identification of deleterious synonymous variants in human genomes 10.1093/bioinformatics/btt308
SInBaD Exploring functional variant discovery in non-coding regions with SInBaD 10.1093/nar/gks800
SiPhy Identifying novel constrained elements by exploiting biased substitution patterns 10.1093/bioinformatics/btp190
Skippy Genomic features defining exonic variants that modulate splicing 10.1186/gb-2010-11-2-r20
SNAP2 Better prediction of functional effects for sequence variants 10.1186/1471-2164-16-s8-s1
SNPedia SNPedia: A wiki supporting personal genome annotation, interpretationand analysis 10.1093/nar/gkr798
SNPdbe Snpdbe: Constructing an nsSnp functional impacts database 10.1093/bioinformatics/btr705
SnpEff A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3 10.4161/fly.19695
SNPeffect SNPeffect 4.0: On-line prediction of molecular and structural effects of protein-coding variants 10.1093/nar/gkr996
SNPinfo/FuncPred SNPinfo: Integrating GWAS and candidate gene information into functional SNP selection for genetic association studies 10.1093/nar/gkp290
SNPnexus? SNPnexus: Assessing the functional relevance of genetic variation to facilitate the promise of precision medicine 10.1093/nar/gky399
SNPs3D Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines 10.1186/1471-2105-8-450
SPANR/SPIDEX The human splicing code reveals new insights into the genetic determinants of disease. 10.1126/science.1254806
SPF_Cancer A new disease-specific machine learning approach for the prediction of cancer-causing missense variants 10.1016/j.ygeno.2011.06.010
SuRFR SuRFing the genomics wave: An R package for prioritising SNPs by functionality 10.1186/s13073-014-0079-1
SVScore SVScore: an impact prediction tool for structural variation 10.1093/bioinformatics/btw789
Syntool Syntool: A novel region-based intolerance score to single nucleotide substitution for synonymous mutations predictions based on 123,136 individuals 10.1155/2017/5096208
TANGO Protein aggregation and amyloidosis: Confusion of the kinds? 10.1016/j.sbi.2006.01.011
TransComp Automated prediction of protein association rate constants 10.1016/j.str.2011.10.015
transFIC Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation 10.1186/gm390
UMD-predictor UMD-predictor, a new prediction tool for nucleotide substitution pathogenicity - Application to four genes: FBN1, FBN2, TGFBR1, and TGFBR2 10.1002/humu.20970
VAAST VAAST 2.0: Improved variant classification and disease-gene identification using a conservation-controlled amino acid substitution matrix 10.1002/gepi.21743
Variant Tools Reproducible simulations of realistic samples for next-generation sequencing studies using variant simulation tools 10.1002/gepi.21867
VariBench VariBench: A Benchmark Database for Variations 10.1002/humu.22204
VariSNP VariSNP, A benchmark database for variations from dbSNP 10.1002/humu.22727
VarMod VarMod: Modelling the functional effects of non-synonymous variants 10.1093/nar/gku483
VarWalker VarWalker: Personalized Mutation Network Analysis of Putative Cancer Genes from Next-Generation Sequencing Data 10.1371/journal.pcbi.1003460
VEP The Ensembl Variant Effect Predictor 10.1186/s13059-016-0974-4
VEST Identifying Mendelian disease genes with the Variant Effect Scoring Tool 10.1186/1471-2164-14-s3-s3
VEST-indel Assessing the Pathogenicity of Insertion and Deletion Variants with the Variant Effect Scoring Tool (VEST-Indel) 10.1002/humu.22911
Waltz Exploring the sequence determinants of amyloid structure using position-specific scoring matrices 10.1038/nmeth.1432
WGSA WGSA: An annotation pipeline for human genome sequencing studies 10.1136/jmedgenet-2015-103423
wInterVar InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines 10.1016/j.ajhg.2017.01.004
WoLF-PSORT WoLF PSORT: Protein localization predictor 10.1093/nar/gkm259
WS-SNPs&GO WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation 10.1186/1471-2164-14-s3-s6
Zyggregator The Zyggregator method for predicting protein aggregation propensities 10.1039/b706784b