Variant impact prediction methods

Each name is scaled according to the logarithm of the number of citations. Colors are randomly assigned.

CAGI Wordle

Information on the variant annotation tools included can be found in the table below.

Image at a higher resolution can be found here
You are welcome to use this image with appropriate citation.

If you know of any methods we did not include please let us know.

Image created by: Melissa Kim Ly and Mabel Furutsuki

To cite the figure please cite: Hoskins RA, Repo S, Barsky D, Andreoletti G, Moult J, Brenner SE.2017. Reports from CAGI: The Critical Assessment of Genome Interpretation. Hum Mutat 38:1072-1084. doi:10.1002/humu.23290

Method Citation(MLA) DOI
AGGRESCAN Conchillo-Solé, Oscar, et al. "AGGRESCAN: a server for the prediction and evaluation of" hot spots" of aggregation in polypeptides." BMC bioinformatics 8.1 (2007): 1. 10.1186/1471-2105-8-65
Align-GVGD Tavtigian, Sean V., et al. "Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral." Journal of medical genetics 43.4 (2006): 295-305. 10.1136/jmg.2005.033878
ASP and ASPex Marini, Nicholas J., Paul D. Thomas, and Jasper Rine. "The use of orthologous sequences to predict the impact of amino acid substitutions on protein function." PLoS Genet 6.5 (2010): e1000968. 10.1371/journal.pgen.1000968
AutoMute Masso, Majid, and Iosif I. Vaisman. "AUTO-MUTE: web-based tools for predicting stability changes in proteins due to single amino acid replacements." Protein Engineering Design and Selection 23.8 (2010): 683-687. 10.1093/protein/gzq042
BeAtMuSiC Dehouck, Yves, et al. "BeAtMuSiC: prediction of changes in protein-protein binding affinity on mutations." Nucleic acids research 41.W1 (2013): W333-W339. 10.1093/nar/gkt450
CADD Kircher, Martin, et al. "A general framework for estimating the relative pathogenicity of human genetic variants." Nature genetics 46.3 (2014): 310. 10.1038/ng.2892
CanPredict Kaminker, Joshua S., et al. "CanPredict: a computational tool for predicting cancer-associated missense mutations." Nucleic acids research 35.suppl 2 (2007): W595-W598. 10.1093/nar/gkm405
CHASM Capriotti, Emidio, Remo Calabrese, and Rita Casadio. "Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information." Bioinformatics 22.22 (2006): 2729-2734. 10.1093/bioinformatics/btl423
CoDP Terui, Hiroko, et al. "CoDP: predicting the impact of unclassified genetic variants in MSH6 by the combination of different properties of the protein." Journal of biomedical science 20.1 (2013): 1. 10.1186/1423-0127-20-25
CoVEC Frousios, Kimon, et al. "Predicting the functional consequences of non-synonymous DNA sequence variants—evaluation of bioinformatics tools and development of a consensus strategy." Genomics 102.4 (2013): 223-228. 10.1016/j.ygeno.2013.06.005
CUPSAT Parthiban, Vijaya, M. Michael Gromiha, and Dietmar Schomburg. "CUPSAT: prediction of protein stability upon point mutations." Nucleic acids research 34.suppl 2 (2006): W239-W242. 10.1093/nar/gkl190
DBD-Hunter Gao, Mu, and Jeffrey Skolnick. "DBD-Hunter: a knowledge-based method for the prediction of DNA-protein interactions." Nucleic acids research 36.12 (2008): 3978-3992. 10.1093/nar/gkn332
DFIRE Zhang, Chi, et al. "A knowledge-based energy function for protein-ligand, protein-protein, and protein-DNA complexes." Journal of medicinal chemistry 48.7 (2005): 2325-2335. 10.1021/jm049314d
Dmutant Zhou, Hongyi, and Yaoqi Zhou. "Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction." Protein science 11.11 (2002): 2714-2726. 10.1110/ps.0217002
EGAD Pokala, Navin, and Tracy M. Handel. "Energy functions for protein design: adjustment with protein-protein complex affinities, models for the unfolded state, and negative design of solubility and specificity." Journal of molecular biology 347.1 (2005): 203-227. 10.1016/j.jmb.2004.12.019
EIGEN Ionita-Laza, Iuliana, et al. "A spectral approach integrating functional genomic annotations for coding and noncoding variants." Nature genetics 48.2 (2016): 214-220. 10.1038/ng.3477
Eris Yin, Shuangye, Feng Ding, and Nikolay V. Dokholyan. "Eris: an automated estimator of protein stability." Nature methods 4.6 (2007): 466-467. 10.1038/nmeth0607-466
Exomiser Smedley, Damian, et al. "Next-generation diagnostics and disease-gene discovery with the Exomiser." Nature protocols 10.12 (2015): 2004-2015. 10.1038/nprot.2015.124
FATHmm Shihab, Hashem A., et al. "Predicting the functional consequences of cancer-associated amino acid substitutions." Bioinformatics (2013): btt182. 10.1093/bioinformatics/btt182
FIS Reva, Boris, Yevgeniy Antipin, and Chris Sander. "Predicting the functional impact of protein mutations: application to cancer genomics." Nucleic acids research (2011): gkr407. 10.1093/nar/gkr407
FOLD-RATE Gromiha, M. Michael, A. Mary Thangakani, and Samuel Selvaraj. "FOLD-RATE: prediction of protein folding rates from amino acid sequence." Nucleic acids research 34.suppl 2 (2006): W70-W74. 10.1093/nar/gkl043
FoldAmyloid Garbuzynskiy, Sergiy O., Michail Yu Lobanov, and Oxana V. Galzitskaya. "FoldAmyloid: a method of prediction of amyloidogenic regions from protein sequence." Bioinformatics 26.3 (2010): 326-332. 10.1093/bioinformatics/btp691
FOLDEF Guerois, Raphael, Jens Erik Nielsen, and Luis Serrano. "Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations." Journal of molecular biology 320.2 (2002): 369-387. 10.1016/S0022-2836(02)00442-4
FoldX Schymkowitz, Joost, et al. "The FoldX web server: an online force field." Nucleic acids research 33.suppl 2 (2005): W382-W388.. 10.1093/nar/gki387
FunSAV Wang, Mingjun, et al. "FunSAV: predicting the functional effect of single amino acid variants using a two-stage random forest model." PloS one 7.8 (2012): e43847. 10.1371/journal.pone.0043847
GWAVA Ritchie, Graham RS, et al. "Functional annotation of noncoding sequence variants." Nature methods 11.3 (2014): 294-296. 10.1038/nmeth.2832
HOPE Dunlavy, Daniel M., et al. "HOPE: A homotopy optimization method for protein structure prediction." Journal of Computational Biology 12.10 (2005): 1275-1288. 10.1089/cmb.2005.12.1275
K-Fold Capriotti, Emidio, and Rita Casadio. "K-Fold: a tool for the prediction of the protein folding kinetic order and rate." Bioinformatics 23.3 (2007): 385-386. 10.1093/bioinformatics/btl610
KGGSeq Li, Miao-Xin, et al. "A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases." Nucleic acids research (2012): gkr1257. 10.1093/nar/gkr1257
LocTree2 Goldberg, Tatyana, Tobias Hamp, and Burkhard Rost. "LocTree2 predicts localization for all domains of life." Bioinformatics 28.18 (2012): i458-i465. 10.1093/bioinformatics/bts390
LS-SNP/PDB Ryan, Michael, et al. "LS-SNP/PDB: annotated non-synonymous SNPs mapped to Protein Data Bank structures." Bioinformatics 25.11 (2009): 1431-1432. 10.1093/bioinformatics/btp242
MAPP Stone, Eric A., and Arend Sidow. "Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity." Genome research 15.7 (2005): 978-986. 10.1101/gr.3804205
MaxEnt Yeo, Gene, and Christopher B. Burge. "Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals." Journal of Computational Biology 11.2-3 (2004): 377-394. 10.1089/1066527041410418
MultiMutate Deutsch, Chris, and Bala Krishnamoorthy. "Four-body scoring function for mutagenesis." Bioinformatics 23.22 (2007): 3009-3015. 10.1093/bioinformatics/btm481
Mupro Cheng, Jianlin, Arlo Randall, and Pierre Baldi. "Prediction of protein stability changes for single‐site mutations using support vector machines." Proteins: Structure, Function, and Bioinformatics 62.4 (2006): 1125-1132. 10.1002/prot.20810 Reva, Boris, Yevgeniy Antipin, and Chris Sander. "Predicting the functional impact of protein mutations: application to cancer genomics." Nucleic acids research (2011): gkr407. 10.1093/nar/gkr407
MutationTaster Schwarz, Jana Marie, et al. "MutationTaster evaluates disease-causing potential of sequence alterations." Nature methods 7.8 (2010): 575-576. 10.1038/nmeth0810-575
MutPred Li, Biao, et al. "Automated inference of molecular mechanisms of disease from amino acid substitutions." Bioinformatics 25.21 (2009): 2744-2750. 10.1093/bioinformatics/btp528
MutPred Splice Mort, Matthew, et al. "MutPred Splice: machine learning-based prediction of exonic variants that disrupt splicing." Genome biology 15.1 (2014): 1. 10.1186/gb-2014-15-1-r19
MuX-48 Kang, Shuli, Gang Chen, and Gengfu Xiao. "Robust prediction of mutation-induced protein stability change by property encoding of amino acids." Protein Engineering Design and Selection 22.2 (2009): 75-83. 10.1093/protein/gzn063
MuX-S Kang, Shuli, Gang Chen, and Gengfu Xiao. "Robust prediction of mutation-induced protein stability change by property encoding of amino acids." Protein Engineering Design and Selection 22.2 (2009): 75-83. 10.1093/protein/gzn063
NETdiseaseSNP Johansen, Morten Bo, et al. "Prediction of disease causing non-synonymous SNPs by the Artificial Neural Network Predictor NetDiseaseSNP." PloS one 8.7 (2013): e68370. 10.1371/journal.pone.0068370
nsSNPAnalyzer Bao, Lei, Mi Zhou, and Yan Cui. "nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms." Nucleic acids research 33.suppl 2 (2005): W480-W482. 10.1093/nar/gki372
PAGE Tartaglia, Gian Gaetano, et al. "Prediction of aggregation rate and aggregation‐prone segments in polypeptide sequences." Protein Science 14.10 (2005): 2723-2734. 10.1110/ps.051471205
PantherPSEC Thomas, Paul D., and Anish Kejariwal. "Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: evolutionary evidence for differences in molecular effects." Proceedings of the National Academy of Sciences of the United States of America 101.43 (2004): 15398-15403. 10.1073/pnas.0404380101
Parepro Tian, Jian, et al. "Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines." BMC bioinformatics 8.1 (2007): 1. 10.1186/1471-2105-8-450
PhD-SNP Capriotti, Emidio, et al. "WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation." BMC genomics 14.3 (2013): 1. 10.1186/1471-2164-14-S3-S6
Phen-Gen Javed, Asif, Saloni Agrawal, and Pauline C. Ng. "Phen-Gen: combining phenotype and genotype to analyze rare disorders." Nature methods 11.9 (2014): 935-937. 10.1038/nmeth.3046
PMUT Ferrer-Costa, Carles, et al. "PMUT: a web-based tool for the annotation of pathological mutations on proteins." Bioinformatics 21.14 (2005): 3176-3178.. 10.1093/bioinformatics/bit486
PolyPhen Ramensky, Vasily, Peer Bork, and Shamil Sunyaev. "Human non‐synonymous SNPs: server and survey." Nucleic acids research 30.17 (2002): 3894-3900. 10.1093/nar/gkf493
PolyPhen2 Adzhubei, Ivan, Daniel M. Jordan, and Shamil R. Sunyaev. "Predicting functional effect of human missense mutations using PolyPhen‐2." Current protocols in human genetics (2013): 7-20. 10.1002/0471142905.hg0720s76
PON-P Olatubosun, Ayodeji, et al. "PON‐P: Integrated predictor for pathogenicity of missense variants." Human mutation 33.8 (2012): 1166-1174. 10.1002/humu.22102
PopMuSic Dehouck, Yves, et al. "Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0." Bioinformatics 25.19 (2009): 2537-2543. 10.1093/bioinformatics/btp445
Predicting Protein Mutant Stability Change (MuStab) Teng, Shaolei, Anand K. Srivastava, and Liangjiang Wang. "Sequence feature-based prediction of protein stability changes upon amino acid substitutions." BMC genomics 11.2 (2010): 1. 10.1186/1471-2164-11-S2-S5
Prediction of Amyloid Structure Aggregation (PASTA2) Walsh, Ian, et al. "PASTA 2.0: an improved server for protein aggregation prediction." Nucleic acids research 42.W1 (2014): W301-W307. 10.1093/nar/gku399
Prediction of long-range Contacts (PROFcon) Punta, Marco, and Burkhard Rost. "PROFcon: novel prediction of long-range contacts." Bioinformatics 21.13 (2005): 2960-2968. 10.1093/bioinformatics/bti454
Protein Aggregation Prediction Server (ProA) Fang, Yaping, et al. "Identification of properties important to protein aggregation using feature selection." BMC bioinformatics 14.1 (2013): 1. 10.1186/1471-2105-14-314
Protein Property Prediction and Testing Database (PPT-DB) Wishart, David S., et al. "PPT-DB: the protein property prediction and testing database." Nucleic acids research 36.suppl 1 (2008): D222-D229. 10.1093/nar/gkm800
PROVEAN "Choi, Yongwook, et al. ""Predicting the functional effect of amino acid substitutions and indels."" PloS one 7.10 (2012): e46688.
" 10.1371/journal.pone.0046688
Re-ID Shringarpure, Suyash S., and Carlos D. Bustamante. "Privacy risks from genomic data-sharing beacons." The American Journal of Human Genetics 97.5 (2015): 631-646. 10.1016/j.ajhg.2015.09.010
SAAPdb Hurst, Jacob M., et al. "The SAAPdb web resource: A large‐scale structural analysis of mutant proteins." Human mutation 30.4 (2009): 616-624. 10.1002/humu.20898
SAPred Ye, Zhi-Qiang, et al. "Finding new structural and sequence attributes to predict possible disease association of single amino acid polymorphism (SAP)." Bioinformatics 23.12 (2007): 1444-1450. 10.1093/bioinformatics/btm119
Saunders & Baker Saunders, Christopher T., and David Baker. "Recapitulation of protein family divergence using flexible backbone protein design." Journal of molecular biology 346.2 (2005): 631-644. 10.1016/j.jmb.2004.11.062
Scide Dosztanyi, Zsuzsanna, et al. "SCide: identification of stabilization centers in proteins." Bioinformatics 19.7 (2003): 899-900. 10.1093/bioinformatics/btg110
Scpred Kurgan, Lukasz, Krzysztof Cios, and Ke Chen. "SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences." BMC bioinformatics 9.1 (2008): 1. 10.1186/1471-2105-9-226
SIFT Kumar, Prateek, Steven Henikoff, and Pauline C. Ng. "Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm." Nature protocols 4.7 (2009): 1073-1081. 10.1038/nprot.2009.86
SIFT-indel Hu, Jing, and Pauline C. Ng. "SIFT Indel: predictions for the functional effects of amino acid insertions/deletions in proteins." PLoS One 8.10 (2013): e77940. 10.1371/journal.pone.0077940
SignalP Nielsen, Henrik, et al. "Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites." Protein engineering 10.1 (1997): 1-6. 10.1093/protein/10.1.1
SInBaD Lehmann, Kjong-Van, and Ting Chen. "Exploring functional variant discovery in non-coding regions with SInBaD." Nucleic acids research 41.1 (2013): e7-e7. 10.1093/nar/gks800
Site Directed Mutator (SDM) Topham, Christopher M., N. Srinivasan, and Tom L. Blundell. "Prediction of the stability of protein mutants based on structural environment-dependent amino acid substitution and propensity tables." Protein Engineering 10.1 (1997): 7-21. 10.1093/protein/10.1.7
Skippy Woolfe, Adam, James C. Mullikin, and Laura Elnitski. "Genomic features defining exonic variants that modulate splicing." Genome biology 11.2 (2010): 1. 10.1186/gb-2010-11-2-r20
SNAP and SNAP2 Bromberg, Yana, and Burkhard Rost. "SNAP: predict effect of non-synonymous polymorphisms on function." Nucleic acids research 35.11 (2007): 3823-3835.. 10.1093/nar/gkm238
SNP Ranking by Function R package (SuRFR) Ryan, Niamh M., et al. "SuRFing the genomics wave: an R package for prioritising SNPs by functionality." Genome medicine 6.10 (2014): 1. 10.1186/s13073-014-0079-1
SNPdbe Schaefer, Christian, et al. "SNPdbe: constructing an nsSNP functional impacts database." Bioinformatics 28.4 (2012): 601-602. 10.1093/bioinformatics/btr705
SNPeffect Reumers, Joke, et al. "SNPeffect: a database mapping molecular phenotypic effects of human non-synonymous coding SNPs." Nucleic acids research 33.suppl 1 (2005): D527-D532. 10.1093/nar/gki086
SNPinfo/FuncPred Xu, Zongli, and Jack A. Taylor. "SNPinfo: integrating GWAS and candidate gene information into functional SNP selection for genetic association studies." Nucleic acids research 37.suppl 2 (2009): W600-W605. 10.1093/nar/gkp290
SNPs&GO Capriotti, Emidio, et al. "WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation." BMC genomics 14.3 (2013): 1. 10.1186/1471-2164-14-S3-S6
SNPs3D Yue, Peng, Eugene Melamud, and John Moult. "SNPs3D: candidate gene and SNP selection for association studies." BMC bioinformatics 7.1 (2006): 1. 10.1186/1471-2105-7-166
TANGO Rousseau, Frederic, Joost Schymkowitz, and Luis Serrano. "Protein aggregation and amyloidosis: confusion of the kinds?." Current opinion in structural biology 16.1 (2006): 118-126. 10.1016/
TargetP Emanuelsson, Olof, et al. "Locating proteins in the cell using TargetP, SignalP and related tools." Nature protocols 2.4 (2007): 953-971. 10.1038/nprot.2007.131
TransComp Qin, Sanbo, Xiaodong Pang, and Huan-Xiang Zhou. "Automated prediction of protein association rate constants." Structure 19.12 (2011): 1744-1751. 10.1016/j.str.2011.10.015
transFIC Gonzalez-Perez, Abel, Jordi Deu-Pons, and Nuria Lopez-Bigas. "Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation." Genome medicine 4.11 (2012): 1. 10.1186/gm390
UMD-predictor Frédéric, Mélissa Yana, et al. "UMD‐predictor, a new prediction tool for nucleotide substitution pathogenicity—application to four genes: FBN1, FBN2, TGFBR1, and TGFBR2." Human mutation 30.6 (2009): 952-959. 10.1002/humu.20970
VAAST Yandell, Mark, et al. "A probabilistic disease-gene finder for personal genomes." Genome research 21.9 (2011): 1529-1542. 10.1101/gr.123158.111
VAAST 2 Hu, Hao, et al. "VAAST 2.0: Improved variant classification and disease‐gene identification using a conservation‐controlled amino acid substitution matrix." Genetic epidemiology 37.6 (2013): 622-634. 10.1002/gepi.21743
VarMod Pappalardo, Morena, and Mark N. Wass. "VarMod: modelling the functional effects of non-synonymous variants." Nucleic acids research (2014): gku483. 10.1093/nar/gku483
VEST Carter, Hannah, et al. "Identifying Mendelian disease genes with the variant effect scoring tool." BMC genomics 14.3 (2013): 1. 10.1186/1471-2164-14-S3-S3
VEST-indel Douville, Christopher, et al. "Assessing the Pathogenicity of Insertion and Deletion Variants with the Variant Effect Scoring Tool (VEST‐Indel)." Human mutation 37.1 (2016): 28-35. 10.1002/humu.22911
Waltz Maurer-Stroh S, Debulpaep M, Kuemmerer N, Lopez de la Paz M, Martins IC, Reumers J, Morris KL, Copland A, Serpell L, Serrano L, Schymkowitz JW, Rousseau F. Exploring the sequence determinants of amyloid structure using position-specific scoring matrices. Nat Methods. 2010 Mar;7(3):237-42 10.1038/NMETH.1432
WoLF-PSORT Horton, Paul, et al. "WoLF PSORT: protein localization predictor." Nucleic acids research 35.suppl 2 (2007): W585-W587. 10.1093/nar/gkm259
Zyggregator Tartaglia, Gian Gaetano, and Michele Vendruscolo. "The Zyggregator method for predicting protein aggregation propensities." Chemical Society Reviews 37.7 (2008): 1395-1401. 10.1039/b706784b