Identify the splicing impact of variation

CHALLENGE WITHDRAWN
This challenge is withdrawn, because the results were unexpectedly published. If you know of another splicing dataset possible suitable for CAGI, please contact the organizers at cagi@genomeinterpretation.org.

Dataset description: public
Exome dataset: public

Background: Accurate precursor mRNA (pre-mRNA) splicing is required for the expression of protein coding genes from the human genome. In this process, intervening sequences (introns) are removed from pre-mRNA and coding/regulatory sequences (exons) are ligated together generating a mature mRNA. A large ribonucleoprotein machine called the spliceosome assembles de novo upon every nascent intron and catalyzes the chemical steps of splicing. Numerous auxiliary cis-acting elements guide the spliceosome to correct pairs of splice sites. Exonic sequences are densely packed with regulatory elements such as splicing enhancers and splicing silencers (ESE and ESS, respectively). Thus, many exon sequences are multifunctional and contain overlapping information required to specify accurate pre-mRNA splicing and to dictate the primary structure of polypeptides. To better understand genotype-phenotype relationships it is critical to determine if polymorphisms influence the function of exon sequences in pre-mRNA splicing, mRNA translation or potentially both steps.

For the past several years the group of Jeremy Sanford at University of California, Santa Cruz has been working to identify splicing-sensitive disease mutations using the Human Gene Mutation Database. In an initial study his group identified thousands of putative splicing-sensitive disease mutations and validated a handful of aberrant splicing events.

Prediction Challenge: Predictors are asked to compare exons from wild type and disease-associated alleles of four different disease genes and then predict which exons will exhibit aberrant pre-mRNA splicing. The submitted prediction should be the change in percentage inclusion of the exon, as delta "percent spliced in" (ΔPSI) as compared to the wild type. In addition, we ask predictors to describe the mechanism how splicing is affected. The predictions will be compared to experimental results.

Dataset: The dataset is composed of 4 pairs of exons from 4 different genes. Each pair contains a wild type sequence and a mutant sequence differing by only a single nucleotide. Each pair of exons was assayed, experimentally for splicing efficiency.

  1. Disease: Optic Neuron Atrophy
    Gene: OPA1
    Chr3 +strand 194843856-194843927
    Wild Type Exon Sequence:
    ACCATATCCTTAAATGTAAAAGGCCCTGGACTACAGAGGATGGTGCTTGTTG
    ACTTACCAGGTGTGATTAAT

    Mutant Exon Sequence:

    ACCATATCCTTAAATGTAAAAGGCCCTGGACTACAGAGGATGGTGCTTGTTG
    ACTTACTAGGTGTGATTAAT
  2. Disease: Hyperchromatosis
    Gene:TFR2
    Chr 7 -strand 100068560-100068682
    Wild Type Exon Sequence:
    GGAGAGCTGGTGTACGCCCACTACGGGCGGCCCGAAGACCTGCAGGACCT
    GCGGGCCAGGGGCGTGGATCCAGTGGGCCGCCTGCTGCTGGTGCGCGTGG
    GGGTGATCAGCTTCGCCCAGAAG

    Mutant Exon Sequence:

    GGAGAGCTGGTGTACGCCCACTAGGGGCGGCCCGAAGACCTGCAGGACCT
    GCGGGCCAGGGGCGTGGATCCAGTGGGCCGCCTGCTGCTGGTGCGCGTGG
    GGGTGATCAGCTTCGCCCAGAAG
  3. Disease: McArdle Disease
    Gene PYGM
    Chr 11 -strand 64278301 - 64278393
    Wild Type Exon Sequence:
    GTGGCCATCCAGCTCAATGACACCCACCCCTCCCTGGCCATCCCCGAGCT
    GATGAGGATCCTGGTGGACCTGGAACGGATGGACTGGGACAAG

    Mutant Exon Sequence:

    GTGGCCATCCAGCTCAATGACACCCACCCCTCCCTGGCCATCCCCGAGCT
    GATGAGGATCCTGGTGGACCTGGAACGGATGGACTAGGACAAG
  4. Disease: Cardiomyopathy
    Gene: MYH7
    Chr 14 -strand 22968004 - 22968153
    Wild Type Exon Sequence:
    GTGATATATGCCACTGGGGCACTGGCCAAGGCAGTGTATGAGAGGATGTT
    CAACTGGATGGTGACGCGCATCAATGCCACCCTGGAGACCAAGCAGCCAC
    GCCAGTACTTCATAGGAGTCCTGGACATCGCTGGCTTCGAGATCTTCGAT

    Mutant Exon Sequence:

    GTGATATATGCCACTAGGGCACTGGCCAAGGCAGTGTATGAGAGGATGTT
    CAACTGGATGGTGACGCGCATCAATGCCACCCTGGAGACCAAGCAGCCAC
    GCCAGTACTTCATAGGAGTCCTGGACATCGCTGGCTTCGAGATCTTCGAT
  5. Dataset provided by

    Tim Sterne-Weiler and Jeremy Sanford, University of California, Santa Cruz