Exploring protein binding sites mutated in cancer


CanBind has been tested on Safari (Mac OS X Mountain Lion), Firefox (Linux Ubuntu 12.04 LTS), Google Chrome and Internet Explorer 9 (Windows 7). In order to display structures, you need a Java-enabled browser. If your browser cannot display Java applets properly, you will still be able to use the web server, but without the ability to display protein structures.

Searching for a gene of interest

You can display information about a particular gene by clicking on the Browser cancer mutations tab. Then, you can search for a gene of interest (HUGO names) by using the drop-down list or typing it in. As you start typing, a list of genes will be populated with the partial matches (see Figure below). Please keep in mind that in order for a gene to be present, it needs to have at least one reported mutation in the eight cancer types that are currently included, and at least one protein structure with 60% identity over at least 80% of the structure.

The eight cancer types are: breast cancer (BRCA), clear cell kidney cancer (KIRC), colon adenocarcinoma (COAD), endometrial cancer (UCEC), glioblastoma multiforme (GBM), lung squamous carcinoma (LUSC), ovarian cancer (OV), and rectal adenocarcinoma (READ).

searching for a gene

Visualizing the matching structures

Once a gene has been selected, all the corresponding isoforms are shown in the left panel. Clicking on their names will make a list of matching structures appear. The structures are listed as 4-letter PDB codes followed by chain identifier and binding site (BS) number. A star next to the binding site indicates the presence of a cancer mutation in the binding site.

Clicking on the toggle buttons will load the corresponding structure in the Jmol applet. Structures are displayed in ribbon form, with the ligand shown as CPK spheres. Residues are color-coded as red (binding and with a mutation), blue (non-binding and with a mutation), and orange (binding and with no mutation). Hovering over a residue will yield the residue type and number. All PDB residues have been renumbered to match the FASTA sequence of the protein.

visualize the structures

Displaying information about mutations

Clicking the see the mutations button opens a new tab with more details on the mutations. The chart at the top of the page (see Figure below) gives a quick summary of the frequency of mutations. The bar at the bottom of the chart represents the protein sequence, color coded in gray for residues with structural information, orange for binding residues, and black for residues with no matching structure.

Clicking on the isoform name will open a new tab with the FASTA sequence.

mutation frequency

The table below the chart gives the number of mutations per position per cancer type. The full names of the cancer types will appear when hovering over the abbreviations. Empirical p-values (for isoforms with at least one mutation in a binding residues) are given below the abbreviation (see the main paper for more details). When hovering over a cell, information on the frequency of the individual amino acid substitutions will appear (see Figure below). If the protein sequence is very long, it is possible to navigate to the position of interest by clicking on the corresponding point in the chart.

hoveing over mutation frequency