The basic idea is that if there is a set of points and a query point (Ro below), the sum of vectors from the query point to the set of points will approach zero when the query point is in the 'middle' of the set and apprach the sum of the vector lengths when the query point is way outside:
Circular variance, used to characterize the spread in a set on angles is defined as
This can be written in the more general form of
This form is valid in three dimensions as well (in that case it is also referred to as spherical variance). Applying this form to the vectors from the query point gives a smooth 0-1 scale for the extent of burial of the query point in the set. In the example below a 6 A slice of bacteriorhodopsin simulated in water is shown, where each atom is color-coded by its circular variance w.r.t. the atoms of the protein.
Detecting pocket regions (steps 1-2)
1. Overlay a grid on the macromolecule and
remove the gridpoints that are covered by an atom
of the macromolecule
(see also Stahl et al., Prot. Engng., 13, 218
(2001)).
2. Find the connected clusters of the remaining gridpoints.
One of these clusters will include
all the gridpoints external to the macromolecule
while the rest will delineate various cavities.
Detecting pocket regions (steps 3-5)
3. Calculate the CV for the remaining gridpoints
with respect to the macromolecule
4. Eliminate all gridpoints where the value of CV
is below a threshold value, CVmax
5. Each connected cluster will represent one pocket.
The more gridpoints are in the cluster, the larger the pocket is.
Example: pockets of a bromodomain. One of the pockets is the binding site.
Detecting domain-separation
Domain separation can be detected by examining the circular varaince map, determined by the representative atoms of each residue (e.g., the alpha carbons for proteins):
where rij is the vector from the representative atom of residue i to that of residue j. Since a domain-separating segment is outside both domains it seprates, a low-CV swath of this CV map is diagnostic of domain-separating regions.
Example: CV map of bacteriorhodopsin. Loops connecting the transmembrane helices are domain-separators. Black bars below the map show the positions of the (transmembrane) helices. The loop between helices 3 and 4 is buried.
The calculations were performed with the programs Simulaid (insideness labeling and domain-separation map) and MMC (pocket regions), available at this website.
Last modified: 11/15/2004 (MM)