

PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 
java.lang.Object org.knowceans.mcl.MarkovClustering
public class MarkovClustering
MarkovClustering implements the Markov clustering (MCL) algorithm for graphs, using a HashMapbased sparse representation of a Markov matrix, i.e., an adjacency matrix m that is normalised to one. Elements in a column / node can be interpreted as decision probabilities of a random walker being at that node. Note: whereas we explain the algorithms with columns, the actual implementation works with rows because the sparse matrix used has rowmajor format.
The basic idea underlying the MCL algorithm is that dense regions in sparse graphs correspond with regions in which the number of klength paths is relatively large, for small k in N, which corresponds to multiplying probabilities in the matrix appropriately. Random walks of length k have higher probability (product) for paths with beginning and ending in the same dense region than for other paths.
The algorithm starts by creating a Markov matrix from the graph, for which first the adjacency matrix is added diagonal elements to include selfloops for all nodes, i.e., probabilities that the random walker stays at a particular node. After this initialisation, the algorithm works by alternating two operations, expansion and inflation, iteratively recomputing the set of transition probabilities. The expansion step corresponds to matrix multiplication (on stochastic matrices), the inflation step corresponds with a parametrized inflation operator Gamma_r, which acts columnwise on (column) stochastic matrices (here, we use rowwise operation, which is analogous).
The inflation operator transforms a stochastic matrix into another one by raising each element to a positive power p and renormalising columns to keep the matrix stochastic. The effect is that larger probabilities in each column are emphasised and smaller ones deemphasised. On the other side, the matrix multiplication in the expansion step creates new nonzero elements, i.e., edges. The algorithm converges very fast, and the result is an idempotent Markov matrix, M = M * M, which represents a hard clustering of the graph into components.
Expansion and inflation have two opposing effects: While expansion flattens the stochastic distributions in the columns and thus causes paths of a random walker to become more evenly spread, inflation contracts them to favoured paths.
Description is based on the introduction of Stijn van Dongen's thesis Graph Clustering by Flow Simulation (2000); for a mathematical treatment of the algorithm and the associated MCL process, see there.
Constructor Summary  

MarkovClustering()

Method Summary  

private void 
addLoops(SparseMatrix a,
double loopGain)
add loops with specific energy, which corresponds to adding loopGain to the diagonal elements. 
SparseMatrix 
expand(SparseMatrix m)
expand stochastic quadratic matrix by sqaring it with itself: result = m * m. 
double 
inflate(SparseMatrix m,
double p,
double zeromax)
inflate stochastic matrix by Hadamard (elementwise) exponentiation, pruning and normalisation : result = Gamma ( m, p ) = normalise ( prune ( m .^ p ) ). 
static void 
main(java.lang.String[] args)
test the MCL algorithm with the matrix loaded from the file in the argument. 
private static void 
print(SparseMatrix a,
java.lang.String label)

SparseMatrix 
run(SparseMatrix a,
double maxResidual,
double pGamma,
double loopGain,
double maxZero)
run the MCL process. 
Methods inherited from class java.lang.Object 

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait 
Constructor Detail 

public MarkovClustering()
Method Detail 

public static void main(java.lang.String[] args)
args
 public SparseMatrix run(SparseMatrix a, double maxResidual, double pGamma, double loopGain, double maxZero)
a
 matrixmaxResidual
 maximum difference between row elements and row square
sum (measure of idempotence)pGamma
 inflation exponent for Gamma operatorloopGain
 values for cyclesmaxZero
 maximum value considered zero for pruning operations
private static void print(SparseMatrix a, java.lang.String label)
public double inflate(SparseMatrix m, double p, double zeromax)
result = Gamma ( m, p ) = normalise ( prune ( m .^ p ) ).
By convention, normalisation is done along rows (SparseMatrix has rowmajor representation)
m
 matrix (mutable)p
 exponent as a doublezeromax
 below which elements are pruned from the sparse matrix
public SparseMatrix expand(SparseMatrix m)
matrix

private void addLoops(SparseMatrix a, double loopGain)
a
 loopGain



PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 