|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.knowceans.mcl.MarkovClustering
public class MarkovClustering
MarkovClustering implements the Markov clustering (MCL) algorithm for graphs, using a HashMap-based sparse representation of a Markov matrix, i.e., an adjacency matrix m that is normalised to one. Elements in a column / node can be interpreted as decision probabilities of a random walker being at that node. Note: whereas we explain the algorithms with columns, the actual implementation works with rows because the sparse matrix used has row-major format.
The basic idea underlying the MCL algorithm is that dense regions in sparse graphs correspond with regions in which the number of k-length paths is relatively large, for small k in N, which corresponds to multiplying probabilities in the matrix appropriately. Random walks of length k have higher probability (product) for paths with beginning and ending in the same dense region than for other paths.
The algorithm starts by creating a Markov matrix from the graph, for which first the adjacency matrix is added diagonal elements to include self-loops for all nodes, i.e., probabilities that the random walker stays at a particular node. After this initialisation, the algorithm works by alternating two operations, expansion and inflation, iteratively recomputing the set of transition probabilities. The expansion step corresponds to matrix multiplication (on stochastic matrices), the inflation step corresponds with a parametrized inflation operator Gamma_r, which acts column-wise on (column) stochastic matrices (here, we use row-wise operation, which is analogous).
The inflation operator transforms a stochastic matrix into another one by raising each element to a positive power p and re-normalising columns to keep the matrix stochastic. The effect is that larger probabilities in each column are emphasised and smaller ones deemphasised. On the other side, the matrix multiplication in the expansion step creates new non-zero elements, i.e., edges. The algorithm converges very fast, and the result is an idempotent Markov matrix, M = M * M, which represents a hard clustering of the graph into components.
Expansion and inflation have two opposing effects: While expansion flattens the stochastic distributions in the columns and thus causes paths of a random walker to become more evenly spread, inflation contracts them to favoured paths.
Description is based on the introduction of Stijn van Dongen's thesis Graph Clustering by Flow Simulation (2000); for a mathematical treatment of the algorithm and the associated MCL process, see there.
Constructor Summary | |
---|---|
MarkovClustering()
|
Method Summary | |
---|---|
private void |
addLoops(SparseMatrix a,
double loopGain)
add loops with specific energy, which corresponds to adding loopGain to the diagonal elements. |
SparseMatrix |
expand(SparseMatrix m)
expand stochastic quadratic matrix by sqaring it with itself: result = m * m. |
double |
inflate(SparseMatrix m,
double p,
double zeromax)
inflate stochastic matrix by Hadamard (elementwise) exponentiation, pruning and normalisation : result = Gamma ( m, p ) = normalise ( prune ( m .^ p ) ). |
static void |
main(java.lang.String[] args)
test the MCL algorithm with the matrix loaded from the file in the argument. |
private static void |
print(SparseMatrix a,
java.lang.String label)
|
SparseMatrix |
run(SparseMatrix a,
double maxResidual,
double pGamma,
double loopGain,
double maxZero)
run the MCL process. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public MarkovClustering()
Method Detail |
---|
public static void main(java.lang.String[] args)
args
- public SparseMatrix run(SparseMatrix a, double maxResidual, double pGamma, double loopGain, double maxZero)
a
- matrixmaxResidual
- maximum difference between row elements and row square
sum (measure of idempotence)pGamma
- inflation exponent for Gamma operatorloopGain
- values for cyclesmaxZero
- maximum value considered zero for pruning operations
private static void print(SparseMatrix a, java.lang.String label)
public double inflate(SparseMatrix m, double p, double zeromax)
result = Gamma ( m, p ) = normalise ( prune ( m .^ p ) ).
By convention, normalisation is done along rows (SparseMatrix has row-major representation)
m
- matrix (mutable)p
- exponent as a doublezeromax
- below which elements are pruned from the sparse matrix
public SparseMatrix expand(SparseMatrix m)
matrix
-
private void addLoops(SparseMatrix a, double loopGain)
a
- loopGain
-
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |