org.knowceans.topics.cgen
Class MixNetParser

java.lang.Object
  extended by org.knowceans.topics.cgen.MixNetParser

public class MixNetParser
extends java.lang.Object

parse a mixnet script. Example:

 mixnet = LDA
 description:
        test for the mixnet model parser and generator
        
 preferences:
 # produce serial acceleration code (bounds-based sampling)
 fastSerial = false
 # produce parallel acceleration code (shared memory)
 fastParallel = false
 # produce independent samplers for each edge
 indepSamplers = false
 
 # declare nodes (parameters)
 nodes:
 # descriptions and "=" optional
 doc-topic =    theta : M, K | alpha : 1
 # observed nodes, "in" operator means sparse indexing
 doc-author =   am : M, am[m].length in A
 # single label per document
 doc-label =    cm : M, 1 in C
 # option: hyperparameter not estimated
 topic-word =   phi : K, V | beta : 1 : fixed
 topic-word2 =  psi : K, V | gamma : 1
 # constant value C
 topic-char =   xi : K, C = 3 | delta : 1
 # coupled parameters (all statistics joined, hyperparam of first node used)
 connect nodes = xi == phi
 
 # declare sequences with data variable, indexes, range and count
 # sequences. Each edge has a sequence
 words =                w : m, n : M, w[m].length : W
 # labels may have their own sequence (if used as output)
 labels                 c : m, j : M, c[m].length : Wc
 # defines nesting between sequences
 # eg paragraphs nesting words
 pos =          q : m, t in n : M, w[m].length, q[m][n].length : Wq
 
 # declare edges (variables) for sequences
 edges:
 words = w ::
                document =              m : M
                author =                x : A
                label =                 c : C
                topic =                 z : K
                word =                  w : V
 paragraphs      = s ::
        stopic                  y : Y 
 
 # declare network (dependencies)
 # prior alpha and selectors [] optional
 # selectors default to [x,y] of input edges 
 network:
        m >> theta >> z,x
        m,z >> phi[m,z] | alpha[j] >> w
        x >> psi[k] >> w2
  k : { 
        // some Java code, with indices [m,x] and IDLE
        if (x == 0) { k = IDLE; } else { k = [m,x]; } 
  }.
  # k : { ... }. may be on one line, as well.
 
 # special expressions: Java expressions, with { k = [x,y]; } being
 # parent values, linearly indexed. Switching is possible via 
 # { k = IDLE; }, so node is not used for this sample (IDLE = -1). 
 # All "common" Java expressions are permitted (doc for details).
 

Author:
gregor

Field Summary
 java.lang.String comment
          comments start with # on an otherwise empty line
 java.lang.String edge
          edges start with a name and have a variable and dimension, they belong to the last sequence declared
(package private)  boolean fastSerial
           
 java.lang.String file
           
(package private)  boolean formatCode
           
 java.lang.String network
          a network line links one or more incoming and outgoing edges with a node a node has optional alpha and selectors theta[kSel] | alpha[jSel]
 java.lang.String node
          nodes optional name and dimensions (name =) theta : dim | alpha : dim
 java.lang.String nodecouple
          C5 parameter coupling structures are specified by theta1 == theta2
 java.lang.String preference
          preferences are key = value, also used for the title
 java.util.HashMap<java.lang.String,java.lang.String> preferences
           
 java.lang.String section
          sections start with a section title and a colon
 java.lang.String selectend
          finish a selector
 java.lang.String selector
          Selectors may be set with a variable in kSel, which is defined in a free Java expression starting with a variable and ":$sp{" and ending with "}.", at the end of the first line or on an empty line.
 java.lang.String seqlabel
          sequence label
 java.lang.String sequence
          sequences start with a \@name = variable : indexes : dimensions : number of tokens
 java.lang.String[] terminals
          terminals
 java.lang.String title
          title follows the word mixnet =
static java.lang.Object TRUESTRING
           
(package private)  boolean verbose
           
 
Constructor Summary
MixNetParser(java.lang.String file)
          create the mixnet parser from the file
 
Method Summary
 boolean getBoolean(java.lang.String key, boolean standard)
           
 double getNumber(java.lang.String key, double standard)
           
 java.util.HashMap<java.lang.String,java.lang.String> getPreferences()
           
 java.lang.String getString(java.lang.String key, java.lang.String standard)
           
static void main(java.lang.String[] args)
           
 MixNet parse()
          parse the file and return the mixture network
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

file

public java.lang.String file

preferences

public java.util.HashMap<java.lang.String,java.lang.String> preferences

verbose

boolean verbose

fastSerial

boolean fastSerial

formatCode

boolean formatCode

TRUESTRING

public static final java.lang.Object TRUESTRING

terminals

public final java.lang.String[] terminals
terminals


title

public final java.lang.String title
title follows the word mixnet =

See Also:
Constant Field Values

comment

public final java.lang.String comment
comments start with # on an otherwise empty line

See Also:
Constant Field Values

section

public final java.lang.String section
sections start with a section title and a colon

See Also:
Constant Field Values

preference

public final java.lang.String preference
preferences are key = value, also used for the title

See Also:
Constant Field Values

node

public final java.lang.String node
nodes optional name and dimensions (name =) theta : dim | alpha : dim

See Also:
Constant Field Values

nodecouple

public final java.lang.String nodecouple
C5 parameter coupling structures are specified by theta1 == theta2

See Also:
Constant Field Values

sequence

public final java.lang.String sequence
sequences start with a \@name = variable : indexes : dimensions : number of tokens

See Also:
Constant Field Values

seqlabel

public final java.lang.String seqlabel
sequence label

See Also:
Constant Field Values

edge

public final java.lang.String edge
edges start with a name and have a variable and dimension, they belong to the last sequence declared

See Also:
Constant Field Values

network

public final java.lang.String network
a network line links one or more incoming and outgoing edges with a node a node has optional alpha and selectors theta[kSel] | alpha[jSel]

See Also:
Constant Field Values

selector

public final java.lang.String selector
Selectors may be set with a variable in kSel, which is defined in a free Java expression starting with a variable and ":$sp{" and ending with "}.", at the end of the first line or on an empty line. The expression must evaluate to the variable name in the node specification. More allowed expressions:

Switch: k : {if (j==1) {k = u;} else if (j==2) {k = v;} else {k = IDLE;}}. selects value of u or v or does not use the input (no counts iterated).

Aggregation: nzz : { nzz = new int[w[m].length]; for (int i = 0; i < w[m].length; i++) { nzz[z[i]]++; }}.

Any other Java expression that evaluates to a value of k or the name of the index variable for the node given in theta[var]. Check for naming collisions. Use in braces with period at the end. In assignments, comma-separated indexes may be used, like k = u,v; as long as u and v are parent edges. k is not an edge but a placeholder for the joint index. It is resolved to a linear component index.

NB: merging is default for an input edge that has several parent nodes.

See Also:
Constant Field Values

selectend

public final java.lang.String selectend
finish a selector

See Also:
Constant Field Values
Constructor Detail

MixNetParser

public MixNetParser(java.lang.String file)
create the mixnet parser from the file

Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception

getPreferences

public java.util.HashMap<java.lang.String,java.lang.String> getPreferences()

getString

public java.lang.String getString(java.lang.String key,
                                  java.lang.String standard)

getNumber

public double getNumber(java.lang.String key,
                        double standard)

getBoolean

public boolean getBoolean(java.lang.String key,
                          boolean standard)

parse

public MixNet parse()
             throws java.io.IOException
parse the file and return the mixture network

Returns:
Throws:
java.io.IOException