Monday, June 1, 2009

opennlp.ccg.grammar.Grammar

This class encodes the notion of a CCG grammar, essentially a lexicon, a set of rules and a type hierarchy. Its constructor is called by the opennlp.ccg.TextCCG class, as discussed here. Here are the essentials (ignoring XSLT files for transforming from and to LFs, pitch accents, special tokenisers, supertags etc.):

public final Types types;
public final Lexicon lexicon;
public final RuleGroup rules;
private String grammarName = null;
public static Grammar theGrammar; //nasty hack!
Here are the basics of the constructor, which takes a single input parameter url, of class java.net.URL:
theGrammar = this;
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(url);
Element root = doc.getRootElement();
grammarName = root.getAttributeValue("name");
...
Element typesElt = root.getChild("types");
URL typesUrl;
if (typesElt!=null)
    typesUrl = new URL(url,typesElt.getAttributeValue("file"));
else typesUrl = null;
Element lexiconElt = root.getChild("lexicon");
URL lexiconUrl = new URL(url,lexiconElt.getAttributeValue("file"));
Element morphElt = root.getChild("morphology");
URL morphUrl = new URL(url,morphElt.getAttributeValue("file"));
Element rulesElt = root.getChild("rules");
URL rulesUrl = new URL(url,rulesElt.getAttributeValue("file"));
...
if (typesUrl!=null) types = new Types(typesUrl,this);
else types = new Types(this);
lexicon = new Lexicon(this);
lexicon.init(lexiconUrl,morphUrl); *****
rules = new RuleGroup(rulesUrl,this);
...

No comments:

Post a Comment