############################ conll-2009-dev-shift-pop.jar ############################ The model conll-2009-dev-shift-pop.jar was provided by Jinho Choi and is trained on the development set of the CoNLL 2009 English data. More information about this data can be found here: http://ufal.mff.cuni.cz/conll2009-st/train-dev-data.html The ConLL 2009 shared task is described in the following paper. http://aclweb.org/anthology-new/W/W09/W09-1201.pdf This model uses the "shift pop" algorithm. When using this model you should specify the value AbstractDepParser.ALG_SHIFT_POP for the configuration parameter org.cleartk.syntax.dependency.clear.ClearParser.parserAlgorithmName as annotated in ClearParser. Also, your default memory setting for your JVM may not suffice to load this model. It can load with the following argument "-Xmx1g". On 29 Sep 2011, the model was modified by adding a "1" before the "17" near the end of the "lexica" file, to accomodate a change in model format for the 0.4.0-SNAPSHOT release. ################# additional models ################# There is an additional model available from the ClearTK downloads page: http://code.google.com/p/cleartk/downloads/list and is called conll-2009-training-dev-shift-pop.jar. This model is built from both the training data and development data from the CoNLL 2009 shared task (see links above). This model was provided by Jinho Choi as a file named conll-trndev-sp.mod.3. This model is very large and expands considerably in memory. You will need at least 8GB (gigabytes) to load it and run it. ##### Notes ##### - Both models were trained using PennTreebank part-of-speech tags. So, your input part-of-speech tags should match the tags used for training this model. The part-of-speech tag for punctuation symbols such as "." "," ":" ";" "(" ")" etc. are the symbols themselves (i.e. the part-of-speech tag for the token ")" should be ")".) This may be inconsistent with other PTB-derived tagging schemes that may use tags such as "COLON" or "RRB". Your part-of-speech tags should be modified to be consistent with the tagging scheme used here. - The dependency labeling scheme produced by the models provided here is separate/different from that of the Malt parser models used by ClearTK's wrapper of the Malt parser. You should not assume that the wrapper for the Clear Parser can be used interchangeably with the wrapper for the Malt Parser.