// Define some global attributes include::_globattr.adoc[] LVG ~~~ This annotator wraps the National Library of Medicine (NLM) link:{lvg-home}[SPECIALIST lexical tools]. It generates a canonical form for words and also generates a list of lemma entries with Penn Treebank tags. These tags could be useful for a part of speech (POS) tagger. However, for the OpenNLP POS tagger, we use a tag dictionary rather than lemma information. See the <> for the POS tagger annotator. Analysis engines (annotator) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - *LvgAnnotator.xml* + *Parameters*:: UseSegments;; controls whether only certain sections will be annotated by this annotator SegmentsToSkip;; list of sections not to be processed by this annotator UseCmdCache;; controls whether to look up information in a cache before using norm CmdCacheFileLocation;; location of norm cache file CmdCacheFrequencyCutoff;; (cutoff value) ExclusionSet;; words for which canonicalForm is never set and Lemma entries are never posted XeroxTreebankMap;; mapping of part of speech tags used by LVG to POS tags from lexical tools to Penn Treebank tags PostLemmas;; controls whether any lemma entries are posted to the CAS UseLemmaCache;; controls whether to look up lemma information in a cache before using lvg LemmaCacheFileLocation;; the location of the cache file LemmaCacheFileFrequencyCutoff;; (cutoff value) //////////////////////// NOTE: As distributed, PostLemmas is set to false. This is done to reduce the size of the CAS. Set PostLemmas to true to have edu.mayo.bmi.uima.core.type.Lemma annotations added to the CAS. //////////////////////// Resources ^^^^^^^^^ - *lvg.properties* + The LVG config file +resources/lvg/data/config/lvg.properties+ defines the location and attributes of the LVG database and the jdbc driver used. + - *LVG database* + The database engine used is hsqldb. The database file included is a sample. See <> for details on how to replace the sample.