org.apache.ctakes.postagger
Class TagDictionaryCreator

java.lang.Object
  extended by org.apache.ctakes.postagger.TagDictionaryCreator

public class TagDictionaryCreator
extends java.lang.Object

From a Part of Speech (POS) corpus in OpenNLP format, create a tagDictionary
Outputs the dictionary to the specified file

Author:
Mayo Clinic

Constructor Summary
TagDictionaryCreator()
           
 
Method Summary
static java.util.HashMap<java.lang.String,java.util.Set<java.lang.String>> createTagDictionary(java.io.BufferedReader br, boolean caseSensitive)
          Create a tag dictionary from a POS corpus in OpenNLP format.
static void main(java.lang.String[] args)
          Read a file containing POS-tagged tokens in OpenNLP format, and output a tagDictionary (OpenNLP format)
Example input:
winning_JJ body_NN
winning_VBG
Example output:
body NN
winning JJ VBG
static void printUsage()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TagDictionaryCreator

public TagDictionaryCreator()
Method Detail

createTagDictionary

public static java.util.HashMap<java.lang.String,java.util.Set<java.lang.String>> createTagDictionary(java.io.BufferedReader br,
                                                                                                      boolean caseSensitive)
                                                                                               throws java.io.IOException
Create a tag dictionary from a POS corpus in OpenNLP format.

Parameters:
br -
caseSensitive -
Returns:
Throws:
java.io.IOException

main

public static void main(java.lang.String[] args)
Read a file containing POS-tagged tokens in OpenNLP format, and output a tagDictionary (OpenNLP format)
Example input:
winning_JJ body_NN
winning_VBG
Example output:
body NN
winning JJ VBG

See Also:
printUsage()

printUsage

public static void printUsage()