org.apache.ctakes.relationextractor.cr
Class GoldEntityAndRelationReader

java.lang.Object
  extended by org.apache.uima.analysis_component.AnalysisComponent_ImplBase
      extended by org.apache.uima.analysis_component.Annotator_ImplBase
          extended by org.apache.uima.analysis_component.JCasAnnotator_ImplBase
              extended by org.uimafit.component.JCasAnnotator_ImplBase
                  extended by org.apache.ctakes.relationextractor.cr.GoldEntityAndRelationReader
All Implemented Interfaces:
org.apache.uima.analysis_component.AnalysisComponent

public class GoldEntityAndRelationReader
extends org.uimafit.component.JCasAnnotator_ImplBase

Read named entity annotations and relations between them from knowtator xml files into the CAS. Assumptions: - A pair of entities can only have a single relation between them - An entity can have only a single semantic type For each relation instance in the gold standard, this reader will: - Check if the arguments of this relation instance can be extracted by CTAKEs automatically. If one of them cannot, this relation instance and the entities will be skipped. - Check if another relation between a pair of entities with the same knowtator mention ids has already been added to the cas. If it has, the reader will not add a new relation between these entities. This reader will also make sure each entity is added to the cas only once. E.g. the cas may already contain an entity if it participates in another relation that's already been added to the cas or due to an error in the gold standard (i.e. if it was annotated twice -- such weirdness does happen). TODO: Currently this reader does not normalize the roles of the arguments accross different corpora. It will simply add to the cas whatever is in the data. However, the roles were not consistently annotated accross different corpora (e.g. Sharp and Share assign different roles to the modifiers and entity mentions that participate in degree_of relation). This issue needs to be addressed so that modles can be trained on data coming from different sources.

Author:
dmitriy dligach

Field Summary
 int identifiedAnnotationId
           
static java.io.File inputDirectory
           
static java.lang.String PARAM_INPUTDIR
           
 int relationArgumentId
           
 int relationId
           
 
Constructor Summary
GoldEntityAndRelationReader()
           
 
Method Summary
 void initialize(org.apache.uima.UimaContext aContext)
           
 void process(org.apache.uima.jcas.JCas jCas)
           
 
Methods inherited from class org.uimafit.component.JCasAnnotator_ImplBase
getLogger
 
Methods inherited from class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
getRequiredCasInterface, process
 
Methods inherited from class org.apache.uima.analysis_component.Annotator_ImplBase
getCasInstancesRequired, hasNext, next
 
Methods inherited from class org.apache.uima.analysis_component.AnalysisComponent_ImplBase
batchProcessComplete, collectionProcessComplete, destroy, getContext, getResultSpecification, reconfigure, setResultSpecification
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PARAM_INPUTDIR

public static final java.lang.String PARAM_INPUTDIR
See Also:
Constant Field Values

inputDirectory

public static java.io.File inputDirectory

identifiedAnnotationId

public int identifiedAnnotationId

relationId

public int relationId

relationArgumentId

public int relationArgumentId
Constructor Detail

GoldEntityAndRelationReader

public GoldEntityAndRelationReader()
Method Detail

initialize

public void initialize(org.apache.uima.UimaContext aContext)
                throws org.apache.uima.resource.ResourceInitializationException
Specified by:
initialize in interface org.apache.uima.analysis_component.AnalysisComponent
Overrides:
initialize in class org.uimafit.component.JCasAnnotator_ImplBase
Throws:
org.apache.uima.resource.ResourceInitializationException

process

public void process(org.apache.uima.jcas.JCas jCas)
             throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
Specified by:
process in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
Throws:
org.apache.uima.analysis_engine.AnalysisEngineProcessException