Title: Execution Metadata

The execution metadata holds detailed information about an ongoing/completed enhancement process. Basically they describe how the [ExecutionPlan](chains/executionplan.html) provided by the [Chain](chains) was executed by the [EnhancementJobManager](enhancementjobmanager.html). Both the ExecutionMetadata and the ExecutionPlan are provided with the ContentItem as an own content part of the type MGraph with the URI "http://stanbol.apache.org/ontology/enhancer/executionmetadata#ChainExecution". For users of the Stanbol Enhancer the Execution Metadata are of interest to:

* check progress of asynchronously started Enhancement Processes: Metadata for all planed executions of engines are created as soon as an ContentItem is parsed to the EnhancementJobManager and are updated as soon as the execution of engines start/complete/fail.
* Monitor the performance of different EnhancementEngines: The Execution Metadata provide detailed information about starting/completion time points for engine executions.
* Inspect the Enhancement Process: check if optional EnhancementEngines were successfully executed or skipped/failed; validate the configured EnhancementChain by checking the actual execution order of the EnhancementEngines.


## Execution Metadata Ontology

The RDFS schema used for the execution plan is defined as follows:

![Execution Metadata](executionmetadata.png "Overview of the Execution Metadata Ontology")

 * Namespace: em : http://stanbol.apache.org/ontology/enhancer/executionmetadata#
 * __em:Execution__ : Super class for all Executions
     * __em:executionPart__ (domain:Execution, range: em:ChainExecution): Defines that this execution was part of the execution of a chain
     * __em:status__(domain: em:Execution; range: em:ExecutionStatus): The status of an execution (used for both em:EngineExecution and em:ChainExecution
     * __em:started__ (domain: em:Execution; range: xsd:dateTime): Marks the start of the execution
     * __em:completed__ (domain: em:Execution; range: xsd:dateTime): Marks the completion of the execution
     * __em:statusMessage__ (domain: em:Execution; range: xsd:string): A natural language description providing further information about the status of this execution. Typically used to parse error messages if the execution fails (em:status is set to em:StatusFailed).
 * __em:ChainExecution__ : Class used to describe the execution of an enhancement chain.
     * __em:defaultChain__ (domain: em:ChainExecution; range: xsd:boolean): If the executed chain is currently the default Chain of the Stanbol Enhancer.
     * __em:executionPlan__ (domain:ChainExecution; range: ep:ExecutionPlan): Links to the execution plan as provided by the chain.
     * __em:enhances__(domain: em:ChainExecution; range: rdf:Resource) : links the em:ChainExecution with the URI of the processed content item. The range needs to be updated as soon as the Stanbol Enhancement Structure is defined.
     * __em:enhancedBy__ (domain: rdf:Resource; range: em:ChainExecution) : links the URI of the content item with the metadata about the enhancement process. The range needs to be updated as soon as the Stanbol Enhancement Structure is defined.
 * __em:EngineExecution__ : Class used to describe the execution of an EnhancementEngine.
     * __em:executionNode__ (domain: em:EngineExecution; range: ep:ExecutionNode): The node within the ExecutionPlan
 * __em:ExecutionStatus__ : Class describing the status of an EngineExecution
     * __em:StatusScheduled__ : ExecutionStatus instance describing that an execution is scheduled but has not yet started
     * __em:StatusInProgress__ : ExecutionStatus instance describing that the execution of the linked EngineExecution is in progress
     * __em:StatusCompleted__ : ExecutionStatus instance describing that the execution has already completed successfully
     * __em:StatusFailed__ : ExecutionStatus indicating that the execution has failed. Typically an em:statusMessage describing the reason for the failed execution is provided for em:Executions with this state.
     * __em:StatusSkipped__ : ExecutionStatus indicating that the execution of an ep:ExecutionNode was skipped. This is only allowed for execution nodes that are marked as optional. Typically also an em:statusMessage with the reason should be provided.


### Example

The following example uses the same properties as used within the [ExecutionPlan](chains/executionplan.html) section. To make it easier to see the relations between the execution metadata and the execution plan, the triples of the execution plan are included at the end of this example.

This example describes the following situation:

* the execution of the content item with the URI 'urn:contentItem1' with the default chain
* the default chain is represented by a chain with the name "demoChain" the ExecutionPlan has the URI 'urn:execPlan'
* the successful execution of the 'langid' engine (execution: 'urn:exec1', node: 'urn:node1')
* the failed execution of the 'ner' engine (execution: 'urn:exec2', node: 'urn:node2'): As reason for the failure a message is provided that the NER model for the language 'de' is not available
* the successful execution of the 'zemanta' engine (execution: 'urn:exec3', node: 'urn:node5'): This engine was started in parallel to the 'ner' engine - therefore before the chain failed.
* There is no execution of the dbpediaLinking (node: '') and geonamesLinking (node: '') engines because the chain failed before these engines were scheduled. This assumes the EnhancementJobManager does only add em:EngineExecution resources when it starts the processing of an ep:ExecutionNode defined in the execution plan. However, the EnhancementJobManager can also create ep:Execution resources for all execution nodes. In that case there would be also em:EngineExecution resources for the dbpediaLinking and geonamesLinking engines with the em:status set to 'em:StatusScheduled'. 

The RDF graph with the Execution Metadata:

    :::text
    urn:exec
        rdf:type em:ChainExecution
        em:executionPlan urn:execPlan
        em:enhances urn:contentItem1
        em:defaultChain "true"
        em:started 2012-01-11T12.13.14.156
        em:completed 2012-01-11T12.13.15.157
        em:status em:StatusFailed
        em:statusMessage "Unable to execute EnhancementEngine 'new' \
            (Message: No NER model for language 'de' is available)."
        em:executionPart urn:exec1, urn:exec2, urn:exec3, urn:exec4, urn:exec5

    urn:exec1
        rdf:type em:EngineExecution
        em:executionPart urn:exec
        em:executionNode urn:node1
        em:status em:StatusCompleted
        em:started 2012-01-11T12.13.14.160
        em:completed 2012-01-11T12.13.14.250

    urn:exec2
        rdf:type em:EngineExecution
        em:executionPart urn:exec
        em:executionNode urn:node2
        em:status StatusFailed
        em:statusMessage "No NER model for language 'de' is available"
        em:started 2012-01-11T12.13.14.253
        em:completed 2012-01-11T12.13.14.289

    urn:exec3
        rdf:type em:EngineExecution
        em:executionPart urn:exec
        em:executionNode urn:node5
        em:status StatusCompleted
        em:started 2012-01-11T12.13.14.253
        em:completed 2012-01-11T12.13.15.150

The Execution Plan: (copy from the example provided in the ExecutionPlan section)
    
    :::text
    urn:execPlan
        rdf:type ep:ExecutionPlan
        ep:hasExecutionNode urn:node1, urn:node2, urn:node3, urn:node4, urn:node5
        ep:chain "demoChain"

    urn:node1
        rdf:type stanbol:ExecutionNode
        ep:inExecutionPlan urn:execPlan
        stanbol:engine langId

    urn:node2
        rdf:type ep:ExecutionNode
        ep:inExecutionPlan urn:execPlan
        ep:dependsOn urn:node1
        ep:engine ner

    urn:node3
        rdf:type ep:ExecutionNode
        ep:inExecutionPlan urn:execPlan
        ep:dependsOn urn:node1
        ep:engine dbpediaLinking

    urn:node4
        rdf:type ep:ExecutionNode
        ep:inExecutionPlan urn:execPlan
        ep:dependsOn urn:node1
        ep:engine geonamesLinking

    urn:node5
        rdf:type ep:ExecutionNode
        ep:inExecutionPlan urn:execPlan
        ep:engine zemanta
        ep:optional "true"^^xsd:boolean


## Creation/Management of Execution Metadata

This section is primarily intended for implementors of EnhancementJobManager. However it might also provide insights for users that want/need to monitor the state of enhancement processes as it describes what information are added when to the Execution Metadata.

When the [EnhancementJobManager](enhancementjobmanager.html) starts the Enhancement of a ContentItem it needs to check if the [ContentItem](contentitem.html) already contains ExecutionMetadata in the ContentPart with the URI "http://stanbol.apache.org/ontology/enhancer/executionmetadata#ChainExecution". If this is the case it needs to initialize itself based on the pre-existing information. If no ExecutionMetadata are present, a new EnhancementProcess needs to be created based on the parsed Chain. Differences between this two cases are explained in the following two sub sections.

### Initialization

If no ExecutionMetadata are present within a parsed ContentItem, a new EnhancementProcess needs to be set up. This includes the following steps:

1. Get the [ExecutionPlan](chains/executionplan.html) for the parsed enhancement [Chain](chains). If no chain is parsed the default chain need to be acquired by using the [ChainManager](chains/chainmanager.html).
2. Create the content part for the ExecutionMetadata with the [ContentItem](contentitem.html) and add the information of the [ExecutionPlan](chains/executionplan.html) to it.
3. Create the initial ExecutionMetadata. This includes the 'em:ChainExecution' instance for the 'ep:ExecutionPlan' as well as 'em:EngineExecution' instances for all 'ep:ExecutionNode's defined by the execution plan. All such 'em:Execution' instances MUST BE created with the 'em:ExecutionStatus' 'em:StatusSheduled'.

The ExecutionMetadataHelper utility of the "org.apache.stanbol.enhancer.servicesapi" module contains utility methods for initializing execution metadata.

### Continuation

If the parsed ContentItem does already contain ExecutionMetadata in the content part with the URI "http://stanbol.apache.org/ontology/enhancer/executionmetadata#ChainExecution" the EnhancementJobManager MUST follow the following steps to continue an EnhancementProcess.

1. Check if the contained ExecutionMetadata are valid
    * If a 'em:ChainExecution' node is present that 'em:enhances' the parsed ContentItem
    * If the ExecutionPlan is included and if the value of the 'ep:chain' property for the 'ep:ExecutionPlan' resource corresponds to the name of the Chain parsed in the request.
2. Check the status of all 'em:Execution' instances
    * reset the status of 'em:Execution's that are in-progress to scheduled.
    * TODO: here we could also retry the execution of failed 'em:Execution's

Note that with an continuation the ExecutionPlan MUST NOT be updated. It MUST BE also NOT checked if a Chain with the name as stored in the ExecutionMetadata is still present. Note also that configuration changes of EnhancementEngine will affect the continuation of the enhancement process.

The ExecutionMetadataHelper utility of the "org.apache.stanbol.enhancer.servicesapi" module contains utility methods for reading and validating pre-existing execution metadata.

### Execution State Management

The following metadata need to be updated by the EnhancementJobManager when:

* Enhancement process starts
    * set the 'em:status' of the 'em:ChainExecution' to 'em:StatusInProgress'
    * set the 'em:started' to the current date time
* EnhancementEngine execution starts:
    * set the 'em:status' of the 'em:EngineExecution' to 'em:StatusInProgress'
    * set the 'em:started' to the current date time
* EnhancementEngine completes
    * set the 'em:status' of the 'em:EngineExecution' to 'em:StatusCompleted'
    * set the 'em:completed' to the current date time
* Optional EnhancementEngine not available
    * set the 'em:status' of the 'em:EngineExecution' to 'em:StatusSkipped'
    * set both 'em:started' and 'em:completed' to the current date time
* Optional EnhancementEngine failed
    * set the 'em:status' of the 'em:EngineExecution' to 'em:StatusFailed'
    * set the 'em:completed' to the current date time
* Required EnhancementEngine failed or not available
    * set the 'em:status' of the 'em:EngineExecution' to 'em:StatusFailed'
    * set the 'em:status' of the 'em:ChainExecution' to 'em:StatusFailed'
    * set the 'em:completed' of both the engine and the chain execution to the current date time
* Enhancement process completes
    * set the 'em:status' of the 'em:ChainExecution' to 'em:StatusCompleted'
    * set the 'em:completed' to the current date time
* Internal error in the EnhancementJobManager implementation
    * set the 'em:status' of the 'em:ChainExecution' to 'em:StatusFailed'
    * do not set any 'em:EngineExecution' to failed.
    * set the 'em:completed' value of the 'em:ChainExecution' to the current date time

The ExecutionMetadataHelper utility of the "org.apache.stanbol.enhancer.servicesapi" module contains utility methods to preform state transitions on 'em:Execution' instances.

## Using ExecutionMetadata

This section provides some examples on how to access and retrieve information from the ExecutionMetadata.

### Accessing ExecutionMetadata

The ExecutionMetadata and the [ExecutionPlan](chains/executionplan.html) are stored in a content part with with URI "http://stanbol.apache.org/ontology/enhancer/executionmetadata#ChainExecution" with the [ContentItem](contentitem.html). The following code segment can be used to retrieve the RDF graph with the ExecutionMetadata:

    :::java
    ContentItem ci; //the ContentItem
    //the URI is available as constant of the ExecutionMetadata class
    UriRef contentPartURI = ExecutionMetadata.CHAIN_EXECUTION;
    
    MGraph executionMetadata = ci.getPart(contentPartURI,MGraph.class);

The ExecutionMetadata are stored as read-/writeable RDF graph. To parse a read-only version to other components one can use the "getGraph()" method defined by MGraph.

### Getting details about the em:ChainExecution 

The following code segments show how to access information about the execution of the enhancement process for a [ContentItem](contentitem.html). All directly accessed methods in the examples below are static imports from one of the following two utility classes part of the "org.apache.stanbol.enhancer.servicesapi" module.

* ExecutionPlanHelper: Utility class that provides methods for reading and creating [ExecutionPlan](chains/executionplan.html).
* ExecutionMetadataHelper: Utility class for reading and manipulating the ExecutionMetadata
* EnhancementEngineHelper: Utility that contains general purpose RDF utilities.

This code example first gets the ChainExecution, ExecutionPlan and Chain name for the enhanced content item. In a second step metadata of all executed EnhancementEngines are retrieved.

    :::java
    ContentItem ci; //the ContentItem
    MGraph em; //the ExecutionMetadata
    
    //get the ChainExecution, ExecutionPlan and the name of the Chain
    NonLiteral ce = getChainExecution(em,ci.getUri());
    if(ce != null){
        NonLiteral ep = getExecutionPlan(em,ce);
        String chainName = getString(em,ep,ExecutionPlan.CHAIN);
    } else {
        log.warn("ExecutionMetadata of not contain information for "
            + "ContentItem {}!",ci.getUri());
    }
    
    //get the EngineExecutions and the name of the Engines
    Set<NonLiteral> executions = getExecutions(em,ce);
    for(NonLiteral ex : executions){
        NonLiteral en = getExecutionNode(em,ex);
        if(en != null){
            String engineName = getEngine(em,en);
            boolean optional = isOptional(em,en);
        } else { //maybe a sub-chain execution
            //currently not supported, but might
            //added in future versions
        }
        UriRef status = getStatus(em,ex);
        Date started = getStarted(em,ex);
        Date completed = getCompleted(em,ex);
    }