Jena OWL Syntax Checker HowTo

This file does not yet document how to:

Use the commandline checker with an OntDocumentManager.
Use an OntDocumentManager with the API.
Check N3 or N-triple from the commandline

Use the --help option for an up-to-date synopsis of the commandline options.

Introduction
Command Line Tool
Common Java interface
Using with Jena Models, OntModels, and Graphs
- Sample code
Streaming Checker
Future Plans

Introduction

An OWL Syntax Checker takes a file, Model or Graph and sees whether it conforms with OWL Lite or OWL DL syntax or neither.

The basic usage returns a single word being one of "Lite" "DL" or "Full".

Error messages can indicate why it is not in a lower level.

The Jena OWL syntax checker implements the OWL Syntax Checker defined by the OWL Test Cases Recommendation.

Tests have shown that it is much faster and smaller than other syntax checkers written in Java. It is believed to be the fastest OWL Syntax Checker. However, it does not produce an OWL abstract syntax tree, and the error messages are not yet adequately clear.

There are three different ways of using the Jena OWL Syntax Checker:

From the command line, using the class jena.owlsyntax
Checking Jena Models and Graphs which have already been read in, using com.hp.hpl.jena.ontology.tidy.Checker.
As a standalone Java library, for non-Jena applications requiring optimized memory usage com.hp.hpl.jena.ontology.tidy.StreamingChecker.

Most users should start with the first option, and possibly migrate to the second. The third case is only useful for specialized applications needing to make the best possible usage of their memory. (Some tests indicate that the Checker class is slightly quicker than the StreamingChecker class).

The Jena Ontology Models, and the Jena reasoners do not require any specific syntactic conformance for the documents they work with, and do not require the use of the syntax checker. However, restricting your ontology files to be OWL Lite or OWL DL may catch many typos and silly mistakes, and may improve the style of your ontologies. Moreover, greater interoperability with other tools is likely if you restrict yourself to OWL Lite or OWL DL.

Command Line Tool

Invoke Java with the normal Jena classpath and the jena.owlsyntax class and the following arguments:

    jena.owlsyntax [--lite|--quiet|--dl] [--big|--lang [N-TRIPLE|N3]] [file1] [file2]
    jena.owlsyntax --help
    jena.owlsyntax [--textui] --test [ManifestURL]

The first form reports "Lite", "DL", or "Full" If two files are specified, then both files are checked, and the vocabulary usage by both files together must be separated. If no files are specified then standard input is used, (in this case, relative URIs and rdf:ID's are resolved against <urn:x-jena:syntaxchecker>).

-L --lang Specify N3 or N-TRIPLE input. -m --manager Specify OWL Document Manager. --test Run a test suite - default latest OWL Test publication. URL of file:testing/wg/OWLManifest.rdf uses local copy. --textui Use the junit.textui instead of the swingui jena.owlsyntax [--lite|--quiet] [--big] [file1] [file2] jena.owlsyntax --help jena.owlsyntax [--textui] --test [ManifestURL]

-l
--lite: Give error messages for OWL DL or OWL Full constructions.
-d
--dl: (default) Give error messages for OWL Full constructions.
-q
--quiet: No error messages.
-s
--short: Give short error messages (Default is long messages for OWL Full constructs only)
-b
--big: Input file is big: optimize memory usage. Quality of long error messages suffers.
-L
--lang: Specify N3 or N-TRIPLE input, note memory optimizations (--big) are not implemented.
-m
--manager: Specify OWL Document Manager.
--test: Run a test suite - default latest OWL Test publication. URL of file:testing/wg/OWLManifest.rdf uses local copy.
--textui: Use the junit.textui instead of the swingui

Common Java interface

The API contains two implementation classes Checker and StreamingChecker. Both of these implement the same interface CheckerResults for reporting the results of a syntax check. This gives methods to get the one word result, and any error messages. Some errors are characterised by a small subgraph that exhibits the error. This subgraph can be accessed using the SyntaxProblem.problemSubGraph() method.

Using with Jena Models, OntModels, and Graphs

Most Jena users wishing to use the syntax checker should use the Checker class. The Javadoc should be referred to for usage details.

An issue that should be understood concerns the processing of owl:imports. Some Jena Models and Graphs have already processed owl:imports (e.g. a default OntModel), others have not (e.g. a default Model). For ease of use, the principle methods in the Checker class add(Model) and add(Graph), inspect their argument to decide whether or not it already has had the imports processed or not. This inspection is heuristic, but should work with Models created using the ModelFactory. However, for production code with custom Models or Graphs you should use either addGraphAndImports or addRaw depending on whether or not the custom graph is already imports closed or not.

If the Checker does imports processing it uses a private OntDocumentManager constructed from the default profile (as shipped, this does not redirect any URLs). If you wish to use a custom OntDocumentManager to redirect some URLs, then construct an OntModel, and then add it to the Checker.

Sample code

    // set this boolean to true if error msgs should
    // indicate why the graph is not in Lite.
    boolean expectingLite = false;
    
    // m can be an OntModel
    Model m = ...;
    Writer w; // for error messages.
    

    
   // Get a syntax checker
   Checker chk = new Checker(expectingLite);

   // Add one or more models or graphs.
   chk.add(m);
   ...
   
   // get result.
   String subLang = chk.getSubLanguage();
   ...

   ...
   // If we do not like the answer we can 
   // get error messages.

   if (!(subLang.equals("Lite"))) {
     // There is no explaination offered why something
     // is in a lower level than expected, only why it 
     // is in a higher level than expected.
     Iterator it = chk.getProblems();
     while (it.hasNext()) {
       SyntaxProblem sp = (SyntaxProblem) it.next();
       String s = sp.longDescription();

       w.write(s);
       w.write("\n");
     }
   }

Streaming Checker

The StreamingChecker class is an alternative for users wishing to only use the OWL Syntax Checker, without using other Jena functionality. It is optimised for memory usage, and is very slightly slower than the Checker class. It is a streaming mode parser in that it does not remember all the triples from start to finish but intelligently discards those which have been fully analysed.

If there is a need to redirect the GET requests on any URLs being processed as a result of owl:imports, then the StreamingChecker can be given an appropriate OntDocumentManager when it is constructed. By default, it uses a private document manager, constructed from the default profile.

Future Plans

We have a design for "OWL Tidy" based on the syntax checker. This would take an OWL Full document, and produce a similar OWL DL document. The user would have fine-grain control as to whether they allow some information to be lost (incomplete) or some information to be added (unsound).

Implementing this would also involve significant improvements in the error messages in the syntax checker.

We also have a design for implementing a syntax checker that actually builds abstract syntax trees.

Our current (May 2004) assessment, is that it is not a good use of our time to code up such designs. Building abstract syntax trees is adequately addressed by Bechhofer's syntax checker. OWL Tidy would be useful, if Jena users wanted to be able to interoperate with OWL DL systems. We see no such demand at the moment; this may be partly explicable by the lack of widely used OWL DL systems. We are aware of some standards developers and similar who would like an OWL Tidy functionality; however as far as we can tell, they would require the modifications to be small modifications to the source files, not small modifications to the RDF graph. Since this more than doubles the amount of work required for OWL Tidy, it makes the cost/benefit analysis unfavourable.

Users who would like more effort spent by the jena team in this area should indicate on jena-dev. Messages indicating unhappiness with the current syntax checker's error messages; or asking why is some file in OWL Full and not DL and what can be done about it; or how to interoperate with some named OWL DL system; are likely to be the most effective.