Eyeball currently can check for:
In the Eyeball distribution directory, run the Eyeball tests:
ant testIf these tests fail, something is wrong. Sometimes it's no more than a classpath problem, which you can fix. If not, use the jena-dev mailing list to request assistance. Note that any support is provided on a voluntary basis, as and when the effort is available.
If the tests have passed, you can copy lib/*.jar to whatever place you find convenient. You can then use it from the command line or from within Jena code. You will also need to copy the directories mirror and etc to somewhere convenient where the Jena FileManager class can see them.
jena.jar
and may not work
with your usual installation.)
Run the command:
The -whatever sections can come in any order and may be repeated, in which case the new arguments are appended to the existing ones.java [java options eg classpath and proxy] jena.eyeball [-assume Reference*] -check specialURL+ [-config fileOrURL*] [-root rootURL] [-render Name] [-include shortName*] [-exclude shortName*]
The -config fileOrURL options specify the Eyeball assembler file to load. A single configuration model is constructed as the union of the contents of those files. If this option is omitted, the config file etc/eyeball-config.n3 is loaded. See loadConfigFiles for details.
The -assume References identifies any assumed schemas used to identify the predicates and classes of the data model. The reference may be a file name or a URL; it is loaded by a default FileManager (and hence respects any FileManager renamings).
Eyeball automatically assumes the RDF and RDFS schemas, and the built-in XSD datatype classes. The short name owl can be used to refer to the OWL schema, dc to the Dublin Core schema, dcterms to the Dublin Core terms schema,and dc-all to both.
The specialURLs name the files or URL references containing the data to be eyeballed. If several names are given, each is checked individually.
If the URL is of the form ont:NAME:base, then the checked model
is the model base treated as an OntModel with the specification
OntModelSpec.NAME
. If the URL (or the base)
is of the form jdbc:DB:head:model, then the checked model
is the one called model in the database with connection
jdbc:DB:head. (The database user and password must be specified
independently using the jend.db.user and jena.db.password
system properties.)
If any of the data or schema are identified by an http: URL, and you are behind a firewall, you will need specify the proxy to Java using system properties; one way to do this is by using the Java command line options:
-DproxySet=true -DproxyHost=theProxyHostName -DproxyPort=theProxyPortNumber
The include shortNames are strings which are the
eye:shortName
value of some inspector cluster in
the Eyeball config file; see the config file description for
details. If omitted, it is as if
-include defaultInspectorshad been written. The
-exclude
option allows the
shortnames of inspectors to be excluded from the checks.
(eg, the type inspector currently slows things down quite
a lot and might well be excluded from an initial eyeballing.)
The eyeball reports are written to the standard output; by default, the reports appear as text (RDF rendered by omitting the subjects - which are all blank nodes - and lightly prettifying the predicate and object). To change the rendering style, supply the -render option with the name of the renderer as its value. Eyeball comes with N3, XML, and text renderers; the Eyeball config file associates renderer names with their classes.
java jena.eyeball -check myDataFile.rdf java jena.eyeball -assume dc -check http://example.com/nosuch.n3 java jena.eyeball -assume mySchema.rdf -check myData.rdf -render xml java jena.eyeball -check myData.rdf -include defaultInspectors
Eyeball can be used from within Java code; the command line merely provides a convenient external interface.
An Eyeball object has three subcomponents: the assumptions against which the model is to be checked, the inspectors which do the checking, and the renderer used to display the reports.
The assumptions are bundled into a single OntModel. Multiple assumptions can be supplied either by adding them as sub-models or by loading their content directly into the OntModel.
The inspectors are supplied as a single Inspector object.
The method Inspector.Operations.create(List)
creates a single Inspector from a list of Inspectors; this
inspector delegates all its inspection methods to all of
its sub-inspectors.
The renderer can be anything that implements the (simple) renderer interface.
To create an Eyeball:
Eyeball eyeball = new Eyeball( inspector, assumptions, renderer );
Models to be inspected are provided as OntModels. The problems are delivered to a Report object, where they are represented as an RDF model.
eyeball.inspect( report, ontModelToBeInspected )The result is that same report object. The Report::model() method delivers an RDF model which describes the problems found by the inspection. The inspections supplied in the distribution use the EYE vocabulary, and are used in the standard reports:
unknown predicate | eye:unknownPredicate URI | the URI of the unknown predicate |
bad URI | eye:badURI String | the spelling of the bad URI |
illegal language code | eye:badLanguage String | the bad language code |
eye:onLiteral String | a plain literal with the same lexical form | |
bad datatype URI | eye:forReason URI | the reason URI |
eye:onLiteral String | a plain literal with the same lexical form | |
bad namespace URI | eye:onPrefix String | the prefix name with the bad namespace |
eye:forReason URI | the reason URI | |
eye:badNamespaceURI String | the spelling of the bad URI | |
Jena prefix found | eye:jenaPrefixFound String | the name of the Jena prefix |
eye:forNamespace | the namespace the prefix is bound to | |
implausible vocabulary item | eye:onResource URI | the URI of the implausible resource |
eye:notFromSchema URI | the URI of the schema | |
an undeclared class | eye:unknownClass | the resource that was presumed to be a Class |
an untyped Resource | eye:hasNoType Resource | the resource that has no rdf:type property |
a repeated prefix namespace | eye:multiplePrefixesForNamespace | the namespace resource that has multiple prefixes |
eye:onPrefix | the prefixes that were bound to the namespace | |
inconsistent types for resource | eye:noConsistentTypeFor URI | the URI of the inconsistent resource |
eye:hasAttachedType URI | one of the given types that have no intersection | |
"wrong" number of property values for some subject | eye:cardinalityFailure | the subject for which the failure was detected |
eye:onProperty | the property P that has the wrong number of values | |
eye:onType | the cardinality-constrained type | |
eye:cardinality | a blank node [eye:min min; eye:max max] | |
eye:numValues | the number of values of P found | |
eye:values | a blank node of rdf:type eye:Set
with an rdfs:member value for each of the
values of P.
| |
ill-formed list | eye:illFormedList | the URI of the root of the list |
eye:because | [eye:element index; has no/has multiple rdf:first/rest properties] | |
a likely miswritten typed list idiom has been detected | eye:suspectListIdiom | the list type that is suspect |
suspicious restriction, ie doesn't have exactly one owl:onProperty statement and exactly one constraint. | eye:suspiciousRestriction | a blank node with the restriction properties |
[optional, multiple] eye:forReason URI | eye:missingOnProperty -- there is no owl:onProperty property in this suspicious restriction. | |
eye:multipleOnProperty -- there is more than one owl:onProperty in this suspicious restriction. | ||
eye:missingConstraint -- there is no owl:hasValue, owl:allValuesFrom, owl:someValuesFrom, or owl:[minC|maxC|c]ardinality property in this suspicious restriction. | ||
eye:multipleConstraint -- there are multiple constraints (as above) in this suspicious restriction. | ||
[optional, multiple] eye:subClassOf | an immediate named superclass of this suspicious restriction, to help identify it. | |
[optional, multiple] eye:equivalentClass | an immediate named owl:equivalentClass of this suspicious restriction, to help identify it. | |
A SPARQL query that was required to succeed did not, or a SPARQL query that was required to fail did not. | eye:sparqlRequireFailed query | the query that failed, or a designated alternative message. |
eye:sparqlProhibitFailed query | the query that should not have succeeded, or a designated alternative message. |
Every report item in the model is a blank node with
rdf:type eye:Item
.
The labels for the Eyeball predicates and reason messages are defined in the Eyeball schema file etc/eyeball-schema.n3 (and are used by the text renderer):
eye:uriContainsSpaces | the URI contains unencoded spaces, probably as a result of sloppy use of file: URLs. |
eye:uriFileInappropriate | a URI used as a namespace is a file: URI, which is inappropriate as a global identifier. |
eye:uriHasNoScheme | a URI has no scheme field, probably a misused relative URI. |
eye:schemeShouldBeLowercase | the scheme part of a URI is not lower-case; while technically correct, this is not usual practice. |
eye:uriFailsPattern | a URI fails the pattern appropriate to its schema (as defined in the configuration for this eyeball). |
eye:unrecognisedScheme | the URI scheme is unknown, perhaps a misplaced QName. |
eye:uriNoHttpAuthority | an http: URI has no authority (domain name/port) component. |
eye:uriSyntaxFailure | the URI can't be parsed using the general URI syntax, even with any spaces removed. |
eye:namespaceEndsWithNameCharacter | a namespace URI ends in a character that can appear in a name, leading to possible ambiguities. |
eye:uriHasNoLocalname | a URI has no local name according to the XML name-splitting rules. (For example, the URI http://x.com/foo#12345 has no local name because a local name cannot start with a digit.) |
The configuration file is a Jena assembler description with added Eyeball vocabulary.
Eyeball is also configured by the location-mapping file etc/location-mapping.n3. The Eyeball jar contains copies of both the default config and the location mapper; these are used by default. You can provide your own etc/eyeball-config.n3 file earlier on your classpath or in your current directory; this config replaces the default. You may provide additional location-mapping files earlier on your classpath or in your current directory.
A shortname can name several schemas. The Eyeball delivery has the short names rdf, rdfs, owl, and dc for the corresponding schemas (and mirror versions of those schemas so that they don't need to be downloaded each time Eyeball is run.)[] eye:shortName shortNameLiteral ; eye:schema fullSchemaURL ... .
eye:shortName
s
(supplied on the command line). Each such property value
must be a plain string literal whose value is the full name of
the Inspector class to load and run; see the Javadoc of Inspector
for details.
An inspector resource may refer to other inspector resources
to include their inspectors, using either of the two properties
eye:include
or eye:includeByName
.
The value of an include
property should be another
inspector resource; the value of an includeByName
property should be the shortName
of an inspector
resource.
[Two inspector resources may refer to each other, in which case they are equivalent.]
The inspectors provided in the Eyeball distribution are:
class leafname | eye:shortName | description |
LiteralInspector | literal | Checks literals for syntactically correct language codes, syntactically correct datatype URIs, and conformance of the lexical form of typed literals to their datatype |
PropertyInspector | property | Checks that every property used is "declared" in some provided schema. |
PrefixInspector | prefix | Checks that prefix namespaces are well-formed and that well-known prefixes have their well-known URIs. Also reports Jena automatically generated (j.Number) prefixes. |
URIInspector | URI | Checks that every URI in the model is well-formed. Uses the new Jena IRI code. |
VocabularyInspector | vocabulary | Checks that every URI in the model whose namespace is mentioned in some schema is one of the URIs declared for that namespace. If the inspector has any eye:openNamespace properties, their values are resources whose URIs are "open" namespaces that the inspector will not report. |
AllTypedInspector | all-typed | checks that all URI and bnode resources in the model
have an rdf:type property in the model or the schema(s).
If there is a statement in the confiuration with
property eye:checlLiteralTypes and
value eye:true , also checks that every
literal has a type or a language. Not in the
default set of inspectors.
|
ConsistentTypeInspector | consistent-type | Checks that every subject in the model can be given a type which is the intersection of the subclasses of all its "attached" types. See below. |
ClassInspector | presumed-class | Checks that every resource in the model that appears
as the object of an rdf:type ,
rdfs:domain , or rdfs:range statement,
or as the subject or object of an rdfs:subClassOf
statement, has been declared as a Class in
the schemas or the model under test.
|
CardinalityInspector | cardinality | Looks for classes C that are subclasses of cardinality
restrictions on some property P with cardinality range
min to max. For any X of rdf:type
C, it checks that the number of values of P is
in the range min..max and generates a report if it isn't.
(Doesn't account for owl:sameAs in the 1.2 release.)
|
ListInspector | list |
|
OwlSyntaxInspector | owl | Looks for "suspicious restrictions" which have some of the OWL restriction properties but not exactly one owl:onProperty and exactly one constraint (owl:allValuesFrom, etc). |
SparqlDrivenInspector | sparql | checks that given SPARQL queries succeed (if required) or fail (if prohibited) when applied to the model. |
my:EList a owl:Class ; rdfs:subClassOf rdf:List ; rdfs:subClassOf [owl:onProperty rdf:first; owl:allValuesFrom my:Element] ; rdfs:subClassOf [owl:onProperty rdf:rest; owl:allValuesFrom my:EList] .The type
my:Element
is the element type of the list, and
the type EList
is the resulting typed list. The list inspector
checks that all the subclasses of rdf:List
that are also
subclasses of any bnode that has any property that has as an object
either rdf:first
or rdf:rest
is a subclass
defined by the full idiom above: if not, it reports it as a
suspectListIdiom
.
For example, if the model contains three types Top
,
Left
, and Right
, with Left
and Right
both being subtypes of Top
and with no other subclass statements, then some S
with rdf:type
s Left
and Right
would generate this warning.
The ConsistentTypeInspector must do at least some type inference.
This release
of Eyeball compromises by doing RDFS inference augmented by (very)
limited union and intersection reasoning, as described in the Jena
rules in etc/owl-like.rules
. Even so, doing type
inference over a large model is costly; you may wish to suppress it
with -exclude
until any other warnings are dealt with.
While, technically, a resource with no attached types at all is automatically inconsistent, Eyeball quietly ignores such resources, since they turn up quite often in simple RDF models.
Implementation note: The ConsistentTypeInspector's inferencing is done entirely by forward rules, triggered on the first subject to inspect. Once the rules have run to completion, further subjects are cheap. Using backward rules, the initial closure of the model was somewhat cheaper, but each new subject in a biggish took a long time - a second or so - to process.
eye:schemePattern
; their objects must be strings
which describe a legal (Java regex) pattern for a URI.
The scheme parts of those patterns form the set of known URI schemes:
a URI that has that scheme, but does not match any of the patterns
for that scheme, generates an eye:uriFailsPattern
report.
Eyeball forms a single |-separated Java regular expression from all the patterns sharing the same scheme part.The currently shipped config file restricts the type-id part of a URN to containing letters, digits, and hyphens, and to start with a letter.
The[] eye:renderer FullClassName ; eye:shortName ShortClassHandler
FullClassName
is a string literal giving the full
class name of the rendering class. That class must implement the
Renderer interface and have a constructor that takes a
single Model (the configuring model) as an argument.
The ShortClassHandle
is a string literal giving
the short name used to refer to the class. The default short name
used is default. There should be no more than one
eye:shortName statement with the same ShortClassHandle
in the configuation file, but the same class can have many different
short names.
The TextRenderer
supports an additional property
eye:labels
to allow the appropriate labels for an
ontology to be supplied to the renderer. Each object of a
eye:labels
statement names a model; all the
rdfs:label
statements in that model are used
to supply strings which are used to render resources.
The model names are strings which are interpreted by Jena's
FileManager
, so they may be redirected using
Jena's file mappings.