Title: Using ARP Without Jena
ARP can be used both as a Jena subsystem, or as a standalone
RDF/XML parser. This document gives a quick guide to using ARP
standalone.
## Contents
- [Overview](#overview)
- [Sample Code](#sample)
- [ARP Event Handling](#handlers)
- [Configuring ARP](#config)
- [Interrupting ARP](#interrupt)
- [Using Other SAX Sources](#sax2rdf)
- [Memory usage](#memory)
## Overview
To load an RDF file:
1. Create an
[ARP](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARP.html#ARP()) instance.
2. Set parse options, particularly error detection control, using
[getOptions](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPConfig.html#getOptions())
or
[setOptionsWith](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPConfig.html#setOptionsWith(org.apache.jena.rdf.arp.ARPOptions)).
3. Set its handlers, by calling the
[getHandlers](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPConfig.html#getHandlers())
or
[setHandlersWith](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPConfig.html#setHandlersWith(org.apache.jena.rdf.arp.ARPHandlers))
methods, and then.
- Setting the
[statement handler](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPHandlers.html#setStatementHandler(org.apache.jena.rdf.arp.StatementHandler)).
- Optionally setting the other handlers.
4. Call a
[load](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARP.html#load(java.io.InputStream,%20java.lang.String))
method
Xerces is used for parsing the XML. The SAXEvents generated by
Xerces are then analysed as RDF by ARP. It is possible to use a
different source of SAX events.
Errors may occur in either the XML or the RDF part.
## Sample Code
ARP arp = new ARP();
// initialisation - uses ARPConfig interface only.
arp.getOptions().setLaxErrorMode();
arp.getHandlers().setErrorHandler(new ErrorHandler(){
public void fatalError(SAXParseException e){
// TODO code
}
public void error(SAXParseException e){
// TODO code
}
public void warning(SAXParseException e){
// TODO code
}
});
arp.getHandlers().setStatementHandler(new StatementHandler(){
public void statement(AResource a, AResource b, ALiteral l){
// TODO code
}
public void statement(AResource a, AResource b, AResource l){
// TODO code
}
});
// parsing.
try {
// Loading fixed input ...
arp.load(new StringReader(
"\n"
+""
+"hello\n"
+""
));
}
catch (IOException ioe){
// something unexpected went wrong
}
catch (SAXParseException s){
// This error will have been reported
}
catch (SAXException ss) {
// This error will not have been reported.
}
## ARP Event Handling
ARP reports events concerning:
- Triples found in the input.
- Errors in the input.
- Namespace declarations.
- Scope of blank nodes.
User code is needed to respond to any of these events of interest.
This is written by implementing any of the relevant interfaces:
[StatementHandler](/documentation/javadoc/jena/org/apache/jena/rdf/arp/StatementHandler.html),
org.xml.sax.ErrorHandler,
[NamespaceHandler](/documentation/javadoc/jena/org/apache/jena/rdf/arp/NamespaceHandler.html),
and
[ExtendedHandler](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ExtendedHandler.html).
An individual handler is set by calling the
[getHandlers](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPConfig.html#getHandlers())
method on the ARP instance. This returns an encapsulation of all
the handlers being used. A specific handler is set by calling the
appropriate set...Handler method on that object, e.g.
[setStatementHandler](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPHandlers.html#setStatementHandler(org.apache.jena.rdf.arp.StatementHandler)).
All the handlers can be copied from one ARP instance to another by
using the
[setHandlersWith](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPConfig.html#setHandlersWith(org.apache.jena.rdf.arp.ARPHandlers))
method:
ARP from, to;
// initialize from and to
// ...
to.setHandlersWith(from.getHandlers());
The error handler reports both XML and RDF errors, the former
detected by Xerces. See
[ARPHandlers.setErrorHandler](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPHandlers.html#setErrorHandler(org.xml.sax.ErrorHandler))
for details of how to distinguish between them.
## Configuring ARP
ARP can be configured to treat most error conditions as warnings or
to be ignored, and to treat some non-error conditions as warnings
or errors.
In addition, the behaviour in response to input that does not have
an `` root element is configurable: either to treat the
whole file as RDF anyway, or to scan the file looking for embedded
`` elements.
As with the handlers, there is an options object that encapsulates
these settings. It can be accessed using
[`getOptions`](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPConfig.html#getOptions()),
and then individual settings can be made using the methods in
[`ARPOptions`](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ARPOptions.html).
It is also possible to copy all the option settings from one ARP
instance to another:
ARP from, to;
// initialize from and to ...
to.setOptionsWith(from.getOptions());
The [I/O how-to](iohowto.html#arp_properties) gives some more
detail about the options settings, although it assumes the use of
the Jena `RDFReader` interface.
## Interrupting ARP
It is possible to interrupt an ARP thread. See the
[I/O how-to](iohowto.html#interrupting_arp) for details.
## Using Other SAX Sources
It is possible to use ARP with other SAX input sources, e.g. from a
non-Xerces parser, or from an in-memory XML source, such as a DOM
tree.
Instead of an ARP instance, you create an instance of
[SAX2RDF](/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2RDF.html) using
the [newInstance](/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2RDF.html#newInstance(java.lang.String))
method. This can be configured just like an ARP instance, following
the initialization section of the [sample code](#sample).
This is used like a SAX2Model instance as
[described elsewhere](sax.html).
## Memory usage
For very large files, ARP does not use any additional memory except
when either the
[ExtendedHandler.discardNodesWithNodeID](/documentation/javadoc/jena/org/apache/jena/rdf/arp/ExtendedHandler.html#discardNodesWithNodeID())
returns false or when the
[AResource.setUserData](/documentation/javadoc/jena/org/apache/jena/rdf/arp/AResource.html#setUserData(java.lang.Object))
method has been used. In these cases ARP needs to remember the
`rdf:nodeID` usage through the file life time.