Jena JDBC comes with three built in drivers by default with the option of building custom drivers if desired. This page covers the differences between the provided drivers and the connection URL options for each.
Connection URLs for Jena JDBC drivers have a common format, they all start with the following:
jdbc:jena:foo:
Where foo
is a driver specific prefix that indicates which specific driver implementation
is being used.
After the prefix the connection URL consists of a sequence of key
value pairs, the characters ampersand (&
), semicolon (;
) and
pipe (|
) are considered to be separators between pairs, the
separators are reserved characters and may not be used in values. The key is
separated from the value by a equals sign (=
) though unlike the
separators this is not a reserved character in values.
There is no notion of character escaping in connection parameters so if you
need to use any of the reserved characters in your values then you should
pass these to the connect(String, Properties)
method directly in the
Properties
object.
There are some common parameter understood by all Jena JDBC drivers and which apply regardless of driver implementation.
As discussed in the overview the drivers have a notion
of JDBC compatibility which is configurable. The jdbc-compatibility
parameter is used
in connection URLs. To avoid typos when creating URLs programmatically a constant
(JenaDriver.PARAM_JDBC_COMPATIBILITY
) is provided which contains the parameter
name exactly as the code expects it. This parameter provides an integer value
in the range 1-9 which denotes how compatible the driver should attempt to
be. See the aforementioned overview documentation for more information on the interpretation
of this parameter.
When not set the default compatibility level is
used, note that JenaConnection
objects support changing this after
the connection has been established.
The second of the common parameters is the pre-processor
parameter which is used to
specify one/more CommandPreProcessor
implementations to use. The
parameter should be specified once for each pre-processor you wish to you and
you should supply a fully qualified class name to ensure the pre-processor
can be loaded and registered on your connections. The driver will report an
error if you specify a class that cannot be appropriately loaded and
registered.
Pre-processors are registered in the order that they are specified so if you
use multiple pre-processors and they have ordering dependencies please ensure
that you specify them in the desired order. Note that JenaConnection
objects support changing registered pre-processors after the connection has
been established.
There is also a post-processor
parameter which is used to specify
one/more ResultsPostProcessor
implementations to use. The parameter
should be specified once for each post-processor you wish to use and you
should supply a fully qualified class name to ensure the post-processor can
be loaded and registered on your connections. The driver will report an error
is you specify a class that cannot be appropriately loaded and registered.
Post-processors are registered in the order that they are specified so if you
use multiple post-processors and they have ordering dependencies please
ensure that you specify them in the desired order. Note that
JenaConnection
objects support changing registered post-processors
after the connection has been established.
Each driver is available as a separate maven artifact, see the artifacts page for more information.
The in-memory driver provides access to a non-persistent non-transactional in-memory dataset. This dataset may either be initially empty or may be initialized from an input file. Remember that this is non-persistent so even if the latter option is chosen changes are not persisted to the input file. This driver is primarily intended for testing and demonstration purposes.
Beyond the common parameters it has two possible connection parameters. The first of these
is the dataset
parameter and is used to indicate an input file that the driver will
initialize the in-memory dataset with e.g.
jdbc:jena:mem:dataset=file.nq
If you prefer to start with an empty dataset you should use the empty
parameter instead e.g.
jdbc:jena:mem:empty=true
If both are specified then the dataset
parameter has precedence.
The TDB driver provides access to a persistent Jena TDB dataset. This
means that the dataset is both persistent and can be used transactionally. For correct
transactional behavior it is typically necessary to set the holdability for connections and
statements to ResultSet.HOLD_CURSORS_OVER_COMMIT
as otherwise closing a result set or making
an update will cause all other results to be closed.
Beyond the common parameters the driver requires a single location
parameter that provides
the path to a location for a TDB dataset e.g.
jdbc:jena:tdb:location=/path/to/data
By default a TDB dataset will be created in that location if one does not exist, if you would
prefer not to do this i.e. ensure you only access existing TDB datasets then you can add the
must-exist
parameter e.g.
jdbc:jena:tdb:location=/path/to/data&must-exist=true
With this parameter set the connection will fail if the location does not exist as a directory, note that this does not validate that the location is a TDB dataset so it is still possible to pass in invalid paths even with this set.
The Remote Endpoint driver provides access to any SPARQL Protocol compliant store that exposes SPARQL query and/or SPARQL update endpoints. This driver can be explicitly configured to be in read-only or write-only mode by providing only one of the required endpoints.
The query
parameter sets the query endpoint whilst the update
parameter sets the update endpoint e.g.
jdbc:jena:remote:query=http://localhost:3030/ds/query&update=http://localhost:3030/ds/update
At least one of these parameters is required, if only one is provided you will get a read-only or write-only connection as appropriate.
This driver also provides a whole variety of parameters that may be used to customize its behavior further. Firstly there are a set of parameters which control the dataset description provided via the SPARQL protocol:
default-graph-uri
- Sets a default graph for queriesnamed-graph-uri
- Sets a named graph for queriesusing-graph-uri
- Sets a default graph for updatesusing-named-graph-uri
- Sets a named graph for updatesAll of these may be specified multiple times to specify multiple graph URIs for each.
Then you have the select-results-type
and model-results-type
which are used to set the MIME
type you'd prefer to have the driver retrieve SPARQL results from the remote endpoints in. If used
you must set them to formats that ARQ supports, the ARQ WebContent
class has constants for the various supported formats.
There is also comprehensive support for authentication using this driver, the standard JDBC user
and password
parameters are used for credentials and then a selection of driver specific
parameters are used to configure how you wish the driver to authenticate.
Under the hood authentication uses the new HttpAuthenticator
framework introduced in the same
release as Jena JDBC, see HTTP Authentication in ARQ. This means
that it can support standard HTTP auth methods (Basic, Digest etc) or can use more complex schemes
such as forms based auth with session cookies.
To set up standard HTTP authentication it is sufficient to specify the user
and password
fields. As
with any JDBC application we strongly recommend that you do not place these in the connection URL
directly but rather use the Properties
object to pass these in. One option you may wish to include
if your endpoints use HTTP Basic authentication is the preemptive-auth
parameter which when set to
true will enable preemptive authentication. While this is less secure it can be more performant if
you are making lots of queries.
Setting up form based authentication is somewhat more complex, at a minimum you need to provide the
form-url
parameter with a value for the URL that user credentials should be POSTed to in order to
login. You may need to specify the form-user-field
and form-password-field
parameters to provide
the name of the fields for the login request, by default these assume you are using an Apache mod_auth_form
protected server and use the appropriate default values.
The final option for authenticator is to use the authenticator
parameter via the Properties
object
to pass in an actual instance of a HttpAuthenticator
that you wish to use. This method is the most
powerful in that it allows you to use any authentication method that you need.