Jena2 Database Interface - Release Notes

28 August, 2003

The jena/db module provides an implementation of the Jena model interface but with the ability to store and retrieve RDF statements using a database. It currently supports MySQL, Oracle and PostgreSQL for persistent storage and runs under Linux and WindowsXP. Jena1 databases cannot be accessed by Jena2 and must be reloaded. Little performance tuning has been done.

Contents

Features
Remaining Work
Database Engines Supported
Compatibility with Jena1 ModelRDB
Migrating Jena1 Applications and Databases

Features

The Jena2 persistence subsystem implements an extension to the Jena Model class that provides persistence for models through use of a back-end database engine. Jena2 is largely backwards-compatible for Jena1 applications with the exception of some database configuration options. The default Jena2 database layout uses a denormalized schema in which literals and resource URIs are stored directly in statement tables. This differs from Jena1 in which all literals and resources were stored in two separate tables that were referenced by the statements table. The Jena2 layout enables faster insertion and retrieval but uses more storage than Jena1. However, configuration options are available that give Jena2 users some control over the degree of denormalization in order to reduce storage consumption. The persistence subsystem supports a Fastpath capability for RDQL queries that dynamically generates SQL queries to perform as much of the RDQL query as possible within the database engine. For an overview of creating and accessing persistent Jena models, see the HowTo. A summary of the various configuration options available is given in Options. Known issues and open problems are summarized in  Issues. Some performance notes and observations are given in PerfNotes.

Remaining Work

In the future, we would like to address the following issues.

Database Engine Support

The following table lists the platforms, database engines and JDBC drivers currently supported for Jena2 persistence. Older and newer versions may work but have not been tested. See the database engine-specific howto documents for more details (MySQL, Oracle, PostgreSQL).

   Platform        Database Engine        JDBC Driver
   Windows XP     MySQL 4.0.12    mysql-connector-java-3.0.7-stable.jar    
   Linux (RedHat 7.2)        MySQL 4.0.12    mysql-connector-java-3.0.7-stable.jar
   Windows XP     PostgreSQL 7.3.3    pg73jdbc2.jar (JDBC2 for jdk1.3, 1.2, 1.4 )    
   Linux (RedHat 7.2)        PostgreSQL 7.3.3    pg73jdbc2.jar (JDBC2 for jdk1.3, 1.2, 1.4 )
   Windows XP     Oracle 9.2.0.1.0    Oracle oci8 JDBC driver (thick client)    
   Linux (RedHat AS)        Oracle 9.2.0.1.0    Oracle oci JDBC driver (thick client) 

Compatibility with Jena1 ModelRDB and Databases

In general, Jena2 supports backwards compatibility for the Jena1 applications using the ModelRDB class. However, the Jena1 databases themselves are not compatible and cannot be directly read by Jena2. Instructions on migrating Jena1 databases to Jena2 are given below.

There are some changes to the API. The ModelRDB constructors are deprecated and applications should consider migrating to new factory methods for creating and opening persistent models (see the HowTo). The ModelRDB package name has changed. Jena1 applications that directly reference the package name jena.rdb must be modified to reference the package name jena.db. Jena2 does not support the StoreRDB class nor any of the Jena1 customization parameters (setProperty, getProperty). Jena2 uses a different technique for database configuration.

Jena2 does not support the hash layouts and proc layouts of Jena1. Applications that request these layouts under Jena2 will be given a generic layout. Jena2 supports MySQL, Oracle, PostgreSQL. Applications that require Interbase or Berkeley DB will not work. The driver configuration files (e.g., Mysql.config) are no longer used. Instead, get/set methods on the driver and modelRDB classes are used for configuration parameters (see Options).

In Jena2, all databases are multi-model. By default, models are stored separately and each model uses one table for asserted statements and another table for reified statements. To share tables among models, see migrating, below.

Performance of Jena2 persistent models is no worse than Jena1 and often better. However, Jena2 persistent models may consume more database space. See the performance notes (PerfNotes).

Migrating Jena1 Applications and Databases to Jena2

As mentioned above, most Jena1 persistent applications should run with little or no modification under Jena2. However, the ModelRDB class constructors are deprecated. In Jena2, persistent models should be created using a factory method. In some situations, the ModelRDB class constructors are still required because the factory methods do not yet support all the ModelRDB functionality (see HowTo).

The Jena2 persistence architecture and layout are different from Jena1. However, these differences are largely transparent to applications and only affect code that creates new persistent models. In particular, the way in which database configuration options are specified is changed. In Jena2, configuration options are specified using get and set methods on models or drivers. See Options for details on using them.

There is no utility provided to migrate databases. However, users can easily create their own. A small Jena1 application program is used to write Jena1 ModelRDB contents to a text file using an RDF writer. Then, a small Jena2 application program is used to read this file and store it in a persistent model. For small databases, the PrettyWriter (RDF/XML-ABBREV) should be adequate. For medium or large databases, use the N-Triple writer for better performance and consider using a pipe to connect the two small applications rather than using an intermediate file.