Concurrent access to Models

Applications need to be aware of the concurrency issues in access Jena models. API operations are not thread safe by default. Thread safety would simple ensure that the model datastructures remained intact but would not give an application consistent access to the RDF graph. It would also limit the throughput of multithreaded applications on multiprocessor machines where true concurrenecy can lead to a reduction in response time.

For example, supposed an application wishes to read the name and age of a person from model. This takes two API calls. It is more convenient to be able to read that information in a consistent fashion, knowing that the access to the second piece of information is not being done after some model chnage has occurred.

Special care is needed with iterators. In general, Jena's iterators do not take a copy to enable safe use in the presence of concurrent update. A multithreaded application needs to be aware of these issues and correctly use the mechamisms that Jena provides (or manage its own concurrency itself). While not zero, the application burden is not high.

There are two main cases:

Multiple threads in the same JVM.
Multiple applications accessing the same persistent model (typically, a database).

Transactions are provided by database-backed models: see the database documentation and the Model interface to transactions.

This note describes the support for same-JVM, multithreaded applications.

ModelLocks

ModelLocks provide critical section support for managing the interactions of multiple threads in the same JVM. Jena provides multiple-reader/single-writer concurrency support (MRSW).

All Jena models provide a ModelLock to mediate concurrent access. ModelLock is a subinterface of Model.

The pattern general is:

Model model = . . . ;
model.enterCriticalSection(ModelLock.READ) ;  // or ModelLock.WRITE
try {
    ... perform actions on the model ...
    ... obey contract - no update operations if a read lock
} finally {
    model.leaveCriticalSection() ;
}

Applications are expected to obey the lock contract, that is, they must not do update operations if they have a read lock as there can be other application threads reading the model concurrently.

Iterators

Care must be taken with iterators: unless otherwise stated, all iterators must be assumed to be iterating over the datastructures in the model or graph implmentation itself. It is not possible to safely pass these

RDQL Query

RDQL query results are iterators and no different from other iterators in Jena for concurrency purpose. The default query engine does not give thread safety and the normal requirements on an application to ensure MRSW access in the presence of iterators applies. Note that Jena's query mechanism is itself multithreaded with the query itself executing on a separate thread from the calling application. If the application is single threaded, no extra work is necessary. If the application is multithreaed, queries should be executed with a read lock.

Outline:

Model model = . . . ;
Query query = new Query(queryString) ;
query.setSource(model);
QueryExecution qe = new QueryEngine(query) ;

model.enterCriticalSection(ModelLock.READ) ;
try {
    // Must do inside the critical section.
    QueryResults results = qe.exec() ;
    for ( Iterator iter = results ; iter.hasNext() ; )
    {
        ResultBinding res = (ResultBinding)iter.next() ;
        . . . process results . . .
    }
    results.close() ;
} finally { 
   model.leaveCriticalSection() ;
}

Updates to the model should not be performed inside the read-only section. For database-backed models, the application can use a transaction. For in-memory models, the application should collect the changes together during the query processing then making all the changes holding a write lock.

Jena ModelLocks do not provide lock promotion - an application can not start a "write" critical section while holding a "read" lock because this can lead to deadlock.