UI-Component Sets
Project Documentation

Introduction To Conversation Scoped Persistence

Orchestra's persistence support aims to simplify developers life when it comes to build web based applications that extensively use ORM persistence including JPA (Java Persistence API), Toplink or Hibernate.

Persistent objects often participate in a conversation (ie a sequence of requests). For example, an address-management module may need to read a person from a database on the first request of a conversation, then later requests may modify attributes and add child records for address, telephone, email, etc. Only at the end of the conversation is the data saved back to the database.

The Problem

Whoever has web-application development experience with an ORM layer has run into exceptions like the dreaded LazyInitializationException which occurs when following a relationship from one persistent object to another, or the NonUniqueObjectException that occurs if you try to merge detached objects back into the persistence session. Most of those problems, if not all, result from the fact that a persistence session is opened and closed for each HTTP request. In this case, the ORM tool has no chance to manage your entities for the duration of a business process - once you close a persistence layer session at the end of the HTTP request, all entities are detached from the persistence layer session.

The magic is to keep such a persistence session open as long as required, but as short as possible, contrary to the well-known OpenSessionInView or OpenSessionPerRequest-patterns. In those patterns, the session is closed too early for completing a full business process. Keeping the session alive for exactly the necessary timespan is one of the cool features of Apache MyFaces Orchestra.

The demarcation of such a timespan is defined by a conversation in Orchestra. A conversation is completely detached from the pageflow, it doesn't matter if the conversation spans multiple requests on a single page or multiple pages.

A conversation scope is (like the request or session scope) a place where you can store your beans. A conversation-scoped bean will have an attached entity manager which is in use for each database access until the application ends the conversation.

Quick review of the Java Persistence API (JPA)

JPA defines the concept of a "PersistenceContext", aka an EntityManager (or in Hibernate terminology, a "Session"). This represents a pool of persistent objects. First the context has to be created. Then data can be loaded from the database, causing objects to be created and placed in the context. Objects in the context can be modified, and new objects can be placed in the context. When entityManager.flush() is called, any modified data in the context is written back out to the database.

When an object is read from the database which has a reference to some other persistent object, what happens depends on whether that reference ("relation") is marked as "eager" or "lazy":

  • For eager relations, the associated data is read immediately from the database, and an object created to hold it (which is also added to the persistence context). The relation can then be followed (ie the reference used) even after the associated persistence context is no longer valid, as the object is already in memory.

  • For lazy relations, the reference in the original object points instead at a JPA-created proxy object, and only if a method is invoked on the proxy is a database operation triggered to load the actual object. Lazy relations are very useful, but mean that the relation can only be followed while the persistence context is still valid; if the proxy is triggered after the persistence context is no longer valid then a LazyInitializationException occurs.

A context can be closed (which automatically flushes it). When this happens, all objects in the context become "detached"; code that holds references to them can still access the objects. However because the context no longer exists, they cannot be written back to the database. In addition, any attempt to fetch a related object causes a LazyIntializationException because there is no longer a context into which the related object can be fetched.

Contexts are closed when no longer used because they do take up a lot of memory. The art of persistence is to keep contexts around for as long as necessary but no longer. Note that in some ORM implementations it is possible to partially discard contexts, ie to remove from the context objects that are known to no longer be needed, but that is not a general-purpose solution; it is just too hard to manually track exactly what is in use and what is not.

An object which has been detached (ie whose context has been closed) can be reattached to a different context. This allows code to load an object, detach it, modify it and later (potentially days later) create a new context, reattach the object then flush the new context causing that object to be written to the database. Of course if the database has been modified in the meantime then the save will fail.

The above information about JPA also applies to Hibernate and Toplink, which work in a very similar manner.

Using EJBs (separated web tier and business logic)

When using the full jee framework, the web and logic tiers are strongly separated, and may be on separate physical machines. The logic tier (EJBs) are responsible for performing all database access; they use declarative security and other mechanisms to ensure that external code (including a web tier, native gui applications, etc) can only read data that they have rights to read, and can only modify data via the APIs provided by the EJBs.

In this case, the web tier has no database connection, and clearly can only navigate relations that are present in the serialized data returned by EJBs. In the old days (before JPA), fields representing such relationships would simply be null, and accessing them would trigger a NullPointerException or similar. If an EJB uses JPA to load objects, then returns those objects to a remote client, then any attempt to navigate a relationship that is not populated will instead result in a LazyInitializationException; the effect is the same but the difference exists because JPA uses proxy objects to implement lazy loading. These proxy objects get returned along with the "real" data, but as there is no longer a context that the real referenced objects can be loaded into (and no database connection to do it with!) they cannot execute.

In practice, this does mean that the EJB tier has to be very aware of the needs of its presentation tier (eg the web tier), but this does seem unavoidable, and is the price of the strong separation between business and presentation.

Because an application using this architecture provides no database connection to the web tier, Orchestra can not provide any support for conversation-scoped persistence contexts. Orchestra's persistence support is only for use with applications that have their business logic and presentation logic in the same tier.

Note that JBoss Seam provides similar "conversation-scoped" persistence support, but this also only applies when the business logic and the presentation logic are in the same "tier". When remote EJBs are used to implement interactions with the database then there is simply nothing the web tier can do about this.

Single Tier Applications -- Old Style

Much code that does not use EJBs is nevertheless written in the stateless-session-bean style. When displaying a persistent object a backing bean method will open a persistence-context, read the relevant objects into it and then close the context. JSF components then render the object, but must be careful not to access any relations that were not initialised before the context was closed. A later action method which wants to save data will open a new context, reattach the object retrieved earlier, flush the context to the database, then close the context again.

An alternative is to allow the object context to exist for the entire http request, closing it only when the request is about to return (after render phase). This is referred to as the "Open Session In View" pattern. In this way, the context is cached (eg in a thread-local variable) for the duration of the request, and any JSF action method that wants to access persistent objects just retrieves that context and uses it. This is an extremely useful pattern, with no known drawbacks.

However with "Open Session In View" the context is still tied to just one request. This still exposes the application to potential LazyInitialisationExceptions, as follows:

  • request #1 loads an object from the database and stores it in the http session. The page is nicely rendered, and no LazyInitialisationException can occur. The pool is closed at the end of the request.

  • request #2 is executed, which tries to display more data from the object that is cached in the session. That data is not currently loaded, however, and the object is no longer in a context. An exception therefore occurs.

Single Tier Applications -- With Conversation Scoped Persistence

The solution to the problems described above is simply to store the PersistenceContext object in the http session, and only close it after the conversation has ended, ie when it has been decided to write to the database or abandon any changes made.

Although the default behaviour of JPA is to close the persistence context when a transaction commits or rolls back, this is optional and simply disabling this allows the persistence context to be cached over multiple requests. Instead, the db connection associated with the persistence context is simply removed from the context at the end of each request and returned to the connection pool. At the start of the next request a different connection is retrieved from the db pool and attached to the persistence context again.

Note that database transactions still occur; typically a transaction is started on the context's connection at the beginning of each request and committed at the end of each request. However as long as no flush() operation has been invoked on the context, modifications made to objects in the context do not get written to disk. Of course it is not possible for a real database transaction to be kept open across requests, as that would require the database to keep rows in the database locked for as long as the http session lasts (potentially hours), which is simply unacceptable.

There are dangers to this approach; a persistence context can become large if many objects have been loaded into it. Care needs to be taken to control the amount of persistent data read/written during a conversation. Manual removal of objects from the context which are no longer needed can be useful, or secondary persistence contexts can be created to perform operations on objects that are not needed to be kept in the conversation scope - particularly reads of objects that will not be modified as part of the conversation.

There is also a potential problem when inserting new objects into the context which have a database key that is generated via a database write. In this case, the database write occurs in the request in which the object was added to the context, even though the object's data does not get written until the end of the conversation. If the conversation is "rolled back" then the key operation remains, as that transaction has long since been committed.

Handling Transactions

The scope of a persistence context and database transaction boundaries (begin/commit) are separate issues. Multiple database transactions *can* occur within the lifetime of a single persistence context. It isn't the default - by default, close/rollback will close the associated context. However it is possible to disable this default in which case the persistence-context can last longer.

How to Use Orchestra Conversation Scoped Persistence

When configured appropriately, Spring will automatically scan the beans it loads for the standard persistence annotations, and injects a persistence context where requested by the bean. Orchestra ensures that the persistence context Spring injects is the appropriate one for the current conversation.

Your code is therefore simply straightforward:

import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
public class ComponentDAO
{
  @PersistenceContext
  private EntityManager entityManager;
  ....
}
      

Spring's annotation support does require, however, that the class containing the annotation be declared as a Spring bean, and instantiated via Spring. This means that all code which uses an instance of the above class needs to have it injected, rather than using the new() operator to create the instance. Existing code therefore may need to be restructured to take advantage of persistence annotations with Orchestra. The persistence context object that Spring injects is actually a proxy which looks up the correct EntityManager object to use on each method call, so it is safe to use singleton scope (Spring's default scope). Of course if the class has any other non-static members then the scope should be set to "prototype", to avoid conflicts between different instances of this class in different conversations.

Documentation Still TODO

TODO: document Orchestra's transaction support features

TODO: is the persistence-context serializable? Are all persistent objects in the context always serializable?