Distributable J2EE Web Applications
A Container Provider's View of the current Servlet Specification.

The 'Java(tm) Servlet Specification, Version 2.4' makes a number of references to 'distributable' web applications and httpsession 'migration'. It states that compliant deployments "...can ensure scalability and quality of service features like load-balancing and failover..." (SRV.7.7.2). In today's demanding enterprise environments, such features are increasingly required. This paper sets out to distil and understand the relevant contents of the specification, construct a model of the functionality that this seems to support, assess this functionality with regard to feasibility and popular requirements and finally make suggestions as to how a compliant implementation might be architected.

Prerequisites.

TODO - A good understanding of what an HttpSession is, what it is used for and how it behaves will be necessary for a full understanding of this content. A comprehensive grasp of the requirements driving architectures towards clustering and of common cluster components (such as load-balancers) will also be highly beneficial.

The Servlet Specification - distilled:

When a webapp declares itself <distributable/> it enters into a contract with it's container. The Servlet Specification includes a dry bones description of this contract which we will distil from it and flesh out in this paper.

For a successful outcome the implementors of both Container and Containee need to be agreed on exactly what behaviour is expected of each other. For a really deep understanding of the contract they will need to know why it is as it is (TODO - This paper will provide such a view, from both sides).

The Specification mandates the following behaviour for distributable Servlets:

Non-Distributable Servlets

Only Servlets deployed within a webapp may be distributable. (TODO - Ed.: is there any other standard way to deploy a Servlet? Perhaps through the InvokerServlet?) (SRV.3.2) TODO - WHY?

Single Threaded Servlets

SingleThreadedModel Servlets, whilst discouraged (since it is generally more efficient for the Servlet writer, who understands the problem domain, to deal with application synchronisation issues) are limited to a single instance pool per JVM.(SRV.2.3.3.1)

Multi-Threaded Servlets

Multithreaded HttpServlets are restricted to one Servlet instance per JVM, thus delegating all application synchronisation issues to a single point where the Servlet's writer may resolve them with application-level knowledge (SRV.2.2).

Distributable State

The only state to be distributed will be the HttpSession. Thus all application state that requires distribution must be housed in an HttpSession or alternative distributed resource (e.g. EJB, DB, etc.). The contents of the ServletContext are NOT distributed. (SRV.3.2, SRV.3.4.1, SRV.14.2.8)

HttpSession Migration

Moving HttpSessions between process boundaries (i.e. from JVM to JVM, or JVM to store) is termed 'migration'.In order that the container should know how to migrate application-space Objects, stored in an HttpSession, they must be of mutually agreed type.

In a J2EE (Version 1.4) environment (e.g. in a web container embedded in an application server), the set of supported types for HttpSession attributes is as follows, although web containers are free to extend this set (J2EE.6.4): (Note that using an extended type would impact your webapp's portability).

Breaking this contract through use of an unagreed type will result in the container throwing an IllegalArgumentException upon its introduction to the HttpSession, since the container must maintain the migratability of this resource (SRV.7.7.2).

Migration Implementation

How migration is actually implemented is undefined and left up to the container provider (SRV.7.7.2). The application is not even guaranteed that the container will use readObject() and writeObject() (TODO explain) methods if they are present on an attribute. The only guarantee given by the specification is that their "serializable closure" will be "preserved" (SRV.7.7.2). This is to allow the container provider maximum flexibility in this area.

HttpSessionActivationListener

The specification describes an HttpSessionActivationListener interface. Attributes requiring notification before or after migration can implement this. The container will call their willPassivate() method just before passivation, thus giving them the chance to e.g. release non-serialisable resources. Immediately after activation the container will call their didActivate() method, giving them the chance to e.g. reacquire such resources. (SRV.7.7.2, SRV.10.2.1, SRV.15.1.7, SRV.15.1.8). Support for a number of other such listeners are required in a compliant implementation, but these are not directly related to session migration.

HttpSession Affinity

Given that: we can see that any implementation must resolve these apparently contradictory issues satisfactorily.

The Servlet Specification states:

"All requests that are part of a session must be handled by one Java Virtual Machine (JVM) at a time." (SRV.7.7.2).

The intention of this statement is to resolve such concurrency issues. It prunes the tree of possible implementations substantially, insisting that all concurrent requests for a particular session are delivered to the same node.

Delivering requests for the same session to the same node is known variously as 'session affinity', 'sticky sessions', persistent sessions' etc., depending on your container's vendor. The specification is trading complexity in the web-container tier for complexity in the load-balancer tier. This added requirement will impact the latency of this tier, in that the load-balancer will generally need to parse the uri or headers of each http request travelling through it (in a non-encrypted form) in order to extract the target session id. However, the reduction of potentially awkward concurrency issues/race conditions in the web-container tier is a gain considered worth this sacrifice.

It is worth noting that, since we have now introduced a requirement for the load-balancer tier to have knowledge of the location of httpsessions within the web-container tier, the ability to 'migrate' these objects may, therefore, require a certain amount of coordination between the two tiers.

Background Threads

The previous requirement reduces our problem from race conditions between distributed objects in different JVMs, to a situation where we simply have to manage coordination between multiple threads in the same JVM. The purpose of this coordination is to ensure that access to container managed resources that are available to multiple concurrent application space threads is properly synchronised.

Whilst the container has implicit knowledge about any thread, executing application code, for the lifecycle of which it is responsible (i.e. request threads), it has no control over any thread that is entirely managed by application code - Background thread. Such threads might execute across request boundaries, accessing otherwise predictably dormant resources that might otherwise be passivated or migrated elsewhere.

Fortunately, the specification also recommends that references to container-managed objects should not be given to threads that have been created by an application (SRV.2.3.3.3, SRV.S.17) and whose lifecycle is not entirely bounded by that of a request thread. The container is encouraged to generate warnings if this should occur. Application developers should understand that recommendations such as this become all the more important when working in a distributed environment.

This concept of "container-managed objects" needs more careful discussion and we shall look at it more closely later.

HttpSession Events

Finally, given that HttpSessions are the only type to be distributed and that they should only ever be in one JVM at one time, it should come as no surprise that ServletContext and HttpSession events are not propagated outside the JVM in which they were raised (SRV.10.7) as this would result in container owned objects becoming active in a JVM through which no relevant request thread was passing.

Is this adequate ?

Armed now with a deeper understanding of exactly what the specification says about distributable webapps, we can begin to speculate on what a compliant implementation might look like.

The specification has done a reasonably good job of outlining our area of interest. Before implementing a container, however, there are a number of issues that we still need to address.

Catastrophic failure

TODO - Looking at what this specification actually says about distributable webapps, it can be seen immediately that it seems to reliably outline a mechanism for the controlled shutdown of a node and the attendant migration of it's sessions to [an]other node[s], or persistant storage.

The ability to migrate sessions on controlled shutdown is useful functionality (maintenance will be one of the main reasons behind the occurrence of session migration), but it does not go far enough for many enterprise-level users, who require a solution capable of transparent recovery, without data loss, even in the case of a node's catastrophic failure. If a node is simply switched off, thus having no chance to perform a shutdown sequence, then volatile state will simply be lost. It is too late to call HttpSessionActivationListener.willPassivate() where necessary and serialise all user state to a safe place! Container implementors must ask themselves the question - 'What, within the bounds of the current specification, can we do to mitigate this event?'.

Before moving into more detailed discussion about session migration we need to discuss the synchronisation of session attributes and to introduce the concepts of 'Reference vs. Value Based Semantics' and 'Object Identity'.

Session Attribute Synchronisation

We have shown that there are many times at which a container may wish to take a backup copy, via serialisation, of a session or session attribute. In a multi-threaded environment the container needs to be able to ensure a consistent view of the object that it is backing up. i.e. the object must remain unchanged throughout the process of serialisation, otherwise the backup copy can not be guaranteed valid.

If we classify session attributes as "container-managed objects", then we can see that the specification 'recommends' their references not being given to any application thread running beyond the scope of a request. This means that, provided that no request threads for this session are running in the container, we can be assured of thread-safe access to it's attributes and thus a consistent snapshot of the session's state.

If we classify sessions but not session attributes as "container-managed objects", then this assumption breaks down.

Even given this asumption, backing up of sessions when a relevant request or background thread is running (e.g. 'When' policies 'Immediate' and 'Request') become problematic. This is unfortunate, because inability to implement these policies impacts on the guarantees that the container can make and thus the quality of service that it can offer.

These issues are not isolated to the management of HttpSessions, they are present throughout distributed software architectures. Aside from an explicit synchronisation protocol a common and practical solution is to alter the semantics of object equality.

Because the design of HttpSessions did not originally encompass their distributability - explicit session attribute synchronisation protocol between application and container code. - shift from reference to value based semantics Object Identity is also an issue.

Reference vs Value Based Semantics - TODO - needs refactoring.

Given the following Servlet code snippet:
    Foo foo1=new Foo();
    session.setAttribute("foo", foo1);
    Foo foo2=session.getAttribute("foo");
    
Which of these assertions (assuming that Foo.equals() is well implemented) would you expect to be true?

If you expect foo1==foo2 then you are expecting reference-based semantics.

If you are expecting reference-based semantics you might well write code such as this in order to avoid unnecessary de/rehashes:

    Point p=new Point(0,0);
    session.setAttribute("point", p);
    p.setX(100);
    p.setY(100);
    
and then might expect that :
    ((Point)session.getAttribute("point")).getX()==100;
    

Using value based-semantics, out of these three (TODO) assertions, only the second of the equality tests would succeed.

Every parameter passed to and from a value based API must be assumed to be copied from an original, since it may have come across the wire from another address space.

For this reason, when you start dealing with (possibly) remote objects in a distributed scenario, you generally shift your semantics from reference to value. (c.f. Remote EJB APIs)

Unfortunately, the Servlet Specification, whilst clearly mandating that every session attribute must be of a type that the container knows how to move from VM to VM omits to mention that a possible impact of doing this is an important shift in semantics. This is exacerbated by the fact that, unlike EJBs, which have been designed specifically for distributed use, the httpsession API does not change (c.f. Local/Remote) according to the semantic that is required, which is simply a single deployment option. This encourages developers to believe that they can make a webapp that has been written for Local use, into a fully functional distributed component, simply by adding the relevant tag to the web.xml. All attendant problems are delegated, by spec and developer, to the unfortunate container provider.

Thus the container provider must make a choice here

Object Identity, Object Streams and Synchronisation

TODO - I guess Object Identity can only be preserved within a single Object tree ? so attribute-based distribution will not recognise the same object shared between different attributes

How can we guarantee, unless we know that no other threads are running, the synchronisation of values as we stream them out of the container ?

Session Backup - When

The answer to the concern of lost data is to frequently ship backup copies off-node, so that in the case of its catastrophic failure, we have a fallback position. The freshness of our backup data depends directly on the frequency of this process. This frequency is bounded by resource concerns and the contract between container and containee, as discussed above.

Let us examine some of the possibilities:

TODO - NEEDS CONCLUDING

Session Backup - What ?

Once we have decided when to backup, we must think about what to backup. Candidates include the following:

HttpSessionActivationListener:

The Servlet Specification has one final curve ball to throw at the Container Provider here. We have already seen how HttpSessionActivationListeners are notified around passivation/activation. Assuming that they require this in order to prepare themselves for serialisation, or recover from deserialisation it is likely that when the container calls their willPassivate() method, that they will move to a new state that, whilst valid for serialisation, is invalid for normal runtime operation. They might e.g. release a resource that would be too expensive or awkward to passivate, knowing that they can reacquire a replacement upon re-activation.

Imagine now that rather than simply migrate a session from one node to another, we are simply taking a backup of it at the end of a request group, a guard point against the node's catastrophic failure. If we simply call willPassivate() and then serialise a copy into our backup store, we will have the backup that we required, but will have left the attribute in a state which may mean it is invalid for normal operations

The solution is to call didActivate() immediately after taking the copy, thus restoring the attribute to its previous valid state. In effect the backup procedure may be thought of as a mini-migration off a node and then straight back onto it again, leaving a spare copy off-node.

This has interesting ramifications for the whole 'Session' backup policy which may end up doing this to many attributes which have not actually been added or altered since the last backup was taken. If this involves an expensive release and reacquisiton of resources, the impact may be substantial. The 'Delta' policy will not suffer from this inefficiency, since it will only concern itself with attributes that have changed.

Optimisations

TODO - These need to be discussed here so that we can draw upon them when discussing different impls.

Conclusions

In conclusion, we have been able to show that, whilst the spec does not explicitly cater for recovery from catastrophic failure, it does provide the Container Provider with enough structure to be able to implement various solutions to this problem. Unfortunately, because of its implicit reference-based semantics and failure to impose a mandatory protocol for the synchronisation of distributable session attributes it does not go quite far enough to allow such implementations suffient room to manoeuvre that they can deliver optimium results.

Given the current state of the specification, therefore, the solution space is an area of trade-off and compromise between not only accuracy and economy as might well be expected but also the semantics of reference, value and identity, which are really areas that should not be open to interpretation. Any application developer involving themselves in this would therefore be well advised to aqcuaint themselves fully with these concepts so as to be prepared for the unexpected side-effects that they are likely to cause.

No single set of the above policies is likely to implement the desired "silver bullet", however, with a solid understanding of the issues involved and the route that various implementations take through this maze, the application architect has a much improved chance of a successful outcome.

With this in mind, we may now survey existing implementations in the open source arena, with particular respect to the solutions that they have chosen to overcome the problems that we have identified.

Current Open Source Implementations

Jetty

The Jetty distribution contains a pluggable distributable session manager, written by the author, which relies on value-based semantics to implement an immediate, by default, delta-based replication strategy over JGroups (TODO - link), although other distribution policies, notably by CMP EJB and JBoss(tm) clustering (see below) are alos available. With mod_jk and session affinity, via a pluggable session id generator, backups may be done asynchronously, or if deployment happens under a dumb load-balancer, backups may be taken synchronously to ensure the consistency of the session no matter where in the cluster a request lands.

Tomcat 4.x - Filip Hanik

Tomcat 5.x - ???

JBoss(tm)

JBoss(tm) contains a ClusteredHttpSession service (TODO - check name) which backs onto the JBoss clustering layer which is implemented through replication using JGroups (TODO - link). Replication is done on according to whole 'Session' policy. The regularity of the backing up depends on the Web Container making use of the service.

The Jetty integration provides a pluggable 'Store' component which allows it to make use of this medium. Backups are taken immediately any change is made to a session. Jetty's other distribution policies may also be used.

The Tomcat integration relies on this service for it's transport. (TODO - finish). CONFIRM.

Apache Geronimo (TODO - URL)

The Geronimo implementation is currently being undertaken by the author of this paper, and therefore takes into account all points raised herein.

Further Reading:

TODO - more readings needed.

Further Isues

Further Notes

TODO - Look into Geronimo impl... SRV.10.6 Listener Exceptions TODO - we can use a SecurityManager to prevent background threads being created. We can prevent access from such a thread to a container managed object, but we can't prevent such a reference being held by such a thread...

does anything else other than session need to be distributed ?

  • security info
  • application level data (as opposed to user level)
  • etc
  • TODO - replication is faster than shared-store because 'getAttribute' is not a remote call. Effectively, with replication, each replicant IS a shared store which processes requests locally.

    have we mentioned that migration is bad because cache hits go down ?

    Q for Niall - if Object Identity table is scoped within a single ObjectOutputStream, then howcome contention on this table is meant to be a VM-wide problem ? Since the table only appears to scope the lifecycle of a single instance of this stream, how could it make sense for it to be a longlived global construct?