Axis Architecture Guide

Under construction ....
Post-Alpha 3 Version
Feedback: axis-dev@xml.apache.org

Introduction
Architecture Overview
Subsystems
Message Flow Subsystem
    Handlers and Chains
    Message Contexts
    Engine
Administration Subsystem
Interaction Diagrams
    Client Side Processing
Open Issues

Introduction

This guide records some of the rationale of the architecture and design of Axis.

Architectural Overview

Axis consists of several subsystems working together, as we shall see later. In this section we'll give you an overview of how the core of Axis works.

Handlers and the Message Path in Axis

Put simply, Axis is all about processing Messages. When the central Axis processing logic runs, a series of Handlers are each invoked in order. The particular oder is determined by two factors - deployment configuration and whether the engine is a client or a server. The object which is passed to each Handler invocation is a MessageContext. A MessageContext is a structure which contains several important parts: 1) a "request" message, 2) a "response" message, and 3) a bag of properties. More on this in a bit.

There are two basic ways in which Axis is invoked:

As a server, a Transport Listener will create a MessageContext and invoke the Axis processing framework.
As a client, application code (usually aided by the client programming model of Axis) will generate a MessageContext and invoke the Axis processing framework.

In either case, the Axis framework's job is simply to pass the resulting MessageContext through a configurable set of Handlers, each of which has an opportunity to do whatever it is designed to do with the MessageContext.

Message Path on the Server

The server side message path is shown in the following diagram. The small cylinders represent Handlers and the larger, enclosing cylinders represent Chains.

A message arrives (in some protocol-specific manner) at a Transport Listener. In this case, let's assume the Listener is a HTTP servlet. It's the Listener's job to package the protocol-specific data into a Message object (org.apache.axis.Message), and put the Message into a MessageContext. The MessageContext is also loaded with various properties by the Listener - in this example the property "http.SOAPAction" would be set to the value of the SOAPAction HTTP header. The Transport Listener also sets the transportName String on the MessageContext , in this case to "http". Once the MessageContext is ready to go, the Listener hands it to the AxisEngine.

The AxisEngine's first job is to look up the transport by name. The transport is an object which contains a request Chain, a response Chain, or perhaps both. A Chain is a Handler consisting of a sequence of Handlers which are invoked in turn -- more on Chains later. If a transport request Chain exists, it will be invoked, passing the MessageContext into the invoke() method. This will result in calling all the Handlers specified in the request Chain configuration.

After the transport request Handler, the engine locates a global request Chain, if configured (in the <requestFlow> element of the WSDD <globalConfiguration>, as explained in the WSDD deployment section later in this document), and then invokes any Handlers specified therein.

At some point during the processing up until now, some Handler has hopefully set the serviceHandler field of the MessageContext (this is usually done in the HTTP transport by the "URLMapper" Handler, which maps a URL like "http://localhost/axis/services/AdminService" to the "AdminService" service). This field determines the Handler we'll invoke to execute service-specific functionality, such as making an RPC call on a back-end object. Services in Axis are typically instances of the "SOAPService" class (org.apache.axis.handlers.soap.SOAPService), which may contain request and response Chains (similar to what we saw at the transport and global levels), and must contain a provider, which is simply a Handler responsible for implementing the actual back end logic of the service.

In typical RPC examples, the provider is the org.apache.axis.providers.java.RPCProvider class. This is just another Handler that, when invoked, attempts to call a backend Java object whose class is determined by the "className" parameter specified at deployment time. It uses the SOAP RPC convention for determining the method to call, and makes sure the types of the incoming XML-encoded arguments match the types of the required parameters of the resulting method.

The Message Path on the Client

The Message Path on the client side is similar to that on the server side, except the order of scoping is reversed, as shown below.

The service Handler, if any, is called first - on the client side, there is no "provider" since the service is being provided by a remote node, but there is still the possibility of request and response Chains. The service request and response Chains serve to do any service-specific processing of the request message on its way out of the system, and also of the response message on its way back to the caller.

After the service request Chain, the global request Chain, if any, is invoked, followed by the transport. The Transport Sender, a special Handler whose job it is to actually perform whatever protocol-specific operations are necessary to get the message to and from the target SOAP server, is invoked to send the message. The response (if any) is placed into the responseMessage field of the MessageContext, and the MessageContext then propagates through the response Chains - first the transport, then the global, and finally the service.

Subsystems

Axis comprises several subsystems working together with the aim of separating responsibilities cleanly and making Axis modular. Subsystems which are properly layered enable parts of a system to be used without having to use the whole of it (or hack the code).

The following diagram shows the layering of subsystems. The lower layers are independent of the higher layers. The 'stacked' boxes represent mutually independent, although not necessary mutually exclusive, alternatives. For example, the HTTP and SMTP transports are independent of each other but may be used together.

Message Flow Subsystem

Handlers and Chains

Handlers are invoked in sequence to process messages. At some point in the sequence a Handler may send a request and receive a response or else process a request and produce a response. Such a Handler is known as the pivot point of the sequence. As described above, Handlers are either transport-specific, service-specific, or global. The Handlers of each of these three different kinds are combined together into Chains. So the overall sequence of Handlers comprises three Chains: transport, global, and service. The following diagram shows two sequences of handlers: the client-side sequence on the left and the server-side sequence on the right.

A web service does not necessarily send a response message to each request message, although many do. However, response Handlers are still useful in the message path even when there isn't a response message, e.g. to stop timers, clean up resources, etc.

A Chain is a composite Handler, i.e. it aggregates a collection of Handlers as well as implementing the Handler interface as shown in the following diagram:

Back to message processing -- a message is processed by passing through the appropriate Chains. A message context is used to pass the message and associated environment through the sequence of Handlers. The model is that Axis Chains are constructed offline by having Handlers added to them one at a time. Then they are made online and message contexts start to flow through the Chains. Multiple message contexts may flow through a single Chain concurrently. Handlers are never added to a Chain once it goes online. If a Handler needs to be added or removed, the Chain must be 'cloned', the modifications made to the clone, and then the clone made online and the old Chain retired when it is no longer in use. Message contexts that were using the old Chain continue to use it until they are finished. This means that Chains do not need to cope with the addition and removal of Handlers while the Chains are processing message contexts -- an important simplification.

The deployment registry has factories for Handlers and Chains. Handlers and Chains can be defined to have 'per-access', 'per-request', or 'singleton' scope although the registry currently only distinguishes between these by constructing non-singleton scope objects when requested and constructing 'singleton scope objects once and holding on to them for use on subsequent creation requests.

Targeted Chains

A Targeted Chain is a special kind of chain which may have any or all of: a request Handler, a pivot Handler, and a response Handler. The following class diagram shows how Targeted Chains relate to Chains. Note that a Targeted Chain is an aggregation of Handlers by virtue of extending the Chain interface which is an aggregation of Handlers.

A service is a special kind of Targeted Chain in which the pivot Handler is known as a "provider".

Fault Processing

Now let's consider what happens when a fault occurs. The Handlers prior to the Handler that raised the fault are driven, in reverse order, for onFault (previously misnamed 'undo'). The scope of this backwards scan is interesting: all Handlers previously invoked for the current Message Context are driven.

Need to explain how "FaultableHandlers" and "WSDD Fault Flows" fit in.

Message Contexts

The current structure of a MessageContext is shown below. Each message context may be associated with a request Message and/or a response Message. Each Message has a SOAPPart and an Attachments object, both of which implement the Part interface.

The typing of Message Contexts needs to be carefully considered in relation to the Axis architecture. Since a Message Context appears on the Handler interface, it should not be tied to or biassed in favour of SOAP. The current implementation is marginally biassed towards SOAP in that the setServiceHandler method narrows the specified Handler to a SOAPService. But if we were to factor out a more abstract Message Context interface, we would then be then faced with a problem. Some Handlers, most likely in the global layer, would need to accept a Message Context with a particular concete type (e.g. for a HTTP transport) and produce a Message Context of a different concrete type (e.g. for a SOAP service) and so Handler.invoke would need a more complex signature.

Engine

Axis has an abstract AxisEngine class with two concrete subclasses: AxisClient drives the client side handler chains and AxisServer drives the server side handler chains. The relationships between these classes is fairly simple:

Engine Configuration

The EngineConfiguration interface is the means of configuring the Handler factories and global options of an engine instance. An instance of a concrete implementation of EngineConfiguration must be passed to the engine when it is created and the engine must be notified if the EngineConfiguration contents are modified. The engine keeps a reference to the EngineConfiguration and then uses it to obtain Handler factories and global options.

The EngineConfiguration interface belongs to the Message Flow subsystem which means that the Message Flow subsystem does not depend on the Administration subsystem.

Administration Subsystem

The Administration subsystem provides a way of configuring Axis engines. The configuration information an engine needs is a collection of factories for runtime artefacts such as Chains and SOAPServices and a set of global configuration options for the engine.

The Message Flow subsystem's EngineConfiguration interface is implemented by the Administration subsystem. FileProvider enables an engine to be configured statically from a file containing a deployment descriptor which is understood by the WSDDDeployment class. SimpleProvider, on the other hand, enables an engine to be configured dynamically.

WSDD-Based Administration

WSDD is an XML grammer for deployment descriptors which are used to statically configuring Axis engines. Each Handler needs configuration in terms of the concrete class name of a factory for the Handler, a set of options for the handler, and a lifecycle scope value which determines the scope of sharing of instances of the Handler.

The structure of the WSDD grammar is mirrored by a class hierarchy of factories for runtime artefacts. The following diagram shows the classes and the types of runtime artefacts they produce (a dotted arrow means "instantiates").

Interaction Diagrams

Client Side Processing

The client side Axis processing constructs a Call object with associated Service, MessageContext, and request Message as shown below before invoking the AxisClient engine.

An instance of Service and its related AxisClient instance are created before the Call object. The Call object is then created by invoking the Service.createCall factory method. Call.setOperation creates a Transport instance, if a suitable one is not already associated with the Call instance. Then Call.invoke creates a MessageContext and associated request Message, drives AxisClient.invoke, and processes the resultant MessageContext. This significant method calls in this sequence are shown in the following interaction diagram.

Open Issues

The relationship between the Axis subsystems needs to be documented and somewhat cleaned up as there is leakage of responsibilities between some of the subsystems. For example, there is some SOAP and HTTP bias in the basic MessageContext type and associated classes.
What classes are included in the "encoding" and "message model" subsystems? Are these subsystems independent of the other subsystems which depend on "message flow"?
(Possibly related to the previous issue) How should we distribute the classes in the above diagram between the Axis subsystems taking into account SOAP-specific and HTTP-specific features?
The Axis Engine currently knows about thee layers of handlers: transport, global, and service. However, architecturally, this is rather odd. What "law" of web services ensures that there will always and only ever be three layers? It would be more natural to use Targeted Chains with their more primitive notion of request, pivot, and response Handlers. We would then implemented the Axis Engine as a Targeted Chain whose pivot Handler is itself a Targeted Chain with global request and response Handlers and a service pivot Handler (which is itself a Targeted Chain as we have just described). Such an Axis Engine architecture is shown in the diagram below.

WSDDService.faultFlows is initialised to an empty Vector and there is no way of adding a fault flow to it. Is this dead code or is something else missing?
If a fault occurs after the pivot Handler, should the backwards scan notify Handlers which were invoked prior to the pivot Handler? The current implementation does notify such Handlers. However, this is not consistent with the processing of faults raised in a downstream system and stored in the message context by the pivot Handler. These faults are passed through any response Handlers, but do not cause onFault to be driven in the local engine.

We need to consider what's going on here. If you take a sequence of Handlers and then introduce a distribution boundary into the sequence, what effect should that have on the semantics of the sequence in terms of its effects on message contexts? The following diagram shows a client-side Handler sequence invoking a server-side Handler sequence. We need to consider how the semantics of this combined sequence compares with the sequence formed by omitting the transport-related Handlers.