Axis Architecture Guide
Under construction
....
Post-Beta 2 Version
Feedback: axis-dev@xml.apache.org
Contents
Introduction
Architecture Overview
Subsystems
Message Flow Subsystem
Handlers and Chains
Message Contexts
Engine
Administration Subsystem
Message Model Subsystem
SOAP Message Model
Message Elements
Deserialization
Encoding Subsystem
WSDL Tools Subsystem
WSDL2Java
Interaction Diagrams
Client Side Processing
Open Issues
Introduction
This guide records some of the rationale of the architecture and design
of Axis.
Architectural Overview
Axis consists of several subsystems working together, as we shall see later.
In this section we'll give you an overview of how the core of Axis works.
Handlers and the Message Path in Axis
Put simply, Axis is all about processing Messages. When the central Axis
processing logic runs, a series of Handlers are each invoked in
order. The particular order is determined by two factors - deployment configuration
and whether the engine is a client or a server. The object which is passed
to each Handler invocation is a MessageContext. A MessageContext
is a structure which contains several important parts: 1) a "request" message,
2) a "response" message, and 3) a bag of properties. More on this in a
bit.
There are two basic ways in which Axis is invoked:
-
As a server, a Transport Listener will create a MessageContext
and invoke the Axis processing framework.
-
As a client, application code (usually aided by the client programming
model of Axis) will generate a MessageContext and invoke the Axis processing
framework.
In either case, the Axis framework's job is simply to pass the resulting
MessageContext through the configured set of Handlers, each of which has
an opportunity to do whatever it is designed to do with the MessageContext.
Message Path on the Server
The server side message path is shown in the following diagram. The small
cylinders represent Handlers and the larger, enclosing cylinders represent
Chains (ordered collections of Handlers which will be described
shortly).
A message arrives (in some protocol-specific manner) at a Transport
Listener. In this case, let's assume the Listener is a HTTP servlet. It's
the Listener's job to package the protocol-specific data into a Message
object (org.apache.axis.Message), and put the Message into a MessageContext.
The MessageContext is also loaded with various properties by the
Listener - in this example the property "http.SOAPAction" would be
set to the value of the SOAPAction HTTP header. The Transport Listener
also sets the transportName String on the MessageContext , in this
case to "http". Once the MessageContext is ready to go, the Listener hands
it to the AxisEngine.
The AxisEngine's first job is to look up the transport by name. The
transport is an object which contains a request Chain, a response
Chain, or perhaps both. A Chain is a Handler consisting of a sequence
of Handlers which are invoked in turn -- more on Chains later. If a transport
request Chain exists, it will be invoked, passing the MessageContext into
the invoke() method. This will result in calling all the Handlers specified
in the request Chain configuration.
After the transport request Handler, the engine locates a global request
Chain, if configured, and
then invokes any Handlers specified therein.
At some point during the processing up until now, some Handler has hopefully
set the serviceHandler field of the MessageContext (this is usually
done in the HTTP transport by the "URLMapper" Handler, which maps a URL
like "http://localhost/axis/services/AdminService" to the "AdminService"
service). This field determines the Handler we'll invoke to execute service-specific
functionality, such as making an RPC call on a back-end object. Services
in Axis are typically instances of the "SOAPService" class (org.apache.axis.handlers.soap.SOAPService),
which may contain request and response Chains (similar to
what we saw at the transport and global levels), and must contain a provider,
which is simply a Handler responsible for implementing the actual back
end logic of the service.
For RPC-style requests, the provider is the org.apache.axis.providers.java.RPCProvider
class. This is just another Handler that, when invoked, attempts to call
a backend Java object whose class is determined by the "className" parameter
specified at deployment time. It uses the SOAP RPC convention for determining
the method to call, and makes sure the types of the incoming XML-encoded
arguments match the types of the required parameters of the resulting method.
The Message Path on the Client
The Message Path on the client side is similar to that on the server side,
except the order of scoping is reversed, as shown below.
The service Handler, if any, is called first - on the client
side, there is no "provider" since the service is being provided by a remote
node, but there is still the possibility of request and response Chains.
The service request and response Chains perform any service-specific
processing of the request message on its way out of the system, and also
of the response message on its way back to the caller.
After the service request Chain, the global request Chain, if any, is
invoked, followed by the transport. The Transport Sender, a special
Handler whose job it is to actually perform whatever protocol-specific
operations are necessary to get the message to and from the target SOAP
server, is invoked to send the message. The response (if any) is placed
into the responseMessage field of the MessageContext, and the MessageContext
then propagates through the response Chains - first the transport, then
the global, and finally the service.
Subsystems
Axis comprises several subsystems working together with the aim of separating
responsibilities cleanly and making Axis modular. Subsystems which are
properly layered enable parts of a system to be used without having to
use the whole of it (or hack the code).
The following diagram shows the layering of subsystems. The lower layers
are independent of the higher layers. The 'stacked' boxes represent mutually
independent, although not necessary mutually exclusive, alternatives. For
example, the HTTP and SMTP transports are independent of each other but
may be used together.
In fact, the Axis source code is not as cleanly separated into subsystems
as the above diagram might imply. Some subsystems are spread over
several packages and some packages overlap more than one subsystem.
Proposals to improve the code structure and make it
conform more accurately to the notional Axis subsystems will be considered
after v1.0 of Axis is complete.
Message Flow Subsystem
Handlers and Chains
Handlers are invoked in sequence to process messages. At some point in
the sequence a Handler may send a request and receive a response or else
process a request and produce a response. Such a Handler is known as the
pivot
point of the sequence. As described above, Handlers are either transport-specific,
service-specific, or global. The Handlers of each of these three different
kinds are combined together into Chains. So the overall sequence of Handlers
comprises three Chains: transport, global, and service. The following diagram
shows two sequences of handlers: the client-side sequence on the left and
the server-side sequence on the right.
A web service does not necessarily send a response message to each request
message, although many do. However, response Handlers are still useful
in the message path even when there isn't a response message, e.g. to stop
timers, clean up resources, etc.
A Chain is a composite Handler, i.e. it aggregates a collection of Handlers
as well as implementing the Handler interface as shown in the following
UML diagram:
A Chain also has similarities to the Chain of Responsibility design pattern
in which a request flows along a sequence of Handlers until it is processed.
Although an Axis Chain may process a request in stages over a succession of
Handlers, it has the same advantages as Chain of Responsibility:
flexibility and the ease with which new function can be added.
Back to message processing -- a message is processed by passing through
the appropriate Chains. A message context is used to pass the message and
associated environment through the sequence of Handlers. The model is that
Axis Chains are constructed offline by having Handlers added to them one
at a time. Then they are turned online and message contexts start to flow
through the Chains. Multiple message contexts may flow through a single
Chain concurrently. Handlers are never added to a Chain once it goes online.
If a Handler needs to be added or removed, the Chain must be 'cloned',
the modifications made to the clone, and then the clone made online and
the old Chain retired when it is no longer in use. Message contexts that
were using the old Chain continue to use it until they are finished. This
means that Chains do not need to cope with the addition and removal of
Handlers while the Chains are processing message contexts -- an important
simplification.
The deployment registry has factories for Handlers and Chains. Handlers
and Chains can be defined to have 'per-access', 'per-request', or 'singleton'
scope although the registry currently only distinguishes between these
by constructing non-singleton scope objects when requested and constructing
singleton scope objects once and holding on to them for use on subsequent
creation requests.
Targeted Chains
A Targeted Chain is a special kind of chain which may have any or
all of: a request Handler, a pivot Handler, and a response Handler. The
following class diagram shows how Targeted Chains relate to Chains. Note
that a Targeted Chain is an aggregation of Handlers by virtue of extending
the Chain interface which is an aggregation of Handlers.
A service is a special kind of Targeted Chain in which the pivot Handler
is known as a "provider".
Fault Processing
Now let's consider what happens when a fault occurs. The Handlers prior
to the Handler that raised the fault are driven, in reverse order, for
onFault (previously misnamed 'undo'). The scope of this backwards scan
is interesting: all Handlers previously invoked for the current Message
Context are driven.
Need to explain how "FaultableHandlers" and "WSDD Fault Flows" fit
in.
Message Contexts
The current structure of a MessageContext is shown below. Each message
context may be associated with a request Message and/or a response Message.
Each Message has a SOAPPart and an Attachments object, both of which implement
the Part interface.
The typing of Message Contexts needs to be carefully considered in
relation to the Axis architecture. Since a Message Context appears on the
Handler interface, it should not be tied to or biassed in favour of
SOAP. The current implementation is marginally biassed towards SOAP in
that the setServiceHandler method narrows the specified Handler to a SOAPService.
Engine
Axis has an abstract AxisEngine class with two concrete subclasses: AxisClient
drives the client side handler chains and AxisServer drives the server
side handler chains. The relationships between these classes is fairly
simple:
Engine Configuration
The EngineConfiguration interface is the means of configuring the Handler
factories and global options of an engine instance. An instance of a
concrete implementation of EngineConfiguration must be passed to the engine
when it is created and the engine must be notified if the EngineConfiguration
contents are modified. The engine keeps a reference to the EngineConfiguration
and then uses it to obtain Handler factories and global options.
The EngineConfiguration interface belongs to the Message Flow subsystem
which means that the Message Flow subsystem does not depend on the
Administration subsystem.
Administration Subsystem
The Administration subsystem provides a way of configuring Axis engines.
The configuration information an engine needs is a collection of factories for
runtime artefacts such as Chains and SOAPServices and a set of global
configuration options for the engine.
The Message Flow subsystem's EngineConfiguration interface
is implemented by the Administration subsystem. FileProvider enables
an engine to be configured statically from a file containing a
deployment descriptor which is understood by the WSDDDeployment class.
SimpleProvider, on the other hand, enables an engine to be configured
dynamically.
WSDD-Based Administration
WSDD is an XML grammer for deployment descriptors which are used to
statically configure Axis engines.
Each Handler needs configuration in terms of the concrete class name
of a factory for the Handler, a set of options for the handler, and
a lifecycle scope value which determines the scope of sharing of
instances of the Handler.
The structure of the WSDD grammar is mirrored by a class hierarchy of factories
for runtime artefacts.
The following diagram shows the classes and the types of runtime
artefacts they produce (a dotted arrow means "instantiates").
Message Model Subsystem
SOAP Message Model
The XML syntax of a SOAP message is fairly simple.
A SOAP message consists of an envelope containing:
- an optional header containing zero or more header entries
(sometimes ambiguously referred to as headers),
- a body containing zero or more body entries, and
- zero or more additional, non-standard elements.
The only body entry defined by SOAP is a SOAP fault which is
used for reporting errors.
Some of the XML elements of a SOAP message define namespaces, each in terms
of a URI and a local name, and encoding styles, a standard one of which
is defined by SOAP.
Header entries may be tagged with the following optional SOAP attributes:
- actor which specifies the intended recipient of the header
entry in terms of a URI, and
- mustUnderstand which specifies whether or not the intended
recipient of the header entry is required to process the header
entry.
So the SOAP message model looks like this:
Message Elements
The classes which represent SOAP messages form a class hierarchy based on
the MessageElement class which takes care of namespaces and encodings.
The SOAPHeaderElement class looks after the actor and mustUnderstand
attributes.
During deserialization, a parse tree is constructed consisting of instances
of the above classes in parent-child relationships as shown below.
Deserialization
The class mainly responsible for XML parsing, i.e. deserialization, is
DeserializationContextImpl ('DCI') which inherits most of its methods from the
DeserializationContext interface.
DCI manages the construction of the parse tree and
maintains a stack of SAX handlers, a reference to the MessageElement that
is currently being deserialized, a stack of namespace mappings, a mapping from IDs to elements, a set of type mappings for deserialization (see
Encoding Subsystem) and a SAX event recorder.
Elements that we scan over, or ones for which we don't have a particular
deserializer, are recorded - in other words, the SAX events are placed into
a queue which may be 'played back' at a later time to any SAX ContentHandler.
Once a SOAPEnvelope has been built, either through a parse or manual
construction by the user, it may be output using a SerializationContext (also
see Encoding Subsystem).
MessageElements all have an output() method which lets them write out their
contents.
The SAX handlers form a class hierarchy:
and stack up as shown in the following diagram:
Initially, the SAX handler stack just contains an instance of
EnvelopeHandler which represents the fact that parsing of the SOAP envelope
has not yet started.
The EnvelopeHandler is constructed with a reference to an EnvelopeBuilder,
which is the SAX handler responsible for parsing the SOAP envelope.
During parsing, DCI receives the events from the
SAX parser and notifies either the SAX handler on the top of its handler
stack, the SAX event recorder, or both.
On the start of an element, DCI calls the SAX handler
on the top of its handler stack for onStartChild. This method returns a
SAX handler to be used to parse the child, which DCI
pushes on its SAX handler stack and calls for startElement.
startElement, amongst other things, typically creates a new MessageElement
of the appropriate class and calls DCI
for pushNewElement.
The latter action creates the parent-child relationships of the parse tree.
On the end of an element, DCI pops the top SAX handler
from its handler stack and calls it for endElement. It then drives SAX
handler which is now on the top of the handler stack for onEndChild.
Finally, it sets the MessageElement that is currently being deserialized to
the parent of the current one.
Elements which are not defined by SOAP are treated using a SOAPHandler
as a SAX event handler and a MessageElement as a node in the parse tree.
Encoding Subsystem
Encoding is most easily understood from the bottom up. The basic
requirement is to transform between values of programming language
datatypes and their XML representations. In Axis, this means
encoding (or 'serializing') Java objects and primitives into XML
and decoding (or 'deserializing') XML into Java objects and primitives.
The basic classes that implement these steps are serializers
and deserializers.
Particular serializers and deserializers are written to support
a specific XML processing mechanism such as DOM or SAX.
So serializer factories and deserializer factories
are introduced to construct serializers and deserializers
for a XML processing mechanism which is specified
as a parameter.
As is apparent from the above class diagrams, each pair of Java
type and XML data type which needs encoding
and decoding requires specific serializers and
deserializers (actually one of each per XML processing mechanism).
So we need to maintain a mapping from a pair of Java type and XML data
type, identified by a QName, to a serializer factory and a
deserializer factory.
Such a mapping is known as a type mapping.
The type mapping class hierarchy is shown below. Notice how
the default type mapping instantiates the various serializer and
deserialiser factories.
There is one final level of indirection. How do we know
which type mapping to use for a particular message?
This is determined by the encoding which is specified in
the message. A type mapping registry maintains
a map from encoding name (URI) to type mapping.
Note that the XML data type QNames are defined by the encoding.
So, in summary, to encode a Java object or primitive data value
to a XML datatype or to decode the latter to the former,
we need to know:
- the Java type we are dealing with,
- the QName of the XML data type we want to encode it as,
- the XML processing mechanism we are using, and
- the encoding name.
WSDL Tools Subsystem
The WSDL Tools subsystem contains WSDL2Java and Java2WSDL. The
Axis runtime does not depend on these tools -- they are just there
to make life easier for the user.
WSDL2Java
This tool takes a description of a web service written in WSDL
and emits Java artefacts used to access the web service.
There are three layers inside the tool:
- framework: SymbolTable, Emitter, WriterFactory
- WSDL2Java plugin to the framework: WSDL2Java (the main),
JavaWriterFactory, and all the WSDL-relative writers: JavaPortTypeWriter,
JavaBindingWriter, etc.
- The actual WSDL2Java emitters, one for each file generated:
JavaInterfaceWriter, JavaStubWriter, etc.
Java2WSDL
tbd.
Interaction Diagrams
Client Side Processing
The client side Axis processing constructs a Call object with associated
Service, MessageContext, and request Message as shown below before invoking
the AxisClient engine.
An instance of Service and its related AxisClient instance are
created before the Call object. The Call object is then created by invoking
the Service.createCall factory method. Call.setOperation creates
a Transport instance, if a suitable one is not already associated with
the Call instance. Then Call.invoke creates a MessageContext and associated
request Message, drives AxisClient.invoke, and processes the resultant
MessageContext. This significant method calls in this sequence are shown
in the following interaction diagram.
Open Issues
-
The relationship between the Axis subsystems needs to be documented
and somewhat cleaned up as there is leakage of responsibilities between
some of the subsystems. For example,
there is some SOAP and HTTP bias in the basic MessageContext
type and associated classes.
-
What classes are included in the "encoding" subsystem?
Are the encoding and message model subsystems independent of the other subsystems which depend on
"message flow"?
-
(Possibly related to the previous issue) How should we distribute the classes
in the above diagram between the Axis subsystems taking into account
SOAP-specific and HTTP-specific features?
-
The Axis Engine currently knows about three layers of handlers: transport,
global, and service. However, architecturally, this is rather odd. What
"law" of web services ensures that there will always and only ever be three
layers? It would be more natural to use Targeted Chains with their more
primitive notion of request, pivot, and response Handlers. We would then
implemented the Axis Engine as a Targeted Chain whose pivot Handler is
itself a Targeted Chain with global request and response Handlers and a
service pivot Handler (which is itself a Targeted Chain as we have just
described). Such an Axis Engine architecture is shown in the diagram below.
-
WSDDService.faultFlows is initialised to an empty Vector and there is no
way of adding a fault flow to it. Is this dead code or is something else
missing?
-
If a fault occurs after the pivot Handler, should the backwards scan notify
Handlers which were invoked prior to the pivot Handler? The current implementation
does notify such Handlers. However, this is not consistent with the processing
of faults raised in a downstream system and stored in the message context
by the pivot Handler. These faults are passed through any response Handlers,
but do not cause onFault to be driven in the local engine.
We need to consider what's going on here. If you take a sequence of
Handlers and then introduce a distribution boundary into the sequence,
what effect should that have on the semantics of the sequence in terms
of its effects on message contexts? The following diagram shows a client-side
Handler sequence invoking a server-side Handler sequence. We need to consider
how the semantics of this combined sequence compares with the sequence
formed by omitting the transport-related Handlers.