Apache Muse - WSDL2Java Tool

Introduction
Usage
Architecture Overview
Analyzer
Synthesizer
Projectizer
Iterative Development Usage Pattern
Code Reuse in Other Projects
Use Cases
FAQ

Introduction

The goal of this program is to be a simple, lightweight, command-line tool for generating server-side implementation skeletons in addition to client side proxies. This will allow users to write their own WSDL, simplify code generation for headless builds and, most importantly, act as an isolation layer between changes in the UI models and the code generation (since it is driven by WSDL).

Usage

Running wsdl2java with no arguments prints the following message to standard output:


    Usage: wsdl2java.[bat|sh] PLATFORM [CONTAINER] -wsdl FILE [OPTIONS]

The following arguments are required:
  -wsdl FILE            The WSDL definition file to analyze

  PLATFORM              Must be one of the following:
    -j2ee               Create a J2EE project
    -osgi               Create a OSGi project
    -proxy              Create a Proxy project

  CONTAINER             Specify one of the following for J2EE or OSGi:
    axis2               Create a Axis2 1.1 container
    mini                Create a Mini SOAP Engine container

The following arguments are optional:
  -output DIR           Specify an output direcgtory
  -overwrite            Overwrite files that exist
  -help                 Display this message
  -helpmore             Display more advanced help message

-wsdl - the WSDL file that we should analyze. This should be a filesystem path.

You must specify a target platform. This is different than previous versions of wsdl2java, but after some confusion on what the default should be the choice is now explicit.
- -j2ee - Creates a J2EE project, which means the end result will be a Web ARchive File (WAR File).
- -osgi - Creates an OSGi project, which means that the end result will be an OSGi Bundle Java ARchive File (JAR file).
- -proxy - creates a Proxy project, which means that the end result will be a JAR file containing the proxy classes.
For OSGi and J2EE, you must specify a container. This is what handles the incoming requests and routes them to the Muse framework.
- axis2 - Uses an Axis2 1.1 container to host the Muse environment. This is a robust container which has many features.
- mini - Uses the Muse Mini Servlet Engine which has a much smaller footprint than axis2.
-output - tells the program to write all files to a given directory. When this optional flag is not present files are generated to the current directory.

-overwrite - tells the program to overwrite existing files

-help - prints the help message (same as running with no arguments)

-helpmore - prints advanced options (see below)

There are other options for more advanced usage scenarios. These are not printed by default since they would just confuse regular users. Running wsdl2java with the "-helpmore" argument prints the following message to standard output:


    Usage (one of the following):

wsdl2java.[bat|sh] PLATFORM [CONTAINER] -wsdl FILE [OPTIONS]

wsdl2java.[bat|sh] PLATFORM [CONTAINER] -descriptor FILE [OPTIONS]

One of the following arguments is required:
  -wsdl FILE            The WSDL definition file to analyze
  -descriptor FILE      The Muse descriptor to use

  PLATFORM              Must be one of the following:
    -j2ee               Create a J2EE project
    -osgi               Create a OSGi project
    -proxy              Create a Proxy project

  CONTAINER             Specify one of the following for J2EE or OSGi:
    axis2               Create a Axis2 1.1 container
    mini                Create a Mini SOAP Engine container

The following arguments are optional:
  -output DIR           Specify an output direcgtory
  -overwrite            Overwrite files that exist
  -analyzer CLASS       The Analyzer component
  -synthesizer CLASS    The Synthesizer component
  -projectizer CLASS    The Projectizer component
  -dump FILE            Dump the built-in descriptor to a file
  -headers              Generate a custom headers parameter in operations
  -quiet                Turn off all messages
  -verbose              Turn on verbose output
  -version              Print out the version
  -help                 Display a simple help message
  -helpmore             Display this message

-wsdl - the WSDL file that we should analyze. This should be a filesystem path.

-desriptor - the Muse descriptor we should analyze. This should be a filesystem path. You can specify both only one of -descriptor or -wsdl, they are mutally exclusive. When generating from a single WSDL file only one resource-type will be generated (in most circumstances). In some instances, a developer might want to make a more complex environment that can incorporate advanced Muse features and multiple resource-types. In this case a user would pass in a Muse descriptor that would contain all of the information needed to generate the complex project.

You must specify a target platform. This is different than previous versions of wsdl2java, but after some confusion on what the default should be the choice is now explicit.
- -j2ee - Creates a J2EE project, which means the end result will be a Web ARchive File (WAR File).
- -osgi - Creates an OSGi project, which means that the end result will be an OSGi Bundle Java ARchive File (JAR file).
- -proxy - creates a Proxy project, which means that the end result will be a JAR file containing the proxy classes.
For OSGi and J2EE, you must specify a container. This is what handles the incoming requests and routes them to the Muse framework.
- axis2 - Uses an Axis2 1.1 container to host the Muse environment. This is a robust container which has many features.
- mini - Uses the Muse Mini Servlet Engine which has a much smaller footprint than axis2.
-output - tells the program to write all files to a given directory. When this optional flag is not present files are generated to the current directory.

-overwrite - tells the program to overwrite existing files

-analyzer - specifies a class to plug in as the Analyzer

-synthesizer - specifies a class to plug in as the Synthesizer

-projectizer - specifies a class to plug in as the Projectizer

-dump - tells the program to write the built-in skeleton descriptor to a file. This is useful as a starting point to write your own custom descriptor, it has reasonable default values.

-headers - Tells the program to generate an extra parameter to every operation that allows for custom headers to be passed in. This is only implemented for Proxy generation.

-quiet - turns off all output (including error and warning messages)

-verbose - turn up verbosity, also print stack traces for errors.

-version - prints the version information and exits.

-help - prints the help message (same as running with no arguments)

-helpmore - prints advanced options (see below)

Architecture Overview

Wsdl2java performs the following functions during code generation:

Capability Extraction:
1. Capabilities can be grouped by namespace. This allows the tool to figure out which properties (from the properties document) and which operations (based on the namespace of the input elements of the messages) belong to a common namespace (ie. capability). See below for details of the algorithm.
2. For each capability the tool generates separate classes that implement the necessary operations and have the necessary properties.
Deployment artifact generation:
1. Muse deployment artifacts will have to come from the analyzed WSDL. A Muse endpoint needs to have a Muse deployment descriptor to function. However, the contents of this descriptor can be inferred from the contents of the WSDL file for many common cases so that users do not need to understand how to make a deployment descriptor to generate a project.
Platform specific artifact generation:
1. Each platform has specific files and a specific directory structure. Wsdl2java aims to ease the burden of assembling these files by automatically creating the needed directory structure and useful default values for all of the artifacts needed for deployment.
2. The same WSDL can be used to generate a variety of projects that target different environments.
A java project
1. An automatically generated directory structure that has all of the generated classes and packages that we need.
2. An ant script at the top level that will by default recompile everything and bundle it up into a WAR file.
3. Any platform-specific files that are necessary to build a deployable unit.

To achieve these goals, we've split the tool into three distinct and pluggable pieces:

The next sections explain each piece in detail.

Analyzer

The Analyzer portion of the tool reads in WSDL and produces a Map of capability URIs to Capability objects for each capability discovered in the WSDL.

Capability Extraction

The capability extraction can be summarized as follows:

Read in the WSDL using wsdl4j.
Get the service (there should only be one), get the port (there should only be one) and then go through the portType for the given port.
Iterate over each operation in the portType. Use the namespace URI of the input message's part or element as the namespace of the operation. Locate the related capability in the URI-Capability map and add the operation data to the Capability.
Locate the ResourceProperties extension attribute on the portType. Locate the referenced element in the schemas referenced from the current WSDL. Iterate through the list of elements and add each as a property for the given Capability (based on the namespace URI of the element).
At this point there will be a map of URIs to Capability objects.

Capability creation is lazy in the above algorithm to prevent unneeded Capabilities from being created. This means that a Capability will only be created if an operation or property is found for that capability. This precludes the ability to create empty Capabilities (that is, those that have no operations or properties) by using a WSDL descriptor. However, if a Muse deployment descriptor is used during code generation, if a capability pair is defined that is not built in and has not appeared in the WSDL then an empty capability will be generated during code generation. Empty capabilities are useful for doing server-side tasks that don't fit neatly into other capabilities.

The code generation is also aware of the built-in implementations for the standard capabilities. This means that by using the standard WSDL and schema descriptors for the built-in capabilities will not result in skeleton implementation classes being generated for Capabilities such as Identity or Manageability Characteristics. By default, if wsdl2java finds elements that belong to a certain built-in capability, the generated muse descriptor will reference the built-in Muse class as the default implementation for the capability. In this way, a user can concentrate only on creating their own custom capability implementations and wsdl2java will take care of plugging in the needed web services plumbing.

WS-Notification Producer Support

In general, one input WSDL will create a server-side project that has one resource-type in its Muse Deployment descriptor. WS-Notification Producers need to also have a second resource-type that handles all of the subscriptions for notifications that come in. Instead of forcing users to create a deployment descriptor when creating a WS-NotificationProducer, wsdl2java will automatically add the required artifacts and augment the generated deployment descriptor to contain a resource-type that implements a Subscription Manager.

Synthesizer

The Map of capabilities from the analyzer can be passed to the synthesizer to create the necessary skeleton code. This code is then returned as a map of file names to code (as a string)

Server Stub Code Generation

The server code stubs create interfaces that subclass Capability (or WsCapability) and implementing classes that implement the interface and subclass AbstractCapability (or AbstractWsCapability). The interfaces also define constants for the class such as the URI of the capability. The decision on which parent class is based on whether the capability has any associated properties.

Client Proxy Code Generation

The client code generation creates a single class that exposes all of the properties and methods of the combined capabilities at the remote endpoint. This class converts local requests into remote calls.

If using a deployment descriptor for generation, a proxy will be created for each resource-type in the descriptor.

The -headers option add an extra parameter to each custom operation found in the analyzed WSDL files. This parameter will be the last parameter in the parameter list and will be defined as: Element[] customHeaders.

Projectizer

The Projectizer takes the Maps from the previous two stages and creates the necessary project layout for the target platform. This class handles writing the files to disk and laying out a suitable directory structure along with descriptors and optional build scripts. The projectizer has the unpleasant and tedious task of laying out the right directory structure for OSGi, J2EE and Proxy projects that would otherwise take many error-prone steps.

It been our experience that projectizers will grow in number while the synthesizers and analyzers have stayed relatively stable. This allows for reuse of this tooling in other environments and for other platforms without having to do any WSDL analysis.

J2EE Projectizer Family

The J2EE platform is the most complicated platform for projectizing. It contains many descriptors and has implied patterns for classloading locations. Currently, wsdl2java supports two engines for the J2EE platform: Axis2 and Muse Mini Soap Servlet.

See the example in the tutorial for more information.

OSGi Projectizer Family

Unlike the J2EE platform, the OSGi platform can be easily self-contained. The OSGi projectizer will create everything that is needed to deploy and run the endpoint while the J2EE projectizer would create only the deployable WAR that has to be deployed into a J2EE container like Tomcat. The OSGi projectizer uses the Equinox framework as its OSGi implementation. Currently, wsdl2java supports two engines for the OSGi platform: Axis2 and Muse Mini Soap Servlet. In essence, the engines serve as isolation layers that isolate Muse from its environment.

The OSGi projectizer creates the necessary descriptor files and copies over all of the bundles necessary to launch the generated project into an OSGi environment. See the example in the tutorial for more information.

Axis2 Engine

The Axis2 1.1 engine project will contain two directories (as well as a build.xml and .overwrite): WebContent and JavaSource. This class also creates the necessary descriptors (muse.xml, services.xml) that include information about the implementing classes. When ant is invoked in the directory where the code is generated, a WAR file is created that matches the name of the current directory. So if the current working directory is "c:\foo\bar\example" then the build will create "example.war" when run.

Muse Mini Servlet Engine

Although Axis2 provides a robust platform for hosting Muse endpoints, some users find that it is too complex for their needs. On platforms such as J2ME, Axis2 is not a viable option because of the limited subset of classes that are present.

Proxy Projectizer

A proxy exposes all of the properties and operations of an endpoint using a Java facade. This is a useful convenience for interacting with the endpoint programmatically. See the example in the tutorial for more information.

Overwriting

During iterative development, it is common to generate a project into a target directory, find some changes that need to be made to the WSDL and then regenerate into the same directory. In previous versions of Muse this posed a problem in deciding which files to overwrite. Initially the approach was extreme caution: do not overwrite anything that exists. This approach proved to be too restrictive for normal use. For example, it was not uncommon for users to modify the generated build.xml that is generated and it was also required that users modify the generated skeleton code for server-side capabilities that define their own operations. However, if the WSDL changed and a user regenerated, the changes would not appear in the generated code since the default behaviour was not to overwrite any existing files. If the user specified the -overwrite flag, there would be no checking of which files would be overwritten creating a potentially dangerous situation wherein a user would overwrite files that contained hand-written server-side implementation code.

To overcome these problems, wsdl2java now generates a simple list of files that are considered overwritable in a file called .overwrite located in the root of the generated project directory. This file contains an include list of files that can be overwritten without worry by the tooling. The file contains one file name per line. Lines that begin with # are considered comments. If a user modifies a file that has been generated by the tooling and does not wish for the tooling to overwrite the file during the subsequent generation, the user can simply remove or comment out the file in the .overwrite manifest.

This also solves the problem of changing WSDL files and regenerating. The server-side code is split into an implementation class and an interface. The implementation class is the place where users are expected to add custom code. By default, the implementation class is not in the .overwrite file, while the interface is added to the manifest. In this way, if the WSDL (that is, the interface description) changes, the generated interface will be overwritten while leaving the implementation file untouched. If the user fails to make the needed modifications to the implementation file to fulfill the requirements of the implementation class a compilation error will occur alerting the user to missing operations.

It is also fully possible for a user to add other files to the .overwrite manifest. This could be done if the implementation class does not have any custom operations or initialization code and can be safely regenerated during subsequent calls to wsdl2java. A special case of this behaviour is adding the .overwrite file to itself. This changes the behaviour to always overwriting all of the time. It would be the equivalent of running wsdl2java with -overwrite every time. The only caveat here is that the .overwrite file is not overwritten since it no longer matters. If you delete the .overwrite file, the next code generation will be done with no overwriting. To regenerate the default .overwrite on such a project you must run wsdl2java with the -overwrite flag.

WS-ResourceMetadataDescriptor Support

If there is a WS-ResourceMetadataDescriptor (.rmd file) referenced in the WSDL that is passed to wsdl2java, that file will be carried through to the generated project. For more information about see Metadata Exchange.

Iterative Development Usage Pattern

We envision that this tool will be part of an iterative development procedure that looks like:

Create WSDL (plus optional Muse descriptor)
Feed WSDL into this tool, generating code.
Modify code or modify WSDL
Go back to step 2.

The code that will be generated will include

Interfaces for the various capabilties
Implementation classes which will have empty methods that must be implemented by the developer.

Code Reuse in Other Projects

This tool should has portions that are pluggable when value-added options are available. For example, it might make sense that if we are in the Eclipse tooling we use EMF-based code generation for conflict resolution and automatic round-tripping. This would be a value-add that would make changing the WSDL and regenerating a bit more informative than just breaking the abstract to implementation layer. For example, you would be able to do the code merge, extract the delta and populate the implementations with TODO's of what needs to be fixed.

Another useful feature from the WSDL analysis (described above) is that it will create simple in-memory structures that describe the capabilities found in the WSDL provided. This would drive WSDL importing in the tooling and should be exposed programmatically.

That said, the in-memory model could come from whatever tooling model the tooling is using that week. So the creation of the model could be pluggable: command-line uses a WSDL analyzer to build the model, tooling uses an EMF analyzer to build the model.

Use Cases

1. Simple WSDL-in-Muse-out generation

Input: WSDL

Output: Some generated classes, web descriptors, Muse descriptor

Description: In this scenario a user would pass in a WSDL file to the command line tool and the tool would examine the contents of the file, extract capability information (see above) and pass it to a proxy or server synthesizer. The synthesizer would create proxy classes for remote classes to use in calling out to the endpoint or the server skeleton classes for handling the requests. The server-side classes would implement the same interfaces as the client side proxies, which would essentially be the capabilities extracted from the WSDL. A Muse descriptor would be created to map the capability URIs to the generated classes.

2. Dump the current default configuration into a descriptor

Input: Name of the file write to

Output: A base Muse descriptor to use with use case 3.

Description: Since knowledge of how to create a descriptor from scratch involes either understanding the descriptor schema or copy/pasting a descriptor from the documentation, a user can ask the tool to dump a default descriptor. This would be done with a single command line argument which is the location of the target file to be written (ie. -dump mymusedesc.xml). This descriptor will have the default data about mappings of built-in capabilities and other deployment information. Users can then use this as a starting point for creating their own custom descriptor which can be passed to use case 3.

3. Code generation given a skeleton Muse descriptor

Input: WSDL, base Muse descriptor

Output: generated classes, web descriptors, augmented Muse descriptor

Description: Here, a user could create a bare-bones Muse descriptor where they could specify:

URI to class mapping for capability URIs.
Initialization parameters.

In this case, the proxy and server synthesizer would use this descriptor as a starting point for code generation. There are three cases:

The descriptor could force class naming or mapping explicitly. For example, if in the descriptor a mapping is given, but the class is not visible on the classpath of the synthesizer, then the synthesizer would create a class using the given name (instead of trying to guess a class based on the standard convention of converting a URI into a java fully-qualified class name). If the class does exist on the classpath then no generation would take place.
The descriptor could overwrite internal mappings of built-in capabilities. If the user already has their own implementation of the built-in capabilities, they could use the descriptor to force the mapping onto the supplied class.
If no URI class mapping is found, the synthesizer would create the class. The synthesizer would append this URI class mapping to the descriptor.

This would also leave the option of including other pertinent data that is added to the Muse descriptor and the current code synthesizer could safely ignore it and simply build on top of the data.

4. Code generation given a synthesizer class

Input: WSDL, qualified name of a java class to use as a synthesizer

Output: Generated classes, web descriptors, Muse descriptor

Description: A user might want to take a different approach to code generation because of existing frameworks such as EMF. In this case, the user would create a class that implements the Generator interface and supply this name on the command line. The exact API for this class is forth-coming, but the general idea follows. The class would take as input parameters a set of capabilities (captured in Capability objects) and an in-memory model of the descriptor (captured in DeploymentDescriptor from org.apache.muse.core.descriptor). The Capability objects would hold abstract representations of the operations (input/output/fault information) and properties (type information), with pointers into the original WSDL document. The main task of the command-line tool would be to analyze and aggregate the information contained within the WSDL into a form that is very easy to use by an implementer of the synthesizer class. The task of the synthesizer class would be to map the Capability information onto whatever service objects they choose to employ and update the deployment descriptor. As an afterthought, it is conceivable that the synthesizer could simply return a map of URI to class values so as to insulate the Generator from a dependency on the DeploymentDescriptor class.

Use cases 3 and 4 can be combined so that the user can provide the wsdl, synthesizer class and base descriptor.

FAQ

Q: Why not use JET (or EMF or another framework)?

A: After looking at JET and EMF it seems that the problem they are solving is far more comprehensive than what we are doing here. Specifically we don't need a template framework if we only have one template that is very unlikely to change much anyway. Also, for cleanliness of code, it would not be calling EMF methods directly because the code would become unreadable. So if there is already an abstraction that is suitable to the task (ie. the WSDM abstraction and in-memory capability model) then there's no need for ultra-generic frameworks. To date, the existing Inspector class already does a large chunk of the work (ie. generating the client code) without resorting to complicated frameworks. It also seems that if this code is intended to be modified by others, then we want to lower any learning curve to modifying it. I would posit that most java developers don't know JET or EMF, so they would have to learn one or the other to contribute. Finally, the extension capability is there in the synthesizer piece. Any custom stuff can be done there while maintaining the actual hard piece of the tooling (schema + wsdl analysis).

Q: Why not extend the existing wsdl2java?

A: This is a question of being overly generic. Currently, wsdl2java is only supposed to take a generic wsdl and make some java classes out of it. It does its own serialization/deserialization class creation and looks at the wsdl in generic pieces: types, portTypes, services, etc. There is no wholistic view or cross-cutting based on namespace URI and capability-awareness. Also, it is unclear that wsdl2java was written with the explicit componentization that we are planning in mind. This means that there would be an unnecessary dependency on WTP in whatever tooling we target. Finally, there is the ramp-up of having to go through the existing wsdl2java code to figure out how and why they did things, if it is at all possible to extend it the way we want to and how much work it would be.

Q: Why not have a configuration file?

A: This raises several issue. From the motiviation for this tool, the main idea is simplicity. A configuration file could keep data such as default URI to class mappings (for built-in types, for example), and other deployment data. The location of this configuration file would become tedious to maintain. Options would include:

Keeping it in the jar that contains this tool. This option is not appealing because any changes to the configuration file would mean that the jar would have to be unzipped, modified and recompressed. This could be a bad thing in environments where jar modification is not possible (ie. when jars are signed).
Keeping it in a specified directory. This would imply some kind of persistent installation/configuration for something that is supposed to be a very simple tool. So configuration would be in the form of a $HOME/.muserc file or an environment variable or as a parameter on the command line (-configuration /foo/bar/muse.conf). In addition, all of this data is easily captured in the deployment descriptor in use case 3.

Since none of these options is particularly appealing, there is no configuration file. However, because the default configuration can be dumped out using use case 2, there is a way to inspect the current configuration and modify it to suit your needs.

Q: Why is the WSDL modified as it goes through the tool?

A: This question comes from the belief that the WSDL should be a descriptor of the endpoint which should be treated as a catalyst in the transformation. The problems with this approach are manifest in the unfortunate realities of deploying webapps.

First, there is a convention among web serive applications that appending "?wsdl" to the target URL will respond with the WSDL document. If the document contains links (via wsdl or schema imports/includes) then those links must be resolved by tools that wish to interact with the web service. However, on the axis2 platform, for example, paths to services are in the form of DIRECTORY/services/ENDPOINTNAME. The DIRECTORY corresponds to an actual directory on the webserver, but the services and ENDPOINTNAME portions can be virtual (handled by axis2). So, if the WSDL file returned by ?wsdl has a link to a file called "../mydir/test.wsdl", it is unclear how the webserver would resolve this path. A fully resolved WSDL which contains all of its contents in-lined, will not suffer from this reality.

A closely related issue involves the directory structure of the original WSDL input. The two interesting cases are:

A WSDL file that lives in a parent (or ancestor) directory
Two or more WSDL files that have the same file name but live in different directories

In the first case, the directory structure cannot be maintained when copying into the /wsdl folder, since for the links to still be valid, the files must be in parent (or ancestor) directories which may make no sense on the server. This can be resolved by making all of the links relative to the current folder and then copy the all of the WSDL files over. Unfortunately this can lead to the second problem.

It is not unthinkable that a developer would name two WSDL files the same name and keep them in separate folders. This could happen because of different versions of the same interface, for example. This creates a conflict when trying to flatten the folder structure.