apache > ws.apache
Apache Muse
 

Apache Muse - WSDL2Java Tool

Introduction

The goal of this program is to be a simple, lightweight, command-line tool for generating server-side implementation skeletons in addition to client side proxies. This will allow users to write their own WSDL, simplify code generation for headless builds and, most importantly, act as an isolation layer between changes in the UI models and the code generation (since it is driven by WSDL).

Note
The first time you run wsdl2java, it will download the latest version of Apache Axis2 1.1. This download is almost 10 MB, so you can expect a delay of a few seconds to almost one minute depending on your connection speed. Once Axis2 has been downloaded, wsdl2java will continue as normal and you won't have to download it again.

Usage

Running wsdl2java with no arguments prints the following message to standard output:

Usage: wsdl2java.[bat|sh] -wsdl FILE [OPTIONS]

The following arguments are required:
  -wsdl FILE            The WSDL definition file to analyze

The following arguments are optional:
  -overwrite            Overwrite files that exist
  -help                 Display this message
  -helpmore             Display more advanced help message
    

  • -wsdl - the WSDL file that we should analyze. This should be a filesystem path.

  • -overwrite - tells the program to overwrite existing files

  • -help - prints the help message (same as running with no arguments)

  • -helpmore - prints advanced options (see below)

Running wsdl2java with the "-helpmore" argument prints the following message to standard output:

Usage (one of the following):

wsdl2java.[bat|sh] -wsdl FILE [OPTIONS]

wsdl2java.[bat|sh] -descriptor FILE [OPTIONS]

One of the following arguments is required:
  -wsdl FILE            The WSDL definition file to analyze
  -descriptor FILE      The Muse descriptor to use

The following arguments are optional:
  -overwrite            Overwrite files that exist
  -analyzer CLASS       The Analyzer component
  -synthesizer CLASS    The Synthesizer component
  -projectizer CLASS    The Projectizer component
  -dump FILE            Dump the built-in descriptor to a file
  -osgi                 Generate an OSGi project
  -proxy                Generate a proxy project
  -quiet                Turn off all messages
  -verbose              Turn on verbose output
  -help                 Display a simple help message
  -helpmore             Display this message
    

  • -wsdl - the WSDL file that we should analyze. This should be a filesystem path.

  • -desriptor - the Muse descriptor we should analyze. This should be a filesystem path. You can specify both only one of -descriptor or -wsdl, they are mutally exclusive.

  • -overwrite - tells the program to overwrite existing files

  • -analyzer - specifies a class to plug in as the Analyzer

  • -synthesizer - specifies a class to plug in as the Synthesizer

  • -projectizer - specifies a class to plug in as the Projectizer

  • -dump - tells the program to write the built-in skeleton descriptor to a file. This is useful as a starting point to write your own custom descriptor, it has reasonable default values.

  • -osgi - tells the program to generate an OSGi project. This is really just a shortcut for specifying the OSGi projectizer.

  • -proxy - tells the program to generate proxy code. This is the same as specifying the Proxy synthesizer and Proxy projectizer.

  • -quiet - turns off all output (including error and warning messages)

  • -verbose - turn up verbosity, also print stack traces for errors.

  • -help - prints the help message (same as running with no arguments)

  • -helpmore - prints advanced options

Architecture Overview

The tool will have a simple command-line interface that will take as few parameters as humanly possible. From these humble beginnings, the tool will analyze the wsdl and extract the following information:

  1. The capabilities:
    1. Capabilities can be grouped by namespace. This allows the tool to figure out which properties (from the properties document) and which operations (based on the namespace of the messages) belong to a common namespace (ie. capability). See below for details of the algorithm.
    2. For each capability the tool generates separate classes that implement the necessary operations and have the necessary properties.
  2. Deployment artifact generation:
    1. Muse deployment artifacts will have to come from the analyzed WSDL.
  3. A java project
    1. An automatically generated directory structure that has all of the generated classes and packages that we need.
    2. An ant script at the top level that will by default recompile everything and bundle it up into a WAR file.
    3. Any platform-specific files that are necessary to build a deployable unit.

To achieve these goals, we've split the tool into three distinct and pluggable pieces:

The next sections explain each piece in detail.

Analyzer

The Analyzer portion of the tool reads in WSDL and produces a Map of capability URIs to Capability objects for each capability discovered in the WSDL.

Capability Extraction

The capability extraction can be summarized as follows:

  1. Read in the WSDL using wsdl4j.
  2. Analyze the types section to find all schemas:
    1. For each schema, locate its targetNamespace attribute and create a capability object for it, saving this object in a map if it wasn't there already.
    2. Recurse into the schema to find other referenced schemas, repeating step a. for each one.
  3. Repeat step 2 for each wsdl or schema file imported.
  4. At this point there will be a map of URIs to Capability objects.
  5. Get the service (there should only be one), get the port (there should only be one) and then go through the portType for the given port.
  6. Iterate over each operation in the portType. Use the namespace URI of the input message's part or element as the namespace of the operation. Locate the related capability in the URI-Capability map and add the operation data to the Capability.
  7. Locate the ResourceProperties extension attribute on the portType. Locate the referenced element in the schemas referenced from the current WSDL. Iterate through the list of elements and add each as a property for the given Capability (based on the namespace URI of the element).

Synthesizer

The Map of capabilities from the analyzer can be passed to the synthesizer to create the necessary skeleton code. This code is then returned as a map of file names to code (as a string)

Server Stub Code Generation

The server code stubs create interfaces that subclass Capability (or WsCapability) and implementing classes that implement the interface and subclass AbstractCapability (or AbstractWsCapability). The interfaces also define constants for the class such as the URI of the capability. The decision on which parent class is based on whether the capability has any associated properties.

Client Proxy Code Generation

The client code generation already exists and creates a single class that exposes all of the properties and methods of the combined capabilities at the remote endpoint. This class converts local requests into remote calls.

Projectizer

The Projectizer takes the Maps from the previous two stages and creates the necessary project layout for the target platform. This class handles writing the files to disk and laying out a suitable directory structure along with descriptors and optional build scripts.

Axis2 Projectizer

Currently, we include a Projectizer for the Axis2 platform. This projectizer creates two directories: WebContent and JavaSource, along with a build.xml file for building with ant. This class also creates the necessary descriptors (muse.xml, services.xml) that include information about the implementing classes. When ant is invoked in the directory where the code is generated, a WAR file is created that matches the name of the current directory. So if the current working directory is "c:\foo\bar\example" then the build will create "example.war" when run. See the example in the tutorial for more information.

OSGi Projectizer

The OSGi projectizer creates the necessary descriptor files and copies over all of the bundles necessary to launch the generated project into an OSGi environment. See the example in the tutorial for more information.

Proxy Projectizer

A proxy exposes all of the properties and operations of an endpoint using a Java facade. This is a useful convenience for interacting with the endpoint programmatically. See the example in the tutorial for more information.

Iterative Development Usage Pattern

We envision that this tool will be part of an iterative development procedure that looks like:

  1. Create WSDL (plus optional Muse descriptor)
  2. Feed WSDL into this tool, generating code.
  3. Modify code or modify WSDL
  4. Go back to step 2.
The code that will be generated will include
  1. Interfaces for the various capabilties
  2. Implementation classes which will have empty methods that must be implemented by the developer.

Code Reuse in Other Projects

This tool should have portions that are pluggable when value-added options are available. For example, it might make sense that if we are in the Eclipse tooling we use EMF-based code generation for conflict resolution and automatic round-tripping. This would be a value-add that would make changing the WSDL and regenerating a bit more informative than just breaking the abstract to implementation layer. For example, you would be able to do the code merge, extract the delta and populate the implementations with TODO's of what needs to be fixed.

Another useful feature from the WSDL analysis (described above) is that it will create simple in-memory structures that describe the capabilities found in the WSDL provided. This would drive WSDL importing in the tooling and should be exposed programmatically.

That said, the in-memory model could come from whatever tooling model the tooling is using that week. So the creation of the model could be pluggable: command-line uses a WSDL analyzer to build the model, tooling uses an EMF analyzer to build the model.

Use Cases

1. Simple WSDL-in-Muse-out generation
Input: WSDL
Output: Some generated classes, web descriptors, Muse descriptor
Description: In this scenario a user would pass in a WSDL file to the command line tool and the tool would examine the contents of the file, extract capability information (see above) and pass it to a proxy and server synthesizer. The synthesizer would create proxy classes for remote classes to use in calling out to the endpoint and the server skeleton classes for handling the requests. The server-side classes would implement the same interfaces as the client side proxies, which would essentially be the capabilities extracted from the WSDL. A Muse descriptor would be created to map the capability URIs to the generated classes.
2. Dump the current default configuration into a descriptor
Input: Name of the file write to
Output: A base Muse descriptor to use with use case 3.
Description: Since knowledge of how to create a descriptor from scratch involes either understanding the descriptor schema or copy/pasting a descriptor from the documentation, a user can ask the tool to dump a default descriptor. This would be done with a single command line argument which is the location of the target file to be written (ie. -dump mymusedesc.xml). This descriptor will have the default data about mappings of built-in capabilities and other deployment information. Users can then use this as a starting point for creating their own custom descriptor which can be passed to use case 3.
3. Code generation given a skeleton Muse descriptor
Input: WSDL, base Muse descriptor
Output: generated classes, web descriptors, augmented Muse descriptor
Description: Here, a user could create a bare-bones Muse descriptor where they could specify:
  1. URI to class mapping for capability URIs.
  2. Initialization parameters.
In this case, the proxy and server synthesizer would use this descriptor as a starting point for code generation. There are three cases:
  1. The descriptor could force class naming or mapping explicitly. For example, if in the descriptor a mapping is given, but the class is not visible on the classpath of the synthesizer, then the synthesizer would create a class using the given name (instead of trying to guess a class based on the standard convention of converting a URI into a java fully-qualified class name). If the class does exist on the classpath then no generation would take place.
  2. The descriptor could overwrite internal mappings of built-in capabilities. If the user already has their own implementation of the built-in capabilities, they could use the descriptor to force the mapping onto the supplied class.
  3. If no URI class mapping is found, the synthesizer would create the class. The synthesizer would append this URI class mapping to the descriptor.
This would also leave the option of including other pertinent data that is added to the Muse descriptor and the current code synthesizer could safely ignore it and simply build on top of the data.
4. Code generation given a synthesizer class
Input: WSDL, qualified name of a java class to use as a synthesizer
Output: Generated classes, web descriptors, Muse descriptor
Description: A user might want to take a different approach to code generation because of existing frameworks such as EMF. In this case, the user would create a class that implements the Generator interface and supply this name on the command line. The exact API for this class is forth-coming, but the general idea follows. The class would take as input parameters a set of capabilities (captured in Capability objects) and an in-memory model of the descriptor (captured in DeploymentDescriptor from org.apache.muse.core.descriptor). The Capability objects would hold abstract representations of the operations (input/output/fault information) and properties (type information), with pointers into the original WSDL document. The main task of the command-line tool would be to analyze and aggregate the information contained within the WSDL into a form that is very easy to use by an implementer of the synthesizer class. The task of the synthesizer class would be to map the Capability information onto whatever service objects they choose to employ and update the deployment descriptor. As an afterthought, it is conceivable that the synthesizer could simply return a map of URI to class values so as to insulate the Generator from a dependency on the DeploymentDescriptor class.

Use cases 3 and 4 can be combined so that the user can provide the wsdl, synthesizer class and base descriptor.

FAQ

Q: Why not use JET (or EMF or another framework)?

A: After looking at JET and EMF it seems that the problem they are solving is far more comprehensive than what we are doing here. Specifically we don't need a template framework if we only have one template that is very unlikely to change much anyway. Also, for cleanliness of code, it would not be calling EMF methods directly because the code would become unreadable. So if there is already an abstraction that is suitable to the task (ie. the WSDM abstraction and in-memory capability model) then there's no need for ultra-generic frameworks. To date, the existing Inspector class already does a large chunk of the work (ie. generating the client code) without resorting to complicated frameworks. It also seems that if this code is intended to be modified by others, then we want to lower any learning curve to modifying it. I would posit that most java developers don't know JET or EMF, so they would have to learn one or the other to contribute. Finally, the extension capability is there in the synthesizer piece. Any custom stuff can be done there while maintaining the actual hard piece of the tooling (schema + wsdl analysis).

Q: Why not extend the existing wsdl2java?

A: This is a question of being overly generic. Currently, wsdl2java is only supposed to take a generic wsdl and make some java classes out of it. It does its own serialization/deserialization class creation and looks at the wsdl in generic pieces: types, portTypes, services, etc. There is no wholistic view or cross-cutting based on namespace URI and capability-awareness. Also, it is unclear that wsdl2java was written with the explicit componentization that we are planning in mind. This means that there would be an unnecessary dependency on WTP in whatever tooling we target. Finally, there is the ramp-up of having to go through the existing wsdl2java code to figure out how and why they did things, if it is at all possible to extend it the way we want to and how much work it would be.

Q: Why not have a configuration file?

A: This raises several issue. From the motiviation for this tool, the main idea is simplicity. A configuration file could keep data such as default URI to class mappings (for built-in types, for example), and other deployment data. The location of this configuration file would become tedious to maintain. Options would include:

  • Keeping it in the jar that contains this tool. This option is not appealing because any changes to the configuration file would mean that the jar would have to be unzipped, modified and recompressed. This could be a bad thing in environments where jar modification is not possible (ie. when jars are signed).
  • Keeping it in a specified directory. This would imply some kind of persistent installation/configuration for something that is supposed to be a very simple tool. So configuration would be in the form of a $HOME/.muserc file or an environment variable or as a parameter on the command line (-configuration /foo/bar/muse.conf). In addition, all of this data is easily captured in the deployment descriptor in use case 3.
Since none of these options is particularly appealing, there is no configuration file. However, because the default configuration can be dumped out using use case 2, there is a way to inspect the current configuration and modify it to suit your needs.



Q: Why is the WSDL modified as it goes through the tool?

A: This question comes from the belief that the WSDL should be a descriptor of the endpoint which should be treated as a catalyst in the transformation. The problems with this approach are manifest in the unfortunate realities of deploying webapps.

First, there is a convention among web serive applications that appending "?wsdl" to the target URL will respond with the WSDL document. If the document contains links (via wsdl or schema imports/includes) then those links must be resolved by tools that wish to interact with the web service. However, on the axis2 platform, for example, paths to services are in the form of DIRECTORY/services/ENDPOINTNAME. The DIRECTORY corresponds to an actual directory on the webserver, but the services and ENDPOINTNAME portions can be virtual (handled by axis2). So, if the WSDL file returned by ?wsdl has a link to a file called "../mydir/test.wsdl", it is unclear how the webserver would resolve this path. A fully resolved WSDL which contains all of its contents in-lined, will not suffer from this reality.

A closely related issue involves the directory structure of the original WSDL input. The two interesting cases are:

  1. A WSDL file that lives in a parent (or ancestor) directory
  2. Two or more WSDL files that have the same file name but live in different directories
In the first case, the directory structure cannot be maintained when copying into the /wsdl folder, since for the links to still be valid, the files must be in parent (or ancestor) directories which may make no sense on the server. This can be resolved by making all of the links relative to the current folder and then copy the all of the WSDL files over. Unfortunately this can lead to the second problem.

It is not unthinkable that a developer would name two WSDL files the same name and keep them in separate folders. This could happen because of different versions of the same interface, for example. This creates a conflict when trying to flatten the folder structure.