The Jakarta Site - Mutant Design Notes

Apache Ant

Download

Jakarta

Get Involved

Mutant Design Notes

This is a brief, albeit rambling description of Mutant. Mutant has many experimental ideas which may or may not prove useful. I'll try to describe what is there and let anyone who is interested comment. Mutant is still immature. You'll notice that there is, at this time, just one task, a hacked version of the echo task, which I have been using to test out ideas. Most tasks would end up being pretty similar to their Ant 1.x version.

OK, let me start with some of the motivating requirements. There are of coure many Ant2 requirements but I want to focus on these two for now. Mutant does also address many of the other Ant2 requirements.

I'll use the terms Ant and mutant somewhat interchangeably - just habit, not an assumption of any sort.

One of the things which is pretty difficult in Ant 1.x is the management of classpaths and classloaders. For example, today the antlr task requires the antlr classes in the classpath used to start ant. I'm talking here about the classpath built up in the ant.bat/ant script launchers. At the same time, the checkstyle task which uses antlr won't run if the antlr classes are in the classpath because then those classes cannot "see" the classes in the taskdef's classpath.

Another requirement I have is extensibility. In Ant 1.x this is difficult because whenever a new type is created, each task which needs to support this type must be changed to provide the new addXXX method. The ejbjar task is on example of this problem with its concept of vendor specific tools. The zip/jar task, with its support for different types of fileset, is another. The addition of the classfileset to Ant requires a change to the zip task.

Mutant Initialization

Mutant defines a classloader hierarchy somewhat similar to that used in Tomcat 4. Tasks join into this hierarchy at a particular point to ensure they have visibility of the necessary interface classes and no visibility of the Ant core itself. There is nothing particularly novel about this approach, but tasks are able to request certain additional resources as we will see later.

Mutant starts with two jars. One is the start.jar which contains just one class, Main.java which establishes the initial configuration and then runs the appropriate front end command line class. If a different front end was desired, a different launch class, in its own jar, would be used. This would perhaps configure the classloader hierarchy somewhat differently and start the approriate GUI front end class.

The second jar, init.jar, provides a number of initialisation utilities. These are used by Main.java to setup Ant and would also be used by any other front end to configure Ant. The important class here is the InitConfig which communicates the state of Ant at startup into the the core of Ant when it starts up. Main determines the location of ANT_HOME based on the location of the start classes and then populates the InitConfig with both classloaders and information about the location of various jars and config files.

At the top of the classloader hierarchy are the bootstrap and system classloaders. I won't really distinguish between these in mutant. Combined they provide the JDK classes, plus the classes from the init and start jars. One objective is to keep the footprint of the init and start jars small so they do not require any external classes, which may then become visible lower in the hierarchy. Main does not explicitly create these loaders, of course, but just adds a reference to the init config as system class loader

The next jar is for the common area. This provides interface definitions and utility classes for use by both the core and by tasks/types etc. It is loaded from ANT_HOME/lib/common/*.jar. Typically this is just lib/common/common.jar but any other jars in here are loaded. This pattern is used in the construction of all of the classloaders.

Next up is the core loader. It includes the lib/antcore/antcore.jar plus any others including the XML parser jars. Mutant's core does not assume that the project model will come from an XML description but XML facilities are needed in the core for reading in Ant library defs and config files. The parser jar locations are also stored in the init config. This lets the jars be added to any Ant library that wants to use Ant's XML parser rather than providing its own. Similarly tools.jar's location is determined automatically and added to the config for use by tasks which request it. I'll go into more detail when discussing the antlib processing.

The final jar that is loaded is the jar for the frontend - cli.jar. This is not passed in init config since these classes are not visible to the core and are not needed by it. So the hierarchy is
 
jdk classes
    |
start/init
    |
 common
    |
 antcore
    |
   cli
Task classloaders generally will come in at common, hiding the core classes, front end and XML parser classes from tasks.

Once Main has setup the initConfig, it creates the front end commandline class and launches mutant proper, passing it the command line args and the init config.

A GUI would typically replace start.jar and the cli.jar with its own versions which manage model construction from GUI processes rather than from XML files. It may be possible to move some of Main.java's processing into init.jar if it is useful to other front ends. I haven't looked at that balance.

Mutant Frontend

The front end is responsible for coordinating execution of Ant. It manages command line arguments, builds a model of the Project to be evaluated and coordinates the execution services of the core. cli.jar contains not only the front-end code but also the XML parsing code for building a project model from an XML description. Other front ends may choose to build project models in different ways. Commandline is pretty similar to Ant 1.x's Main.java - it handles arguments, building loggers, listeners, defines, etc - actually I haven't fully implemented command line defines in mutant yet but it would be similar to Ant 1.x.

Commandline then moves to building a project model from the XML representation. I have just expanded the approach in Ant 1's ProjectHelper for XML parsing, moving away from a stack of inner classes. The classes in the front end XML parsing use some XML utility base classes from the core.

The XML parsing handles two elements at parse time. One is the <ref> element which is used for project references - that is relationships between project files. The referenced project is parsed as well. The second is the <include> element which includes either another complete project or a project <fragment> directly into the project. All the other elements are used to build a project model which is later processed in the core.

The project model itself is organized like this

A project contains

named references to other projects

targets

build elements (tasks, type instances)

A target contains

build elements (tasks, type instances)

A build element contains

build elements (nested elements)

So, for now the project model contains top level tasks and type instances. I'm still thinking about those and property scoping especially in the face of project refs and property overrides. Anyway, the running of these tasks is currently disabled.

Once the model is built, the commandline creates an execution manager instance, passing it the initConfig built by Main.jar. It adds build listeners and then starts the build using the services of the ExecutionManager.

Ant Libraries

Before we get into execution proper, I'll deal with the structure of an ant library and how it works. An antlibrary is a jar file with a library descriptor located in META-INF/antlib.xml. This defines what typedefs/taskdefs/converters the library makes available to Ant. The classes or at least some of the classes for the library will normally be available in the jar. The descriptor looks like this (I'll provide two examples here)
<antlib libid="ant.io" 
        home="http://jakarta.apache.org/ant"
        isolated="true">
  <typedef name="thread" classname="java.lang.Thread"/>
  <taskdef name="echo" classname="org.apache.ant.taskdef.io.Echo"/>

  <converter classname="org.apache.ant.taskdef.io.FileConverter"/>
</antlib>

<antlib libid="ant.file" 
        home="http://jakarta.apache.org/ant"
        reqxml="true" reqtools="true" extends="ant.io"
        isolated="true">
  <taskdef name="copy" classname="org.apache.ant.file.copy"/>
</antlib>        
the "libid" attribute is used to globally identify a library. It is used in Ant to pick which tasks you want to make available to a build file. As the number of tasks available goes up, this is used to prevent name collisions, etc. The name is constructed similarly to a Java package name - i.e Reverse DNS order.

The "home" attribute is a bit of fluff unused by mutant to allow tools to manage libraries and update them etc. More thought could go into this.

"reqxml" allows a library to say that it wants to use Ant's XML parser classes. Note that these will be coming from the library's classloader so they will not, in fact, be the same runtime classes as used in Ant's core, but it saves tasks packaging their own XML parsers.

"reqtools" allows a library to specify that it uses classes from Sun's tools.jar file. Again, if tools.jar is available it will be added to the list of classes in the library's classloader

"extends" allows for a single "inheritance" style relationship between libraries. I'm not sure how useful this may be yet but it seems important for accessing common custom types. It basically translates into the class loader for this library using the one identified in extends as its parent.

"isolate" specifies that each task created from this libary comes from its own classloader. This can be used with tasks derived from Java applications which have static initialisers. This used to be an issue with the Anakia task, for example. Similarly it could be used to ensure that tool.jar classes are unloaded to stop memory leaks. Again this is experimental so may not prove ultimately useful.

The <typedef> in the example creates a <thread> type. That is just a bit of fun which I'll use in an example later. It does show the typedefing of a type from outside the ant library however.

<taskdef> is pretty obvious. It identifies a taskname with a class from the library. The import task, which I have not yet implemented will allow this name to be aliased - something like

<import libid="ant.file" task="echo" alias="antecho"/>

Tasks are not made available automatically. The build file must state which tasks it wants to use using an <import> task. This is similar to Java's import statement. Similarly classes whose ids start with "ant." are fully imported at the start of execution.

Mutant Configuration

When mutant starts execution, it reads in a config file. Actually it attempts to read two files, one from $ANT_HOME/conf/antconfig.xml and another from $HOME/.ant/antconfig.xml. Others could be added even specified in the command line. These config files are used to provide two things - libpaths and task dirs.

Taskdirs are locations to search for additional ant libraries. As people bundle Ant tasks and types with their products, it will not be practical to bundle all this into ANT_HOME/lib. These additional dirs are scanned for ant libraries. All .zip/.jar/.tsk files which contain the META-INF/antlib.xml file will be processed.

Sometimes, of course, the tasks and the libraries upon which they depend are not produced by the same people. It is not feasible to go in and edit manifests to connect the ant library with its required support jars, so the libpath element in the config file is used to specify additional paths to be added to a library's classloader. An example config would be
<antconfig>
  <libpath libid="ant.file" path="fubar"/>
  <libpath libid="ant.file" url="http://fubar"/>
</antconfig>
Obviously other information can be added to the config - standard property values, compiler prefs, etc. I haven't done that yet. User level config override system level configs.

So, when a ant library creates a classloader, it will take a number of URLS. One is the task library itself, the XML parser classes if requested, the tools.jar if requested, and any additional libraries specified in the <antconfig>. The parent loader is the common loader from the initconfig. unless this library is an extending library.

Mutant Execution

Execution of a build is provided by the core through two key classes. One if the ExecutionManager and the other is the ExecutionFrame. An execution frame is created for each project in the project model hierarchy. It represents the execution state of the project - data values, imported tasks, typedefs, taskdefs, etc.

The ExecutionManager begins by reading configs, searching for ant libraries, configuring and appending any additional paths, etc. It then creates a root ExecutionFrame which represents the root project. when a build is commenced, the project model is validated and then passed to the ExecutionFrame.

the ExecutionFrame is the main execution class. When it is created it imports all ant libraries with ids that start with ant.*. All others are available but must be explicitly imported with <import> tasks. When the project is passed in, ExecutionFrames are created for any referenced projects. This builds an ExecutionFrame hierarchy which parallels the project hierarchy. Each <ref> uses a name to identify the referenced project. All property and target references use these reference names to identify the particular frame that hold the data. As an example, look at this build file
<project default="test" basedir=".." doc:Hello="true">

  <ref project="test.ant" name="reftest"/>

  <target name="test" depends="reftest:test2">
    <echo message="hello"/>
  </target>

</project>
Notice the depends reference to the test2 target in the test.ant project file. I am still using the ":" as a separator for refs. It doesn't collide with XML namespaces so that should be OK.

Execution proceeds by determining the targets in the various frames which need to be executed. The appropriate frame is requested to execute the target's tasks and type instances. The imports for the frame are consulted to determine what is the approrpiate library and class from that library. A classloader is fetched, the class is instantiated, introspected and then configured from the corresponding part of the project model. Ant 1.x's IntrospectionHelper has been split into two - the ClassIntrospector and the Reflector. When the task is being configured, the context classloader is set. Similarly it is set when the task is being executed. Types are handled similarly. When a type in instantiated or a task executed, and they support the appropriate interface, they will be passed a context through which they can access the services of the core. Currently the context is an interface although I have wondered if an abstract class may be better to handle expansion of the services available over time.

Introspection and Polymorphism

Introspection is not a lot different from Ant 1.x. After some thought I have dropped the createXXX method to allow for polymorphic type support, discussed below. setXXX methods, coupled with an approriate string to type converter are used for attributes. addXXX methods are used for nested elements. All of the value setting has been moved to a Reflector object. Object creation for addXXX methods is no longer provided in the reflector class, just the storage of the value. This allows support for add methods defined in terms of interfaces. For example, the hacked Echo task I am using has this definition
    /**
     * testing
     *
     * @param runnable testing
     */
    public void addRun(Runnable runnable) {
        log("Adding runnable of type "
             + runnable.getClass().getName(), MessageLevel.MSG_WARN);
    }
So when mutant encounteres a nested element it does the following checks

Is the value specified by reference?

<run ant:refid="test"/>

Is it specified by as a polymorphic type?

<run ant:type="thread"/>

or is it just a normal run o' the mill nested element, which is instantiated by a zero arg constructor.

Note the use of the ant namespace for the metadata. In essence the nested element name <run> identifies the add method to be used, while the refId or type elements specify the actual instance or type to be used. The ant:type identifies an Ant datatype to be instantiated. If neither is specified, the type that is expected by the identified method, addRun in this case, is used to create an instance. In this case that would fail.

Polymorphism, coupled with typedefs is one way, and a good way IMHO, of solving the extensibility of tasks such as ejbjar.

OK, that is about the size of it. Let me finish with two complete build files and the result of running mutant on them.

build.ant
<project default="test" basedir=".." doc:Hello="true">

  <ref project="test.ant" name="reftest"/> 

  <target name="test" depends="reftest:test2">
    <echo message="hello"/>
  </target>

</project>
test.ant
<project default="test" basedir="." doc:Hello="true">
  <target name="test2">
    <thread ant:id="testit"/>
    <echo message="hello2">
        <run ant:refid="testit">       
        </run>
    </echo>

    <echo message="hello3">
        <run ant:type="thread">       
        </run>
    </echo>
  </target>

</project>
If I run mutant via a simple script which has just one line

java -jar /home/conor/dev/mutant/dist/lib/start.jar $*

I get this
test2:
     [echo] Adding runnable of type java.lang.Thread
     [echo] hello2
     [echo] Adding runnable of type java.lang.Thread
     [echo] hello3

test:
     [echo] hello

BUILD SUCCESSFUL

Total time: 0 seconds
Lets change the <run> definition to

<run/> in test.ant and the result becomes
test2:
     [echo] Adding runnable of type java.lang.Thread
     [echo] hello2

BUILD FAILED

/home/conor/dev/mutant/test/test.ant:10: 
No element can be created for nested element <run>. 
Please provide a value by reference or specify the value type