Proposal for Using Catalina in Tomcat 4.0
$Id$
Introduction
As many Tomcat developers are aware, development on a next generation
version of Apache JServ was in progress when Sun announced their contribution
of the Java Server Web Development Kit (JSWDK) source code to the Apache
Software Foundation in June 1999. Among other things, this caused interest
in further development of Apache JServ to dwindle, on the (reasonable)
assumption that Tomcat -- the code eventually delivered in October 1999 --
would become a natural successor to Apache JServ, with support for similar
functionality but providing the key enhancements of support for the current
versions of the servlet and JSP specifications.
When the code was delivered in October, it was obvious that its heritage
and purpose were not in conformance with the stated goals of the Jakarta
project ("commercial quality server solutions" and "world class implementation
of the Java Servlet 2.2 and JSP 1.1 Specifications" -- both quotes from the
home page at <http://jakarta.apache.org>). Although there have been
improvements in functionality, quality, and performance since that time, these
improvements have been relatively small in scale compared to the fundamental
problems of the code base -- very poor internal organization and documentation,
"spaghetti code" implementation of much of the functionality, substantial
numbers of bugs in key features, and a general lack of maintainability.
Background
When it became obvious that "evolutionary" changes to the existing code base
would take more effort and time than a "revolutionary" change to a new code
base, considerable effort was (and continues to be) expended to create a new
code base for the servlet container portion of Tomcat, based in large part
on the design direction that Apache JServ was already headed. In addition,
a set of goal statements was explicitly described. These goals (and the
Catalina architectural design that implements them) have been presented --
with very positive reception -- at several recent trade shows (ApacheCon in
March, O'Reilly Java Conference in March, JavaOne in June, and O'Reilly
Open Source Conference in July). Briefly, the goals cover the following
general areas:
- Create a servlet container useful in diverse deployment environments,
including (but not limited to):
- Stand alone "application server"
- Connected to an existing web server
- Integrated in a full function application server (typically
providing the full suite of J2EE capabilities)
- Integrated in a development tool
- Embedded in a hardware device or software application (such as
providing the administrative user interface for a router or a
firewall)
- Providing simultaneous support for non-HTTP protocols on the
same server infrastructure
- Support customizable functionality via plug-in components:
- Internal architecture based on components, described by Java
interfaces
- Implementation classes configurable at run time (with
appropriate defaults)
- Component configuration based on JavaBeans property APIs
- Optional component lifecycle support and events
provided in a consistent manner
- Support extensible request processing:
- At various levels (entire server, virtual host, web
application/servlet context, or individual servlet)
- Example applications: security (authentication and access control),
customized logging, resource management and transaction support,
and performance measurement
- Configurable at run time
High level diagrams of the Catalina component architecture are included
in the session notes for the conference web sites mentioned above. In
particular, the JavaOne presentation can be downloaded from:
<http://jsp.java.sun.com/javaone/javaone2000/pdfs/TS-953.pdf>
.
The existing implementation of these goals is present in the
proposals/catalina
directory of the jakarta-tomcat
CVS module. It passes 100% of the servlet tests (and 100% of the JSP tests
in conjunction with the JSP portions of Tomcat 3.x) from the
jakarta-watchdog
test suite that validates conformance to the
servlet 2.2 and JSP 1.1 specifications.
Comparison With Tomcat 3.X
This section will be fairly brief, because the entire source code for
both Tomcat 3.x and Catalina is available for anyone to peruse and compare
for themselves. However, several key points are worthy of mention
(in alphabetical order):
- Code Design - Many who have reviewed the source code
of Catalina and Tomcat 3.x have commented on Catalina's clean coding
style, overall consistency, and straightforward implementation of the
features of a servlet container. Such characteristics are not usually
evident in the code for Tomcat 3.x (although the current code base is
definitely an improvement over the original). This makes code
maintenance, especially by someone not intimately familiar with Tomcat,
tougher than it needs to be.
- Documentation - Catalina exhibits substantially better
and more complete documentation than Tomcat in many important areas,
including: overall architecture diagrams, complete JavaDoc comments,
and a substantial start towards documentation of server configuration
options in
conf/server.xml
. Tomcat currently includes a
more complete User's Guide, although it is in need of better organization
and presentation no matter what code base we use.
- Functionality - Compared to Tomcat 3.x, Catalina already
supports significant additional functionality in many areas, including:
- HTTP/1.1 support in stand-alone mode (requires a web server to support
this for Tomcat 3.x).
- DIGEST-based authentication, as well as BASIC and FORM authenitication
supported by Tomcat.
- Request processing functionality (Valves in Catalina, versus
RequestInterceptors in Tomcat) that is configurable on a per-server,
per-virtual-host, or per-web-application level. The set of request
interceptors in Tomcat 3.x is currently global to all web applications
installed in a particular instance.
- Valves support request and response filtering via wrappers, which
enables a much richer universe of functionality than the request
interceptors of Tomcat 3.x.
- Support for non-HTTP protocols is possible in Catalina. Tomcat
assumes that every request is an HttpServletRequest, and every
response is an HttpServletResponse.
- Abstraction of resource support (ServletContext.getResources() and
friends) so that Catalina can support resources loaded from a file
system, a JAR file, or (with the addition of customized components)
any other source -- such as BLOB objects in a database. Tomcat 3.x
is tied to loading resources only from a file system.
- Configurable support for caching of recently accessed resources
(such as HTML pages and images) that can dramatically improve
performance of web applications which are predominantly dynamic
(but have some static components), or where the application is
loaded from a location where access is relatively slow (such as
a JAR file on a remote server).
- The file-serving servlet in Catalina optionally supports full
WebDAV Level 2 functionality (see for more
information), supporting web site management from clients like
Microsoft Office 2000.
At present, Tomcat 3.x has one very significant functional feature that
is not yet completed in Catalina -- support for the web server connectors
to Apache, Microsoft IIS, Netscape/iPlanet, and AOLServer. Work is
already under way to port the existing Tomcat 3.x connectors to Catalina,
and it is anticipated that this work will be completed in a timely
manner.
- Goals - The Catalina architecture was designed from its
inception to meet an articulated set of high level goals (see above).
The original goal of the Tomcat 3.x code base (prior to turning the
code over to Apache) was to produce a reference implementation, and
the Tomcat 3.x code has suffered churn in its architecture in part
because no such goals have been stated to compare it against.
- Maturity - Ironically, substantial portions of the
Catalina code base have been in their current organizational design
for much longer than the corresponding Tomcat code base. This is
partly due to the incremental way in which changes have occurred in
the Tomcat code base over the last year or so (including multiple
different approaches to some functional issues), but primarily due to
starting from a stable set of design goals. Tomcat 3.x continues to
exhibit an embarassing number of implementation bugs (for example, the
way that automatic reloading works when changes in Java class files are
detected is still fundamentally flawed in Tomcat 3.2).
Tomcat 3.x, of course, has been tested in "real life" by substantial
numbers of people, whereas Catalina has been tested primarily by
pioneers. It is to be expected that implementation bugs exist in
Catalina -- although experience in fixing Catalina bugs to date has
been that nearly all of them are relatively localized implementation
problems, not issues that require fundamental design changes.
- Performance - No formal benchmarks have been published
by the Tomcat development team describing the performance of Tomcat 3.x
or Catalina. (In addition, I would take any benchmark that purports to
predict how *your* application will perform on a particular server with a
very large grain of salt :-). However, simple tests with tools like
ApacheBench and Microsoft's stress test indicate that Catalina is
faster than Tomcat on serving static resources (substantially faster if
caching is enabled, basically on par with Apache itself), and within a few
percentage points on serving "Hello, World" type servlets and JSP pages.
Only modest efforts to tune Catalina have taken place to date (compared
to the substantial -- and successful -- tuning efforts expended between
Tomcat 3.1 and 3.2), so it can be assumed that there is still significant
room for improvement through the usual techniques of isolating hot spots
and optimizing them.
The Catalina Proposal
Given this background, and the fact that initial public drafts of the
Servlet 2.3 and JSP 1.2 specifications is imminent, it is timely for us to
make a decision to use Catalina as the code base for Tomcat 4.0 (supporting
the features of the new specifications), while continuing to support the
Tomcat 3.x code base for as long as is appropriate. The associated proposal
for organizing the source code repository for Tomcat 4.0 development shows
how this dual track development approach can be supported easily.