AxKit Provider HOWTO Riccardo Cambiassi brujah@infodrome.net A quite complete HOWTO on writing Providers for AxKit Introduction AxKit is an XML Application Server for Apache. It provides on-the-fly conversion from XML to any format, such as HTML, WAP or text using either W3C standard techniques, or flexible custom code. AxKit also uses a built-in Perl interpreter to provide some amazingly powerful techniques for XML transformation. For more information on AxKit see www.axkit.org. One of the most interesting features of AxKit is it's object architecture: it makes easy (and quite fun) to extend it to meet your own needs. From the AxKit manpage we learn that we can operate on three aspects of AxKit behaviour: AxConfigReader returns information about various configuration options. AxProvider is the means by wich AxKit gets its resources from. AxCacheModule is responsible for storing cache data for later retrieval. The special feature that we'll exploit in this document is the extension of the Provider module. Overview of AxKit Providers All Providers descends from Apache::AxKit::Provider Module. From its manpage (and a bit of hacking) we learn that it relies upon the following (main) methods: new() - creates a new Provider object; normally this method should be defined only in the parent class (Apache::AxKit::Provider). To handle custom initialization, look at init() below. init() - intialize the provider; it is called by new(). Here the Provider module has the chance to carry out custom initialization procedures. The standard behaviour is to do nothing. The init() method should accept argument as follows: the first is always the request object; then we have a list of key => value pairs containing either 'uri' or 'file' key for the desired resource id and, in case of an external entity, 'rel' containing the Provider object for the main document. [is this correct ???]. process() * - answers the question "Shall we process this request?". Should return 1 if it can process the resource or die it it cannot, eventually throwing an opportune exception (see below). exists() * - answers the question: "Does the resource exist?". Return 1 if it exists. mtime() * - answers the question "How old is this resource?". Return the modification time in days before the current time. It's used to test the validity of cached data. get_fh() * - returns an open filehandle for the resource (or die if that's not possible). get_strref() * - returns a reference to a scalar containing the resource; note that at least one of get_fh or get_strref must work. key() * - returns an unique identifier for this resource. get_styles() - extract stylesheets and external entities from the resource. get_ext_ent_handler() - return a reference to be used instead of XML::Parser's default external entity handler. All methods marked with * are not defined in Apache::AxKit::Provider, so each real Provider will have to implement their own. How do Providers work? Throughout the processing of a request, whenever AxKit needs to fetch a resource, it creates an Apache::AxKit::Provider object for the desired resource. This is a generic (highlevel) object whose job is not limited to define standard methods (see above) but also to verify which actual Provider is in charge and reconsecrate() to it. To change the default Provider you can use the AxProvider directive or simply set the variable with PerlSetVar directive from your httpd.conf: # either AxProvider Apache::AxKit::Provider::File # or PerlSetVar AxProvider Apache::AxKit::Provider::File A new Provider is created in order to get: The XML document Every stylesheets Every external entity For example, let's suppose we have requested the following document at the url http://localhost/sample.xml <?xml version="1.0" ?> <xml-stylesheet href="/sample.xsl" type="text/xsl"?> <page> <para>Hello World!</para> </page> From a quite high level point of view, AxKit will: Create a Provider for this resource [my $provider = Apache::AxKit::Provider->new($r)] Check if to proceed in processing this resource [if ($provider->process())] Get the resource [$provider->get_fh or $provider->get_strref()] Parse the XML to extract all stylesheet and external entities and, for each resource: Create a Provider for this resource, specifying the 'uri' in case of a stylesheet or either 'file' or 'uri' in case of an external entity. In our example we have just one stylesheet ('/sample.xsl') and no external entities. Check if we can process this resource Get the resource Once AxKit has got all resources it will use Language Processors to apply the stylesheet to the document, and then delivery the result to the browser. Standard Providers What follows is a list of the Providers that come with the standard distribution of AxKit. File That's the default. It gets input from files (surprise!) relative to Apache's DocumentRoot. This is also the most complete Provider in that it defines all features and IMO the best starting place for everyone who wants to develop a new Provider. It defines: get_fh() get_strref() key() exists() process() mtime() and redefines: init(). Scalar This is a basic provider, gets input from a scalar variable. AxKit uses this Provider in order to handle Error messages. It defines all standard methods: process() exists() mtime() get_fh get_strref() key() and redefines: new() init() apache_request get_styles() Filter The most exotic Provider and, quite surprisingly, the most simple: it works with Apache::Filter in order to get data from another PerlHandler. This requires other Handlers to be "Filter aware". By the time of this writing, this applies to: Apache::Registry Apache::SSI Apache::ASP HTML::Mason Apache::SimpleReplace The Filter Provider is derived from File Provider and redefines just: init() get_fh() get_strref() mtime() Using Filter Provider Here we'll discuss how to use Filter Provider in order to exploit Apache::Filter Apache::Filter To enable the Filter chain, you will have to operate on both the Apache configuration and the single handlers (here called Filters). The following piece of code is borrowed from Apache::Filter manpage: #### In httpd.conf: PerlModule Apache::Filter # That's it - this isn't a handler. <Files ~ "*\.blah"> SetHandler perl-script PerlSetVar Filter On PerlHandler Filter1 Filter2 Filter3 </Files> #### In Filter1, Filter2, and Filter3: $r = $r->filter_register(); # Required my $fh = $r->filter_input(); # Optional (you might not need the input FH) while (<$fh>) { s/ something / something else /; print; } As we noticed before, currently the following public modules are Filter-aware. Apache::Registry (using Apache::RegistryFilter, included with Apache::Filter) Apache::SSI Apache::ASP HTML::Mason Apache::SimpleReplace ... with simple CGIs How to use Apache::RegistryFilter and AxKit. This is pretty simple: just add in httpd.conf: PerlModule Apache::RegistryFilter PerlModule AxKit <Location /filter> SetHandler perl-script PerlSetVar Filter On AxProvider Apache::AxKit::Provider::Filter PerlHandler Apache::RegistryFilter AxKit </Location> Then write a CGI that will generate your xml document and put it in the $DocumentRoot/filter/ directory. What follows is a minimalistic example: #!/usr/bin/perl print <<EOT; <?xml version="1.0"?> <?xml-stylesheet href="plain.xsl" type="text/xsl"?> <html> <head> <title>AxKit/Simple CGI filter test</title> </head> <body> <table bgcolor="#FFFFFF"> EOT print map { "<tr><td>$_</td><td>$ENV{$_}</td></tr>" } keys %ENV; print <<EOT; </table> </body> </html> EOT Finally, you will need a stylesheet to process the xml generated by the CGI. Here's an example: <?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" > <xsl:output method="html" indent="yes" encoding="ISO-8859-1" /> <xsl:template match="/"> <xsl:apply-templates select="/html/*"/> </xsl:template> <xsl:template match="*"> <xsl:copy> <xsl:copy-of select="./@*" /> <xsl:apply-templates /> </xsl:copy> </xsl:template> </xsl:stylesheet> ... with Apache::ASP Apache::ASP provides an Active Server Pages port to the Apache Web Server with Perl as the host scripting language. For more information about Apache::ASP, see Apache ASP homepage. To enable it to work with the AxKit Filter Provider, just add the following in httpd.conf: PerlModule Apache::Filter PerlModule Apache::ASP PerlModule AxKit <Location /filter> SetHandler perl-script PerlSetVar Filter On AxProvider Apache::AxKit::Provider::Filter PerlHandler Apache::ASP AxKit </Location> Then create a sample ASP page to generate the xml. Here is a minimalistic example: <?xml version="1.0"?> <?xml-stylesheet href="plain.xsl" type="text/xsl"?> <html> <head> <title>AxKit / Apache::ASP filter test</title> </head> <body bgcolor="#000000" text="#CCCCCC"> <h3>Environment Variables:</h3> <table border="1" width="100%" cellspacing="0" cellpadding="0"> <% my $env = $Request->ServerVariables; $Response->Write ( map { "<tr><th>$_</th><td>$env->{$_}</td></tr>" } keys %$env ); %> </table> </body> </html> Provider Internals Some advanced tips about AxKit. Everything you'll find in this chapter is to be considered just an overview on the subject and is included here just to let the reader better understand some of the main topics covered in this paper. AxKit::Apache object AxKit redefines the request object through the AxKit::Apache package. This (re)defines the following methods: content-type print no_cache send_http_header The changes in the methods have mostly to do with cache handling and shouldn't be of much interest to you. Just note that with the no_cache method you can disable AxKit's own cache too. External Entity Handler As we stated before, it is possible to define a custom External Entity Handler in the Provider module. This happens through the get_ext_ent_handler() routine. The default behaviour is to fetch remote http: entities with HTTP::GHTTP and local (unknown or no scheme) ones with the current AxKit Provider. Apache::AxKit::Exception AxKit uses a subclass of Error to handle Exceptions. This implements the try / catch / otherwise / finally primitives. You can use them to handle (or recover from) errors in a clean way. The Apache::AxKit::Exception package defines the following types of exception: Apache::AxKit::Exception Apache::AxKit::Exception::Declined Apache::AxKit::Exception::Error Apache::AxKit::Exception::OK Apache::AxKit::Exception::Retval Apache::AxKit::Exception::IO Writing a simple provider: DBI Here's a complete example of an AxKit Provider from scratch. This is called Apache::AxKit::Provider::DBI and is designed to get its data from a DataBase. In my example, I used as data source a MySQL installed on localhost, with the 'test' db and a 'guest' user with password 'guest'. Here is the table definition and default values: # Table definition CREATE TABLE blocks ( id varchar(255) DEFAULT '' NOT NULL, block TEXT, PRIMARY KEY (id) ); # Sample code INSERT INTO blocks VALUES ('/index.xml', '<?xml version="1.0"?> <page> <title>Hello world</title> </page>'); INSERT INTO blocks VALUES ('/index.xsl', '<?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" > <xsl:template match="/"> <xsl:apply-templates /> </xsl:template> <xsl:template match="title"> title = <xsl:value-of select="."/> </xsl:template> </xsl:stylesheet>'); What follows is the code for the provider: #TODO: # - insert some decent comments # - write down a sample external entity handler # - explain better the get_mtime() thing. package Apache::AxKit::Provider::DBI; use strict; use vars qw/@ISA/; @ISA = ('Apache::AxKit::Provider'); use Apache; use Apache::Log; use Apache::AxKit::Exception; use Apache::AxKit::Provider; use Apache::AxKit::Provider::File; use Apache::MimeXML; use Apache::Constants; use DBI; # sub: Init # Here we do some initialization stuff. sub init { my $self = shift; my (%p) = @_; if ($p{uri}) { # called from : # process_request ($styleprovider = Apache::AxKit::Provider->new) # check_resource_mtimes ($ent_provider = ... ) # [...] $self->{id} = $p{uri}; } elsif ($p{file}) { $self->{id} = $p{file}; } else { $self->{id} = $self->{apache}->filename(); } } # sub: get_fh # we don't want to handle files, so we just throw an exception here. sub get_fh { throw Apache::AxKit::Exception::IO( -text => "Can't get fh for DBI filehandle" ); } # sub: get_strref # since we refused to work with file handles, we HAVE to define this. sub get_strref { my $self = shift; # Connect to the DB and query it. my $dbh = DBI->connect("dbi:mysql:test",'guest', 'guest'); my $sth = $dbh->prepare("SELECT block FROM blocks WHERE id='".$self->{id}."'"); $sth->execute; # Now get the data and disconnect from the DB my $res = $sth->fetchrow(); $dbh->disconnect; return \$res; } # sub: mtime # This should return the modification time of the resource, for simplicity here we decrement it everytime we are called # so that resources are never considere cacheable. use vars qw/$mtime/; $mtime = 0; sub mtime { my $self=shift; return --$time; #borrowed from Scalar Provider } # sub: process sub process { my $self = shift; # For simplicity, let's assume our DB entry always exists return 1; } # sub: key # should return a unique identifier for the resource. # Let's assume the id from the uri is a good one. sub key { my $self = shift; return $self->{id}; } # sub: exists # should return 1 only if the resource actually exists. Let's cheat for now. sub exists { my $self = shift; return 1; } 1; to enable it, just modify your httpd.conf as follows: <Location /> PerlHandler AxKit AxProvider Apache::AxKit::Provider::DBI </Location >