Embperl - Building dynamic Websites with Perl Copyright (c) 1997-2002 Gerald Richter / ecos gmbh www.ecos.de You may distribute under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl README file. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. $Id$ ### !! IMPORTANT !! IMPORTANT !! IMPORTANT !! IMPORTANT !! IMPORTANT !! ### ### ### This is a BETA release of Embperl 2.0, before installing ### please read the README.v2. Documentation is not yet updated to ### reflect the changes in 2.0, everything that has changed is ### documented in README.v2. ### I use Embperl 2.0b in production environment on my own. ### But be careful, this release may still contain bugs. ### ### The current stable release is Embperl 1.3.4 ### ### !! IMPORTANT !! IMPORTANT !! IMPORTANT !! IMPORTANT !! IMPORTANT !! Hints for using Embperl 2.x --------------------------- Embperl 2 is totaly rewritten. Most of the Perl code is moved into C to speed up processing. The core is totaly redesigned to give a lot of new possibilities. This is still beta, so it may (and will) contains bugs. Please report any weird behaviour to the embperl mailing list, but be sure to read this whole README to understand what doesn't work yet. The Embperl core now works in a totaly different way. The processing of the source towards the output is done by providers. Every provider takes a small step. Which providers are used is defined by a recipe. The standard Embperl recipe contains the following providers: 1 reading the source 2 parsing 3 compiling 4 executing 5 outputing The providers works in a similar way as Unix shell programm which are processing a single source in a pipeline towards the output. In Embperl is is not only a smimple pipeline, but a tree structure, so multiple sources can be incorpoarted in one result. Rearrangeing the provideres or writing and useing new ones gives flexibility and power. Addtional to the standart Embperl providers Embperl ships with XML parser and XSLT processor providers. The new execution scheme is also faster, because html tags and metacommands are parsed only once (Perl code was also (and is still) cached in 1.x) My first benchmarks show 50%-100% faster execution under mod_perl compared to Embperl 1.x. Another new feature is that the syntax of the Embperl parser is defined within the module Embperl::Syntax and can be modified as nessecary. Embperl comes with a set of syntax definitons which can be modified by the user. So far there are syntax definitions for SSI, Text only, Perl only, ASP, POD, RTF and a Mail taglib. You can tell Embperl which syntax to use either in the configuration via EMBPERL_SYNTAX, or with the syntax parameter of Execute, or you can change the syntax dynamically inside the page via the [$syntax $] command. You also could specify more then one syntax at the same time, e.g. [$syntax Embperl SSI $] to mix Embperl tags and SSI tags in the same page. If you'd like to create your own syntax read: perldoc Embperl::Syntax and look at the files under Embperl/Syntax/ for examples on how to do it. Also new is the ability to cache (parts of) the output. See the new configuration directives below. Starting with 2.0b6 Embperl provides a set of new object, which allows to access Embperl internals and manipulate the processing. Basicly there are three major objects: - Application - Request - Component The application object is responsible for a set of pages that forms an application. It is used to configure things like session handling and logging which should be unique across these pages. More important it can be overriden and the overriden object can contain the application logic, to create a proper separation of logic and presentation. The request object holds everything which spans a whole (HTTP-)request. The component object is responsible for a single component, inside the desired output. It holds things like sourcefile etc. All three object has subobject which holds the configuration and a subobject for it's current parameters. See below for a sort list of accessable members. Debugging --------- Starting with 2.0b2 Embperl files can debugged via the interactive debugger. The debugger shows the Embperl page source along with the correct linenumbers. You can do anything you can do inside a normal Perl programm via the debugger, e.g. show variables, modify variables, single step, set breakpoints etc. You can use the Perl interacive command line debugger via perl -d embpexec.pl file.epl or if you prefer a graphical debugger, try ddd (http://www.gnu.org/software/ddd/) it's a great tool, also for debugging any other perl script: ddd --debugger 'perl -d embpexec.pl file.epl' NOTE: embpexec.pl could be found in the Embperl source directory If you want to debug your pages, while running under mod_perl, Apache::DB is the right thing. Apache::DB is available from CPAN. The following differences to Embperl 1.x apply: ------------------------------------------------------ - When running under mod_perl you _must_ load Embperl at server startup time. Either with a PerlModule Embperl in your httpd.conf or a use Embperl ; inside of a startup script. You can use the Embperl configuration directives now directly, (without PerlSetEnv/SetEnv). If you still want to use enviroment varibales to configure Embperl, write Embperl_UseEnv on - For every container in your httpd.conf (e.g. VirtualHost,Directory,Location) where you want to define any application level configuration directives (see below under tAppConfig for a list), you need to set a unique value for EMBPERL_APPNAME. This is for example necessay for all Embperl::Object parameters. Example: EMBPERL_APPNAME my_embperl_app EMBPERL_OBJECT_BASE base.epl - The following options can currently only be set from httpd.conf: optKeepSpaces - The option optRawInput is replaced by EMBPERL_INPUT_ESCMODE, which is off by default (same as when optRawInput was set in 1.x) - The following options are currently not supported: optRedirectStdout optDisableHtmlScan, optDisableTableScan, optDisableInputScan, optDisableMetaScan optDisableHtmlScan can be replaced by switching the syntax, e.g. [$syntax EmbperlBlocks $] # same as [- $optDisableHtmlScan = 1 -] (here goes your code - Embperl will not interpret any html tags here) [$syntax Embperl $] # same as [- $optDisableHtmlScan = 0 -] - Nesting must be done properly. I.e. you cannot put a tag (for a dynamic table) inside an 'if' and the
inside another 'if'. (That still works for static tables) - optUndefToEmptyValue is always set and cannot be disabled. - [$ foreach $x (@x) $] now requires the brackets around the array (like Perl) - [+ +] blocks must now contain a valid Perl expression. Embperl 1.x allows you to put multiple statements into such a block. For performance reasons this is not possible anymore. Also the expression must _not_ be terminated with a semicolon. To let old code work, just wrap it into a 'do' e.g. [+ do { my $a = $b + 5 ; $a } +] - EMBPERL_INPUT_FUNC and EMBPERL_OUTPUT_FUNC are not supported anymore You can the same result and much more by writing custom provider. - Embperl doesn't change the current working directory anymore to the directory of the source file. This is done for performance reasons and because it won't reliable work with threads under mod_perl 2.0. You can use $req -> component -> cwd to get the directotry of the sourcefile (where $req is Embperl request object, which is the first paramter passed to the page i.e. $_[0]) The following things are not fully tested/working yet: ------------------------------------------------------ - [- exit -] exit works not inside of [$ sub $], outside it works (It also can now exit the whole request, see below) - safe namespaces Embperl 1.x compatibility flag ------------------------------ The compatibility flag isn't available anymore in 2.0b6. Since now Embperl 2.0 lives in his own namespace, you can install Embperl 1.x and 2.x on the same machine without conflicts. Addtional Config directives --------------------------- Caching parameter ----------------- execute parameter / httpd.conf environment variable / name inside page (must set inside [! !]) cache_key / EMBPERL_CACHE_KEY / $CACHE_KEY literal string that is appended to the cache key cache_key_options / EMBPERL_CACHE_KEY_OPTIONS / $CACHE_KEY_OPTIONS ckoptCarryOver = 1, use result from CacheKeyFunc of previous step if any ckoptPathInfo = 2, include the PathInfo into CacheKey ckoptQueryInfo = 4, include the QueryInfo into CacheKey ckoptDontCachePost = 8, don't cache POST requests (not yet implemented) Default: all options set cache_key_func / EMBPERL_CACHE_KEY_FUNC / &CACHE_KEY function that should be called when build a cache key. The result is appended to the cache key. expires_func / EMBPERL_EXPIRES_FUNC / &EXPIRES function that is called every time before data is taken from the cache. If this funtion returns true, the data from the cache isn't used anymore, but rebuilt. Function could be either a coderef (when passed to Execute), a name of a subroutine or a string starting with "sub " in which case it is compiled as anonymous subroutine. expires_in / EMBPERL_EXPIRES_IN / $EXPIRES Time in seconds that the output should be cached. (0 = never, -1 = forever) expires_in / EMBPERL_EXPIRES_FILENAME / $EXPIRES_FILENAME Expires when the given file has changed Syntax switching ---------------- syntax / EMBPERL_SYNTAX / [$ syntax $] Used to tell Embperl which syntax to use inside a page. Embperl comes with the following syntaxes: - EmbperlHTML # all the HTML tags that Embperl recognizes by default - EmbperlBlocks # all the [ ] blocks that Embperl supports - Embperl # (default; contains EmbperlHtml and EmbperlBlocks) - ASP # <% %> and <%= %>, see perldoc Embperl::Syntax::ASP - SSI # Server Side Includes, see perldoc Embperl::Syntax::SSI - Perl # File contains pure Perl (similar to Apache::Registry), but # can be used inside EmbperlObject - Text # File contains only Text, no actions are taken on the Text - Mail # Defines the tag, for sending mail. This is an # example for a taglib, which could be a base for writing # your own taglib to extent the number of available tags - POD # translates pod files to XML, which can be converted to # the desired output format by an XSLT transformation - RTF # Can be used to process word processing documents in RTF format You can get a description for each syntax if you type perldoc Embperl::Syntax::xxx where 'xxx' is the name of the syntax. You can also specify multiple syntaxes e.g. EMBPERL_SYNTAX "Embperl SSI" Execute ({inputfile => '*', syntax => 'Embperl ASP'}) ; The 'syntax' metacommand allows to switch the syntax or to add or subtract syntaxes e.g. [$ syntax + Mail $] will add the Mail taglib so the tag is available after this line. [$ syntax - Mail $] now the tag is unknown again [$ syntax SSI $] now you can only use SSI commands inside your page. EMBPERL_INPUT_ESCMODE --------------------- 0 don't interpret input (default) 1 unescape html escapes to their characters (i.e. < becomes < ) inside of Perl code 2 unescape url escapes to their characters (i.e. %26; becomes & ) inside of Perl code 3 unescape html and url escapes, depending on the context Add 4 to remove html tags inside of Perl code. This is help full when an html editor insert html tags like
inside your Perl code. Set EMBPERL_INPUT_ESCMODE to 7 to get the old default of Embperl < 2.0b6 Set EMBPERL_INPUT_ESCMODE to 0 to get the old behaviour when optRawInput was set. This is the current default. Error mailing ------------- EMBPERL_MAIL_ERRORS_TO email address to mail any error to EMBPERL_MAIL_ERRORS_LIMIT do not mail more then errors. Set to 0 for no limit. EMBPERL_MAIL_ERRORS_RESET_TIME reset error counter if for seconds no error has occured EMBPERL_MAIL_ERRORS_RESEND_TIME mail errors of seconds regardless of the error counter All error counting is done per child, so if you run a large site and have 100 childs, you may get 100 * EMBPERL_MAIL_ERRORS_LIMIT mail before they are limited. Session handling ---------------- Session handling has changed from 1.3.3 to 1.3.4 and 2.0b3 to 2.0b4. You must either install Apache::SessionX or set PerlSetEnv EMBPERL_SESSION_HANDLER_CLASS "Embperl::Session" to get the old behaviour. Overview Embperl objects and their methods ------------------------------------------ * Application object thread curr_req config lfd user_session state_session app_session udat sdat mdat debug errors_count errors_last_time errors_last_send_time * Application configuration app_name app_handler_class session_args session_classes session_config session_handler_class cookie_name cookie_domain cookie_path cookie_expires log debug mailhost mailhelo mailfrom maildebug mail_errors_to mail_errors_limit mail_errors_reset_time mail_errors_resend_time object_base object_app object_addpath object_stopdir object_fallback object_handler_class new * Request object apache_req config param component app thread request_count request_time iotype session_mgnt session_id session_state_id session_user_id exit log_file_start_pos error errors errdat1 errdat2 lastwarn cleanup_vars cleanup_packages initial_cwd messages default_messages startclock stsv_count * Request configuration allow urimatch mult_field_sep path debug options session_mode * Request parameter filename unparsed_uri uri path_info query_info language cookies * Component object config param req_running sub_req inside_sub exit path_ndx cwd ep1_compat phase sourcefile buf end_pos curr_pos sourceline sourceline_pos line_no_curr_pos document curr_node curr_repeat_level curr_checkpoint curr_dom_tree source_dom_tree syntax ifd ifdobj append_to_main_req prev strict import_stash exports curr_package eval_package main_sub prog prog_run prog_def code * Component configuration package debug options escmode input_escmode input_charset cache_key cache_key_options expires_func cache_key_func expires_in syntax recipe xsltstylesheet xsltproc compartment cleanup * Component Parameter inputfile outputfile input output sub import firstline mtime param fdat ffld object isa errors xsltparam Configuration directives summary -------------------------------- /* tComponentConfig */ PACKAGE DEBUG OPTIONS ESCMODE INPUT_ESCMODE INPUT_CHARSET CACKE_KEY CACHE_KEY_OPTIONS EXPIRES_FUNC CACHE_KEY_FUNC EXPIRES_IN SYNTAX RECIPE XSLTSTYLESHEET XSLTPROC COMPARTMENT /* tReqConfig */ ALLOW URIMATCH MULTFIELDSEP PATH DEBUG OPTIONS SESSION_MODE /* tAppConfig */ APPNAME APP_HANDLER_CLASS SESSION_HANDLER_CLASS SESSION_ARGS SESSION_CLASSES SESSION_CONFIG COOKIE_NAME COOKIE_DOMAIN COOKIE_PATH COOKIE_EXPIRES LOG DEBUG MAILDEBUG MAILHOST MAILHELO MAILFROM MAIL_ERRORS_TO MAIL_ERRORS_LIMIT MAIL_ERRORS_RESET_TIME MAIL_ERRORS_RESEND_TIME OBJECT_BASE OBJECT_APP OBJECT_ADDPATH OBJECT_STOPDIR OBJECT_FALLBACK OBJECT_HANDLER_CLASS When running under mod_perl, you can use this directly as Apache configuration directives. They are case insensitiv. You don't need the use environment variables for configuration anymore. For this to work you have to add a PerlModule Embperl AddModule embperl.c before the first Embperl configuration directive. If you still like to use enviroment variables, you must set Embperl_UseEnv on For CGI mode still use enviroment variables. exit ---- B will override the normal Perl exit in every Embperl document. Calling exit will immediately stop any further processing of that file and send the already-done work to the output/browser. B If you are inside of an Execute, Embperl will only exit this Execute, but the file which called the file containing the exit with Execute will continue. If you want to exit the whole request, call exit with an argument e.g. exit (200) B If you write a module which should work with Embperl under mod_perl, you must use Embperl::exit instead of the normal Perl exit. (In 1.3.x it was Apache::Exit) Recipes ------- Starting with 2.0b4 Embperl introduces the concept of recipes. A recipe basically tells Embperl how a component should be build. While before 2.0b4 you could have only one processor that works on the request (the Embperl processor - you're also able to define different syntaxes), now you can have multiple of them arranged in a pipeline or even a tree. While you are able to give the full recipe when calling Execute, this is not very convenient, so normally you will only give the name of a recipe, either as parameter 'recipe' to Execute or as EMBPERL_RECIPE in your httpd.conf. Of course you can have different recipes for different locations and/or files. A recipe is constructed out of providers. A provider can either be read from some source or do some processing on a source. There is no restriction on what sort of data a provider has as in- and output - you just have to make sure that output format of a provider matches the input format of the next provider. In the current implementation Embperl comes with a set of built-in providers: - file read file data - memory get data from a scalar - epparse parse file into a Embperl tree structure - epcompile compile Embperl tree structure - eprun execute Embperl tree structure - eptostring convert Embperl tree structure to string - libxslt-parse-xml parse xml source for libxslt - libxslt-compile-xsl parse and compile stylesheet for libxslt - libxslt do an xsl transformation via libxslt - xalan-parse-xml parse xml source for xalan - xalan-compile-xsl parse and compile stylesheet for xalan - xalan do an xsl transformation via xalan There is a C interface, so new custom providers can be written, but what makes it really useful is that the next release of Embperl will contain a Perl interface, so you can write your own providers in Perl. The default recipe is named Embperl and contains the following providers: +-----------+ + file + +-----------+ | v +-----------+ + epparse + +-----------+ | v +-----------+ + epcompile + +-----------+ | v +-----------+ + eprun + +-----------+ This cause Embperl to behave like it has done in the past, when no recipes existed. Each intermediate result could be cached. So for example you are able to cache the already parsed XML or compiled stylesheet in memory, without the need to reparse/recompile it over and over again. Another nice thing about recipes is that they are not static. A recipe is defined by a recipe object. When a request comes in, Embperl calls the get_recipe method of teh application object, which by default calls the get_recipe of the named recipe object, which should return a array that describes what Embperl has to do. The get_recipe methods can of course build the array dynamically, looking, for example, at the request parameters like filename, formvalues, mime type or whatever. For example if you give a scalar as input the Embperl recipe replaces the file provider with a memory provider. Additionally you can specify more then one recipe (separated by spaces). Embperl will call all the new methods in turn until the first one that returns undef. This way you can create recipes that are known for what they are responsible. One possibility would be to check the file extension and only return the recipe if it matches. Much more sophisticated things are possible... See perldoc Embperl::Recipe for how to create your own provider. XML, XSLT --------- As mentioned above, Embperl now contains a provider for doing XSLT transformations. More XML will come in the next releases. The easiest thing is to use the XSLT stuff thru the predefined recipes: EmbperlLibXSLT the result of Embperl will run thru the Gone libxslt EmbperlXalanXSLT the result of Embperl will run thru Xalan-C EmbperlXSLT the result of Embperl will run thru the XSL transformer given by xsltproc or EMBPERL_XSLTPROC LibXSLT run source thru the Gone libxslt XalanXSLT run source thru Xalan-C XSLT run source thru the XSL transformer given by xsltproc or EMBPERL_XSLTPROC For example, including the result of an XSLT transformation into your html page could look like this: Include XML via XSLT

Start xml

[- Execute ({inputfile => 'foo.xml', recipe => 'EmbperlXalanXSLT', xsltstylesheet => 'foo.xsl'}) ; -]

END

As you already guessed, the xsltstylesheet parameter gives the name of the xsl file. You can also use the EMBPERL_XSLTSTYLESHEET configuration directive to set it from your configuration file. By setting EMBPERL_ESCMODE (or $escmode) to 15 you get the correct escaping for XML. Internationalisation (I18N) --------------------------- Starting with 2.0b6 Embperl has buildin support for multi-language applications. There are two things to do. First inside your pages marks which parts are translateable, by using the [= =]. Inside the [= =] blocks you could either put id, which are symbolic names for the text, or you put the text in your primary lanaguage inside the blocks. An example code could look like: [= heading =] Now you run the embpmsgid.pl utility, which extracts all the ids from your page: perl embpmsgid.pl -l de -l en -d msg.pl foo.htm This will create a file msg.pl which contains empty definitions for 'en' and 'de' with all the ids found in the page. If the file msg.pl already exists, the definitions are added. You can give more then one filename to the commandline. The format of the msg.pl file is written with Data::Dumper, so it can be easily read in via 'do' and postprocessed. As next step fill the empty definition with the correct translation. The last thing to do, is tell Embperl which language set to use. You do this inside the init method of the application object. Create an application object, which reads in the message and when the init method is called, pass the correct one to Embperl. There are tow methods $r -> message and $r -> default_message. Both returns a array ref on which you can push your message hashs. Embperl consults first the message array and if not found afterwards the default_message array for the correct message. Because both are arrays you can push multiple message sets on it. This is handy when your application object calls it's base class, which also may define some messages. Here is an example: package My::App ; @ISA = ('Embperl::App') ; %messages = ( 'de' => { 'heading' => 'Überschrift', 'bar' => 'Absenden', }, 'en' => { 'heading' => 'Heading', 'bar' => 'Submit', }, ) ; sub init { my $self = shift ; my $r = $self -> curr_req ; $lang = $r -> param -> language || 'de' ; push @{$r -> messages}, $messages{$lang} ; push @{$r -> default_messages}, $messages{'en'} if ($lang ne 'en') ; } 1 ; Just load this package and set EMBPERL_APP_HANDLER_CLASS to My::App, then Embperl will call the init method on the start of the request. If you are using Embperl::Object, you may instead save it as a file in your document hiearchie make the filename know to Embperl::Object with the EMBPERL_OBJECT_APP directive and Embperl::Object will retrive the correct application file, just in the same way it retrives other files. NOTE: When using with EMbperl::Object, don't make a package declaration at the top of your application object, Embperl::Object assign it's own namespace to the application object. In case you need to retrive a text inside your Perl code, you can do this with $r -> gettext('bar') ------------------- Enjoy Gerald