Embperl - Building dynamic Websites with Perl
Copyright (c) 1997-2002 Gerald Richter / ecos gmbh www.ecos.de
You may distribute under the terms of either the GNU General Public
License or the Artistic License, as specified in the Perl README file.
THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
$Id$
### !! IMPORTANT !! IMPORTANT !! IMPORTANT !! IMPORTANT !! IMPORTANT !!
###
###
### This is a BETA release of Embperl 2.0, before installing
### please read the README.v2. Documentation is not yet updated to
### reflect the changes in 2.0, everything that has changed is
### documented in README.v2.
### I use Embperl 2.0b in production environment on my own.
### But be careful, this release may still contain bugs.
###
### The current stable release is Embperl 1.3.4
###
### !! IMPORTANT !! IMPORTANT !! IMPORTANT !! IMPORTANT !! IMPORTANT !!
Hints for using Embperl 2.x
---------------------------
Embperl 2 is totaly rewritten. Most of the Perl code is moved
into C to speed up processing. The core is totaly redesigned to
give a lot of new possibilities.
This is still beta, so it may (and will) contains bugs.
Please report any weird behaviour to the embperl mailing list, but
be sure to read this whole README to understand what doesn't work yet.
The Embperl core now works in a totaly different way. The processing
of the source towards the output is done by providers. Every provider
takes a small step. Which providers are used is defined by a recipe.
The standard Embperl recipe contains the following providers:
1 reading the source
2 parsing
3 compiling
4 executing
5 outputing
The providers works in a similar way as Unix shell programm which are
processing a single source in a pipeline towards the output. In
Embperl is is not only a smimple pipeline, but a tree structure,
so multiple sources can be incorpoarted in one result.
Rearrangeing the provideres or writing and useing new ones gives
flexibility and power. Addtional to the standart Embperl providers
Embperl ships with XML parser and XSLT processor providers.
The new execution scheme is also faster, because html tags and metacommands
are parsed only once (Perl code was also (and is still) cached in 1.x)
My first benchmarks show 50%-100% faster execution under mod_perl
compared to Embperl 1.x.
Another new feature is that the syntax of the Embperl parser is defined
within the module Embperl::Syntax and can be modified as nessecary.
Embperl comes with a set of syntax definitons which can be modified by
the user. So far there are syntax definitions for SSI, Text only, Perl only,
ASP, POD, RTF and a Mail taglib. You can tell Embperl which syntax to use either in
the configuration via EMBPERL_SYNTAX, or with the syntax parameter of
Execute, or you can change the syntax dynamically inside the page via the
[$syntax $] command. You also could specify more then one syntax at the same
time, e.g. [$syntax Embperl SSI $] to mix Embperl tags and SSI tags in the same
page.
If you'd like to create your own syntax read:
perldoc Embperl::Syntax
and look at the files under Embperl/Syntax/ for examples on how to do it.
Also new is the ability to cache (parts of) the output. See
the new configuration directives below.
Starting with 2.0b6 Embperl provides a set of new object, which allows
to access Embperl internals and manipulate the processing. Basicly there
are three major objects:
- Application
- Request
- Component
The application object is responsible for a set of pages that forms an
application. It is used to configure things like session handling and
logging which should be unique across these pages. More important
it can be overriden and the overriden object can contain the application
logic, to create a proper separation of logic and presentation.
The request object holds everything which spans a whole (HTTP-)request.
The component object is responsible for a single component, inside the
desired output. It holds things like sourcefile etc.
All three object has subobject which holds the configuration and a
subobject for it's current parameters.
See below for a sort list of accessable members.
Debugging
---------
Starting with 2.0b2 Embperl files can debugged via the interactive debugger.
The debugger shows the Embperl page source along with the correct linenumbers.
You can do anything you can do inside a normal Perl programm via the debugger,
e.g. show variables, modify variables, single step, set breakpoints etc.
You can use the Perl interacive command line debugger via
perl -d embpexec.pl file.epl
or if you prefer a graphical debugger, try ddd (http://www.gnu.org/software/ddd/)
it's a great tool, also for debugging any other perl script:
ddd --debugger 'perl -d embpexec.pl file.epl'
NOTE: embpexec.pl could be found in the Embperl source directory
If you want to debug your pages, while running under mod_perl, Apache::DB is the
right thing. Apache::DB is available from CPAN.
The following differences to Embperl 1.x apply:
------------------------------------------------------
- When running under mod_perl you _must_ load Embperl
at server startup time. Either with a
PerlModule Embperl
in your httpd.conf or a
use Embperl ;
inside of a startup script.
You can use the Embperl configuration directives now
directly, (without PerlSetEnv/SetEnv). If you still
want to use enviroment varibales to configure Embperl, write
Embperl_UseEnv on
- For every container in your httpd.conf (e.g. VirtualHost,Directory,Location)
where you want to define any application level configuration directives
(see below under tAppConfig for a list), you need to set a unique
value for EMBPERL_APPNAME. This is for example necessay for all
Embperl::Object parameters. Example:
EMBPERL_APPNAME my_embperl_app
EMBPERL_OBJECT_BASE base.epl
- The following options can currently only be set from httpd.conf:
optKeepSpaces
- The option optRawInput is replaced by EMBPERL_INPUT_ESCMODE,
which is off by default (same as when optRawInput was set
in 1.x)
- The following options are currently not supported:
optRedirectStdout
optDisableHtmlScan, optDisableTableScan,
optDisableInputScan, optDisableMetaScan
optDisableHtmlScan can be replaced by switching the syntax, e.g.
[$syntax EmbperlBlocks $] # same as [- $optDisableHtmlScan = 1 -]
(here goes your code - Embperl will not interpret any html tags here)
[$syntax Embperl $] # same as [- $optDisableHtmlScan = 0 -]
- Nesting must be done properly. I.e. you cannot put a
tag (for a
dynamic table) inside an 'if' and the
inside another 'if'.
(That still works for static tables)
- optUndefToEmptyValue is always set and cannot be disabled.
- [$ foreach $x (@x) $] now requires the brackets around the
array (like Perl)
- [+ +] blocks must now contain a valid Perl expression. Embperl 1.x
allows you to put multiple statements into such a block. For performance
reasons this is not possible anymore. Also the expression must _not_ be
terminated with a semicolon. To let old code work, just wrap it into a 'do'
e.g. [+ do { my $a = $b + 5 ; $a } +]
- EMBPERL_INPUT_FUNC and EMBPERL_OUTPUT_FUNC are not supported anymore
You can the same result and much more by writing custom provider.
- Embperl doesn't change the current working directory anymore to the
directory of the source file. This is done for performance reasons
and because it won't reliable work with threads under mod_perl 2.0.
You can use $req -> component -> cwd to get the directotry of the
sourcefile (where $req is Embperl request object, which is the first
paramter passed to the page i.e. $_[0])
The following things are not fully tested/working yet:
------------------------------------------------------
- [- exit -]
exit works not inside of [$ sub $], outside it works
(It also can now exit the whole request, see below)
- safe namespaces
Embperl 1.x compatibility flag
------------------------------
The compatibility flag isn't available anymore in 2.0b6. Since now
Embperl 2.0 lives in his own namespace, you can install Embperl 1.x and
2.x on the same machine without conflicts.
Addtional Config directives
---------------------------
Caching parameter
-----------------
execute parameter / httpd.conf environment variable / name inside page (must set inside [! !])
cache_key / EMBPERL_CACHE_KEY / $CACHE_KEY
literal string that is appended to the cache key
cache_key_options / EMBPERL_CACHE_KEY_OPTIONS / $CACHE_KEY_OPTIONS
ckoptCarryOver = 1, use result from CacheKeyFunc of previous step if any
ckoptPathInfo = 2, include the PathInfo into CacheKey
ckoptQueryInfo = 4, include the QueryInfo into CacheKey
ckoptDontCachePost = 8, don't cache POST requests (not yet implemented)
Default: all options set
cache_key_func / EMBPERL_CACHE_KEY_FUNC / &CACHE_KEY
function that should be called when build a cache key. The result is
appended to the cache key.
expires_func / EMBPERL_EXPIRES_FUNC / &EXPIRES
function that is called every time before data is taken from the cache.
If this funtion returns true, the data from the cache isn't used anymore,
but rebuilt.
Function could be either a coderef (when passed to Execute), a name of a
subroutine or a string starting with "sub " in which case it is compiled
as anonymous subroutine.
expires_in / EMBPERL_EXPIRES_IN / $EXPIRES
Time in seconds that the output should be cached. (0 = never, -1 = forever)
expires_in / EMBPERL_EXPIRES_FILENAME / $EXPIRES_FILENAME
Expires when the given file has changed
Syntax switching
----------------
syntax / EMBPERL_SYNTAX / [$ syntax $]
Used to tell Embperl which syntax to use inside a page. Embperl comes with
the following syntaxes:
- EmbperlHTML # all the HTML tags that Embperl recognizes by default
- EmbperlBlocks # all the [ ] blocks that Embperl supports
- Embperl # (default; contains EmbperlHtml and EmbperlBlocks)
- ASP # <% %> and <%= %>, see perldoc Embperl::Syntax::ASP
- SSI # Server Side Includes, see perldoc Embperl::Syntax::SSI
- Perl # File contains pure Perl (similar to Apache::Registry), but
# can be used inside EmbperlObject
- Text # File contains only Text, no actions are taken on the Text
- Mail # Defines the tag, for sending mail. This is an
# example for a taglib, which could be a base for writing
# your own taglib to extent the number of available tags
- POD # translates pod files to XML, which can be converted to
# the desired output format by an XSLT transformation
- RTF # Can be used to process word processing documents in RTF format
You can get a description for each syntax if you type
perldoc Embperl::Syntax::xxx
where 'xxx' is the name of the syntax.
You can also specify multiple syntaxes e.g.
EMBPERL_SYNTAX "Embperl SSI"
Execute ({inputfile => '*', syntax => 'Embperl ASP'}) ;
The 'syntax' metacommand allows to switch the syntax or to
add or subtract syntaxes e.g.
[$ syntax + Mail $]
will add the Mail taglib so the tag is available after
this line.
[$ syntax - Mail $]
now the tag is unknown again
[$ syntax SSI $]
now you can only use SSI commands inside your page.
EMBPERL_INPUT_ESCMODE
---------------------
0 don't interpret input (default)
1 unescape html escapes to their characters (i.e. < becomes < )
inside of Perl code
2 unescape url escapes to their characters (i.e. %26; becomes & )
inside of Perl code
3 unescape html and url escapes, depending on the context
Add 4 to remove html tags inside of Perl code. This is help full when
an html editor insert html tags like inside your Perl code.
Set EMBPERL_INPUT_ESCMODE to 7 to get the old default of Embperl < 2.0b6
Set EMBPERL_INPUT_ESCMODE to 0 to get the old behaviour when optRawInput was set.
This is the current default.
Error mailing
-------------
EMBPERL_MAIL_ERRORS_TO
email address to mail any error to
EMBPERL_MAIL_ERRORS_LIMIT
do not mail more then errors. Set to 0 for no limit.
EMBPERL_MAIL_ERRORS_RESET_TIME
reset error counter if for seconds no error has occured
EMBPERL_MAIL_ERRORS_RESEND_TIME
mail errors of seconds regardless of the error counter
All error counting is done per child, so if you run a large site and
have 100 childs, you may get 100 * EMBPERL_MAIL_ERRORS_LIMIT mail
before they are limited.
Session handling
----------------
Session handling has changed from 1.3.3 to 1.3.4 and 2.0b3 to 2.0b4. You must either
install Apache::SessionX or set
PerlSetEnv EMBPERL_SESSION_HANDLER_CLASS "Embperl::Session"
to get the old behaviour.
Overview Embperl objects and their methods
------------------------------------------
* Application object
thread
curr_req
config
lfd
user_session
state_session
app_session
udat
sdat
mdat
debug
errors_count
errors_last_time
errors_last_send_time
* Application configuration
app_name
app_handler_class
session_args
session_classes
session_config
session_handler_class
cookie_name
cookie_domain
cookie_path
cookie_expires
log
debug
mailhost
mailhelo
mailfrom
maildebug
mail_errors_to
mail_errors_limit
mail_errors_reset_time
mail_errors_resend_time
object_base
object_app
object_addpath
object_stopdir
object_fallback
object_handler_class
new
* Request object
apache_req
config
param
component
app
thread
request_count
request_time
iotype
session_mgnt
session_id
session_state_id
session_user_id
exit
log_file_start_pos
error
errors
errdat1
errdat2
lastwarn
cleanup_vars
cleanup_packages
initial_cwd
messages
default_messages
startclock
stsv_count
* Request configuration
allow
urimatch
mult_field_sep
path
debug
options
session_mode
* Request parameter
filename
unparsed_uri
uri
path_info
query_info
language
cookies
* Component object
config
param
req_running
sub_req
inside_sub
exit
path_ndx
cwd
ep1_compat
phase
sourcefile
buf
end_pos
curr_pos
sourceline
sourceline_pos
line_no_curr_pos
document
curr_node
curr_repeat_level
curr_checkpoint
curr_dom_tree
source_dom_tree
syntax
ifd
ifdobj
append_to_main_req
prev
strict
import_stash
exports
curr_package
eval_package
main_sub
prog
prog_run
prog_def
code
* Component configuration
package
debug
options
escmode
input_escmode
input_charset
cache_key
cache_key_options
expires_func
cache_key_func
expires_in
syntax
recipe
xsltstylesheet
xsltproc
compartment
cleanup
* Component Parameter
inputfile
outputfile
input
output
sub
import
firstline
mtime
param
fdat
ffld
object
isa
errors
xsltparam
Configuration directives summary
--------------------------------
/* tComponentConfig */
PACKAGE
DEBUG
OPTIONS
ESCMODE
INPUT_ESCMODE
INPUT_CHARSET
CACKE_KEY
CACHE_KEY_OPTIONS
EXPIRES_FUNC
CACHE_KEY_FUNC
EXPIRES_IN
SYNTAX
RECIPE
XSLTSTYLESHEET
XSLTPROC
COMPARTMENT
/* tReqConfig */
ALLOW
URIMATCH
MULTFIELDSEP
PATH
DEBUG
OPTIONS
SESSION_MODE
/* tAppConfig */
APPNAME
APP_HANDLER_CLASS
SESSION_HANDLER_CLASS
SESSION_ARGS
SESSION_CLASSES
SESSION_CONFIG
COOKIE_NAME
COOKIE_DOMAIN
COOKIE_PATH
COOKIE_EXPIRES
LOG
DEBUG
MAILDEBUG
MAILHOST
MAILHELO
MAILFROM
MAIL_ERRORS_TO
MAIL_ERRORS_LIMIT
MAIL_ERRORS_RESET_TIME
MAIL_ERRORS_RESEND_TIME
OBJECT_BASE
OBJECT_APP
OBJECT_ADDPATH
OBJECT_STOPDIR
OBJECT_FALLBACK
OBJECT_HANDLER_CLASS
When running under mod_perl, you can use this directly as Apache configuration
directives. They are case insensitiv. You don't need the use environment
variables for configuration anymore. For this to work you have to add a
PerlModule Embperl
AddModule embperl.c
before the first Embperl configuration directive. If you still like to
use enviroment variables, you must set
Embperl_UseEnv on
For CGI mode still use enviroment variables.
exit
----
B will override the normal Perl exit in every Embperl document. Calling
exit will immediately stop any further processing of that file and send the
already-done work to the output/browser.
B If you are inside of an Execute, Embperl will only exit this Execute, but
the file which called the file containing the exit with Execute will continue. If
you want to exit the whole request, call exit with an argument e.g. exit (200)
B If you write a module which should work with Embperl under mod_perl,
you must use Embperl::exit instead of the normal Perl exit. (In 1.3.x it was
Apache::Exit)
Recipes
-------
Starting with 2.0b4 Embperl introduces the concept of recipes. A recipe basically
tells Embperl how a component should be build. While before 2.0b4 you could
have only one processor that works on the request (the Embperl processor -
you're also able to define different syntaxes), now you can have multiple of them
arranged in a pipeline or even a tree. While you are able to give the full
recipe when calling Execute, this is not very convenient, so normally you
will only give the name of a recipe, either as parameter 'recipe' to
Execute or as EMBPERL_RECIPE in your httpd.conf. Of course you can have
different recipes for different locations and/or files. A recipe is constructed
out of providers. A provider can either be read from some source or do some
processing on a source. There is no restriction on what sort of data a provider
has as in- and output - you just have to make sure that output format of
a provider matches the input format of the next provider. In the current
implementation Embperl comes with a set of built-in providers:
- file read file data
- memory get data from a scalar
- epparse parse file into a Embperl tree structure
- epcompile compile Embperl tree structure
- eprun execute Embperl tree structure
- eptostring convert Embperl tree structure to string
- libxslt-parse-xml parse xml source for libxslt
- libxslt-compile-xsl parse and compile stylesheet for libxslt
- libxslt do an xsl transformation via libxslt
- xalan-parse-xml parse xml source for xalan
- xalan-compile-xsl parse and compile stylesheet for xalan
- xalan do an xsl transformation via xalan
There is a C interface, so new custom providers can be written, but what makes it
really useful is that the next release of Embperl will contain a
Perl interface, so you can write your own providers in Perl.
The default recipe is named Embperl and contains the following providers:
+-----------+
+ file +
+-----------+
|
v
+-----------+
+ epparse +
+-----------+
|
v
+-----------+
+ epcompile +
+-----------+
|
v
+-----------+
+ eprun +
+-----------+
This cause Embperl to behave like it has done in the past, when no
recipes existed.
Each intermediate result could be cached. So for example you are able
to cache the already parsed XML or compiled stylesheet in memory,
without the need to reparse/recompile it over and over again.
Another nice thing about recipes is that they are not static. A recipe
is defined by a recipe object. When a request comes in, Embperl calls
the get_recipe method of teh application object, which by default
calls the get_recipe of the named recipe object, which should return a array
that describes what Embperl has to do. The get_recipe methods can of course
build the array dynamically, looking, for example, at the request parameters
like filename, formvalues, mime type or whatever. For example if you
give a scalar as input the Embperl recipe replaces the file provider
with a memory provider. Additionally you can specify more then one
recipe (separated by spaces). Embperl will call all the new methods in
turn until the first one that returns undef. This way you can create recipes
that are known for what they are responsible. One possibility would be
to check the file extension and only return the recipe if it matches.
Much more sophisticated things are possible...
See perldoc Embperl::Recipe for how to create your own provider.
XML, XSLT
---------
As mentioned above, Embperl now contains a provider for doing XSLT transformations.
More XML will come in the next releases. The easiest thing is to use the XSLT
stuff thru the predefined recipes:
EmbperlLibXSLT the result of Embperl will run thru the Gone libxslt
EmbperlXalanXSLT the result of Embperl will run thru Xalan-C
EmbperlXSLT the result of Embperl will run thru the XSL transformer
given by xsltproc or EMBPERL_XSLTPROC
LibXSLT run source thru the Gone libxslt
XalanXSLT run source thru Xalan-C
XSLT run source thru the XSL transformer given by xsltproc or
EMBPERL_XSLTPROC
For example, including the result of an XSLT
transformation into your html page could look like this:
Include XML via XSLT
As you already guessed, the xsltstylesheet parameter gives the name of the xsl
file. You can also use the EMBPERL_XSLTSTYLESHEET configuration directive
to set it from your configuration file.
By setting EMBPERL_ESCMODE (or $escmode) to 15 you get the correct escaping
for XML.
Internationalisation (I18N)
---------------------------
Starting with 2.0b6 Embperl has buildin support for multi-language applications.
There are two things to do. First inside your pages marks which parts are translateable,
by using the [= =]. Inside the [= =] blocks you could either put id, which are symbolic
names for the text, or you put the text in your primary lanaguage inside the blocks.
An example code could look like:
[= heading =]
Now you run the embpmsgid.pl utility, which extracts all the ids from your page:
perl embpmsgid.pl -l de -l en -d msg.pl foo.htm
This will create a file msg.pl which contains empty definitions for 'en' and 'de'
with all the ids found in the page. If the file msg.pl already exists, the definitions
are added. You can give more then one filename to the commandline. The format of the
msg.pl file is written with Data::Dumper, so it can be easily read in via 'do' and
postprocessed. As next step fill the empty definition with the correct translation.
The last thing to do, is tell Embperl which language set to use. You do this inside
the init method of the application object. Create an application object, which reads
in the message and when the init method is called, pass the correct one to Embperl.
There are tow methods $r -> message and $r -> default_message. Both returns a array
ref on which you can push your message hashs. Embperl consults first the message array
and if not found afterwards the default_message array for the correct message.
Because both are arrays you can push multiple message sets on it. This is handy when
your application object calls it's base class, which also may define some messages.
Here is an example:
package My::App ;
@ISA = ('Embperl::App') ;
%messages =
(
'de' =>
{
'heading' => 'Überschrift',
'bar' => 'Absenden',
},
'en' =>
{
'heading' => 'Heading',
'bar' => 'Submit',
},
) ;
sub init
{
my $self = shift ;
my $r = $self -> curr_req ;
$lang = $r -> param -> language || 'de' ;
push @{$r -> messages}, $messages{$lang} ;
push @{$r -> default_messages}, $messages{'en'} if ($lang ne 'en') ;
}
1 ;
Just load this package and set EMBPERL_APP_HANDLER_CLASS to My::App, then
Embperl will call the init method on the start of the request.
If you are using Embperl::Object, you may instead save it as a file in your
document hiearchie make the filename know to Embperl::Object with the
EMBPERL_OBJECT_APP directive and Embperl::Object will retrive the correct
application file, just in the same way it retrives other files.
NOTE: When using with EMbperl::Object, don't make a package declaration at
the top of your application object, Embperl::Object assign it's own namespace
to the application object.
In case you need to retrive a text inside your Perl code, you can do this
with $r -> gettext('bar')
-------------------
Enjoy
Gerald