Embperl - building dynamic websites with Perl


Encoding/UTF-8
[ << Prev: Internationalisation (I18N) ] [ Content ] [ Next: Error trapping >> ]

Requires Embperl 2.1.0 and up.

Embperl tries to do the right thing to handle ISO-8859-1 and UTF-8 out of the box. There are three places where encoding comes into places:

 

Posted form data

 

Output escaping

 

Source code

While the first two things are handled by Embperl itself, the third item is currently left to handle by Perl.

Perl carries for each string value a flag that tells if the string is UTF-8 or not. Embperl uses this flag.

Posted form data is examined. If a string contains valid UTF-8 characters Perl's internal UTF-8 flag is set. You can disable setting the UTF-8 flag by setting optFormDataNoUtf8 in EMBPERL_OPTIONS.

Output escaping is done based on the UTF-8 flag. In case the UTF-8 flags is set characters above 127 are not escaped. To get the correct appearance in your browser you also have to specify the encoding as UTF-8 in your content-type http header.

If the UTF-8 flag is not set the output escaping is done based on the setting of EMBPERL_OUTPUT_ESC_CHARSET, which defaults to ISO-8859-1 (latin1). ISO-8859-2 (latin2) is also selectable.

If you wish to have your Perl source code in UTF-8, you have to add a use utf8; at the top of each page.

Please note that not all modules sets Perl's internal UTF-8 flag correctly. At the time of this writing for example DBI and Net::LDAP does not set this flag. You have to correct it manualy, for example by using Encode::_utf8_on.


[ << Prev: Internationalisation (I18N) ] [ Content ] [ Next: Error trapping >> ]


© 1997-2023 Gerald Richter / actevy