/[Apache-SVN]
ViewVC logotype

Revision 1884427


Jump to revision: Previous Next
Author: stsp
Date: Mon Dec 14 16:57:10 2020 UTC (3 years, 11 months ago)
Changed paths: 3
Log Message:
Make mailer.py work properly with Python 3, and drop Python 2 support.

Most of the changes deal with the handling of binary data vs Python strings.

I've made sure that mailer.py will work in a UTF-8 environment. In general,
UTF-8 is recommended for hook scripts. See the SVNUseUTF8 mod_dav_svn option.
Environments using other encodings may not work as expected, but those will
be problematic for hook scripts in general. SVN repositories store internal
data such as paths in UTF-8. Our Python3 bindings do not deal with encoding
or decoding of such data, and thus need to work with raw UTF-8 strings, not
Python strings.

The encoding of file and property contents is not guaranteed to be UTF-8.
This was already a problem before this change. This hook script sends email
with a content type header specifying the UTF-8 encoding. Diffs which contain
non-UTF-8 text will most likely not render properly when viewed in an email
reader. At least this problem is now obvious in mailer.py's implementation,
since all unidiff text is now written out directly as binary data.

As an additional fix, iterate file groups in sorted order. This results in
stable output and makes test cases in our tests/ subdirectory reproducible.

Tested with Python 3.7.5 which is the version I use in my SVN development
setup at present. Tests with newer versions are welcome.

* tools/hook-scripts/mailer/mailer.py:
  Drop Python2-specific includes. Adjust includes as per 2to3.
  (main): Decode arguments from UTF-8 to string.
  (OutputBase:write): Encode string to UTF-8 and pass to write_binary().
   OutputBase implementations now need to provide a self.write_binary
   member which implements a write() method for binary data.
  (MailedOutput): email.Header package is gone, use email.header instead,
   and likewise replace use of email.Utils with email.utils
  (SMTPOutput): Provide self.write_binary in terms of a BytesIO() object.
   We cannot use StringIO since diffs may contain data in arbitrary encodings.
  (StandardOutput): Provide self.write_binary in terms of stdout.buffer.
  (PipeOutput): Provide self.write_binary in terms of pipe.stdin.
  (Commit): Decode log message and paths from UTF-8 to string, and iterate
    path groups from mailer.conf in sorted order.
  (Lock): Decode directory entries from UTF-8 to string. Encode paths back
   to UTF-8 when we ask libsvn_fs for a lock on a path.
    Iterate path groups from mailer.conf in sorted order.
  (DiffGenerator): Decode repository paths from UTF-8 to string.
  (TextCommitRenderer): Decode author, log message, and path from UTF-8 to
    string. Write diff data via write_binary, bypassing the re-encoding step.
  (Config): Decode paths from UTF-8 to string before matching them against
   regular expressions. Also decode the repository directory path from UTF-8.

* tools/hook-scripts/mailer/tests/mailer-t1.output: Adjust expected output.
   File groups are now provided in stable sorted order. This should fix
   spurious test failures in the future.

 * tools/hook-scripts/mailer/tests/mailer-tweak.py: Drop L suffix from long
    integers and pass binary data instead of strings into libsvn_fs.


Changed paths

Path Details
Directorysubversion/trunk/tools/hook-scripts/mailer/mailer.py modified , text changed
Directorysubversion/trunk/tools/hook-scripts/mailer/tests/mailer-t1.output modified , text changed
Directorysubversion/trunk/tools/hook-scripts/mailer/tests/mailer-tweak.py modified , text changed

infrastructure at apache.org
ViewVC Help
Powered by ViewVC 1.1.26