Note for Users Upgrading to SpamAssassin 4.0.1
----------------------------------------------

- Phishstats.info domain has expired;
  "phishing_phishstats_feed" and "phishing_phishstats_minscore"
  options have been removed from Mail::SpamAssassin::Plugin::Phishing
  plugin.

Note for Users Upgrading to SpamAssassin 4.0.0
----------------------------------------------

Apache SpamAssassin 4.0.0 represents years of work by the project with
numerous improvements, new rule types, and internal native handling
of messages in international languages. We highly recommend looking
through this file and all of the .pre files to evaluate your
configuration thoroughly. Plugins have been added, removed, and
improved throughout.

- All rules, functions, command line options and modules that contain
  "whitelist" or "blacklist" have been renamed to contain more
  racially neutral "welcomelist" and "blocklist" terms. This allows
  acronyms like WL and BL to remain the same. Previous options will
  continue work at least until version 4.1.0 is released. If you have
  local settings including scores or meta rules referring to old rule
  names, these should be changed and "enable_compat
  welcomelist_blocklist" added in init.pre. See:
  https://wiki.apache.org/spamassassin/WelcomelistBlocklist (Bug 7826)

- Meta rules no longer use priority values, they are evaluated
  dynamically when the rules they depend on are finished. (Bug 7735)

- API: New $pms->rule_ready() function. Any asynchronous eval-function
  must now return undef (instead of 0 or 1), if rule result is not
  ready when exiting the function. $pms->rule_ready($rulename) or
  $pms->got_hit(...) must be called when the result has arrived. If
  these are not used, it can break depending meta rule evaluation.

- Setting normalize_charset is now enabled by default. Note that rules
  should not expect specific non-UTF8 or UTF8 encoding in
  body. Matching is done against the raw data which may vary depending
  on normalize_charset setting and whether decoding to UTF8 was
  successful. See:
  https://wiki.apache.org/spamassassin/WritingRulesAdvanced

- DKIM plugin has added support for ARC signature verification

- The DecodeShortURL plugin has been added and decodes URIs from URL
  shorteners that may be used to evade scanning

- Strings can now be captured from rules and later reused using the
     special %{TAGNAME} syntax

- The Bayes stopwords, or noise words, are now configurable in order
  to optimize Bayes usage for non-English languages. Stopwords for 16
  foreign languages have been included. See 60_bayes_stopwords.cf in
  the rules files. See Mail::SpamAssassin::Plugin::Bayes and the
  bayes_stopword_languages option if you wish to use a different
  stopword list. This is highly recommended if you are using Bayes and
  you are processing messages in languages other than English.

- The OLEVBMacro plugin has been improved to identify more macros
  while also extracting uris from the attachments for automatic
  inclusion in RBL lookups

- Internationalized domain name (IDN) support has been added and
  requires Net::LibIDN2 or Net::LibIDN module with a new
  Util::idn_to_ascii() function. (Bug 7215)

- Improved internal header address (From/To/Cc) parser, now also
  handles multiple addresses and includes optional support for
  external Email::Address::XS parser, which can handle nested comments
  and other oddities.

- Header :addr :name modifiers now return all addresses. Options of
  :first :last select only first (topmost) or last header to process
  when there are multiple headers with the same name. :addr and :name
  may still return multiple values from a single header.

- API: $pms->get() can and should now be called in list
  context. Scalar context continues to return multiple values newline
  separated, but this should be considered deprecated.

- New ExtractText plugin that extracts text from documents or images
  to feed the data into SpamAssassin for standard processing with
  existing rules, URIs extracted from documents will fall into normal
  RBL lookups.

- New "nolog" tflag added to hide info coming from rules in
  SpamAssassin reports

- All log output (stderr, file, syslog) is now escaped properly for \r
  \n \t \\, control chars, DEL, and UTF-8 sequences presented as
  \x{XX}.  Whitespace is not normalized anymore like in versions prior
  to 4.0.0.

- API: Logger::add() has new optional 'escape' parameter.  New
  Logger::escape_str() function.

- API: New $pms->add_uri_detail_list() function. Also new
  uri_detail_list types: unlinked, schemeless

- Util::split_domain, trim_domain, and is_domain_valid functions have
  a new optional argument ($is_ascii)

- Header names support new :host :domain :ip :revip modifiers

- AskDNS: tag HEADER(hdrname) supported to query any header content
  similarly to header rules

- The HashCash module and support has been removed completely, as it
  has been long since deprecated

- URILocalBL: uri_block_cc/uri_block_cont now support negation (Bug
  7528)

- URILocalBL: IPv6 lookups for hosts is now support, if provided by
  your database

- DNS and other asynchronous lookups such as Pyzor and DCC are now
  only launched when priority -100 is reached. This allows short
  circuiting at a lower priority without sending unneeded DNS queries
  and starting process forms. (Bug 5930)

- API: New plugin method callback method check_dnsbl added to launch
  network lookups at priority -100 and check_post_dnsbl to harvest own
  network lookups

- API: New plugin callback method check_cleanup for cleaning up
  things...

- FreeMail: new options freemail_import_welcomelist_auth and
  freemail_import_def_welcomelist_auth added (Bug 6451)

- New internal Mail::SpamAssassin::GeoDB module that provides a
  unified interface to modules MaxMind::DB::Reader (GeoIP2), Geo::IP,
  IP::Country::DB_File, and IP::Country::Fast.

  This is utilized by RelayCountry and URILocalBL with settings
  geodb_module, geodb_options, and geodb_search_path.

  Deprecated settings still work such as country_db_type,
  country_db_path, uri_country_db_path, and uri_country_db_isp_path
  but will print a warning to migrate to geodb_module/options.

- Razor2 razor_fork option added to create separate Razor2 processes
  and read in the results later asynchronously, increasing throughput,
  and automatically adjusting rule priorities to -100.

- DCC checks are now done asynchronously if using dccifd, improving
  throughput.  With dccifd, rule priorities are automatically adjusted
  to -100.  Commercial reputation rules can be ignored with the option
  "use_dcc_rep 0" to save a few CPU cycles.

- Pyzor pyzor_fork option added to create separate Pyzor processes and
  read in the results later asynchronously, increasing throughput, and
  automatically adjusting rule priorities to -100. Renamed pyzor_max
  setting to pyzor_count_min. Added pyzor_welcomelist_min and
  pyzor_welcomelist_factor setting. Also try to improve false
  positives by ignoring "empty body" messages.

- API: deprecated $pms->register_async_rule_start() and
  $pms->register_async_rule_finish() calls though left in for
  backwards compatibility. Plugins should only use
  $pms->bgsend_and_start_lookup(), which handles required things
  Automatically. Direct calls to bgsend or start_lookup should not be
  used.  $pms->bgsend_and_start_lookup() should always contain
  $ent->{rulename} for correct meta dependency handling. Deprecated
  start_lookup, get_lookup, lookup_ns, harvest_until_rule_completes,
  and is_rule_complete.

- SPF: Mail::SPF is now the only supported perl module and
  Mail::SPF::Query is deprecated along with the settings
  do_not_use_mail_spf, and do_not_use_mail_spf_query. SPF lookups are
  not done asynchronously so using an MTA filter such as pypolicyd-spf
  or spf-engine can generate Received-SPF for SpamAssassin to parse.

- "ALL" pseudo-header now returns decoded headers, so it's usage is
  consistent with single header matching. Using the :raw option mimics
  the previous behavior of with undecoded and folded headers.

- New dns_block_rule option handles blocked DNSBLs (Bug 6728)

- ASN: Support GeoDB for ASN lookups (asn_use_geodb, asn_prefer_geodb,
  asn_use_dns).

- ASN: Default sa-update ruleset doesn't make ASN lookups or add
  headers anymore. Configure desired methods, asn_use_geodb or
  asn_use_dns, and add_header clauses manually as described in the
  plugin documentation. Usage of asn_use_geodb without DNS is
  recommended unless ASNCIDR is needed. Do not use rules that check
  metadata X-ASN header! Only the new eval function check_asn()
  described in plugin manual works reliably.

- sa-update: New --score-multiplier, --score-limit, and --forcemirror
  options added.
    #1 forcemirror: forces sa-update to use a specific mirror server,
    #2 score-multiplier: adjust all scores from update channel by a
     given multiplier to quickly level set scores to match your
     preferred threshold
    #3 score-limit adjusts all scores from update channel over a
     specified limit to a new limit

- New dns_options "nov4" and "nov6" added.  IMPORTANT:; You must set
  nov6 if your DNS resolver is filtering IPv6 AAAA replies.

- API: Added Message::get_pristine_body_digest(),
  Message::get_msgid(), and Message::generate_msgid()
  functions. Removed deprecated private Plugin::Bayes::get_msgid()
  function.

- Bayes and TxRep seen Message-ID tracking hashing method changed.  No
  actions are required. If re-learning some old messages, they might
  be learned twice but old IDs should expire automatically.

- report_charset defaults now to UTF-8.

- Meta rules inherit net tflag setting from dependencies (Bug 7735)

- BodyEval: Added plaintext_body_sig_ratio eval rules for the first
  text/plain MIME part's body and signature length ratio.

- API: Now supports multiple calls of $pms->test_log() for
  rules. Added $pms->check_cleanup() to finalize tags, reports,
  etc. Deprecated internal $pms->{test_log_msgs}, renamed to
  $pms->{test_logs}. Deprecated $pms->clear_test_state() as it is not
  needed anymore. $pms->test_log() now accepts $rulename as second
  argument.

- URIDNSBL: urirhsbl/urirhssub rules support "notrim" tflag to force
  querying the full hostname instead of just the domain. This works
  best if the specific uribl supports this mode. (Bug 7835)

- Removed deprecated --auth-ident and --ident-timeout options from
  spamd

- MIMEHeader: support matching ALL header, tflags range, and tflags
  concat

- Autolearn: add new tflags autolearn_header/autolearn_body. These can
  force a rule to count as header or body points accordingly. (Bug
  7907)

- SSL client certificate support for spamc/spamd is now easier. New
  spamc options --ssl-cert, --ssl-key, --ssl-ca-file, and
  --ssl-ca-path. New spamd options --ssl-verify, --ssl-ca-file, and
  --ssl-ca-path (Bug 7267)

- ArchiveIterator now automatically uncompressed all gzip, bzip2, xz,
  lz4, lzip, and lzo-compressed files (Bug 7598). These apply to
  spamassassin and sa-learn commands also.

- New DMARC policy check plugin.

- New project maintained DecodeShortURLs plugin which may not be
  directly compatible with rules from other third party plugins. See
  The plugin documentation for configuration and rule format.

- Installing module Net::CIDR::Lite allows the use of dash-separated
  IP range format (e.g. 192.168.1.1-192.168.255.255) for NetSet tables
  including internal_networks, trusted_networks, msa_networks, and
  uri_local_cidr.

- The HashBL plugin in v342.pre is now enabled by default.

- HeaderEval check_for_unique_subject_id() function is deprecated.

(end of UPGRADE)