Note for Users Upgrading to SpamAssassin 4.0.1 ---------------------------------------------- - Phishstats.info domain has expired; "phishing_phishstats_feed" and "phishing_phishstats_minscore" options have been removed from Mail::SpamAssassin::Plugin::Phishing plugin. Note for Users Upgrading to SpamAssassin 4.0.0 ---------------------------------------------- Apache SpamAssassin 4.0.0 represents years of work by the project with numerous improvements, new rule types, and internal native handling of messages in international languages. We highly recommend looking through this file and all of the .pre files to evaluate your configuration thoroughly. Plugins have been added, removed, and improved throughout. - All rules, functions, command line options and modules that contain "whitelist" or "blacklist" have been renamed to contain more racially neutral "welcomelist" and "blocklist" terms. This allows acronyms like WL and BL to remain the same. Previous options will continue work at least until version 4.1.0 is released. If you have local settings including scores or meta rules referring to old rule names, these should be changed and "enable_compat welcomelist_blocklist" added in init.pre. See: https://wiki.apache.org/spamassassin/WelcomelistBlocklist (Bug 7826) - Meta rules no longer use priority values, they are evaluated dynamically when the rules they depend on are finished. (Bug 7735) - API: New $pms->rule_ready() function. Any asynchronous eval-function must now return undef (instead of 0 or 1), if rule result is not ready when exiting the function. $pms->rule_ready($rulename) or $pms->got_hit(...) must be called when the result has arrived. If these are not used, it can break depending meta rule evaluation. - Setting normalize_charset is now enabled by default. Note that rules should not expect specific non-UTF8 or UTF8 encoding in body. Matching is done against the raw data which may vary depending on normalize_charset setting and whether decoding to UTF8 was successful. See: https://wiki.apache.org/spamassassin/WritingRulesAdvanced - DKIM plugin has added support for ARC signature verification - The DecodeShortURL plugin has been added and decodes URIs from URL shorteners that may be used to evade scanning - Strings can now be captured from rules and later reused using the special %{TAGNAME} syntax - The Bayes stopwords, or noise words, are now configurable in order to optimize Bayes usage for non-English languages. Stopwords for 16 foreign languages have been included. See 60_bayes_stopwords.cf in the rules files. See Mail::SpamAssassin::Plugin::Bayes and the bayes_stopword_languages option if you wish to use a different stopword list. This is highly recommended if you are using Bayes and you are processing messages in languages other than English. - The OLEVBMacro plugin has been improved to identify more macros while also extracting uris from the attachments for automatic inclusion in RBL lookups - Internationalized domain name (IDN) support has been added and requires Net::LibIDN2 or Net::LibIDN module with a new Util::idn_to_ascii() function. (Bug 7215) - Improved internal header address (From/To/Cc) parser, now also handles multiple addresses and includes optional support for external Email::Address::XS parser, which can handle nested comments and other oddities. - Header :addr :name modifiers now return all addresses. Options of :first :last select only first (topmost) or last header to process when there are multiple headers with the same name. :addr and :name may still return multiple values from a single header. - API: $pms->get() can and should now be called in list context. Scalar context continues to return multiple values newline separated, but this should be considered deprecated. - New ExtractText plugin that extracts text from documents or images to feed the data into SpamAssassin for standard processing with existing rules, URIs extracted from documents will fall into normal RBL lookups. - New "nolog" tflag added to hide info coming from rules in SpamAssassin reports - All log output (stderr, file, syslog) is now escaped properly for \r \n \t \\, control chars, DEL, and UTF-8 sequences presented as \x{XX}. Whitespace is not normalized anymore like in versions prior to 4.0.0. - API: Logger::add() has new optional 'escape' parameter. New Logger::escape_str() function. - API: New $pms->add_uri_detail_list() function. Also new uri_detail_list types: unlinked, schemeless - Util::split_domain, trim_domain, and is_domain_valid functions have a new optional argument ($is_ascii) - Header names support new :host :domain :ip :revip modifiers - AskDNS: tag HEADER(hdrname) supported to query any header content similarly to header rules - The HashCash module and support has been removed completely, as it has been long since deprecated - URILocalBL: uri_block_cc/uri_block_cont now support negation (Bug 7528) - URILocalBL: IPv6 lookups for hosts is now support, if provided by your database - DNS and other asynchronous lookups such as Pyzor and DCC are now only launched when priority -100 is reached. This allows short circuiting at a lower priority without sending unneeded DNS queries and starting process forms. (Bug 5930) - API: New plugin method callback method check_dnsbl added to launch network lookups at priority -100 and check_post_dnsbl to harvest own network lookups - API: New plugin callback method check_cleanup for cleaning up things... - FreeMail: new options freemail_import_welcomelist_auth and freemail_import_def_welcomelist_auth added (Bug 6451) - New internal Mail::SpamAssassin::GeoDB module that provides a unified interface to modules MaxMind::DB::Reader (GeoIP2), Geo::IP, IP::Country::DB_File, and IP::Country::Fast. This is utilized by RelayCountry and URILocalBL with settings geodb_module, geodb_options, and geodb_search_path. Deprecated settings still work such as country_db_type, country_db_path, uri_country_db_path, and uri_country_db_isp_path but will print a warning to migrate to geodb_module/options. - Razor2 razor_fork option added to create separate Razor2 processes and read in the results later asynchronously, increasing throughput, and automatically adjusting rule priorities to -100. - DCC checks are now done asynchronously if using dccifd, improving throughput. With dccifd, rule priorities are automatically adjusted to -100. Commercial reputation rules can be ignored with the option "use_dcc_rep 0" to save a few CPU cycles. - Pyzor pyzor_fork option added to create separate Pyzor processes and read in the results later asynchronously, increasing throughput, and automatically adjusting rule priorities to -100. Renamed pyzor_max setting to pyzor_count_min. Added pyzor_welcomelist_min and pyzor_welcomelist_factor setting. Also try to improve false positives by ignoring "empty body" messages. - API: deprecated $pms->register_async_rule_start() and $pms->register_async_rule_finish() calls though left in for backwards compatibility. Plugins should only use $pms->bgsend_and_start_lookup(), which handles required things Automatically. Direct calls to bgsend or start_lookup should not be used. $pms->bgsend_and_start_lookup() should always contain $ent->{rulename} for correct meta dependency handling. Deprecated start_lookup, get_lookup, lookup_ns, harvest_until_rule_completes, and is_rule_complete. - SPF: Mail::SPF is now the only supported perl module and Mail::SPF::Query is deprecated along with the settings do_not_use_mail_spf, and do_not_use_mail_spf_query. SPF lookups are not done asynchronously so using an MTA filter such as pypolicyd-spf or spf-engine can generate Received-SPF for SpamAssassin to parse. - "ALL" pseudo-header now returns decoded headers, so it's usage is consistent with single header matching. Using the :raw option mimics the previous behavior of with undecoded and folded headers. - New dns_block_rule option handles blocked DNSBLs (Bug 6728) - ASN: Support GeoDB for ASN lookups (asn_use_geodb, asn_prefer_geodb, asn_use_dns). - ASN: Default sa-update ruleset doesn't make ASN lookups or add headers anymore. Configure desired methods, asn_use_geodb or asn_use_dns, and add_header clauses manually as described in the plugin documentation. Usage of asn_use_geodb without DNS is recommended unless ASNCIDR is needed. Do not use rules that check metadata X-ASN header! Only the new eval function check_asn() described in plugin manual works reliably. - sa-update: New --score-multiplier, --score-limit, and --forcemirror options added. #1 forcemirror: forces sa-update to use a specific mirror server, #2 score-multiplier: adjust all scores from update channel by a given multiplier to quickly level set scores to match your preferred threshold #3 score-limit adjusts all scores from update channel over a specified limit to a new limit - New dns_options "nov4" and "nov6" added. IMPORTANT:; You must set nov6 if your DNS resolver is filtering IPv6 AAAA replies. - API: Added Message::get_pristine_body_digest(), Message::get_msgid(), and Message::generate_msgid() functions. Removed deprecated private Plugin::Bayes::get_msgid() function. - Bayes and TxRep seen Message-ID tracking hashing method changed. No actions are required. If re-learning some old messages, they might be learned twice but old IDs should expire automatically. - report_charset defaults now to UTF-8. - Meta rules inherit net tflag setting from dependencies (Bug 7735) - BodyEval: Added plaintext_body_sig_ratio eval rules for the first text/plain MIME part's body and signature length ratio. - API: Now supports multiple calls of $pms->test_log() for rules. Added $pms->check_cleanup() to finalize tags, reports, etc. Deprecated internal $pms->{test_log_msgs}, renamed to $pms->{test_logs}. Deprecated $pms->clear_test_state() as it is not needed anymore. $pms->test_log() now accepts $rulename as second argument. - URIDNSBL: urirhsbl/urirhssub rules support "notrim" tflag to force querying the full hostname instead of just the domain. This works best if the specific uribl supports this mode. (Bug 7835) - Removed deprecated --auth-ident and --ident-timeout options from spamd - MIMEHeader: support matching ALL header, tflags range, and tflags concat - Autolearn: add new tflags autolearn_header/autolearn_body. These can force a rule to count as header or body points accordingly. (Bug 7907) - SSL client certificate support for spamc/spamd is now easier. New spamc options --ssl-cert, --ssl-key, --ssl-ca-file, and --ssl-ca-path. New spamd options --ssl-verify, --ssl-ca-file, and --ssl-ca-path (Bug 7267) - ArchiveIterator now automatically uncompressed all gzip, bzip2, xz, lz4, lzip, and lzo-compressed files (Bug 7598). These apply to spamassassin and sa-learn commands also. - New DMARC policy check plugin. - New project maintained DecodeShortURLs plugin which may not be directly compatible with rules from other third party plugins. See The plugin documentation for configuration and rule format. - Installing module Net::CIDR::Lite allows the use of dash-separated IP range format (e.g. 192.168.1.1-192.168.255.255) for NetSet tables including internal_networks, trusted_networks, msa_networks, and uri_local_cidr. - The HashBL plugin in v342.pre is now enabled by default. - HeaderEval check_for_unique_subject_id() function is deprecated. (end of UPGRADE)