Phrase frequency analysis tools ------------------------------- This is very similar to the method embodied by the mass-check code in the 'masses' dir. The spam and nonspam corp{uses,ii} are analysed on the 3 machines (jm's laptop, dogma, and sonic) where this data is kept. This is done using 3 shell scripts, RUNME, RUNME.DOGMA and RUNME.SONIC. Obviously for your setup you will have to write your own analyser scripts. Then, 'join-and-settle-phrases' is used to rsync the output back to one machine, where it is summarised into a 'spamwords.freqs' file suitable for distribution as the spam_phrases SpamAssassin rules file. Jan 15 2002 jm Jan 27 2003 jm: this is now thoroughly obsolete...