logo
Header graphic 6 of 8

Read more

External links

Qmail recipes

Qmail recipes 2

Integrating qmail with nixspam and spamassassin

The following instructions may not be applicable to a high-traffic mail server, I don't really have experience with handling more than at most a few thousand emails a day. High traffic sites might want to integrate the solutions outlined below with the QMAILQUEUE patch or something similar.

Introduction

nixspam is a procmail-based spam filter developed by the German computer-science magazine "iX". Spamassassin should be a household-name by now, at least with programmers and system administrators, but just so you know: it's a client/server-based bayesian filtering software.

For a first line of defense against spam, I highly recommend installing qmail-1.03 with John Simpson's combined qmail patch and his jgreylist program and the related service-qmail-smtpd-run script. Together with qmail-conf, it makes setting up qmail a painless experience, gives you lots of integrated advanced features and a scalable greylisting solution. Make good use of the available RBLs (nixspam has a very good one at ix.dnsbl.manitu.net) and rblsmtpd and you've taken care of 60% of your spam already.

As promised, I'll stick to the interesting bits and will not explain how to install qmail with the above configuration. John Simpson's page contains a lot of good information on that, anyway.

I'm also assuming that you've setup spamassassin as a daemon. Under Debian Linux that is easy to do by installing the packages spamassassin and spamc. You might want to train spamassassin before using it. A neat trick is that if you're using Thunderbird, you can use your mail profile to train spamassassin, because Thunderbird stores its mail in mbox format. You just switch to your user account and pipe your email archive into spamc [--ham|--spam]. Please note that spamassassin wants to create its bayesian database in the home directory of the user it's running as, so you might want to fiddle with it for a bit if you're using vpopmail, because the resulting ~/.spamassassin directory has to belong to the user vpopmail.vchkpw.

Now, first of all: nixspam, unfortunately, is a procmail script, so you'll have to use procmail to deliver your mail. That step for itself is easy to accomplish and I'll show you how in a minute, but you can't easily set this up for a whole domain. You'll need to setup a .qmail file for each user that will use nixspam. For vpopmail and other multi-user systems you might be able to rig something using the EXT and HOST environment variables (see qmail-command(8) for more information), but that is beyond the scope of this post.

Installing procmail is easy, because your OS distribution is sure to include it as a package. Under Debian Linux just run apt-get install procmail and that's it.

Setting it up

Download the procmail script from the ix ftp server and put it in your mail-user's Maildir directory. You'll need to modify it and setup its configuration variables. The documentation in the script is in German, but there aren't a lot of configuration options anyway.

  1. Set MAILDIR to your Maildir's path.

  2. MY_MAILHOST is your email server's hostname. For me, that's simply the machine the script is running on: symbiont.maurus.net

  3. Set MY_FIRSTNAME, MY_LASTNAME and MY_ALIASES, these should be pretty self-explanatory.

  4. CHARSET is a list of character-sets that you can mark as spam because you can't read them anyway. For me, that's at least Chinese, Japanese, Russian and Korean character-sets.

  5. DOMAIN_OK and SUBJECT_OK are whitelists that allow you to whitelist mailing lists and domains

  6. A few lines further down, you need to modify MY_MX_IP and MY_MX_NAME accordingly

  7. You need to set NIXDIR to a directory where nixspam can record the fuzzy MD5 hashes it uses to identify spam. This directory can be shared by all nixspam users, but it needs to be writable from the account that manages email delivery. If you're using plain ol' qmail, then qmail-lspawn will run qmail-local for the delivery under the account that is configured in /var/qmail/users/assign, so you'll need to setup some group permissions so that every mail user can access nixspam's shared directory. For vpopmail you can most likely just assign rights for the vpopmail user and the group vchkpw.

  8. By the way, setting the BACKUP configuration parameter tells nixspam to backup spam mails in a mbox file called $NIXDIR/spam-mail-backup. I don't recommend this. I find it far better to deliver spam to a Maildir that the user can access over a webmail client like IMP. We'll change the default delivery instructions to do just that below.

Delivering mail and sorting out spam

Now, the next step is to make sure that email is delivered through procmail. The default delivery instructions at the very end of the procmail script direct mails that very likely are spam to /dev/null and reinject all other emails and try to send them to your "real" email account, or yourrealaddress+Spam@mailhost.tld. I prefer to have the spam filter operate on my email directly and then deliver my mail directly into my Maildir mailbox. So I replaced the following code:

# Spam entsorgen (Vorsichtige stellen Spam in eigenen Ordner zu),
# loeschen wie hier sollte mit "BACKUP=yes" (s. o.) einhergehen!
:0 D
* RESULT ?? ^^SPAM^^
/dev/null

# Nicht so sicheren Spam zustellen:
:0 D
* RESULT ?? ^^MAYBESPAM^^
!$LOGNAME+$MY_SPAMFOLDER@$MY_MAILHOST

# Vacation-Rezept bei Bedarf hier einfuegen.

# Hier landen alle restlichen Mails:
:0
!$LOGNAME@$MY_MAILHOST

with direct delivery instructions:

# deliver spam to my spamfolder ~/jonas/Maildir/.Spam
:0 D
* RESULT ?? ^^SPAM^^
$MAILDIR/$MY_SPAMFOLDER

# deliver mails that are likely spam to the spamfolder 
# ~/jonas/Maildir/.Spam
:0 D
* RESULT ?? ^^MAYBESPAM^^
$MAILDIR/$MY_SPAMFOLDER

# emails not identified as spam are piped to the spamassassin client
# that connects to the spamassassin daemon (spamd), but only
# if they're smaller than 250k. I also specify a .lock file to make
# sure a huge flow of emails doesn't spawn a few thousand spamc
# processes. The "f" flag tells procmail to use spamc as a filter.
# If you want to use spamassassin in client mode, i.e. without
# the daemon process, replace spamc with the spamassassin
# executable, but that's *really slow* if you have a big
# token database!
:0fw: spamassassin.lock
* < 256000
| /usr/bin/spamc

# if spamassassin added mail headers that tell us that the message
# is spam (a score of 15+ makes a probability of 99.5%) we deliver
# the message to ~/jonas/Maildir/.Spam
:0
* ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*[\*]*
* ^X-Spam-Flag: Yes
{
    LOG="SPAMASSASSIN_HIT
$LOG"
    :0
    $MAILDIR/$MY_SPAMFOLDER
}

# all other mail goes to the default mailbox ~jonas/Maildir/
:0
$MAILDIR

Please note that my spam mailbox is a Maildir called ".Spam". This is a convention that allows IMAP servers like the excellent dovecot to recognize it as an IMAP folder. If you're using another IMAP server you should look up it's folder namespace and choose the name and location your spam mailbox accordingly. Modern procmail versions can deliver to all kinds of mailbox formats. Creating the spam mailbox is rather easy. I did it by running

# mkdir ~jonas/Maildir/.Spam && \
chown jonas.mail ~jonas/Maildir/.Spam && \
/var/qmail/bin/maildirmake ~jonas/Maildir/.Spam

Plugging it in

Now all that's left to do is to make sure that qmail delivers our mail through procmail which in turn will call spamassassin. For that you need to create .qmail files with the correct delivery instructions. In most cases you need to setup a .qmail file to handle the main account and a .qmail-default file to handle email addresses with qmail-style extensions (prefix-ext@example.com). I'm using the following command in ~jonas/.qmail:

| /usr/bin/procmail -m -t ./Maildir/nixspam.procmailrc

The "-m" command-line parameter switches procmail into "generic mailfilter mode". Because procmail's documentation sucks, it says nowhere what that actually means. Go here to find out. Basically it's more secure, because it ignores /etc/procmailrc. That procmail doesn't switch user in this mode is irrelevant because qmail-local will run under the recipient's uid anyway. "-t" makes procmail return a temporary error if the delivery fails. This is important, because procmail will otherwise return a permanent error, making the email message bounce on the first error. That's generally not what you want, because

Maintenance

And that's it basically. Quite a long post unfortunately, but well worth the effort. nixspam coupled with spamassassin brought my spam per email ratio waaay down. I used to receive about 400 spam-mails per day. Now, the last step is to set up a cronjob that performs a couple of necessary maintenance procedures. I'm using the following entries in /etc/cron.d/nixspam:

PATH=/bin:/usr/bin:/usr/sbin
# process nixspam's files every 2nd day at 4:00am
0  4 */2 *  *     root  cd /var/lib/nixspam; \
                       if [ -x blackmatches ]; then \
                         mv blackmatches blacklist; \
                       fi; \
                       rm md5cache

# we have 5000+ mails every 7 days or so, so this is the recommended
# setting
10 4 *   *    0     root  cd /var/lib/nixspam; \
                        if [ -x cachematches ]; then \
                          mv cachematches spamcache; \
                        fi; \
                        wget -O cachematches -q \
http://www.heise.de/ix/nixspam/nixspam.cachematches; \
                        chown root.mail cachematches; \
                        chmod 664 cachematches

# every month replace whitelist with whitematches
15 4 1   *    *     root  cd /var/lib/nixspam; \
                        if [ -x whitematches ]; then \
                          mv whitematches whitelist; \
                        fi;

Modify the 2nd entry according to your email-traffic so that it gets roughly executed every 5k to 10k emails.

Last but not least, it makes sense to remove spam that is older than a certain limit. I chose 14 days and I do it like this in /etc/cron.daily/removespam:

#!/bin/sh

echo "removing spam older than 14 days"

cd ~jonas/Maildir/.Spam

NEWCOUNT=`find ./new -mtime 14 -type f | wc -l`
CURCOUNT=`find ./cur -mtime 14 -type f | wc -l`

REMCOUNT=$(($NEWCOUNT+$CURCOUNT))

if [ $REMCOUNT -gt 0 ]; then
  echo removing ca. $REMCOUNT messages...
  find ./new -mtime 14 -type f -exec rm \{\} \;
  find ./cur -mtime 14 -type f -exec rm \{\} \;
  echo done.
else
  echo "nothing to do"
fi

Conclusion

And that's how it's done. Hopefully it'll work as good for you as it did for me. Don't hesitate to write a comment or contact me if you have anything to add or discovered something new!