Tryst With Spam - I
Some years back, me and Dimakh had gone to a client's place to finalize an email server proposition. After the discussion was over and we were about to go, the IT manager of the company casually asked if spam and virus protection service would also be there. "Of course", I replied equally casually.
Back in car Dimakh enquired whether I was confident of handling spam and virus. I replied in positive. With my background (in days of assembly language on MS-DOS) of virus writing, virus protection was not a new thing. About spam, well here was I with good experience in managing low-to-medium scale mail servers, good in shell scripting, conversant with SMTP language, could talk in perl and C, spam didn't seem to be much of an issue. Till then, spam for me was just an minor hindrance in work and never a major obstacle. Very soon I was to realize how wrong I was.
The server was installed and ran smoothly for a few months, qmail was running with spamassassin and clamav under qmail-scanner. One rainy evening, when I was coming from a session in Amdocs, one of the system administrator at Dimakh Consultants called me and told that the server's load average has increased. He said "top" revealed perl and clamd as the culprit. I told him to stop the smtp service for a while and also stop spamassassin. Back home, I logged in and from the logs realized that most of the mails were to non-existent addresses. Ok, so I just had to patch up qmail-smtpd to check user name in recipient's field for valid mailboxes and then accept the mail. I patched it using a patch found on net, recompiled and I was on track again.
But the problems had just started, the users of the server started complaining the server was slow and they received a lot of spam. I did whatever I could think of in the following order.
- - tried implementing RBLs
but the issue was that roaming users had problems in sending mails using POP-before-SMTP services due to their IP being in RBLs. - - wrote custom filters for SA.
The result was good for sometime, but it was a time consuming process to write rules and score them. - - trained bayes in SA feeding it with thousands of spam/ham.
Problem was like previous, worked good for a few days but the process had to be continuous. - - started auto-learn for bayes training.
Somehow, with Bayes and especially with auto-learn, load average rose high. - - wrote a filter at delivery level, moving each mail above certain score to be delivered to recipient's "spam folder"
Now, the inbox were a bit free but still spam folder was flooded with spam, with some genuine mails lost in their depth.
But the spam just kept on coming
Click here Tryst With Spam - II