Debianizing the DSPAM Email Spam Filter
DSPAM is a scalable, statistical spam filter. Quoting Federico Sevilla III, who submitted the original RFP:
"DSPAM (as in De-Spam) is an open-source project to create a new kind of anti-spam mechanism, and is currently effective as both a server-side agent for UNIX email servers and a developer's library for mail clients, other anti-spam tools, and similar projects requiring drop-in spam filtering.
The DSPAM agent masquerades as the email server's local delivery agent and filters/learns spams using an advanced Bayesian statistical approach (based on Baye's theorem of combined probabilities) which provides an administratively maintenance-free, easy-learning Anti-Spam service custom tailored to each individual user's behavior. Advanced because on top of standard Bayesian filtering is also incorporated the use of Chained Tokens, de-obfuscation, and other enhancements. DSPAM works great with Sendmail and Exim, and should work well with any other MTA that supports an external local delivery agent (postfix, qmail, etc.)"
False negatives are rejected by the user by using the mail client to forward them to the user's "spam" mailbox. False positives can be reviewed using a web browser ( screenshot ), and these actions are used by DSPAM to train itself.
There has been quite a bit of interest lately in packaging this application for Debian. Discussion to date has been carried out on #195948. The work accomplished so far has been assembled from the patches sent to #195948, tagged to reflect the submissions, and added to the Alioth pkg-dspam CVS (obsoleted).
We have now a maintainer team for dspam and we would like to introduce ourselves:
We are discussing our changes at pkg-dspam-misc, if you want to participate in the discussions please join the mailinglist. If you have questions about dspam or that have to do with the packaging of dspam in debian, please ask them there.
We are moved to subversion and you can find the project here. Most of the changes from the cvs repository are merged with svn and the cvs repository is now obsolete.
- The first diff was contributed by Roger Keays on 15 June 2004. It is tagged as KEAYS.
- The second diff was contributed by Tommi Virtanen on 20 August 2004. Virtanen is a Debian Developer. He updated and polished the Keays diff and is apparently using the resulting .deb in production. He used exim and db4. The CVS tag is r3_0_0_virtanen.
- The third diff was the culmination of work done by Tim Small between 12 Oct and 30 Nov 2004. Small switched to mysql and worked on configuration issues (postinst and postrm). The CVS tag is
- The third diff is from Jesus Climent, 16 December 2004. He was able to take advantage of the upstream change at release 3.2 from configuration at build time to run time configuration. Climent broke up the package to support mysql, postgresql, sqlite, and db4 by moving from a CDBS implementation to one based on a makefile. Climent also eliminated the .../cgi/*.cgi.in files which were not accepted upstream and removed the postinst/postrm files (They are in .../debian/Attic). The CVS tag is
File bugs ?
You are sure you found a bug ? Well, mostly it's a feature but if you consider it as a bug...
Please file the bugs to the BTS using the tool reportbug and be as verbose as possible. Please see Filing bugs
Good documentation can be found at: DSPAM