Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#130936 - 15/12/2002 12:20 Mozilla 1.3alpha's anti-spam features
DWallach
carpal tunnel

Registered: 30/04/2000
Posts: 3810
I've only just started playing with this, but I already like it. The documentation is crap, but the general idea is that all messages are in three states: unknown, junk, and non-junk. Ever message starts out as unknown, then you classify them by hand. Sadly, the GUI does not differentiate between unknown and not-junk, which requires you to read the relevant bug on Bugzilla, which seems to indicate that tri-state vs. dual-state is a topic of internal debate.

But, once you get past all that, it just starts working. The places you can take this sort of technology are limitless. After all, why not have one classification per folder to which you refile messages, and have Mozilla figure out what the messages in these folders have in common with each other?

Meanwhile, here's an amusing statistic: my 'training.dat' file, which I built using about two months worth of my inbox, is currently 1.5MB. It's a binary file. If you read it anyway, you see a list of (text string,32-bit number,32-bit number) tuples -- no doubt, the frequency counts for which each word occurs as junk or not-junk. If I run 'strings | wc -w' on it, I get 51701 words. If you read it over, you see that they're not very bright yet about different forms of whitespace like tabs, and it seems that they're lowercasing all the words, which might lessen their chances of noticing MAKE MONEY FAST spams. Also, they're throwing away any context of where they found a given word (subject, from, to, body, etc.), which I'd normally think would be worth keeping around.

In another week or two, I should be able to have some false-positive / false-negative rates. Right now, at least for the trickle of e-mail that showed up this afternoon, it's flawless.

Top
#130937 - 15/12/2002 22:26 Re: Mozilla 1.3alpha's anti-spam features [Re: DWallach]
tman
carpal tunnel

Registered: 24/12/2001
Posts: 5528
If you're not using Mozilla then SpamAssassin works quite nicely if you're looking for something to semi intelligently judge if something is spam or not. Vipul's Razor is more reliable as it uses an online database of spam which it checks incoming messages again. If you've got Outlook then SpamNet does the same thing.

- Trevor

Top
#130938 - 16/12/2002 12:53 Re: Mozilla 1.3alpha's anti-spam features [Re: tman]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
SpamAssassin is also Outlook only... I'm currently using Mailwasher which is working fairly well and automated (its documentation is very loose, so to get it fully automated you have to dig for yourself in its settings). I'm using it with one of the better email programs available for Windows, TheBat.

Just found this last night: http://sourceforge.net/projects/popfile/ Seems interesting. A bit more to configure, but far more control than something as simple as Mailwasher. MW's cool features involve the simple setup and ability to bounce. TheBat actually includes all that's necessary to implement MW's functionality as well, but it's not currently set up to automatically filter remote mail.

Bruno
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#130939 - 16/12/2002 13:53 Re: Mozilla 1.3alpha's anti-spam features [Re: hybrid8]
tman
carpal tunnel

Registered: 24/12/2001
Posts: 5528
Actually SpamAssassin runs under *NIX but there is a Outlook version. I personally use SpamNet and it works quite nicely. Nearly all of the spam is caught.

SpamAssassin is an open-source product aimed at UNIX systems. However, Deersoft sell Exchange and Outlook versions of SpamAssassin, and there is also Spamnix, a commercial Eudora plug-in. Other interfaces for plugging SpamAssassin into your mail systems are listed on this page.

- Trevor

Top
#130940 - 16/12/2002 16:11 Re: Mozilla 1.3alpha's anti-spam features [Re: DWallach]
DWallach
carpal tunnel

Registered: 30/04/2000
Posts: 3810
FYI, so far there have been zero false positives, and the only negatives are a Taiwanese spammer whose attachments are strictly a MIME-encoded GIF image with no text. This means that Mozilla has to latch onto a scarce few e-mail headers that look suspicious (e.g., "Received: from mydomain.com ..."). Assuming I get more from the GIF-only spammer, Mozilla should eventually learn some useful patterns...

Top
#130941 - 16/12/2002 17:28 Re: Mozilla 1.3alpha's anti-spam features [Re: tman]
Roger
carpal tunnel

Registered: 18/01/2000
Posts: 5682
Loc: London, UK
Actually SpamAssassin runs under *NIX but there is a Outlook version.

Yeah, I run SpamAssassin via procmail on my Unix box. It works a treat. Very few false positives (just a few mailing list posts that I'm not too bothered about). There have been a few false negatives, but it's cut my spam by quite a lot.

I'm generally happy with it, but I might install TMDA as well.
_________________________
-- roger

Top
#130942 - 17/12/2002 07:57 Re: Mozilla 1.3alpha's anti-spam features [Re: Roger]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
Setting up procmail won't be so easy now that my domain host has disabled SSH... I could kill 99% of all my spam with procmail alone. I've been meaning to set it up but now shell access to the server is gone. I could also just completely toss any forwarding of my hybrid8.com domain and that would kill nearly 99% as well. So many choices... So much laziness.

I just noticed mailwasher doesn't auto-process. ARRGH. Pretty useless now for automation without controlling it externally with another program/timer.

Bruno
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top