Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#128065 - 26/11/2002 09:02 Any effective way to filter spam?
Dylan
addict

Registered: 23/09/2000
Posts: 498
Loc: Virginia, USA
My spam problem is out of control. Email is quickly becoming useless to me. I used to not care because it wasn't much effort to delete a few emails a day. Then it became 10 a day... then 20 and now I get about 50 spams a day. I've been losing real messages amongst the trash.

I'm frustrated and it has to stop. I don't want to change my email address because I've had it for 7 years and have given it to countless people. Any suggestions for something that really works?

Top
#128066 - 26/11/2002 09:28 Re: Any effective way to filter spam? [Re: Dylan]
matthew_k
pooh-bah

Registered: 12/02/2002
Posts: 2298
Loc: Berkeley, California
What I found worked for me were simple sorting rules in just about any semi-real email client. Create a spam folder, and then create a bunch of filters based on the spam you've already gotten. (from addresses, viagra, mortgage, "long distance") Then, filter out everything that doesn't have your email address in the To or CC fields. By this point, you'll have cut down tremendously on what makes it to your real inbox. Now, every time a piece of spam gets through, you spend a minute thinking of a rule that would have caught it.

There are more complex systems, but this is what I've found works for me.

Matthew

Top
#128067 - 26/11/2002 09:50 Re: Any effective way to filter spam? [Re: Dylan]
andy
carpal tunnel

Registered: 10/06/1999
Posts: 5914
Loc: Wivenhoe, Essex, UK
Yes. Try one of the Bayesian math based filters.

If you run your own Linux/Unix server then try bogofilter:

http://bogofilter.sourceforge.net/

If you are reliant on someone else's server the try POPFile, which is written in Perl and operates as a proxy between your client and your POP3 server:

http://sourceforge.net/projects/popfile/

I have not tried POPFile myself, but I am running bogofilter on my production server and it is working well.

All the Bayesian filters work a similar way, they "learn" what your spam and non-spam "look" like and then analyse each incoming message to work out what it is. This means that when you first start using them you have to teach the filter what your spam looks like.

I did this by feeding bogofilter several thousand spam messages and then several thousand non-spam messages.

For the first couple of days of use you have to be very careful to correct any mistakes the filter makes.

With bogofilter I do the following, I have four folders on my IMAP server dedicated to dealing with spam:

"Spam" - stuff I know is really spam
"Maybe spam" - stuff that bogofilter has classified as spam
"Wrong - not spam" - stuff that bogofilter said was spam, but was not
"Wrong - is spam" - stuff that bogofilter said was not spam, but is

In my procmail filters I have a couple of filter that catch stuff I am sure is spam (like mail using funky Chinese/Japanese/Korean character sets) which I send straight to "Spam" and also send to bogofilter telling it that it is definitely spam.

One of the filters looks like this:


:0
* 1^0 ^\/Subject:.*=\?(.*big5|iso-2022-jp|ISO-2022-KR|euc-kr|gb2312|ks_c_5601-1987|windows-1251|windows-1256)\?
* 1^0 ^Content-Type:.*charset="?(.*big5|iso-2022-jp|ISO-2022-KR|euc-kr|gb2312|ks_c_5601-1987|windows-1251|windows-1256)
{
:0wc
| bogofilter -s

:0:
Spam
}



I then have a load of procmail filters for various mailing lists. These lists see almost no spam, so as well as filtering them into a folder for each list I also feed them straight to bogofilter telling it that they are not spam.

They look like this:


:0
* ^From:.*jobserve.com
{
:0wc
| bogofilter -n

:0:
Jobs
}



After that I tell bogofilter have a bash at spotting the spam. At the same time I also have bogofilter add the message to the appropriate database automatically, like this:


:0fwE
| bogofilter -u -e -p

# if bogofilter failed, return the mail to the queue, the MTA will
# retry to deliver it later
# 75 is the value for EX_TEMPFAIL in /usr/include/sysexits.h

:0e
{ EXITCODE=75 HOST }

:0:
* ^X-Bogosity: Yes, tests=bogofilter
"Maybe Spam"



Everything that isn't identified as spam at this point ends up in my inbox.

I check the "Maybe spam" folder once a day, just to make sure that there is no obvious non spam messages. If there are I move them into the "Wrong - is spam" folder. I then select all the messages and dump them into "Spam". I have a script that runs once a day to archive the contents of "Spam".

If I find any spam in my inbox or in one of the mailing lists then I move the message into "Wrong - not spam". I then have a script that runs once an hour that looks in these two folders and tells bogofilter what it did wrong. The script looks like this:


#!/bin/bash

cd $HOME

if [ "Wrong - not spam" -nt "Wrong - not spam.empty" ]; then
echo "Updating Wrong - not spam"
lockfile -r 10 "Wrong - not spam.lock"
if [ "$?" -eq 0 ]; then

mv -f "Wrong - not spam" "Wrong - not spam.tmp"
cp -f "Wrong - not spam.empty" "Wrong - not spam"
touch "Wrong - not spam.empty"
rm -f "Wrong - not spam.lock"
echo `cat "Wrong - not spam.tmp" | /usr/local/bin/mid 14 | bogofilter -vv -N`
rm -f "Wrong - not spam.tmp"

else
echo "Failed to get lockfile for Wrong - not spam."
fi
fi

if [ "Wrong - is spam" -nt "Wrong - is spam.empty" ]; then
echo "Updating Wrong - is spam"
lockfile -r 10 "Wrong - is spam.lock"
if [ "$?" -eq 0 ]; then

mv -f "Wrong - is spam" "Wrong - is spam.tmp"
cp -f "Wrong - is spam.empty" "Wrong - is spam"
touch "Wrong - is spam.empty"
rm -f "Wrong - is spam.lock"
echo `cat "Wrong - is spam.tmp" | /usr/local/bin/mid 14 | bogofilter -vv -S`
rm -f "Wrong - is spam.tmp"

else
echo "Failed to get lockfile for Wrong - is spam."
fi
fi



It only takes me a couple of minutes each day to check that there is nothing vital in "Maybe spam" and to be honest I am getting very few false positives now anyway.

Please forgive my crappy shell script...
_________________________
Remind me to change my signature to something more interesting someday

Top
#128068 - 26/11/2002 10:17 Re: Any effective way to filter spam? [Re: andy]
Dylan
addict

Registered: 23/09/2000
Posts: 498
Loc: Virginia, USA
Thank you for the replies.

I've tried creating my own filters but I found it was too time consuming and not good enough. The main problem is that I use Outlook to check my POP mail from three different computers as well as my ISP's web mail interface. It becomes too much work to duplicate my filters everywhere.

The Bayesian filters sound really cool and I can believe that they are effective. POPFile looks like it would be awesome if I only checked mail from one location. But again, I'm using 3 computers plus web mail.

I think what I need is something that cleans up my POP account instead of operating only on what has been downloaded. The filtering service at http://spamcop.net looks interesting. Anyone used it? I'm a cheap bastard but $30/year is worth it if it's effective and doesn't require much effort on my part.

-Dylan

Top
#128069 - 26/11/2002 10:33 Re: Any effective way to filter spam? [Re: Dylan]
lastdan
enthusiast

Registered: 31/05/2002
Posts: 352
Loc: santa cruz,ca
I just signed up with postini.com. they charge like 2 bucks a month. my post-blocked list/filters spam hit about 40+ per day. now i get under 10 per week.
but it does leave me wondering what doesn't get to me that should.

Top
#128070 - 26/11/2002 11:54 Re: Any effective way to filter spam? [Re: lastdan]
Roger
carpal tunnel

Registered: 18/01/2000
Posts: 5682
Loc: London, UK
CloudMark SpamNet.

http://www.cloudmark.com/products/spamnet/

...uses a hash of the email to see if anyone else has reported it as spam. If so, it moves it to a spam folder. It's based on Vipul's Razor:

http://razor.sourceforge.net/

It's cut my spam intake by about 95% at work.

At home, I use SpamAssassin driven from my procmailrc file. See http://www.spamassassin.org/
_________________________
-- roger

Top
#128071 - 26/11/2002 14:01 Re: Any effective way to filter spam? [Re: Roger]
g_attrill
old hand

Registered: 14/04/2002
Posts: 1172
Loc: Hants, UK
I have an email on a .com domain I naievely used on the newsgroups a few years back for ~12 months unprotected - that gets about 30 spam messages a day. My surname.co.uk address gets a couple a week - I am pretty careful what I do with it and I think .co.uk addresses are not used by US originated spam, which from my general observation, accounts for 99% of all spam.

At work I check mail on a couple of domains we nabbed from a local failed .com company - that gets about 300 spams a day and well over 800 every weekend - of all types!

Gareth

Top
#128072 - 26/11/2002 14:24 Re: Any effective way to filter spam? [Re: Dylan]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
Not that it helps you now, but I use sneakemail.com a lot. You can get new random email addresses with no limit that they'll forward to you. The idea is that you get a new address for each untrusted correspondence. If you start getting spam through that address, you can see what address it came from and yell at the person to whom you gave the address. You can also cancel it.

The best part is that you don't have to pay for it. If you feel like it, they have a fairly small amount you can pay that will increase the amount of traffic you can use and the size of individual emails.

It's really nice.
_________________________
Bitt Faulk

Top
#128073 - 26/11/2002 14:25 Re: Any effective way to filter spam? [Re: Dylan]
drakino
carpal tunnel

Registered: 08/06/1999
Posts: 7868
I have been using SpamBouncer for a while now, and it seems to do a decent job as long as it's updated. Many of the procmail rules in this could be adapted for other programs, as it maintains quite a few seperate files for the filter system. Also, it uses procmail scoring to decide if it's minor spam, or definitly spam. I haven't had too many hit the minor spam incorrectly, and almost none hit the definite spam folder. It also asks for all of the addresses you get mail at, and will toss things not directed directly at you into bulk.

Top
#128074 - 26/11/2002 15:12 Re: Any effective way to filter spam? [Re: wfaulk]
drakino
carpal tunnel

Registered: 08/06/1999
Posts: 7868
but I use sneakemail.com a lot

I managed to convert a friend from this service to Postfixs ability to deliver mail to an address like [email protected] . This has the advantage of making it much easier to use a working, but blockable address like sneakemail.com offers. All you do is enter a new address in the potential spam form.

This is also useful for taking advantage of the free offers out there. Simply toss in a spammable temporary address, get the message you need, then toss the address into a killfile.

Top
#128075 - 26/11/2002 15:22 Re: Any effective way to filter spam? [Re: drakino]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
Yeah, but that still has the potential of wasting your mail server's bandwidth. If you delete a sneakemail account, then it's just wasting their bandwidth.
_________________________
Bitt Faulk

Top
#128076 - 26/11/2002 20:12 Re: Any effective way to filter spam? [Re: Dylan]
Jerz
addict

Registered: 13/07/2002
Posts: 634
Loc: Jesusland
I use SpamKiller by McAffee and find it extremely effective...

http://www.mcaffee.com/myapps/msk/

Top