Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#128012 - 25/11/2002 22:25 Spam list & script request
leftyfb
enthusiast

Registered: 04/03/2002
Posts: 217
Loc: Lowell, MA
For anyone using sendmail's spam filtering, i've attached a list i've made up from all the spam i've gotten for about a year now. Just thought someone else might benefit from a few less spam emails in their inbox.

The request I have is for anyone good with linux scripting. The list I made from manually grepping my "spam" mail folder on the server with a criteria of "Received:" and outputed to a file which I then edited to contain 1 line per ip/host address. Then searched for duplicates using excel, appended to my current /etc/mail/access file, searched for dupes again in excel, and just to be picky, sorted the whole thing. Then added REJECT at the end of each line, saved back to the /etc/mail/access file, used the makemap cmd to create the /etc/mail/access.db file and restarted sendmail.

Now I know this could be automated with the proper scripting using the right grep and sed, searching for the right criteria and filters and being able to append to my current access file, filtering out dupes again, running makemap and then restarting sendmail.

My ideal situation is to be able to throw any spam that I haven't yet blocked into the spam folder, setup a cron to periodically check that folder and add the ip/email/host address to the spamlist, restart sendmail and even delete the contents of the spam folder. A step further would be to scan my other mail folders for the ip/email/host address's and run them against the spam list to make sure I haven't accidentally put a valid email into the spam folder and if so, remove the entry.

I ask this here because I know ther are some very talented people that are great with linux and/or scripting and could throw this together in notime. I on the otherhand am limited in my grep/sed/scripting abilities and would take me a year to figure all this out(which I might end up doing if nobody can help me)

Any help is appreciated.

P.S. please don't suggest some of the other ready-made "smart" filters out there. The reason I like my way is because I pick what is spam by putting it in the folder and just want the ip/email/host address's of those emails to be blocked. I don't trust these "smart" filters as there is SOME spam (thinkgeek's newletters) I would actually like and don't feel like having to manually add these to some exclude list like I had to do with the ones I tried.


Attachments
126832-spam_list.txt (926 downloads)

_________________________
Mk2a 30GB Blue. Serial 030102999

Top
#128013 - 26/11/2002 14:05 Re: Spam list & script request [Re: leftyfb]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
The big problem here is trying to extract the appropriate information from the message headers. First, your grep may or may not have been enough to get all the relevant information. RFC822 headers can span multiple lines. A sed can get all the relevant lines. But then you have to parse that information for the ``from'' portion of the header, which can be in multiple formats itself.

Once you've compiled that list, it becomes much easier to deal with the rest. Just pipe it through sort and uniq.

Here's a sed script (that I'm not totally happy with) that will give you the contents of all the Received fields in an RFC822 message:
/^$/         b End

/^Received:/ {
s/^Received:[ \t]*//
n
b Cont
}
/./ d
:Cont
/^[ \t]/ {
s/^[ \t]*//
n
b Cont
}
/^Received:/ {
s/^Received:[ \t]*//
n
b Cont
}
d
:End
N
b End
(Replace the `\t's with actual tabs. I don't think that sed has eny escape characters for tabs -- only newlines. Notice that there's a space in front of the `\t's as well. They should remain there when you replace the `\t's with tabs.)

You should be able to use that to also manage to extract only the ``from'' parts of the Received headers, too, but I'll leave that as an exercise for the reader.

That should get you started, anyway. I've goofed off at work long enough as is.
_________________________
Bitt Faulk

Top
#128014 - 26/11/2002 14:19 Re: Spam list & script request [Re: wfaulk]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4174
Loc: Cambridge, England
I'll leave that as an exercise for the reader.

But identifying spammers is much harder than is made out here. Only the final hop before the first "trusted" mailserver is likely to be a spammer (ones before that, of course, could be forged by the spammer) -- and even that one could be faked, though good MTAs insert a warning when an incoming session doesn't reverse-DNS to its HELO address. Email addresses themselves are very easily faked, especially by viruses -- all my web pages have my real email address on, and half the unwanted email I get these days is bounces from when a virus or spammer has sent mail to a broken or autoresponding address, faked to appear to originate from my address. You can usually tell by looking at all the headers which one is the spammer, but it'd be jolly hard to automate the process.

With the methods proposed here, I'd be on everyone's spammer list in an instant.

Peter

Top
#128015 - 26/11/2002 14:30 Re: Spam list & script request [Re: peter]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
Too true. I just answered the OP without considering the consequences of what he was doing.

<LENNY>
I like writing scripts. Duh...
</LENNY>
_________________________
Bitt Faulk

Top
#128016 - 26/11/2002 14:56 Re: Spam list & script request [Re: wfaulk]
leftyfb
enthusiast

Registered: 04/03/2002
Posts: 217
Loc: Lowell, MA
Ok, so should I just stick with only blocking ip/hosts? Also, I periodically go through the list to make sure nothing has got added that looks like it might block valid email (ebay, hotmail, aol)
_________________________
Mk2a 30GB Blue. Serial 030102999

Top
#128017 - 26/11/2002 15:04 Re: Spam list & script request [Re: wfaulk]
Roger
carpal tunnel

Registered: 18/01/2000
Posts: 5682
Loc: London, UK
_________________________
-- roger

Top
#128018 - 26/11/2002 15:18 Re: Spam list & script request [Re: Roger]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
Well that just takes all the fun out of it.
_________________________
Bitt Faulk

Top