December 19, 2002

SpamAssassin vs. Harvard Berkman Center Newsletter

Donna Wentworth at Copyfight says

Hoping

...today's issue of The Filter will slip quietly under the wire.

Sadly, it looks like it's over the default line. Using SpamAssassin (2.31) with the defaults. I get

SPAM: ... Start SpamAssassin results ...
SPAM: This mail is probably spam. The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See http://spamassassin.org/tag/ for more details.
SPAM:
SPAM: Content analysis details: (5.8 hits, 5 required)
SPAM: NO_REAL_NAME (0.5 points) From: does not include a real name
SPAM: GAPPY_TEXT (0.4 points) BODY: Contains 'G.a.p.p.y-T.e.x.t'
SPAM: DOUBLE_CAPSWORD (1.1 points) BODY: A word in all caps repeated on the line
SPAM: CLICK_BELOW (1.5 points) BODY: Asks you to click below
SPAM: EXCUSE_1 (2.3 points) BODY: Gives a lame excuse about why you were sent this SPAM
SPAM:
SPAM: ... End of SpamAssassin results ...

Well, let's take a look:

NO_REAL_NAME (0.5 points) From: does not include a real name

True.

GAPPY_TEXT (0.4 points) BODY: Contains 'G.a.p.p.y-T.e.x.t'

H a r v a r d  L a w  S c h o o l

DOUBLE_CAPSWORD (1.1 points) BODY: A word in all caps repeated on the line

Note sure about this, since line-breaking is unclear, but I think it has to do with "DMCA" being repeated in a line, as in

"... US Copyright Office's DMCA Rulemaking, proposing an exemption to the DMCA's anticircumvention provisions ..."

CLICK_BELOW (1.5 points) BODY: Asks you to click below

"The Rotisserie implements an innovative approach to online discussion that encourages measured, thoughtful discourse. Click on the link below to find out more or to download the software"

EXCUSE_1 (2.3 points) BODY: Gives a lame excuse about why you were sent this SPAM

"You are receiving this email because someone (perhaps you) requested that your name be added to our mailing list."

(frankly, that does sound spammish!)

The web version had some differences, and I had originally tested that. It doesn't have the line which make it fall foul of the EXCUSE_1 test. A small update on the web turned out to be enough to fall into a porn test, which isn't triggered in the email-version,

PORN_3 (0.5 points) Uses words and phrases which indicate porn

And the magic words are:

(?i-xsm:\baction) : (SDMI) were quashed by an RIAA letter threatening legal action under
(?i-xsm:\bhot) : for such hot-button terms as "Tibet" and "democracy."
(?i-xsm:\bstrip) : featuring music by the White Stripes and their creative cohorts, Red

That's "3 porn words in the whole message body", and adding an update about "legal action" put it over the threshold with "action" (lawyer jokes about old professions are coming to mind ... oops ...)

Spam-wars, spam-wars ...

By Seth Finkelstein | posted in spam | on December 19, 2002 07:02 PM (Infothought permalink) | Followups

Seth Finkelstein's Infothought blog (Wikipedia, Google, censorware, and an inside view of net-politics) - Syndicate site (subscribe, RSS)

Subscribe with Bloglines      Subscribe in NewsGator Online  Google Reader or Homepage