September 21, 2007

Debunking - IS NOT Being Censored by AOL and Hotmail

I've seen this story making the rounds. However, IS NOT Being Censored by AOL and Hotmail:

1. If two large ISPs independently begin blocking mail from a given domain/IP address/network block/etc., then it's usually a pretty good sign that there is an issue with the mail source.

This this another "I'm-Being-Censored" wolf-cry, where people's buttons are pushed and their paranoia is stoked. This part was particularly manipulative: "Further, the Microsoft-Hotmail administrators inform us that they are blocking our communications to Truthout subscribers on their systems due to what they describe as our "reputation."

The word "reputation" there is a term of art meaning "spamminess", not anything political. Truthout was essentially told their messages looked like spam, but they then sent out an alert portraying it as if they were being told they were dangerous radicals.

And as usual, the accurate technical information is going to languish in obscure blog posting and expert mailing lists, while the demagogic attention-machine cranks on.

December 06, 2006

Spam, Spam, Spam ... The Ultimate "Long-Tail" Web 2.0 Market

I'm going to join the pack-blogging about the New York Times article Spam Doubles, Finding New Ways to Deliver Itself (mostly as a test to see if this gets into the news-collector Megite).

As Nick Carr put it:

"Given enough eyeballs, no scam is too shallow."

If niche markets for micro-content are supposedly The Revolution, why not niche markets for micro-scams? Indeed, abstractly, it would seem to be an even better fit. Product and monetary "exit" are combined into one!

Moreover, one can't even have simple faith in a technological solution to a solution problem, because the inbuilt technological arms race can attract programmers on the side of evil - the "botnets" used in spam-attacks are arguably impressive feats of subversion and distributed coordination.

No easy answers. Maybe we can only hope that some spam-fads, like some disease epidemics, eventually burn-out:

There is another aspect the scammers forgot to realise: there may be millions of people willing to order Viagra from shady websites, but investors willing to make a habit of buying spammed stocks and losing money are certainly in short supply.

October 18, 2006

Spamhaus v. E360Insight, ICANN, and There Is No Domain Crisis In The Making

I'm going to spit into the wind, and anti-hype the story about "Spamhaus appeals possible shutdown ruling". Background: A spam-enabler in the US sued an anti-spam organization, SpamHaus, based in the UK. The UK organization deliberately didn't show in US court, so the US court awarded awared the case to the US spam-enabler by default. Now, the US spam-enabler filed a proposal with the court that the UK anti-spam organization be forced to give the US spam-enabler their domain name. This bit of nasty legal posturing is being treated as if it were earth-shattering, or at least cybernetintertarweb-shattering.

Here's a good analysis from a mailing list:

From: Jonathan Zittrain <zittrain[at-sign]>
Date: October 8, 2006 5:05:04 PM EDT
Subject: more on ICANN NOT ordered by Illinois court to suspend

Dave and IP,

I don't see cause for panic on the Spamhaus lawsuit.

1/ The subject line of this thread is puzzling, since the document at <> is merely a proposed order, no doubt put forward by the plaintiff. The plaintiff is welcome to file proposed paperwork with the judge, but that doesn't make it an order until the judge signs it.

2/ An alert judge would not sign this document. There are specific state practices (and often statutes) about how default judgments are handled, and about how any sort of judgment translates into anything that binds a party outside of the case. For example, banks can sometimes be ministerially ordered to attach wages or seize accounts of people who owe money in lawsuits, or land can be auctioned. But something like a domain name is a far cry from a bank account or a house, and the registrar would have plenty to say about what to do with what is more a contractual relationship than a sum of money or a piece of real property.

3/ If the judge isn't alert and just signs, the registrar would have plenty of interventions to make if it chose -- and indeed it may not even be under the jurisdiction of the court.

There's some chance this could turn out to be more than mildly interesting, but I don't see any reason to think it's some grave event for cyberspace. ...JZ

[Update Friday 10/20 6pm: And per AP report, the court turned down the proposal]

June 01, 2006

Erection Problems (with a spam filter)

A news story about how the word "Erection" caused email to be lost is proving popular:

Commercial lawyer Ray Kennedy sent three emails to Rochdale Council's planning department objecting to proposals to extend his next-door neighbour's home on Sunny Brow Road in Middleton.

It later emerged the first two failed to reach the department because software on the town hall's computer system - designed to filter out obscene material - intercepted them because they contained the word "erection".

Somehow a third email, which contained the same word, managed to reach a planning officer - but the plans had already been given the go-ahead.

I think what going on here is basically true, but just slightly more complicated. It may not only be the word "Erection" all by itself. But that word, plus a few other minor words (What a cock-up?), is enough to trigger the spam threshold.

See the report I wrote a while back on a similar spam system:

UK Parliament Mail - The Ministry Of Silly Messages
Abstract: This report examines messages being rejected by a mail system in use [then] by the UK parliament.

April 27, 2006

Having Splogs (Spam Blogs) Boost Your Technorati Rank

When I wrote the Google logo chocolate poker chip post, I knew the keywords might attract spammers (I can tell which of my blog posts are popular in search engines, because they're the ones which become targets for spam). But a side-effect of attracting spammers seems to be attracting splogs (spam blogs). Roughly, these are blogs which exist to try to fool search engines, and often scrape other blogs for content. And Technorati, which ranks blogs by number of other blogs linked to them, can be fooled by these spam blogs.

So my post ended up being linked to, by some of these spam blogs. Which counted towards my Technorati "Authority". Which is another sad commentary on the concept (spammers are not notable for being great judges of worth).

A non-spam blogger probably couldn't push this too much without being caught. But, for these limited purposes, it's an intriguing bit of judo that if you can't get A-listers to link to you, mindless spammers seem to work just as well.

April 25, 2005

CNN Blog spam Google conspiracy theory (Nancy Grace)

(sorry, old net.joke)

There's an accusation going around that CNN is engaging in a viral guerilla marking campaign, also involving lowering the Google rank of blogs which criticize CNN. See for example the coverage at MetaFilter and Wonkette

Ouch. This is taking normal net strangeness, and turning it into a convoluted double-backed conspiracy theory that's straight out of a spy novel. The accusation is that CNN is spamming blogs. But then not only are they spamming blogs, they are engaging in a sophisticated Google attack designed to lower the rank of posts critical of CNN, by introducing spam into the comment stream. Oh, and the evidence for this involves in part that CNN sent press releases to well-read blogs during the attacks on CNN executive Eason Jordan.

Google, blogs, spam, CNN ... or was that Russia, KGB, terrorists, NYT? (maybe these days, Iraq, Al-Qaeda, WMD, Dan Rather).

After spending too much time looking through the evidence, it's pretty clearly one guy who has a slightly askew take on CNN's Nancy Grace. The keyword-stuffing technique that's supposed to spam-poison the comments is there because the spammer thinks it helps his spam. Not as a devious rank-lowering trojan-horse. The proof, to the expert eye, is that some spam keywords are structured the way an amateur would think would matter in search (plus sign preceding the word). But a professional search engine optimizer would never bother doing it (of course, it could be a professional cleverly faking being an amateur ...) But a journalist reporting on this wouldn't see the difference.

Of course, this post isn't going to be heard enough to do any good sad face.

[Update 5/5, to more clearly explain my interpretation (thanks for the link, Dan): There's someone who has been spamming a few blogs which discuss CNN, with his message critical of Nancy Grace. As he does this, he gets the (to him) "bright idea" that if he adds a bunch of keywords to his comment, it will rank better in searches. So he adds (here is a key point) keywords. He creates these keywords working from various ways searches are done, which sometimes requires a plus-sign before a word in order to require that word to appear in the search. This would not be the thinking of an elaborate anti-optimization attack.

The net is filled with people who go around and spam blogs to get their message heard, with various degrees of skill at it. So by the saying "When you hear hoofbeats, think of horses before zebras", when you see weird spam, think marginal people before elaborate PR campaigns. It's a much better fit. ]

December 21, 2004

Fact-Checking And Journalists

Here's two notes on posts I'm not going to write, and why, to add to reality-based thinking about blogs.

EFF's recent spam paper Noncommercial Email Lists: Collateral Damage in the Fight Against Spam has the following parade of horrible:

For example, the technology journalist Declan McCullagh reports that SpamCop blacklisted his email list ... Rectifying the situation proved difficult, and McCullagh was incorrectly listed as a spammer with SpamCop two more times after that.

Oh boy, is there more to the story than appears in that paragraph! But what's the point of my taking it on? Spam politics is a war-zone, and I'm unarmored. I don't need the fight. I'll just note a question for all the people enamored of the supposed power of blogs in fact-checking journalists:

What happens when someone fact-checks a journalist, and the journalist can just reply: "Sod off"? (or, for that matter, "Are you high?")

Further on the topic of blogs, facts, and journalists, the official report concerning the CBS forged memos scandal is due soon. This will be the result of the network's own internal investigation. I've thought of trying to expand a post I did on Gatekeepers of the Media vs. Blog Triumphalism, which examines the huge institutional support in going after Dan Rather. But the prospect of stirring up a hornet's nest of raving wingnuts, is not appealing. I'm not a club-member of one of the political alliances, so either nobody will hear it, or I'll just get slammed.

So much for the ability to be heard ...

March 03, 2004

Free porn, Google, spam, Internet censorship, and the Supreme Court

[Yes, this post really seriously concerns *all* the topics listed, it's truly that _tour de force_]

The Supreme Court just heard arguments on another Internet censorship law, "COPA", ( Ashcroft v. ACLU, 03-218). The Boston Globe reported:

Ordinarily, US Solicitor General Theodore B. Olson prepares for an appearance before the Supreme Court by acting out his argument before a pretend court. This time, for a case about the Internet, he added a new twist: searching online for free porn.

At his home last weekend, Olson told the justices yesterday, he typed in those two words in a search engine, and found that "there were 6,230,000 sites available."

The top lawyer who represents the Bush administration before the Supreme Court said the search's results illustrate how pornography on websites "is increasing enormously every day," a central point in his argument for saving an antipornography law that was enacted six years ago but has yet to go into effect.

Now, let's do something often unrewarded in this world - think. What search did he do exactly? It seems to be the following search in Google:

That gives me now "about 6,320,000" results, close enough, the total number returned often varies a bit.

Now, what that search means is roughly the number of pages containing the words "free" and "porn" anywhere in the entire page (or links with those words). This blog entry will qualify as one of those results as soon as it is indexed. I don't think this blog entry is proof of how pornography on websites "is increasing enormously every day,", much less the need for an Internet censorship law.

I've written about the problems of Google and stupid journalism tricks before. But, sigh, nobody reads me, so this won't get reported. Anyway, the story gets even better.

I started digging down into the results to see if I could find some non-sex-site mentions before the Google 1000 results display limit (Yes, Mr. Olson, there are more than 1000 sites devoted to sex in the world, that's true). Google's display crashed stopped in the high 800's! That is, displayed at the bottom, for:

In order to show you the most relevant results, we have omitted some entries very similar to the 876 already displayed.
If you like, you can repeat the search with the omitted results included.

The number varies, but it's been under 900.

Joke: Hear ye! Hear ye! Instead of "6,230,000 sites available", there's really uniquely less than 900! At least, according to Google.

Now, this is the Google display crash from bugs in the Google spam filtering. Google has cleaned-up their index so the crash is not happening on the first screen of results. But it's still in their results display code. Usually, people don't see the bug in practice, since the crash has now been pushed very far down in the sequence of results.

But here I had a reason to go looking out as far as I could, and ran into the crash in a bona-fide real-world situation. Not just a trivial query too, but one with profound implications for Censorship Of The Internet.

[Update 3/4: Michael Masnick brings to my attention that what I thought was the old Google spam crash is now reduced to duplicate-removal processing on the 1000 results display limit - the point is still that I can use fallacious superficial search "logic" to assert there's less than 900 sites, because Google "says" so. But the technical reason is not quite what I wrote originally]

Humor: If the evidence from a Google search was good enough to be used to justify censorship when it said "6.2 million", why isn't it good enough to justify no censorship if on further investigation it says less than 900? That is, if you thought it was valid before, with a big number, why isn't it valid now, with a small number? (garbage in, garbage out)

Look at me, I'm a journalist (or grandstanding lawyer) - Google says there's no practically no porn on the net!

January 08, 2004

Anti-Spam Law Needs Spammer's Heads On Pikes

I wrote the following to Dave Farber about a comment he made:

From: Seth Finkelstein
Subject: Re: [IP] So much for anti-spam laws

On Wed, Jan 07, 2004 at 05:05:33PM -0500, Dave Farber wrote:
> In 24 hours I have gotten 686 spam messages . That is up from pre 2004.

There won't be any change until some spammer's heads (or other organs) are nailed to the wall (metaphorically, under this law, I mean, I think ...).

Same as with the RIAA lawsuits, actually. A few high-profile high-penalty prosecutions, which hurt people, might cause a change. But the law has to be shown to be a real threat for that to happen.

"Send a spam, go to jail" needs to be the slogan.

I often say spammers have to be considered as gangs of thieves. That sounds like hyperbole, but I mean it. Would people say "So much for anti-theft laws - we PASSED A LAW, and the theft didn't stop!". I predict there's going to be quite a few of these sorts of complaints in the next few weeks (no offense to Dave). And endless Libertarian posturing: "Spam didn't stop! CAN-SPAM is useless! See, government doesn't work!"

Think of organized crime. Organized crime doesn't stop because there's a law passed against them. They stop when there are enough prosecutions so the leaders worry they will be next in line for a jail cell. Spammers won't stop until they see some criminal prosecutions come down, and have a strong worry they will be next.

September 10, 2003

Anti-spam versus censorware, once again

Matthew Skala, one of the programmers of CyberPatrol lawsuit (in?:-))fame, had a recent blog entry playing off the just-granted Websense patent, but using that to discuss censorware versus anti-spam programs. This issue, of using computers against spam or for censorware, comes up often. I've been replying to it for many years.

If I thought anyone much would care, I'd write up a FAQ on the topic. But that entry got thousands of readers as part of a Slashdot article, and, sigh, I hate to sound whiny, but I don't expect Slashdot coverage. Anyway, I wrote Matthew, and he kindly noted my point in a follow-up blog entry.

When putting together something now, I found I'd written long entries on this be fore:

" porn, spam, "filtering", and magic"

" More on censorware, spam-killing, and "magic""

No point in rewriting them. There's also an extensive discussion of the differences in the Reply to Copyright Office DMCA 1201 Censorware Exemption Question, and some in my DMCA testimony.

In a sentence, fighting spam concerns something you don't want to read, the sender wants to force on you. Censorware is about something you want to read, and an authority wants to prohibit you. These are thus very different situations.

September 09, 2003

You Might Be An Anti-Spam Kook If...

Vernon Schryver just wrote a hilarious guide to when
You Might Be An Anti-Spam Kook If...

If you are one of the many people who has recently discovered the Final Ultimate Solution to the Spam Problem [FUSSP], beware of the warning signs of having gone off the deep end.

It's somewhat technical, but very true.

"... you are the first to think of the FUSSP."
"... you plan to make money by licensing the FUSSP."
"... the FUSSP requires a small number of central servers to handle certificates, act as "pull servers" for bulk mail, account for mail charges, or whatever, but that is not a problem."

All hail the FUSSP-ots!

August 22, 2003

Sobig.F virus and spam

[The context of this was a mailing list thread about an expected wave of Sobig.F virus attacks from certain sites in the virus data]

I ran the list of Sobig.F attack addresses through Google searches, both by address and by resolved name, to see if anything interesting could be found. The data and results confirmed what Rich Kulawiec had written about the connection to spamming systems. That is, there is a connection to spam systems.

At least eight of the sites appeared in various spam-denying log files from one place which makes such logs public.

Sites found:

Detailed data below or

[The last number is the number of hits of the site from that day, I think]

July 05, 2003

Jonathan Kamens and Osirusoft spam blacklist

I just saw a fascinating thread about "Jonathan Kamens and osirusoft".

It seems that Jonathan Kamens remains personally spam-blacklisted, on and This started from when he was working for an employer with bad e-mail practices (, but he no longer works for them. Yet the blacklisting continues.

Amazing commentary:

I used to maintain exactly the position that you're maintaining -- that the block-list maintainers are reasonable people and that entries in block-lists are usually removed quickly when they are shown to be in error or when the blocked parties show that they have taken steps to address the problems. When I heard people complain about being blocked inappropriately, I'd tell them to go make their case in and, if they were right, the problem would be solved. I no longer believe that.

June 30, 2003

Intel v. Hamidi

Intel v. Hamidi verdict is in, let the interesting times begin ...

Hamidi's an ex-Intel employee who has been sending messages critical of Intel to tens of thousands of Intel employees, unsolicited, at their Inte work addresses. Intel sought an injunction against him, won in lower court, and just lost by a narrow 4-3 ruling in California Supreme Court

I've always thought this was a "hard cases make bad law" situation. Most civil-libertarians I know, who aren't dedicated anti-spammers, analyze it in term of a Little Guy being shut-up by A Big Corporation. And indeed, on the facts of it, I rather agree. On the other hand, in general, the law simply doesn't seem to have a "Little Guy vs. Big Corporation" exception in it. There's something to that effect in labor law, but Hamidi's cause itself didn't fall under it. I thought he was going to lose his case.

Well, hail California, they actually found for him, if narrowly To me, it's very interesting how, in my view, the majority basically seemed to want to write "We hold that he's a Little Guy being shut-up by A Big Corporation, and we're going to let him slide here because he's just not doing any real damage":

Intel's e-mail system was equipment designed for speedy communication between employees and the outside world; Hamidi communicated with Intel employees over that system in a manner entirely consistent with its design; and Intel objected not because of an offense against the integrity or dignity of its computers, but because the communications themselves affected employee-recipients in a manner Intel found undesirable.

The problem is, that fits every spammer, in terms of "consistent with its design". Spammers use email systems to deliver email, that's the whole point. The majority then lets Hamidi off the hook, as Intel objected for content (and hence, in practice, making a "Little Guy" exception). There's long sections in their opinion where they seem to try to say "Real spammers, don't try to use this, we don't mean to help you, no, no, no". They're commercial, Hamidi's non-commercial, which is a fair point. But spammers aren't noted for their honesty and strict construction in legal matters.

I suspect California is going to have some very interesting spam cases soon. ("My company must tell your employees about this great new way to improve their marital happiness, because an employee happier at home is happier and more productive at work ...")

And it must be wonderful to have legal backing ...

June 07, 2003

Neil Schwartzman, "spam political correctness"

[I wrote this in reply to the message Neil Schwartzman on political correctness: blacklists vs.blocklists on Dave Farber's list, a little while ago, but it didn't make the cut]

> Neil Schwartzman [on "blacklist" vs. "blocklist"]
> I have a strong notion that this started at a company that publishes
> blacklists during a time when they were being sued into complacency ...
> and in an attempt to softsell what they were doing to a judge, they coined
> this horrid new term. However, this perversion of the English language is
> just sad, under any circumstances.

If this is referring to Media3 v. MAPS, just on factual terms, I don't see it. They've had their main product named the "Realtime Blackhole List" for a long time. And the judge called it by that term, which seems fair.

"Nonetheless, in the case at hand, MAPS has done more than merely setting up a Website with allegedly tortious content. It has acted purposefully and successfully to sell and distribute its product, the blackhole list, in Massachusetts. It has directed its staff to encourage Massachusetts companies, over the telephone and through email, to discontinue spam-neutral and spam-friendly websites. Accordingly, I conclude that the exercise of this court's jurisdiction over MAPS is reasonable, and is authorized by the Massachusetts long-arm statute and the United States Constitution."

I think, for spam, aversion to the term "blacklist" doesn't have anything to do with McCarthyism in specific. But rather, my sense is it's a product of some people wanting to "have it both ways". That is, the list is intended as a tool of "disapproval/suspicion/penalized". But there's also at times a contradictory impulse to disclaim the moral implications which flow from that intent. Which leads to some very strange writing about these lists on occasion, as if they just fell from the sky and were published as curiosities.

Besides, if anyone was attempting to softsell what they were doing, as a PR tactic, they'd call it a "filter list", as the censorware companies call their blacklists :-).

May 15, 2003

Media3 v. MAPS spam case referenced in press again

Sigh ... I just saw the following spam/blacklist article, I have to remind myself once more to stay out of the spam-wars:

Blacklists vs. Spam

"Haselton found out that his organization had been placed on the Mail Abuse Prevention System (MAPS) list because of complaints that his Internet service provider, Media3 Technologies, refused to cut off service to companies suspected of doing business with spammers."

Much better information is in the court documents for Media3 v MAPS:

"MAPS responds that its assertion that Media3 is "spam-friendly" is true because Media3 does, in fact, host companies that provide services exclusively to spammers."

"Media3 has not established a likelihood that it will prevail on the merits of its defamation claim because, on the present record, MAPS has made a strong showing that its characterization of Media3 as "spam-friendly," is true. Media3's actions may well be found to outweigh its "Acceptable Use Policy." As described above, Media3 hosts several websites which provide support services that are used either exclusively or predominantly by spammers. See Def.'s Exhibits 1-4. These services include the sale of hundreds of thousands and even millions of e-mail addresses which are sold without any indication whatsoever that they are sold with the permission of the e-mail user. As the record stands, there is a serious question whether MAPS's assertion that Media3 is "spam-friendly" is defamatory because the statement appears to be accurate."

May 08, 2003


The "Tripoli" anti-spam proposal looks like it'll be making the rounds. It's got an excellent pedigree (Lauren Weinstein and Peter G. Neumann). But I'm dubious.

First, anything which has the word "Empowered" in it ("An Empowered E-Mail Environment"), is a big red flag to me. Purely a Bayesian test :-). But in my experience, "empowered" is "a system that we're trying to sell as helpful to someone else, even though they never asked for it."

More technically, the problem seems to be right here:

A key aspect of the Tripoli environment is the concept of a third-party certified, encrypted authentication token that would be cryptographically linked with every e-mail message. Within the Tripoli architecture, this token is referred to by the acronym "PIT" (Payload Identity Token, henceforth referred to as "Pit") and is at the core of Tripoli.

[Note: "Tripoli PIT" tended to make me wonder if that was a subtle joke, akin to "Amontillado CASK" or "Telltale HEART". There's a Tripoli mineral which is pit-mined, but that didn't seem like the reference]

EVERY e-mail is going to be third-party certified? I can see the dollar signs in some people's eyes now (not the proposers, rather certain readers). A mint. An e-mint. There's already several I-am-not-a-spammer certification systems, for businesses which want to send legitimate (requested) commercial bulk email. I suppose there's a value in unifying an interface to these systems. But I'm not sure that's much empowerment. Maybe the certification will be useful for innocent senders caught-up in spam-blacklist wars.

I mistrust elaborate technical proposals to social problems. The gee-whiz always sounds nice in the White Paper, with fun buzzwords. But overall, people tend not to want it.

May 07, 2003

Spam-Wars vs. War-On-Terrorism

[A reply to a message on Report from FTC Spam Conference . (I probably should remind myself there's no benefit to me in the spam-wars)]

> ... I believe that the long history of law developed around
> governmental censorship can aid us in looking at where the current
> systems are going wrong and what they could do to make things better.

As an only slightly tongue-in-cheek idea, I wonder if the key analogy is not censorship law, but anti-terrorism law. To a first approximation, spammers seem to me to be much more akin to terrorists than traditional censorship-targets. That is, they're taking advantage of an open society (US/Internet) in order to "hijack" infrastructure so as to convert it for their own typically criminal actions (I don't mean to trivialize mass-murder versus tawdry scams, but a thief and fraud con man is still a criminal).

The Spam Wars have very much the flavor of the War-On-Terrorism. There's strikes at territories (Afghanistan or ISPs) which are "harboring" enemy forces, and woe unto any civilians who are caught in the middle. There's "If you are not with us, you're against us" attitudes of collective-guilt towards many other parties, and pariah powers using innocents as human shields in order to generate sympathy (cough what the "spam-friendly" ISP Media3 did with Peacefire cough).

Spam blacklists seem to be somewhat like "no-fly" lists, where there's a database which is checked before one can travel (whether passenger or email) via the facilities. And there's the whole issue of cloaking/hiding/anonymity, as the bad guys often try to get service under forged identities, to avoid detection. There's even the same sort of urgency to say "I agree those are bad guys, but ..." (there's some of this in censorship, but the intensity is much higher with terrorists than typical censored material, and the spam-wars look to have the intensity-level of terrorism, not censorship). Though there are some real bad guys indeed, well-financed and ruthless.

If we destroy the open Internet, the spammers will have won?

Seth Finkelstein Consulting Programmer sethf[at-sign]

May 06, 2003

Reaction to article on REDUCE Spam Act

[This was written for a mailing-list in reaction to a column which criticized Lessig's REDUCE Spam Act]

From: Seth Finkelstein
Subject: Isolating spammers is good (was proposal to end spam ...)

> For one thing, an increasing percentage of it comes from overseas, and
> you can be certain that offshore bulk mailers will gleefully thumb their
> noses at Congress. ...
> Everyone would start quarantining ADV-tagged mail as rigorously as Hong
> Kong is isolating suspected SARS patients.
> If Lofgren's bill is enacted, U.S.-based spam operations are likely to
> shift operations elsewhere, just as gambling sites set up shop in the
> Caribbean.

Dave, can I point out that, far from being a clever Unintended Consequence, this argument is in fact something of a Straw Man. Way before Lessig, some of the most technically knowledgeable anti-spammers have not regarded spammers-will-just-set-up-shop-elsewhere as being a killer issue. That's been an argument/counter-argument right from the start over the MAPS Realtime Blackhole List. One position is that driving spammers overseas is good, because it makes them more isolated, and helps reject their connections with minimal other damage: (emphasis mine)

"And if it becomes widely known that selling e-mail or web services only to have them advertised in the text of spam is a great way to lose connectivity, then spammers will not be able to hide behind legitimate service providers and we'll smoke them out into the open, which means into using their own (blackholable) links."

The difference between gambling and spamming, is that people want to gamble, but nobody wants to be spammed. If spammers set up in the Caribbean, that's an invitation to make the Caribbean its own INTRA-net.

Note, by the way, the article also contradicts itself. The same "overseas" issue would apply to "long-standing common law rights" too - an approval of Blackstone over Lessig is purely ideological, not technological.

A law which helps to burden spammers, to isolate them, to deny them operations in a country, can thus be part of a solution, even without any international treaty.

May 02, 2003

Spam, Viagra, and saying Jehovah

[This was for Dave Farber's list, to explain why an editorial on spam ended up regarded as spam. Didn't make the cut - or maybe it got rejected somewhere along the way as spam ...]

[For IP] Ah, another easy one. Just from my experience, the problem is with the sentence:

"One technique is to misspell words, like "Viagra" or "pornography," that set off the filters.

You can't say "Viagra" or "pornography". Because if you say "Viagra" or "pornography", even in the context of discussing why you can't say "Viagra" or "pornography", well, we see what happens :-)

I'm reminded of this scene from the movie Life Of Brian:

Look. I don't think it ought to be blasphemy, just saying 'Jehovah'.
Oooh! He said it again! Oooh!...
You're only making it worse for yourself!
Making it worse?! How could it be worse?! Jehovah! Jehovah! Jehovah!
I'm warning you. If you say 'Jehovah' once more--
[MRS. A. stones OFFICIAL]
Right. Who threw that?

February 24, 2003

Spam-killing and ISP non-liability

[This was something I wrote to send to Dave Farber's list]

Subject: Re: [IP] Have ISP's walked into their own trap?
On Mon, Feb 24, 2003 at 08:12:12AM -0500, a comment was made:
> As ISPs move to fight spam by installing content aware filtering, have
> they cracked open the historical defense that much like common carriers,
> ISPs are not responsible for content in Internet copyright, fraud,
> pornography, and terrorism cases?

Dave, can I point out many obvious rebuttals?

1) If that were so, then ISPs which explicitly market built-incensorware (Surfcontrol/N2H2/Websense/etc.) as a feature of their service would be doing far, far, worse to this defense. I can't recall this ever being an issue.

2) The telephone company is allowed to offer blocking of various phone exchanges (i.e. phone-sex lines) without incurring content-based liability.

3) Similarly, the telephone company is allowed to monitor phone lines for various service-related problems without incurring such liability.

I think the deepest answer, though, is that anti-spam programs are not in fact directed at content _per se_, but the action of unwelcome solicitation. The repetitive content is used as a proxy for that solicitation. But I'd say it's important to keep in mind that the content itself is not the target. None of the programs have the goal of making, say, purchases of "rhymes-with-niagra" to be impossible (even though it sometimes seems that way from the practical effects ...).

A bit more prosaically, I doubt ISPs have any special hatred for merchants of oner-tay artridges-cay, in terms of the product. Rather it's the attempt to sell that product by trespass-to-chattel which is the problem.

Disclaimer: I am not a lawyer.

Seth Finkelstein Consulting Programmer sethf[at-sign]

February 23, 2003

Message on statistics for spam blacklists

This recent message about spam blacklists struck me as much food for thought (note the ratio):

February 22, 2003

A professional spammer is a con-man scammer lying thief

For some email I just wrote:

"Some days, I think about writing a book on civil-liberties activism. One chapter would be about getting fooled. My thesis there is that since civil-libertarians are fundamentally good-hearted, dedicated, people interested in fighting injustice, they are also particularly vulnerable to manipulative liars. And a professional spammer is almost by definition a con-man scammer lying thief (really - I do not mean this hyperbolically - spam is overwhelmingly made up of frauds - pump-and-dump stocks, sex lures, phony diets, get-rich-quick schemes, etc.)"

February 16, 2003

SpamAssassin vs. Crypto-Gram

Crypto-Gram newsletter is being marked as spam by SpamAssassin again. It's happened before, see my earlier analysis of SpamAssassin and Crypto-Gram. Here's a guess as to why it's happening now (SpamAssassin version 2.43).

WARNING - I used a mail header from the crypto-gram subscription confirmation in these tests, since I wasn't subscribed to the mailing-list. That may affect the results. It's very important to pay attention to the mail header, as tests on it are significant. Using the raw text of the newsletter - that is, no mail header - is not an accurate test!


SPAM: Content analysis details: (5.20 hits, 5 required)

So it's over the limit.

SPAM: NO_REAL_NAME (1.3 points) From: does not include a real name

Right. The "from" is just the mailing list (I assume)

SPAM: FORGED_RCVD_FOUND (0.8 points) Possibly-forged 'Received:' header found
SPAM: MSG_ID_ADDED_BY_MTA_2 (0.1 points) 'Message-Id' was added by a relay (2)

It doesn't like something about the way the mailing is done.

SPAM: OPT_IN (1.5 points) BODY: Talks about opting in

" ... use his own resources and take Opt-In requests from Intel employees ..."

SPAM: US_DOLLARS_4 (0.4 points) BODY: Nigerian scam key phrase ($NNN.N m/USDNNN.N m/US$NN.N m)
SPAM: US_DOLLARS_2 (0.1 points) BODY: Nigerian scam key phrase ($NNN.N m/USDNNN.N m/US$NN.N m)

US_DOLLARS_4 : ... stole $1.5 million in jewels ...
US_DOLLARS_2 : Hot on the heels of our $20M funding, ...

SPAM: BALANCE_FOR_LONG_20K (-0.7 points) BODY: Message text is over 20K in size
SPAM: BALANCE_FOR_LONG_40K (-0.1 points) BODY: Message text is over 40K in size

"Good" points for being long.

SPAM: NORMAL_HTTP_TO_IP (1.3 points) URI: Uses a dotted-decimal IP address in URL

Anyone can get their own .mil domain.

SPAM: SPAM_PHRASE_01_02 (0.5 points) BODY: Spam phrases score is 01 to 02 (low) [score: 1]

And a few misc phrases.

Sigh. Now to go try to see if anything can be fixed. Spam-wars, spam-wars ...

Update: Looks like the problem may be the " Razor" distributed message tests:

Date: Sun, 16 Feb 2003 12:10:49 -0500
Sender: Spam Prevention Discussion List <SPAM-L[at-sign]PEACH.EASE.LSOFT.COM>
From: Ed Allen Smith
Subject: Re: Media: Spamassassin blocks crypto-gram newsletter
By default - and by SA developer recommendation (I've been helping a bit with it and _I_ wouldn't recommend using it for blocking on most accounts, just for sorting mail into different inboxes... and I have some uncertainty on the latest scoresets; I've been working on the SA GA and have been seeing some problems with generalization), yes. From initial reports, at least part of the problem is that _Razor_ is hitting the February 15th CRYPTO-GRAM, so if SA is used with Razor going... I'll check the February 15th CRYPTO-GRAM vs SA 2.50-cvs, with and without Razor2 & DNSBLs. It may wind up that CRYPTO-GRAM has to be specifically whitelisted - SecurityFocus is, due to that SF mailing lists can have, say, malicious JavaScript legitimately being quoted in emails. We'll see.


Allen Smith
February 1, 2003 Space Shuttle Columbia
Ad Astra Per Aspera To The Stars Through Asperity

Posted by Seth Finkelstein at 11:59 PM | Followups
More on the "Ministry Of Silly Messages"

My report on UK Parliament Mail - The Ministry Of Silly Messages ( ) has been getting some coverage.

It's in The Inquirer, in an article Man solves un-parliamentary language conundrum

One critical insight I advocate in the spam-wars, is that the problem is going to be dealt with one way or another. Spam is not an "intellectual" problem, like sex-talk, hate-speech, etc. Spam is a "physical" problem, of cost-shifting and denial-of-service.

I suspect Parliament's trouble arise from conflating the "physical", the flood of solicitations, with the "intellectual", messages using words which were deemed an impropriety.

February 07, 2003

UK Parliament Mail - The Ministry Of Silly Messages

From: Seth Finkelstein
To: Seth Finkelstein's InfoThought list
Subject: IT: UK Parliament Mail - The Ministry Of Silly Messages
Date: Fri, 7 Feb 2003 13:45:06 -0500

New report:

UK Parliament Mail - The Ministry Of Silly Messages

Abstract: This report examines messages being rejected by a mail system in use by the UK parliament.

I've reverse-engineered why the system used by the UK parliament to scan mail for "inappropriate content" was bouncing messages ranging from Welsh newsletters to a Shakespeare quote. Censorware is not fond of pussy-cats and tit-willows.


E-mail vetting blocks MPs' sex debate

Software blocks MPs' Welsh e-mail

Plaid up in arms as Commons spam filter bans Welsh

UK Parliament Mail - The Ministry Of Silly Messages

NTK (Need-To-Know) coverage

Cyber-Rights & Cyber-Liberties (UK)

Seth Finkelstein
Anticensorware Investigations -
Seth Finkelstein's Infothought blog -
List sub/unsub:

February 04, 2003

Mail-blocking and UK Members Of Parliament

I spent some time today digging into the story of UK Members Of Parliament having their mail blocked by a spam-scanning program, i.e.:
BBC News: E-mail vetting blocks MPs' sex debate

I found some original-source information the following UK Parliament site:

By the way, the bounces look like this:

Date: Tue, 4 Feb 2003 16:11:09 +0000 (GMT)
Subject: RE:What is the mail system in use in Parliament?

Message subject: What is the mail system in use in Parliament?

This is to advise you that your email has been blocked and will be deleted by the Houses of Parliament in due course since we believe it has inappropriate content. The intended recipient has not received the email.

In the event that you believe the email has been blocked incorrectly please contact the intended recipient directly to discuss it's release.

That's very poorly designed. It looks intimidating, as if you were being chastised. And having to try to get the recipient to discuss release, sounds like bailing someone out of jail.

Just another day in the spam wars ...

January 17, 2003

Short observations based on the MIT spam conference

I caught the tail-end of the MIT spam conference Pretty interesting. The fact that this became a media event itself is a milestone.

There's definitely a "spam-interest bubble", like there was with viruses some years back, or Artificial Intelligence before that. I recognize the signs, the exotic research, the projects seeking funding, and what might be called the hot-area media effect.

I'm even more convinced that something will break in the near future over spam, I'm just not sure what that'll be - heavy legislation? balkanizing which ISPs communicate with each other? email itself, in terms of practically being able to use it outside of whitelists?

I don't know. I'll just repeat to myself to stay out of the spam-wars.

January 10, 2003

Spam dictionary-attacks and Hotmail

Andreas Bovens pointed me to an interesting Wired article on spammers doing dictionary-attacks in order to get email addresses on the service Hotmail. For those who don't know, a dictionary-attack is a procedure where tests are tried, one after the other, from a list. The spammers are testing email address after email address, one after the other, constantly, in order to find which addresses are valid. Steve Linford of Spamhaus has apparently tracked one spam-gang's attack , over months.

What most impressed me about this was the sheer intensity, the great lengths, to which the spammer was willing to go, just to get some addresses to spam:

Linford figures that in the attack he's been tracking, the spammers have hit Hotmail's server more than 52 million times. Even assuming a pitifully low 1 percent rate of live addresses gleaned from those hits, it still amounts to a significant number of e-mail addresses being added to spam lists.

The mind boggles. Over and over, 52 million+ tries, just to get addresses to spam. And then of course, once those addresses are obtained, presumably spamming them.

(math check - never take a journo-reported number on faith:
5 months * 30 day/mn * 24 hr/day * 60 min/hr * 60 sec/min * 4 tests/sec =
51840000 , more than 5 months checks versus "more than 52 million" - OK!)

That's the intensity of effort which is going into professional spamming. It's awe-inspiring.

I suppose this answers my earlier remark about the resources of a large spam business, and is establishing Spam Is 'A Thousand Times More Horrible Than You Can Imagine'

January 09, 2003

Lessig, label-or-pay anti-spam, and "substantially reduce"

Further on Lessig's advocacy of a label-or-pay anti-spam proposal, I don't find the bet about "substantially reduce the level of spam" all that interesting.

Let's say that the "Label-Else-Spam-Stops-Immediately-Gimmick Law" kills 30% of the spam. This would be great. Phenomenal. A work of genius from simplicity.

Then what do we do about the remaining 70%?

Note, if it kills 30%, the fact that 30% is not 100%, isn't a reason to reject it. Anything helps.

I'm for it. Pragmatically, the labeling system seems more a legal-fiction way of having a de facto spam ban, rather than, in practice, an end in and of itself.

But we've still got a problem with spam, in what do we do NOW?

January 03, 2003

Lessig, label-or-pay anti-spam, and Kansas law

There's been some discussion regarding Lessig's advocacy of a label-or-pay anti-spam proposal

I'll note that something related is already law in Kansas (check out the Spam Legislation list):

"Requires putting "ADV: " or "ADV: ADLT" as the first character in the subject line. The requirement for "ADV" is not necessary if the recipient has an established business relationship or has given the sender authorization. The sender claiming such exemption carries the burden of proof by a preponderance of the evidence"

"Establishes a private right of action for recipients to recover between $ 500 and $ 10,000 for each violation."

It's not quite what Lessig is popularizing, but certainly along those lines. Of course it's state law, requires intent regarding Kansas, etc. But it is interesting to see the above is already law.

January 02, 2003

Spam Is 'A Thousand Times More Horrible Than You Can Imagine'

Spam and DNSRBL's are in the news (using that term loosely), again due to the Slashdot boon granted to the missive on Moving Beyond RBLs

I've been thinking more about a different article, discussing Barry Shein and ISP "The World":
Spam Is 'A Thousand Times More Horrible Than You Can Imagine'

At one point, The World was under attack by 200 servers simultaneously "spewing the same spam at us," Shein said. "Little guys with scripts don't break into 200-plus servers and use them to spew at you. It seems like it's beyond what spammers are likely to be making on this stuff." Sophisticated stealth techniques and coordinating multiple servers seem to Shein to be beyond the resources of small spam businesses.

Right. Beyond the resources of small spam businesses. But what about large spam businesses? That much? Terrifying.

December 31, 2002

MIT Spam Conference, and cheap irony

I just registered for the MIT spam conference

Cheap irony: The registration page states:

(Don't use an address with over-aggressive spam filtering set up on it, because if the confirmation bounces, you won't be registered.)

Hmm ... it seems there's a lesson here somewhere ...

December 25, 2002

Spam Takes No Holiday

My email today still had a daily load of spam! Is spamming considered one of the vital occupations, such as firefighter or emergency medicine, which still has to be done on Christmas? I can understand that perhaps the Chinese and Korean spam doesn't consider Christmas a factor. But I still got a helping of good ol' American mortgage-rate (and other types of) spam.

Maybe those spammers felt there would be less competition today from other spammers?

December 19, 2002

SpamAssassin vs. Harvard Berkman Center Newsletter

Donna Wentworth at Copyfight says

Hoping's issue of The Filter will slip quietly under the wire.

Sadly, it looks like it's over the default line. Using SpamAssassin (2.31) with the defaults. I get

SPAM: ... Start SpamAssassin results ...
SPAM: This mail is probably spam. The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See for more details.
SPAM: Content analysis details: (5.8 hits, 5 required)
SPAM: NO_REAL_NAME (0.5 points) From: does not include a real name
SPAM: GAPPY_TEXT (0.4 points) BODY: Contains 'G.a.p.p.y-T.e.x.t'
SPAM: DOUBLE_CAPSWORD (1.1 points) BODY: A word in all caps repeated on the line
SPAM: CLICK_BELOW (1.5 points) BODY: Asks you to click below
SPAM: EXCUSE_1 (2.3 points) BODY: Gives a lame excuse about why you were sent this SPAM
SPAM: ... End of SpamAssassin results ...

Well, let's take a look:

NO_REAL_NAME (0.5 points) From: does not include a real name


GAPPY_TEXT (0.4 points) BODY: Contains 'G.a.p.p.y-T.e.x.t'

H a r v a r d  L a w  S c h o o l

DOUBLE_CAPSWORD (1.1 points) BODY: A word in all caps repeated on the line

Note sure about this, since line-breaking is unclear, but I think it has to do with "DMCA" being repeated in a line, as in

"... US Copyright Office's DMCA Rulemaking, proposing an exemption to the DMCA's anticircumvention provisions ..."

CLICK_BELOW (1.5 points) BODY: Asks you to click below

"The Rotisserie implements an innovative approach to online discussion that encourages measured, thoughtful discourse. Click on the link below to find out more or to download the software"

EXCUSE_1 (2.3 points) BODY: Gives a lame excuse about why you were sent this SPAM

"You are receiving this email because someone (perhaps you) requested that your name be added to our mailing list."

(frankly, that does sound spammish!)

The web version had some differences, and I had originally tested that. It doesn't have the line which make it fall foul of the EXCUSE_1 test. A small update on the web turned out to be enough to fall into a porn test, which isn't triggered in the email-version,

PORN_3 (0.5 points) Uses words and phrases which indicate porn

And the magic words are:

(?i-xsm:\baction) : (SDMI) were quashed by an RIAA letter threatening legal action under
(?i-xsm:\bhot) : for such hot-button terms as "Tibet" and "democracy."
(?i-xsm:\bstrip) : featuring music by the White Stripes and their creative cohorts, Red

That's "3 porn words in the whole message body", and adding an update about "legal action" put it over the threshold with "action" (lawyer jokes about old professions are coming to mind ... oops ...)

Spam-wars, spam-wars ...

December 18, 2002

The mystery of "domain registration spam body"

Donna Wentworth at Copyfight asks why an issue of the newsletter The Filter triggered a "domain registration spam body" test in SpamAssassin

As I read the test, anything mentioning ".biz" or ".name" or ".info", not at the end of a line (technically, within whitespace) will trigger the "Domain registration spam body" test! Ouch.

All these lines matched:

So how goes the nascent .BIZ business? Berkman Center Faculty
Usage of the .BIZ TLD"--online. Among the findings: three quarters of
currently registered .BIZ domains provide no web content or provide
only error messages or placeholders; a quarter of .BIZ registrations
corresponding .COM; and many .BIZ names fail to comply with .BIZ
> What's In a .NAME?: .BIZ isn't the only new TLD in town--ICANN also
introduced .NAME. Follow the links below for Edelman's study of .NAME

It's a problem with the genetic algorithm.

Excuse me, I must repeat to myself: Stay out of the spam-wars!

December 02, 2002

A small dispatch from the spam-wars

Today I wrote a 'reply' to a message cross-posted to a half-dozen mailing lists, most of which I'm not a member. I expected some bounces from that, but one of the bounces was:

5.3.0 Rejected your system is a spam source see

My system? Whaa? C'mon, give a person a fighting chance. What is "my system"? Sigh. Time to go check the blacklist form. Now, which of the possible IP addresses involved didn't it like? Granted, statistically, I'm running on the extreme edge of mail sophistication, with my own custom configuration. But against that, were I an ordinary person, I'd be stopped cold at this obscure message.

So, I go to and try the IP's which might be problematic. Note, at least I know the IP's - another thing most people would have a hard time doing. Finally, I get to the source of the problem:

I generally list cable modem, dsl and adsl networks where the provider does not publish the customer contact information for the sub-allocations via either ARIN or rwhois. Examples of such providers include but are not limited to AT&T and GTE.

Blech. I suppose if I were a journalist, I could kick and scream and cry bloody murder, and get away with forwarding legal threats. But I have absolutely no desire to do that, and it wouldn't work for me anyway. And it wasn't a very important message in the first place. Loyalty oath: The blacklister has a complete legal right, the mailing-list has a complete legal right, blah, blah, blah ...

Sigh. There's perhaps a few thousands scammers and thieves who are ruining email for everyone, literally the entire Internet.

I hate the spam-wars.

November 11, 2002

The Spam Magic Word - "Viagra"

Amusing - in SpamAssassin (version 2.31) the word "viagra" - just mentioning the word, anywhere at all in a message - is almost enough by itself. to have a message marked as spam at default testing levels. The word 'casino' isn't as bad, but still problematic.

Now let's see if anything happens to this blog item ...

November 10, 2002

More SpamAssassin, example problem in Telecom digest

I've spent around a third of a day in various volunteer investigations of why some mail gets mistakenly flagged as spam by SpamAssassin. I've done this before, in entries such as SpamAssassin and Crypto-Gram. and SpamAssassin and Michael Moore's latest mailing, And I'm not finished today. In lieu of yet one more analysis, I'll point people to another case, of SpamAssassin and Telecom digest.

Now, SpamAssassin is not a bad tool. Its very openness is one reason someone at least has a fighting change of figuring out what it is doing.

And I've also spent around another third of a day arguing with someone about spammers and free-speech (putting forth the viewpoint that spammers take too much advantage of civil-libertarians). Another drain. Spam is like locusts, it destroys the environment, via a plague.

November 04, 2002

Amy Wohl, blacklists, and spam-wars

I read about Amy Wohl's spam blacklisting troubles from Ed Felten's blog item, and, curiously, started digging into the problem (forget about Declan's spam troubles in that item, he can take care of himself). I started writing a letter about it, but soon realized that much of what I had to say in a technical sense had already been said in other's comments (see particularly Paul Verhelst's message) .

It's amazingly hard for an ordinary person, or even a skilled person without specialization in spam, to figure out what's going on in these cases. Here, what happened is that the ISP ( which sends out Wohl's newsletter apparently is also suspected of sending spam mailings, and thus blacklisted. So while her newsletter isn't spam, the same mailing service is allegedly a spam source. So some receiving ISPs drop everything coming from that sending ISP (ie. won't receive any mail from, whether it's spam or not). It was not trivial to figure out this even was the issue, before even thinking about what one should do about it.

Worse, a volunteer service which just provides information on some spam blacklists (and more),, got blamed by mistake for supposedly being the provider of the blacklists.

That's a half-page just slogging through the basics. I think I've got it all correct, but I'm not 100% sure.

Sigh. I keep telling myself: Stay out of the spam-wars. I did my service to the Net, got the scars for my medal, doing any more battles will likely not be good for me.

November 03, 2002

Dr. Dobbs Python-URL! newsletter and Spam Detection

It seems the October 28 Dr. Dobbs Python-URL! (a newsletter about the scripting language Python) is getting marked as spam, according to one netnews posting

I haven't verified this myself, but as a guess, I can conjecture in part why (emphasis added):

Grab an interpreter (it's free), fire up an interactive Python shell, and start playing with the tutorial of your choice (also free). That's the fastest way to get answers to your questions you'll believe. You'll like the answer you discover-unless you're an idiot [wink]." - Tim Peters

I've seen similar examples earlier.

October 01, 2002

SpamAssassin and Michael Moore's latest mailing

It looks like Michael Moore's latest message is another false-positive for spam by SpamAssassin. This might be a repeat of the situation discussed earlier, in SpamAssassin and Crypto-Gram . The mailing reads:

"Michael Moore's Mailing List" ... 10/01/02 1:57PM

October 1, 2002

Dear Friends,

I was going to write you a letter about what a pathetic liar George W, Bush is -- but then I figured, hey, why waste your time telling you something you already know!

[body of message snipped]

If you wish to be be unsubscribed from this mailing list, please click the link below and follow the instructions.

Now, SpamAssassin (version 2.31) sees in part:

SPAM: DEAR_SOMEBODY (-0.7 points) BODY: Contains 'Dear Somebody'
SPAM: DEAR_FRIEND (3.1 points) BODY: How dear can you be if you don't know my name?
(net 3.1-0.7 = 2.4 points)
"Dear Friends," is the problem here.

SPAM: CLICK_BELOW (1.5 points) BODY: Asks you to click below
SPAM: UNSUB_PAGE (2.6 points) URI: URL of page called "unsubscribe"
Well, yes, it had unsubscribe instructions, that's generally considered good practice for a mailing list.

SPAM: DOUBLE_CAPSWORD (1.1 points) BODY: A word in all caps repeated on the line

There may be a few other adjustments, my copy of the Michael Moore message is from a website, so it isn't pure with regard to "subject" line and mail headers. Still, the above is basically enough to have the message marked as spam with the default setting (5 points). Now if it gets automatically reported to the distributed spam-killing systems, it'll again get killed from those systems too.

Once more, I hate the spam-wars.

September 24, 2002

More on censorware, spam-killing, and "magic"

Let me make another try at outlining what I was trying to express in my message "porn, spam, "filtering", and magic", where Edward Felten has nicely replied, and in part responded:

The point I was trying to make in my original post is that too often, the same people who ridicule magical thinking about porn blocking, adopt nearly the same magical "reasoning" when the topic changes to spam blocking.

But, no, that's not really the case, in my view. This is an appealing idea, a "cheap irony". However, I don't think it's an accurate description of the reasoning error. It's not viewed as the same problem overall. Because the topic isn't only blocking. It's the theories of why the blocking is being done, and who is doing it, to whom.

The basic idea, way back in the olden days, was that through the use of magic, err, I mean technology, each person could have their own Internet environment perfectly tuned as they wanted it, and with no "social" aspects necessary (here meaning g-guuhh-guh-government, a word one was supposed to gasp and spit when uttering). What was never supposed to be said then, was that for the case of censorware, it was in fact NOT a situation of a person having their own environment, but of a third-party imposing restrictions on another person, said person presumably actively trying to escape. There was a very weird doublethink going on, where the Internet was supposed to be at the same time 1) uncensorable and 2) very easy to control. With the answer depending on whether it was governments or parents doing the controlling.

But with spam, it really is a matter of a person controlling what they themselves want to see. So someone can believe censorware doesn't work because control magic (protection-from-sex) will fail when cast on a resisting third-party, but such control magic (ward-against-spammers) will succeed when being cast on oneself. And this set of beliefs is even more consistent with the old Net ethos, in fact it might be said to define it.

Moreover, it's important to understand that the blocking theory of censorware is different from spam-killing. In general, there's an idea that censorware is "filtering" out "harmful" material, where even one exposure can be profoundly harmful. Whereas with spam, the problem is nuisance. From this viewpoint, censorware must be far more effective than a spam-killer. A censorware program which was theoretically perfect, except for the flaw that the subject could find just a single unblocked sex site each day, would be near useless. Whereas a spam-killer which which was theoretically perfect except for the flaw that each user had to deal with just a single spam slipping through each day, would be a great help.

Fundamentally, censorware is a content issue, while spam is an amount issue.

So it's not inconsistent for someone to think censorware can't work to the level needed, but spam-killing can do so, because of this content-vs-amount difference.

I believe the no-technical-solution-to-a-social-problem flaw is deeper. The "cheap irony" doesn't apply because people aren't necessarily reasoning inconsistently when they think censorware will fail because it's third-party control focused on "harmful" content, while spam-killing can work because it's first-party control focused on level of nuisance. When viewed this way, there's a world of difference.

However, the problem is that in spam, the spammer wants to escape the control of the program! That's where the social vs. technical fallacy lies. The attack is coming from the "other side" of the system.

I do believe the idea of a simple technical solution to spam is almost certainly wrong, though, just as in censorware. Because in both situations there are parties who want to break the technical system, from some of the strongest motivations of humanity (in the case of censorware - sex, while in the case of spam - money).

Posted by Seth Finkelstein at 12:04 PM | Followups

porn, spam, "filtering", and magic

Edward Felten kindly mentions my message SpamAssassin and Crypto-Gram and remarks in part

I'm amazed at the number of people who scoff at the feasibility of automated Web-porn filtering, while simultaneously putting their faith in automated spam filtering.

Uh-oh. Before I get too deeply into the spam-wars, I'd better say something about the word "filtering" here. I dislike the word "filtering", because it's used for several different situations, which are fundamentally distinct problems:

  • censorware : material which a person wants to read, but a third-party does not want that person to read. The third-party typically feels this material is harmful or toxic to the reader.
  • spam-killing : material which a person does not want, but a third-party wants them to read. The third-party is often deceptive and fraudulent, trying to trick the reader into viewing the material, knowing the readers would otherwise not want to see it.
  • personalization : material which a person wants to read, and the third-party is assisting them in terms of sorting out material from suppliers who are not trying to impose themselves on the reader.

The distinction between keeping people from something they want to read, and forcing on people something they don't want to read, makes the problems architecturally different. Stamping out wanted sexual material isn't quite the same problem as keeping a flood of unwanted ads out of one's face. Nobody thinks reading just one generic spam will cause them severe developmental harm. So the comparison isn't quite so simple.

I think any divide is more that in general, some people believe there's a technical solution to a social problem, and others believe this can't be done. This holds whether that problem is content prohibited by a third-party, or unwanted material by the reader. I'm in the can't-be-done camp (by purely technical means), and I deride the other side as believers in magic.

Some people on that other side tend to get v-e-r-y upset if you write anything which implies that the magic doesn't work. In part, I think because they've invested themselves into an idea that "Magic is the solution!". And if you say it isn't, well, maybe you're a crabby mundane person who is jealous of the happy magic-workers. Or perhaps you even want people to suffer, because, turning it around, you're invested in the idea that "Magic is NOT the solution!". And then there's the argument that if people want to believe in magic, who are you to tell them such a belief is wrong - it's their affair whether the spells they try to cast, work or not.

Again, the spam-wars scare me.

Maybe I'm hypersensitive to these arguments about the word "filtering". But I still have the scars from the censorware wars.

September 22, 2002

SpamAssassin and Crypto-Gram

Seth David Schoen has noted

Schneier's Crypto-Gram is getting flagged as spam by Razor. The reason is that some spam-detecting software will try to automatically detect spam and then automatically report it. So somebody's SpamAssassin mistakenly concludes that a copy of Crypto-Gram is spam and reports it to Razor, and this happens a few times over; now everyone who uses Razor will automatically be advised that Razor considers Crypto-Gram to be spam!

I've been looking at SpamAssassin, and indeed, it does flag the latest Crypto-Gram Newsletter as spam, given the default threshold. Here's which tests are being triggered (information given by SpamAssassin) and why (information not given by SpamAssassin, but can be found from simple investigation since it's open-source). This is from version 2.31:

SPAM: DOUBLE_CAPSWORD (1.1 points) BODY: A word in all caps repeated on the line

"Boolean functions of AES, which could possibly be used to break AES. But"
"called BES that treats each AES byte as an 8-byte vector. BES operates on"
"A new company, PGP Corp., has purchased PGP from Network Associates."

SPAM: PORN_10 (0.6 points) BODY: Uses words and phrases which indicate porn

"by pedophiles, child pornographers, cultists, occultists, drug pushers and"

SPAM: ONE_HUNDRED_PC_FREE (3.4 points) BODY: No such thing as a free lunch

"There's a new Twofish C library, written by Niels Ferguson. The main differences with existing code available is that this one is fully portable, easy to integrate, well documented, and contains extensive self-tests. And it's 100% free."

SPAM: PORN_3 (0.5 points) Uses words and phrases which indicate porn

(?i-xsm:\bporn) : "by pedophiles, child pornographers, cultists, occultists, drug pushers and"
(?i-xsm:\bsex+) : "with the sexual words you'd expect -- I won't print them because too many"
(?i-xsm:\blive) : "complex machinery. Their primary duty is to protect the lives and"
(?i-xsm:\baction) : "any criminal or civil action for disabling, interfering with,"

So, more than 5 points ... SPAM (at default levels)

This is not good.

September 08, 2002

The song "Spam" by "Weird Al"

The song "Spam" wasn't the origin of the term for the kind of email. But it's been running through my mind today.

"Spam in the place where I live (have some more) ...
Spam in the place where I work (you're obsessed) ...
Spam any place that you are (ham and pork) ..."

