SmartFilter - Slashdot Article

Original source: http://slashdot.org/yro/00/02/14/2316202.shtml

Editorial note: Some (not all) links which are now broken in the original, have been fixed in the text below.

Censorware and Memetic Warfare

Posted by jamie on Tuesday February 15, @12:25PM
from the ceci-n'est-pas-une-meme dept.
I'm halfway through Susan Blackmore's book "The Meme Machine," and it's rekindled my interest in meme dispersal. In a memetic sense, the battle over filters in the Holland library is just one of implanting the right ideas in enough people's minds by the day of the vote. Here's a look at one of the more annoying memes the opposition is using: a lie about the results of my very own organization. Click for more.

Everyone's familiar with the term "meme" by now, so I don't have to explain that it's the unit of idea transmission. The struggle over Internet filters, or any other conflict where ideas, facts, opinions, and outlooks collide, is memetic in nature: it's memetic warfare.

All's fair in war, supposedly, but I'm someone who has been infected by the meme that we should all fight fair, even - especially - in the war of ideas.

Will the "fight fair" meme become popular in the long run? I hope so. But the way I see it, that will only happen if it is more successful at reproducing than its alternative: "fight dirty." In the long run, it doesn't matter what's right, or what's good, or what benefits us humans the most. The memes just spread because they're good at spreading.

In early 1999, my friend (now Slashdot writer) Michael Sims started a long process to obtain some Web logs from the state of Utah. Internet access for schools and libraries across the state was provided by a single network, and all their Web traffic went through proxies that had the same blocking software running. Their Web logs were a gold mine of data, showing both blocked and unblocked accesses. When users were blocked from something, the logs showed what category it was blocked in.

Our group, the Censorware Project, had been looking for a real-world test case of this software. Michael did a tremendous amount of work to file the papers, get permission to get the logs, have them delivered, gather them, and analyze them. He then wrote a brilliant report (the rest of us helped too).

What this let us do was see how blocking software's errors show up in the real world. We had known for years that the software has many mistakes in its blacklists, in every product we'd studied. But we had no data on how that affected users.

When all the data was crunched, two numbers surprised us. First, the amount of material blocked was quite small: about 0.6%. People were interested in things besides pornography on the internet. Who would have thought.

Second, just looking at the wrong blocks that we were able to find, the proportion was quite high: about one block in every 20 was Constitutionally protected material. That's a minimum - the minimum we were able to confirm. All in all, we identified over 5,000 occasions when people were blocked from reading protected material (totalling 300 unique Web sites).

Most measures of blocking software effectiveness focus on how much pornography it blocks. We weren't able to test that because we couldn't look through the 99.4% of unblocked material - over 53 million URLs. Just too much data. But we did learn that, in Utah, 5% of the time, when the software said "you can't look at that," it was just plain wrong.

Ninety-five percent accuracy might sound like a nice high figure to base a good meme around. Who could argue with a number like 95%? But consider what this means for the 300 Web sites in question: each of them was blocked from being read by a great many public institutions in the state of Utah.

And the First Amendment protects publishers, not readers: it's freedom of the press, not freedom to read the press. When you're blocked from reading your favorite author, you might be annoyed, but if the censor were taken to court, the injured party would be the author.

This is exactly what we fought against the Communications Decency Act for. Except, in many ways, censorware is worse. If your site is one of the 5% that's wrongly blocked, you won't know it. Our government will stop people from reading what you have to say even if your site is completely innocent (like the Candy Land website), and nobody will bother to notify you. You won't ever know.

At least with the CDA, you'd have gotten a letter from the prosecutor telling you your site was censored - and nobody, but nobody, would ever have been censored for publishing the Bible.

(Yes, the Bible was one of the banned books we found in Utah, along with the Declaration of Independence, the U.S. Constitution, etc. That kind of thing makes good memes.)

Michael put a lot of work into our report, and I even contributed a little, so I'm a little protective of that 5% meme. Which is why it was so jarring to open up a press kit distributed by the Family Research Council, last week, and find our work, cited in black and white, as support for the figure: "one in a million."

That's right, the exact same report which found one bad block in every twenty is now being cited as proving that Web sites are misblocked "one time in a million."

Now that's a good meme. "One in a million" sticks with you. It isn't backed up by any of the facts, but despite that handicap - or perhaps partly because of it - it has thrived.

It was first invented by a fellow named David Burt, who read our report not very carefully, and then decided he was going to do a little numerology of his own.

The first thing he did was ignore all the bad blocks we'd found that he thought were perfectly appropriate. For example, we'd found that the homepage of the band "The Offspring" was wrongly blocked - you may remember their songs from the fall of 1998. "I'm just a sucker with no self-esteem," and so on. (You're humming it now. Catchy meme.)

David Burt decided that The Offspring deserved to be blocked, and to illustrate why, quoted nine words from their Web site:

"These songs have ideas PLUS drugs, sex and ass-kicking"

He also decided it was OK to block BaywatchTV.com, BirthControl.com, the Starr Report, the Yahoo category "Society and Culture: Romance," and Glamour magazine. It was OK to block a page on the NASA Web site about a crackdown on hackers, because it "discusses hacking techniques." Both takedown.com and 2600.com should be blocked, he says, for the same reason. A fellow whose homepage includes a link to a PGP FAQ - no code or binaries - should be blocked for containing "cryptographic software."

Did I mention this man is a librarian?

After trimming out all the fat from our list, he got it down from over 300 sites to just 64. Of course, this was the list of unique sites. If he'd had all our numbers, he would have known that his changes affected our 5% figure by about 0.1% - this because the large majority of blocked sites are blocked few times.

There's some other nonsense he tried, like saying that we were deceitful to ignore blocked banner ads because they were surely all pornographic. In fact, four of the five top blocked ad sites were perfectly ordinary, and counting ads would have made our numbers more impressive, not less.

But his main meme was the number. Armed with his new figure "64", he performed a division by the largest number in our report, which was 54,000,000. Kind of like dividing apples by hydrogen. Of the 54,000,000 URLs, only 29% were page views; only 0.56% of those were blocked; and the previously-mentioned 5% of those were blocked incorrectly. From there he switched from blocks to unique blocks, cutting the actual figure of 5,000 down to his list of 64.

Then, dividing 64 by the original 54,000,000, he got 1 in 1.18... well, for the meme's sake he got one in a million.

Publishing this in April of 1999, David Burt ignored our corrections. Despite our offering all the raw data on CD-ROM, for the cost of the media, he just accused us of lying.

You can't say anything to that, without getting into a yes-you-are no-we're-not. We'd put out two press releases about this already. We told him to order the CD-ROMs and check for himself. Then we moved on.

But his meme began to spread. In June, the company that made the blocking software pulled the same trick, reported the results to Sen. John McCain - and then issued a press release about it. Our group was now cited as supporting their software by proving its accuracy. Since the numbers were so big anyway, they just used the 300 figure and called it an "accuracy rate of 99.9994%."

A group I've never heard of, the American Decency Association, now points to our study and says: "Filters Work!" They source is another group I've never heard of, the Michigan Decency Action Council. Word gets around.

So when I opened up the report "Internet Filtering and Blocking Technology," published by the Family Research Council and distributed at their Holland presentations, I was not surprised when I found the same meme on pages 9 and 14. (I was surprised to see them divide 64 into 54,000,000 and get 6 parts per million. But as long as they've blown the numbers so badly, a little botched division doesn't make any difference.)

I talked to two of the FRC techies about this and tried to explain what was wrong with the numbers. I got some mild interest. Will the FRC correct and reprint this report? Of course not. Admitting that David Burt fudges numbers might be a bad tactical move. The concluding two sections of the report have 31 footnotes, 28 of which reference no one but Mr. Burt.

I choose to be an optimist about the marketplace of ideas. I believe that truthful memes will proliferate in the long run, because enough people's brains select for truth.

But in the meantime, it's frustrating when my team takes below-the-belt punches from the guys who don't care about what's true.

I don't expect everyone reading this to share my memeplex on this issue. I know from reading the comments that many Slashdot readers think censorware in libraries is a good thing, and that's fine. In fact, I'll bet many of you are grinding your teeth that I keep using the word "meme" so damn much. That's fine too.

All I ask is that, when your memes start arguing with my memes, you make them fight fair. It's only right.