Infothought: infothought Archives

July 01, 2013

Blog over. Infothought RIP 2002 - 2013

Executive Summary ("tl;dr"): It hasn't worked. Google changes were the last straw. Blog over. [sad face image]

[Disclaimer - this is NOT a disguised beg for links. It wouldn't solve any of the structural problems outlined below.]

It's been clear for a long time I've considered blogging to have been a failure, for me. I'll skip reciting again my delusion. In sum, while I treasure the occasional indication that someone has enjoyed something I've written, the practical matter is overall, the net effect on my life is that I have much more to lose than I have to gain. I'm reaching the same tiny audience over and over, and squeaking in a basement does nothing against those who shout from the rooftops. More importantly, protesting from below has been sadly useless when being trashed from the top.

What kept me from ultimately abandoning the blog before was that it'd likely be irrevocable. Once I made such an announcement, there would be no going back. The audience would be gone, never to return. Did I really need to do that? Was it precipitous? Instead, I decided to just limp along, posting every once in a while in order to keep active status in feed readers and similar.

But the readership numbers are now going to be decimated anyway, due to the Google Reader shutdown. While there's other feed readers trying to fill the void, it's well-known that such shifts almost always lead to a big drop. Further, recent Google algorithm changes seem to be unfavorable. That's a complicated topic involving details like "over-optimization" and "negative SEO" and "[codename] Penguin update", etc. However, the key aspect is that there's now many more ways for a small blog to run afoul of Google even by mistake or just as collateral damage in the ongoing web spam-war. I even wonder if Google would _de facto_ punish my site if I continued blogging, since the constant addition of pages which have no links/tweets/likes/plusones/[attention!] might be regarded as a lowering of "quality" (remember, for the all hype, Google is not good at making human-level distinctions between thoughtful material and ad-bait - the proof of that is evident in the results of many searches. And if it's relying on social signals such as the list above, I don't do well there).

And those are the last straws. Let me re-emphasize, it would be wrong to say Google killed my blog. It's more along the lines of, after a long, protracted, lingering decline, Google finally pushed it through death's door.

Note Twitter is no answer. While I've had a Twitter account for a while, if I were to spend much time on Twitter, it strikes me that I'd be making the same mistake as with blogging (anti-strawman - this is for my circumstances, which I do not claim apply to every person categorically). I keep thinking: Not again, not another rat-race on a hamster-wheel. I don't want to get on that treadmill, of endlessly trying to find interesting and entertaining items to convey, attempting to gain "followers". I can't win at that game, and I don't want to play. Worse, it's another "power law curve" environment that structurally favors bullying, as those "high up" can broadcast personal attacks against anyone "below" them, with no way even for the target to effectively reply. It's not for me.

I've pointed out the cruelty of blog-evangelism many times in the past, how it preys on people's desire to be heard. And I don't think I'm immune from that weakness, or the "sunk costs" cognitive fallacy. But there comes a time to recognize when a project has failed. And to stop.

Posted by Seth Finkelstein at 06:47 PM | Comments (8)

February 22, 2010

My full replies for Pew Research: The Future of the Internet IV (and Google vs Stupid)

Pew Research Center recently released their survey on The Future of the Internet IV:

Respondents to the fourth "Future of the Internet" survey, conducted by the Pew Internet & American Life Project and Elon University's Imagining the Internet Center, were asked to consider the future of the internet-connected world between now and 2020 and the likely innovation that will occur.

I was one of the survey participants. I ended up with one quote in the report, in the section about reading and writing. The marquee item was querying about Does Google Make Us Stupid?, and I suppose it's just as well that I didn't get quoted there. I remind myself that Google doesn't need me to defend it.

The full responses I wrote for all the Pew survey questions are below:

# Will Google make us stupid?

The article is one of a long line that presents technology as somehow destructive to the essence of humanity (i.e. "making us stupid"). Centuries ago, this was phrased as corruption of the soul. The modern way of expressing it is pseudo-neurology - "Thanks to our brain's plasticity, the adaptation occurs also at a biological level.". It is an exceptional specimen in that it itself references predecessors of this type, having similar objections to writing or the printing press. But the reason it's part of this survey is that it's tapping into the fears and anxieties of many people who find technological advancement frightening, for changing beliefs about what machines can and cannot do ("as we come to rely on computers to mediate our understanding of the world, it is our own intelligence that flattens into artificial intelligence.").

I don't want to sound blindly optimistic, or be too hard on the piece. There are important points about social values being made. But the cost of getting attention for those points is allying them with a framework which appeals to a very reactionary mindset.

# Will we live in the cloud or the desktop?

This is The Return Of The "Thin Client". Every few years, some company gets the bright idea that simple access to high-powered back-end processing is the wave of the future - and of course, the company is going to get rich by providing those clients and matching back-end processing. It's great in theory, not so great in practice. Maybe This Time It's Different, and it's finally going to happen. But network delays and outages have always killed this idea in practice.

# Will social relations get better?

I voted positive, but I really don't like the phrasing of the question. Consider this: "In 2020, when I look at the big picture and consider my personal net worth, savings, home value, and other wealth, I see that the [modern banking system] has mostly been a [positive|negative] force on my financial world. And this will only grow more true in the future.". There's much material glossed over by such a question.

Note the population surveyed might not be the best sample. In my hypothetical query above, asking it to investment bankers will give a different distribution than foreclosed homeowners.

It's a big topic. Just think of it as new ways to meet - AND EXPLOIT - human needs.

# Will the state of reading and writing be improved?

For heaven's sake, It's clear NOW that the Internet has enhanced and improved reading, writing, and the rendering of knowledge. You have know how to read, it encourages writing, and people can exchange knowledge. Don't confuse this with the business models behind serious publishing, encyclopedias, and universities. The future of books is tied into whether there is a social/business model that supports writing for intellectual content rather than as marketing brochures or advertising-bait.

# Will those in GenY share as much information about themselves as they age?

It should be blatantly obvious that getting married and having kids reduces both the inclination and opportunities for "widespread information sharing".

"Not a soul down on the corner
That's a pretty certain sign
That wedding bells are breakin' up
That old gang of mine"

# Will our relationship to key institutions change?

"Popularity Data-Mining Businesses Are Not A Model For Civil Society"

There's a whole cottage industry now of hucksters trying to sell governments, businesses, non-profits, on supposed Internet magic pixie dust that makes citizens and consumers work for the organization for free, and inversely, peddling snake-oil to powerless people via a sales-pitch that it'll give them influence against powerful organizations. Fundamentally, these people are speaking nonsense, which should be evident to anyone who has ever heard volunteerism promoted as a solution to lack of funds.

# Will online anonymity still be prevalent?

At least in the Western world, there are very strong legal protections for the right to act anonymously, at least in terms of political speech. It would require an extreme social shift to remove them. It could happen, but that would mean a major upheaval with far-reaching implications.

# Will the Semantic Web have an impact?

The Semantic Web is like Artificial Intelligence. It's always just around the corner in theory, and disappointing in practice.

# Are the next takeoff technologies evident now?

It's very difficult to figure out what'll take off in the real world. Everything from technological details to market conditions to social trends has to come together, which means there are very few right paths among many wrong ones.

# Will the internet still be dominated by the end-to-end principle?

I can't explain this all in a comment box, but ... the Internet does not really work the way the writer of the question thinks it works. Trying to understand network management in the current political climate is worse than debating national health care systems (that is, there's extensive distorted, agenda-driven, misinformation).

Posted by Seth Finkelstein at 03:38 PM

June 15, 2008

Yahoo Search Engine Spiders Directories From File Paths

I just did an experiment and confirmed that the Yahoo spider will try to search a directory from a file. That is, if it sees a URL like http://example.com/stuff/jump.avi , in addition to retrieving that file, it'll try the URL http://example.com/stuff/ . Though Google won't do that (nor will Microsoft). It's easy to test this yourself if you have a website where you can see server logs. This practice has some significant implications for people who claim that trying truncated URLs is improper behavior and even possibly unauthorized access.

Posted by Seth Finkelstein at 03:50 PM

June 13, 2008

How "alex.kozinski.com" worked (Judge Alex Kozinski "Porn Site" Follow-up)

[Original research! Not an echo!!!]

Following up the "Porn Site" of Judge Alex Kozinski kerfuffle, and all the discussion of private vs. public norms, I've been trying to figure out exactly how the web site was configured. We know the controversial material was in a directory called "stuff", hence it was http://alex.kozinski.com/stuff/

I've found a key piece of evidence. In June 2004, Alex Kozinski sent a public letter in HTML, humorous nominating himself as part of a "Judicial Hottie contest":

Courthouse Forum: The Hot. Alex Kozinski

This letter contains various links, and one sentence in particular is:

* I bungee jump. [Ed. note: Click on the link to play this very fun little video clip--and make sure your sound is turned on!]

There, "bungee jump" is linked to: http://alex.kozinski.com/stuff/jump.avi

Again, that's the key directory.

This shows that Judge Kozinski knew the general public could retrieve specific material from that directory, and in at least one case, invited the public to do so.

I speculate that he did not know that his server was configured with a feature which lists all files in that directory when the directory name was given. That is, he may have thought that the only way to know what files were there, was if one was given filenames.

Moral: Security By Obscurity - Isn't.

Note regarding the search engine restriction file "robots.txt":

Yahoo had a cached copy of that directory (seems uncached now) with an entry at least as late as:

25.minutes.to.go.wmv 28-May-2008 12:18 6.3M movie

This strongly indicates there was no search engines prohibition for that directory. Further evidence is at the Internet archive, which shows many versions e.g.:

http://web.archive.org/web/20070629190035/http://alex.kozinski.com/robots.txt

having only entries:

User-agent: *
Disallow: /jurist-l/

[Disclaimer: Do read the letter. Alex Kozinski is impressive and a very cool guy, and those who are trying to have him removed from his position because of this tempest-in-a-teapot should avail themselves of some of the acts portrayed in the files in that directory]

[Update - see my column "Don't blame the judge for falling through the web's open doors" ]

Posted by Seth Finkelstein at 07:18 PM | Comments (4)

April 06, 2005

Reporters Without Borders nominates freedom blogs - including Infothought!

"Vote for freedom of expression blog award-winners!"
http://www.rsf.org/article.php3?id_article=13098

International April 6, 2005

Reporters Without Borders is calling on Internet-users to vote online for award-winners from among 60 blogs defending freedom of expression. There are six categories : Africa and the Middle East, the Americas, Asia, Europe, Iran and International.

...

These awards will be in tribute to webloggers who defend free expression and sometimes pay heavily for it. ...

Now it is up to Internet-users to decide. They may only vote for one blog per geographical category (The International category is of blogs that have a general interest in freedom of expression on the Internet).

Voting closes on 1st June 2005 and the prize-winners will be announced two weeks later.

To register a vote, go to : http://www.globenet.org/rsf/voteblog.php?lang=en

[And in particular, to vote in the International category, where Infothought is nominated, go to:
http://www.globenet.org/rsf/voteblog.php?cat=5&lang=en ]

Posted by Seth Finkelstein at 08:41 AM

January 13, 2005

CBS Report file has been modifed! Cut and Paste now prohibited!

Ernest Miller noticed that he could no longer cut-and-paste from the CBS report, and asked me to investigate. He's right. The report PDF file has been modified since its release. This can be verifed by any tool which will display the internal information of a PDF file.

http://wwwimage.cbsnews.com/htdocs/pdf/complete_report/CBS_Report.pdf

HTTP information (emphasis added below):

HTTP/1.0 200 OK
Server: Apache
Last-Modified: Wed, 12 Jan 2005 21:24:24 GMT
ETag: "b8626f-abd1c-6cea7200"
Accept-Ranges: bytes
Content-Length: 703772
Content-Type: application/pdf
Date: Thu, 13 Jan 2005 19:25:27 GMT

Current CBS Report file, PDF internal information (from the Linux tool "pdfinfo")

Title:          Microsoft Word - DC-685241-v10-Final_CBS_Report__sent_to_Lou_12_20_.DOC
Author:         demartpe
Creator:        PScript5.dll Version 5.2.2
Producer:       Acrobat Distiller 5.0.5 (Windows)
CreationDate:   Wed Jan  5 23:29:52 2005
ModDate:        Wed Jan 12 16:00:24 2005
Tagged:         no
Pages:          234
Encrypted:      yes (print:yes copy:no change:no addNotes:no)
Page size:      612 x 792 pts (letter)
File size:      703772 bytes
Optimized:      yes
PDF version:    1.4

Earlier CBS Report file, PDF internal information (from the Linux tool "pdfinfo")

Title:          Microsoft Word - DC-685241-v10-Final_CBS_Report__sent_to_Lou_12_20_.DOC
Author:         demartpe
Creator:        PScript5.dll Version 5.2.2
Producer:       Acrobat Distiller 5.0.5 (Windows)
CreationDate:   Wed Jan  5 23:29:52 2005
ModDate:        Fri Jan  7 19:17:44 2005
Tagged:         no
Pages:          234
Encrypted:      no
Page size:      612 x 792 pts (letter)
File size:      703330 bytes
Optimized:      yes
PDF version:    1.4

Note the difference in the "Encrypted:" field!

However, the text itself does not seem to have been altered.

Update 4:15 pm EST: Ernest Miller sends that the version of the report on the CBS law firm site has also been modified, confirmed (though the text again does not seem to have been altered).

http://www.klng.com/downloads/CBS_Report.pdf

HTTP information (emphasis added below):

HTTP/1.1 200 OK
Content-Length: 690313
Content-Type: application/pdf
Last-Modified: Tue, 11 Jan 2005 20:16:48 GMT
Accept-Ranges: bytes
ETag: "8564b67a1af8c41:e1b"
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Date: Thu, 13 Jan 2005 21:19:25 GMT

PDF internal information (from the Linux tool "pdfinfo")

Title:
Creator:        PScript5.dll Version 5.2.2
Producer:       Acrobat Distiller 5.0.5 (Windows)
CreationDate:   Wed Jan  5 23:29:52 2005
ModDate:        Tue Jan 11 15:14:40 2005
Tagged:         no
Pages:          234
Encrypted:      yes (print:yes copy:no change:no addNotes:no)
Page size:      612 x 792 pts (letter)
File size:      690313 bytes
Optimized:      yes
PDF version:    1.5

Update Fri Jan 14 14:45 EST 2005
Sisyphean Musings has CBS's explanation:

To allow copying of text to applications such as Word would allow anyone to create a modified or falsified report, which we cannot allow. The law firm hired by the Independent Panel insists that the report not be available in a format that can be altered, and we agree with that decision.

This speaks for itself.

Posted by Seth Finkelstein at 02:53 PM | Comments (9) | Followups

August 15, 2004

BSA Weasel == "Beagle Boys"!

The Business Software Alliance (BSA) has announced an "anti-piracy" site, with a kids' mascot ferret, and a contest to call it a name.

The BSA weasel creature reminded me of something I'd seen before. Something shady, disreputable, criminal. Finally, I remembered! The BSA weasel looks like he's a member of a criminal gang in Walt Disney Comics, the "Beagle Boys":


BSA Weasel	Beagle Boys

Look at the family resemblance. Same shirt. Same pants (gang colors?). Same squinty, hooded, eyes. Same toothy smirk. He's even wearing something on his chest, which, making allowances for updating to the modern age, might be a Beagle Boys identification patch (more evidence of gang affiliation!).

Traditionally, the Beagle Boys were after Scrooge McDuck's Money Bin. They must be diversifying. There's certainly a big money bin around the Business Software Alliance, one to rival Scrooge McDuck. So the gang has obviously gotten one of their younger members to convince the BSA executives to take him into the organization (using his weasel-skills - thus explaining what would otherwise be evident stupidity in having such a mascot). While everyone is distracted at the official contest ceremony, the rest of the gang will attempt to pull a heist. Classic plot.

It all fits ....

[Credit: Beagle Boys image from Kit's Silver Age Comic Books ]

Posted by Seth Finkelstein at 08:32 AM | Comments (5) | Followups

May 17, 2004

Nitke v. Ashcroft expert witness report of Seth Finkelstein

Nitke v. Ashcroft is a Internet censorship case challenging the obscenity provision of the Communications Decency Act (CDA). I'm serving as an expert witness on the topic of the Internet, anonymity, privacy, as it all relates to net censorship. My expert witness report is now available on-line:

Nitke v. Ashcroft : Seth Finkelstein expert witness report
http://sethf.com/nitke/ashcroft.php

As stated in the Nitke vs. Ashcroft Expert Witness press release:

The expert witness reports support the plaintiffs' contention that "local community standards" cannot be accurately applied to the Internet and, therefore, cannot be used to determine what is obscene. If the most restrictive communities can control what is placed on the Internet, then everyone will be restricted to that standard. The Internet is a world-wide phenomenon, therefore websites should not be held to standards specific to geo-location because community standards vary significantly from region to region and community to community.

Posted by Seth Finkelstein at 11:59 PM | Followups

May 09, 2004

Interviewed for "Blogging of a Thesis About Blogging"

Daniel Kreiss, who is doing "Blogging of a Thesis About Blogging", wrote an interview with me:

Seth Finkelstein interview

In the spirit of blogging, here's my partipatory journalism regarding it:

Crashing Back Down to (a Realistic) Earth
Had a long chat with Seth Finkelstein last night. He has some fascinating insights/arguments into blogging, and why it's a myth that the journalistic gatekeepers are gone. ...

It's quite good, but I'm biased :-)

The discussion ranges over my ideas of gatekeepers of production being replaced with gatekeepers of audience, to power laws to the "complete and utter nonsense to say that blogging will herald a new era of "participatory democracy" or communication where everyone has a voice" (I did indeed say that).

In looking at the evidence, like the theory of power law, Finkelstein (who uses terms like "calculated" when discussing theoretical arguments; ...

Yup. That comes from my Math/Physics background. Many of these discussions strike me as very much like errors one can make in similar calculations. "What's the (electrical) power necessary to run this motor?" isn't too far from "What's the (political) power necessary to run this candidate?". Complete with the contingent that wants to assume a spherical cow.

Now, there's a part of the interview where I disagree or would comment:

I tend to agree that power law is a good description of how users are reading the web, but I also have a sense that this model does not adequately amount to a theory of digital communication. Communication also has a tendency to percolate back up (trickling perhaps, but it is happening none the less) to the gatekeepers of audience, or beyond that into other social relationships.

This where I'd start thinking/asking, "What do you mean by "has a tendency", that is, how much"? Even in the most totalitarian dictatorship, there's some sort of "communication" between the elites and the population at large. Any smart ruler knows you have to listen to the masses to some extent, if only to keep track of who is a potential threat to imprison or kill. Getting too out-of-touch that way is a recipe for overthrow. But the elites and the dissidents sure aren't equal in communication.

For instance, my own newbie gestures at blogging at the time of this post have resulted in a grand total of two citations! Does that mean I am not heard, that I do not have a voice?

Yes. It means you don't have a voice if, say, you're concerned that a "Slashdot editor" with access to 250,000 readers may domain-hijack your website, for example. You couldn't fight back (unless those two readers happen to be very powerful themselves, what I call "The President And The Pope" argument).

Perhaps. But this might not be the end all measure of communication. This is not meant as a grand gesture here, but perhaps my ideas or reporting influenced someone's thinking, which then got passed onto their own blog, with or without the citation, and then around from there both off and online in their dealings with other people. My communication would then implicitly have an audience and power to it, even though I might have no idea or concept of the boundaries of that audience.

Audience (and used here as a proxy for power) is a variable. It can be measured and compared.

First person: "I'm heard by 250,000 people".
Second person: "Well, I'm heard by 250 people, does that mean I have no voice?"

Basic mathematics is that, all other things being equal, as a first approximation, the second person has 1/1000, one one-thousandth, of the voice of the first person, that the first person has ONE THOUSAND times the power of the second.

The amount of noise devoted to denying and obscuring the implications of this very simple little fact is amazing. On and on: Maybe audience isn't everything (right, it isn't, but it's not nothing either), maybe the first approximation isn't accurate (sometimes, but it's still useful overall), maybe the writer is happy to just stand on a streetcorner and rant to whomever passes by (which wasn't the point).

But the vast inequality in power this implies, replicated in Big Bloggerdom as much as other Big Media, is very ideologically unpalatable.

So regardless of the gatekeepers of audience, all communication has the potential to be implicitly powerful in how it is spread; and we do not have a good means for tracking this.

What is "implicitly powerful"? This sounds a lot to me like saying every lottery ticket has the "implicit power" to be a winning ticket. It does. But we also know that the probability is quite measurable.

True, some people are the social entrepreneurs in network theory, but there is always a dialectic at the micro level of communication (and this also does not account for the mere fact that people writing consistently, about anything, has implications in and of itself.)

"True, some people are super-rich, but even poor people have some money, and this does not account for the fact that having some money at all has implications in and of itself". See the problem? That is, saying almost all people have at least a little money, is typically not very useful to examining the divide between wealth and poverty.

There is a danger however, and Finkelstein is right to forcibly point this out. When people blow bubbles there is a distortion that occurs inside the bubble and whether that is traced through the stock market, the Dean campaign, or by ignoring the very real sites of social, economic, and political power, the promise of technology needs to be realistically combined with the cold hard historical reasoning that tells us there will never be a purely technological fix for what ails us.
Thus, we should advocate, and as strongly as ever, for the structural changes (like public subsidies for media outlets) that will create a more responsive, and responsible, media in this country.

I completely agree with the above. The problem, however, is that too many of the bubble-blowers think blogging in itself is that structural change. And I believe in this regard, they are: 1) deluding themselves 2) being cruel to the have-nots 3) aiding to ensconce the exact same gatekeeper hierarchy, by refusing to grapple with its emergent existence.

Posted by Seth Finkelstein at 06:13 PM | Comments (1) | Followups

March 03, 2004

Free porn, Google, spam, Internet censorship, and the Supreme Court

[Yes, this post really seriously concerns *all* the topics listed, it's truly that _tour de force_]

The Supreme Court just heard arguments on another Internet censorship law, "COPA", ( Ashcroft v. ACLU, 03-218). The Boston Globe reported:

Ordinarily, US Solicitor General Theodore B. Olson prepares for an appearance before the Supreme Court by acting out his argument before a pretend court. This time, for a case about the Internet, he added a new twist: searching online for free porn.
At his home last weekend, Olson told the justices yesterday, he typed in those two words in a search engine, and found that "there were 6,230,000 sites available."
The top lawyer who represents the Bush administration before the Supreme Court said the search's results illustrate how pornography on websites "is increasing enormously every day," a central point in his argument for saving an antipornography law that was enacted six years ago but has yet to go into effect.

Now, let's do something often unrewarded in this world - think. What search did he do exactly? It seems to be the following search in Google:

http://www.google.com/search?q=free+porn

That gives me now "about 6,320,000" results, close enough, the total number returned often varies a bit.

Now, what that search means is roughly the number of pages containing the words "free" and "porn" anywhere in the entire page (or links with those words). This blog entry will qualify as one of those results as soon as it is indexed. I don't think this blog entry is proof of how pornography on websites "is increasing enormously every day,", much less the need for an Internet censorship law.

I've written about the problems of Google and stupid journalism tricks before. But, sigh, nobody reads me, so this won't get reported. Anyway, the story gets even better.

I started digging down into the results to see if I could find some non-sex-site mentions before the Google 1000 results display limit (Yes, Mr. Olson, there are more than 1000 sites devoted to sex in the world, that's true). Google's display ~~crashed~~ stopped in the high 800's! That is, displayed at the bottom, for:

http://www.google.com/search?q=free+porn&num=100&start=900

In order to show you the most relevant results, we have omitted some entries very similar to the 876 already displayed.
If you like, you can repeat the search with the omitted results included.

The number varies, but it's been under 900.

Joke: Hear ye! Hear ye! Instead of "6,230,000 sites available", there's really uniquely less than 900! At least, according to Google.

Now, this is the Google display crash from bugs in the Google spam filtering. Google has cleaned-up their index so the crash is not happening on the first screen of results. But it's still in their results display code. Usually, people don't see the bug in practice, since the crash has now been pushed very far down in the sequence of results.

But here I had a reason to go looking out as far as I could, and ran into the crash in a bona-fide real-world situation. Not just a trivial query too, but one with profound implications for Censorship Of The Internet.

[Update 3/4: Michael Masnick brings to my attention that what I thought was the old Google spam crash is now reduced to duplicate-removal processing on the 1000 results display limit - the point is still that I can use fallacious superficial search "logic" to assert there's less than 900 sites, because Google "says" so. But the technical reason is not quite what I wrote originally]

Humor: If the evidence from a Google search was good enough to be used to justify censorship when it said "6.2 million", why isn't it good enough to justify no censorship if on further investigation it says less than 900? That is, if you thought it was valid before, with a big number, why isn't it valid now, with a small number? (garbage in, garbage out)

Look at me, I'm a journalist (or grandstanding lawyer) - Google says there's no practically no porn on the net!

Posted by Seth Finkelstein at 09:52 AM | Comments (9) | Followups

February 16, 2004

Howard Dean Domains

Inspired by Joe Trippi's blog domain, I went digging though the domain database to see if there was anything "interesting" to be found there. No scandal, but some amusing material associated with the Dean campaign. Most of the list was just DeanForInsertstatehere.com or BlahForDean.com. But amusingly, the domains:

DEANFLIPFLOPS.COM
HOWARDFLIPFLOPS.COM
WATCHDEANFLIPFLOP.COM

were all registered by "Dean For America" on "31-Oct-03".

And on "16-SEP-03", the Dean webmaster had registered

FLIPFLOPFORAMERICA.COM and FLIP-FLOPFORAMERICA.COM

I wonder what the story is there, just for the humor value.

Other funny domains:

DEANDEANDEANDEAN.COM
DEANDEANDEANDEANDEAN.COM
DEANDEPRESSION.COM
DEANISWRONG.COM

And interestingly:

DONTRALPH.COM
OPENSOURCEPOLICY.COM
POLICYFORAMERICA.COM

None of these seem to be in use.

If anyone wants it, I've made the list available (not meant to be exhaustive) at:

http://sethf.com/domains/dean/

Again, not exactly hot material, but it has its moments.

Posted by Seth Finkelstein at 11:59 PM | Followups

January 11, 2004

Outsourcing != Democracy

Whenever someone preaches that an industrial change is going to lead to a major revolution, I find it that it's useful to consider whether there will be a revolution, but in the opposite direction entirely. So it is with many democracy-of-the-media discussions I've seen recently. All of these seem to have the same path:

The Media Revolution Is At Hand:

Production is much cheaper. Employees are easier to replace. The occupation is becoming less the province of skilled workers, and more of amateur labor which works nearly free. Advances in mechanization, err, communication, allow for cutting staff drastically, and one laborer can now do what previously required several people. If you don't take advantage of these trends, your competitors will, so get with it.

This is democracy?

No.

This is OUTSOURCING!

No wonder so many of the media pundits are so rude about blogs - they're defending their conception of themselves as hard-to-replace highly skilled labor.

But on the other hand, why am I supposed to be so excited that many skilled jobs are turning into unskilled jobs or cheap-labor jobs? Well, there is of course the populist joy in seeing an arrogant profession brought low. But putting aside heart-warming Schadenfreude at their humbling, the end result here seems to be the exact opposite of what's preached. That is, overall, there will be more power for management, not labor.

Posted by Seth Finkelstein at 11:26 PM | Comments (7) | Followups

December 16, 2003

Seth Finkelstein GrepLaw Interview (Censorware, Copyright, and Blogs)

GrepLaw has an interview with me today:

Seth Finkelstein on Censorware, Copyright, and Blogs
http://grep.law.harvard.edu/article.pl?sid=03/12/16/0526234&mode=nocomment

It's over 6,000 words long. Of course, I think it's well worth reading. But I'm biased there.

The topics range over censorware, copyright, DMCA, free-speech activism, committing bloggery, and more. I went on at length. So even if you've heard me say it all before, it might be worth a look just for the collected edition.

[Update: URL now goes to archived text on my site]

Posted by Seth Finkelstein at 07:24 AM | Comments (3) | Followups

November 26, 2003

Google Bayesian Spam Filtering Problem?

New Google report from Seth Finkelstein:

Google Bayesian Spam Filtering Problem?
http://sethf.com/anticensorware/google/bayesian-spam.php

Abstract: This report describes a possible explanation for recent
changes in Google search results, where long-time high-ranking sites have disappeared. It is hypothesized that the changes are a result of the implementation of a "Bayesian spam filtering" algorithm, which is producing unintended consequences.

Posted by Seth Finkelstein at 09:07 AM | Followups

November 15, 2003

Google Deskbar

Google Deskbar is the latest little tool from Google. It's a self-contained searching program, which is very lightweight and fits snugly in a desktop screen (PR: "Google Deskbar enables you to search with Google from any application without lifting your fingers from the keyboard. Installs easily in your Windows taskbar.")

I was poking around at its innards in order to see if there was anything interesting inside. Internally, it seems to be a "microbrowser". That is, I think it hooks into Windows/Internet Explorer services in order to do a search, exactly as if you had typed it into the Internet Explorer browser. And then uses the Windows Operating System display routines to present the results.

On the one hand, that makes it heavily operating-system dependent in terms of code. On the other hand, it's extremely cheap in terms of development, a neat little hack.

The most socially interesting thing about it, is that given it's tying into Windows/Internet Explorer services, it appears to share the Google cookie with Internet Explorer, and use the Google cookie itself in all searching. That's not obvious, though it makes sense in retrospect.

It's actually a little strange, in terms coming full circle with applications, to realize it's a microbrowser. That is, the original web browsers were simple programs devoted to rendering simple code. Then the inevitable "creeping-featurism" took over ("2. More generally, the tendency for anything complicated to become even more complicated because people keep saying "Gee, it would be even better if it had this feature too"."). So the browser became a behemoth, of often not-quite-working plugins, handling sound and video and cascades of style bleats. It's now so bloated that writing a small and fast program to do one common operation and display the results quickly, is some sort of innovation. Somewhere there's a lesson in that.

Update: I should have mentioned Dave's Quick Search Taskbar Toolbar Deskbar, thanks to LISnews

Posted by Seth Finkelstein at 11:56 PM | Followups

October 27, 2003

Whitehouse.gov iraq robots.txt directories - an explanation?!

Update 10/28: The White House says it's merely a design issue, from

http://www.2600.com/news/view/article/1803

Per: http://www.bway.net/~keith/whrobots/whresp.html

[(10/27) Just sent this to Dave Farber's list, about the whitehouse iraq robots.txt directories (update: note for more background, see http://www.bway.net/~keith/whrobots/ )]

Archived at

The White House And Iraq Directories
http://sethf.com/domains/whitehouse-iraq/

Dave, I've been analyzing the robots.txt file, exactly because the directories are so strange. I have a theory on what's happened. But it's so jaw-dropping that I'm hesitant to rush it into a formal report/release. In short:

There's no conspiracy.

There's a real-life instance of the joke genre which runs "I thought you said ..."

For example, here's one of the jokes: "After a California earthquake, Dan Quayle is sent to visit the most damaged site. But he never arrives there. Finally, he's found in Florida. He says, shocked, "Go to the EPIcenter? I thought you said ..." [EPCOT Center]

The joke here? Someone said:

"Don't have the search engines looking at the Iraq documents index"

And that was heard as:

"Don't have the search engines looking at every "index" with Iraq"

Really!

The evidence for this is that the robots.txt file has lines for

Disallow: /disk2/www/htdocs/infocus/iraq
Disallow: /disk2/www/htdocs/infocus/iraq/news/infocus/iraq

These are the only lines where there's never any matching pattern of "iraq" and "text" at all. They're obviously special in some way. And they look like they're a searchable index.

Then there's the fact that some people are confused between directories, the function of the file "index.html", and that a bare directory will display as "Index of <directory name>" in some servers.

So ... "Iraq index" ... "Index of <directory name>" ... Oooops!

Never attribute to malice which can be explained by stupidity.

This is hard to believe. But it fits!

Update - the robots.txt file has been changed. Grab it from

http://sethf.com/domains/whitehouse-iraq/wh-robots.txt

Or while it lasts, the Google cache:

http://216.239.41.104/search?q=cache:tCfemw3M-aUJ:www.whitehouse.gov/robots.txt

Posted by Seth Finkelstein at 09:46 PM | Comments (4) | Followups

October 21, 2003

"Cites & Insights" November 2003, and math of six degrees of separation

Walt Crawford just published the November 2003 edition of his library 'zine (not blog) "Cites & Insights". It's excellent reading over many topics. More excellent, to me :-), is that I'm mentioned in three different places, in discussions of censorware, copyright, and perspectives on legal risks. I sent a few clarifications, though I don't think it's worth the space of going through the items for a post.

Rather, to do a change of pace, the discussion of the "Six Degrees Of Separation" idea caught my eye:

Once you leave a field, you need to look for other communities--and lots of us don't belong to that many communities. I'd be astonished if "six degrees of separation" for the world as a whole, or even for the United States, worked out in practice. It's a community thing. I'd be astonished if "six degrees of separation" for the world as a whole, or even for the United States, worked out in practice. It's a community thing.

The result is right. Formally, it's a graph-theory mathematical result. Given a graph of 6 billion nodes, and each node connected to (a few hundred? a thousand?) or so total other nodes, what's the average length of the smallest path between two nodes? I don't have a reference to the exact answer, but it's low.

The interesting experimental result of these studies is that estimating a good path in the real-world is actually practical. The key is that, while there's community clustering, people can figure out how to "route" a message across communities, if they want. The critical factor is figuring out the maximal jump per each link. As the results show, it's do-able.

Note asking "What's the number of hops for a connection"? is very different from "How many connections are made, versus die of disinterest?". That's akin to the issue of average life expectancy, where historically, there's a big difference between "Average everyone's lifespans, from 0 to 100", versus "If you survive childhood, how much longer do you live?" - because many people used to die around "0". And many message chains die around "0" too.

That is, overall, very few people may be interested in being routers (there's a lot of dropped packets). So if a path completes (every person is being a router), it has only a few hops necessary. But don't expect many paths to complete. Two different ideas.

Posted by Seth Finkelstein at 11:59 PM | Comments (1) | Followups

October 07, 2003

Google Spam Filtering Gone Bad

I believe I've uncovered the cause of the "Google NACK", a problem where Google is returning no or very few results for certain combinations of search terms. I conjecture it is a consequence of trying to eliminate spam search results, but instead wrongly eliminating all subsequent results. Read:

Google Spam Filtering Gone Bad
http://sethf.com/anticensorware/general/google-spam.php

Abstract: This report describes a problem which caused Google to return very few, or no, results for particular combinations of search terms. It is almost certain this is a consequence of search results being post-processed by spam-defense which has gone awry.

Feel free to verify my methodology. Google has an incentive to rapidly patch any publicized examples.

[Hmm, maybe I should go into "Google studies", Google doesn't sue people!]

Posted by Seth Finkelstein at 03:37 PM | Followups

October 05, 2003

Blogs, Journalism, and Mathematics

The fallacy of "blogging == journalism revolution" has been on my mind today, from BloggerCon. I've figured out the key reasoning error:

People assume production is the same as audience

This is wrong. This is false. This is an unwarranted leap of logic ("then a miracle occurs") that has very little to recommend it, and much to argue against it.

A recent blog survey, "The Blogging Iceberg", has a good paragraph on this:

Nanoaudiences are the logical outcome of continued growth in blogs. Assume for a moment that one day 100 million people regularly read blogs and that they each read 50 other peoples' blogs. That translates into 5 billion subscriptions (50 * 100 million). Now assume on that same day there are 20 million active bloggers. That translates into 250 readers per blog (5 billion / 20 million) - far smaller audiences than any traditional one-to-many communication method. And this is just an average; in practice many blogs have no more than two dozen readers.

Everyone can't have an audience of millions. That's a simple mathematical fact.

So, what's the result of traditional media + blogs? Are the media which does have an audience of millions going to just go away? Why would that happen?

There's a reasoning disconnect, from a very idealist dream, of everyone reading and writing to each other (on an assumed equal or at least meritocracy basis), to the practical constraint that it can't happen in implementation. Because everything from economies of scale to clustering tendencies ("power laws") is going to produce a relatively few large-audience outlets, and everything else is noise.

Posted by Seth Finkelstein at 11:58 PM | Followups

September 16, 2003

Verisign Typosquatting Explorer

I wrote a little perl program to examine what domain names were being suggested by Versign from their current foray into typosquatting

If anyone's interested, go to my page for Verisign Typosquatting Explorer

I haven't had much time to look to see if there's much in the results

Posted by Seth Finkelstein at 08:52 AM | Comments (1) | Followups

September 03, 2003

Eeyores vs. Tiggers

Derek Slater had an extensive post On Bunner, where he remarked in a passage:

Seth "Eeyore" Finkelstein (who's been posting a lot about Bunner) and I discussed this awhile back. ....

The reference caught my eye, in an amusing way. Hmm, I thought, wasn't Lessig also Eeyore?

That inspired me: Forget Liberals vs. Libertarians or Geeks vs. Suits. An unexamined divide is Eeyores vs. Tiggers.

Especially when I saw this quote from Eeyore, which sums up much:

'Sometimes he thought sadly to himself "Why?" and sometimes he thought "Wherefore?" and sometimes he thought "Inasmuch as which?" - and sometimes he didn't quite know what he was thinking about.'

Remember the Tigger is described as:

Their tops are made out of rubber. Their bottoms are made out of springs. They're bouncy, trouncy, flouncy, pouncy Fun, Fun, Fun, Fun, Fun!

Unfortunately, there is not just only one (link omitted out of self-preservation). Anyway, it's fun to be a Tigger. (fun, fun, fun, fun, fun!) You get to be bouncy, trouncy, flouncy, pouncy. To sing of "Emergent Pundocracy" and "Smart Snobs", go on about "The Second Soupy Powder". Who wouldn't want to live in "Cyber's Place", the new home of wunderkind?

By contrast, being an Eeyore is indeed pretty gloomy. It's no - fun - at - all. Copyblight and shrinking-wrap and trade-bleakness and De-'Em-See-Away. Lawsuits and lawyers and liability and losing.

However, the Eeyores tend to be right, while the Tiggers get to be popular. But to quote Eeyore,

'Pathetic. That's what it is. Pathetic.'

Posted by Seth Finkelstein at 11:09 PM | Followups

August 27, 2003

DVD-CCA v. Bunner, my punditry on What It Means

What follows are some thoughts I have about what the Bunner DVD trade-secret case recent decision actually means. Note I am not a lawyer, and the views below are my own, no warranty expressed or implied, free advice is worth what you pay for it, and so on.

In general, this is in the abstract, a formal, procedural, decision. It is not a factual ruling. It's a matter of law. However within those formal, procedural, matter-of-law constraints, I see things as being said, which are not good. But I see it as problematic in a much more complex fashion than the popular press is reporting it.

The popular reporting may be that this decision ruled the facts against Bunner. That's wrong. But I also think it's too abstract (though not strictly wrong), to infer nothing at all about how the facts are likely to be ruled on "remand" stemming from what's written in this decision.

My understanding is that the Appeals Court says:
(emphasis mine in all the below)
http://www.eff.org/IP/Video/DVDCCA_case/20011101_bunner_appellate_decision.html

"Preliminary injunctions are ordinarily reviewed under the deferential abuse-of-discretion standard. We consider only whether the trial court abused its discretion in evaluating two interrelated factors."

They would like to let Bunner off. But they have a problem. They will have a very hard time doing that under a "deferential" "abuse-of-discretion standard". So they make a big jump:

"However, not all restraining preliminary injunctions are entitled to such deferential review. ... Thus, in order to determine the appropriate standard of review, we must first decide whether the restraint imposed by the trial court's preliminary injunction implicated Bunner's First Amendment right to free expression. If so, we exercise independent review. "

This jump gets them out of the "deferential" state, and into the "independent review" state. And they are happy, because they then can write on about the importance of free speech, as a principle.

But this jump lands in the CA Supreme Court. The CA Supreme Court slams it, hard. Not valid, error, core dump, etc. They send it back to the Appeals Court.

We have now returned from the jump. Since no further ruling on facts has formally been made, we could abstractly be said to be no worse off than before. That would be the formal answer. However, informally, I think the key is in this part:

"If, after this examination, the court finds the injunction improper under California's trade secret law, then it should find that the trial court abused its discretion. (See ibid. [holding that, in determining whether the "issuance of a preliminary injunction constitutes an abuse of " discretion under the First Amendment, the reviewing court must independently review the factual findings subsumed in the constitutional determination]; ... [holding that preliminary injunctions are reviewed "under an abuse of discretion standard"].) Otherwise, it should uphold the injunction.

The Appeals Court didn't want to do that review under an "abuse of discretion" standard. So though the case is now being returned back to a favorably-inclined court, it's going back with extremely strong "guidance" to be decided in a way that the Appeals Court wanted to avoid - for the obvious reason that such a path strongly implied upholding the injunction, as a practical matter.

The Appeals Court is now locked back into the "abuse of discretion" box. Along with plenty of attitude conveyed, that the defendant is a bad guy and the plaintiff is a good guy. In theory, they could still have a favorable ruling. But I see them as being told here to uphold the injunction unless they can come up with an extremely good reason why not (again. "abuse of discretion").

Of course, I-Am-Not-A-Lawyer. But I'm trying not to be a defendant either 1/2 :-).

Update: A smart, top-flight, veteran, California lawyer tells me that I'm misreading that key standard of review aspect. The Appeals Court is in fact being told to exercise a fully independent review, not a deferential review. If so, I'll own up to misreading the above.

Again, IANAL

Posted by Seth Finkelstein at 12:57 AM | Comments (9) | Followups

June 25, 2003

"CIPA-compliant" library censorware

The idea of minimal, "open-source", library-specific censorware is being widely discussed (see, e.g. Edward Felten's comments - thanks Donna)

Here's the problem:

1) If any library wanted to play challenge-the-law, all they would need to do is sit back and say "Give us the specific, judicially-decided, URLs to be banned, and we'll ban them -- but not one URL more." And then wait for the compliance lawsuit to be brought. Very simple.

2) If they don't want to be challenging the law, why would they undertake what will certainly be a major PR hassle? That is, anyone can come up with harsh-but-not-illegal sites and say "Library X allows these PORNOGRAPHY sites to be viewed!". So do they get added to the blacklist or not? You mean the library is going to stand up to a constant barrage of bad PR like this? If they were willing to do that, we'd be in case #1.

Two words: Robert Mapplethorpe.
Blacklisted or not? Think through your answer in either case.

What happens when the "North American Man-Boy Love Association" asks to be whitelisted?

The idea of Open-Source Censorware (more accurately, an Open-Source Censorware Blacklist) is one which is very appealing from 10,000 feet. But it falls apart on any close examination.

OpenCensorware is far more work than may be apparent.

Here's the most well-known people who are trying it:

http://www.squidguard.org/
http://www.squidguard.org/blacklist/

Heard of them? No? Consider there's reasons why.

By the way, the Australians tried this idea too:
http://zem.squidly.org/software/guilt.html

http://www.anu.edu.au/mail-archives/link/link0002/0275.html

"Announcing the GnU Internet Lust Terminator, an open-source censorware proxy that only filters ABA-supplied banned URLs.

The software is being developed by Zem for 2600 Australia and will be eventually submitted to the IIA for inclusion as an approved filter"

The Australian government didn't approve it.

[Update 6/26 - I've also suggested privoxy (http://www.privoxy.org/) ]

Before people write back, here's my challenge:

Don't tell me this is such a great idea. Find libraries who will use it who agree it's such a great idea!

[Disclaimer - I said to one proponent of this idea that I'd help make it happen, if he could find libraries which wanted it, and funding for it, as part of a challenge above]

Posted by Seth Finkelstein at 11:25 AM | Comments (7) | Followups

June 13, 2003

DMCA vs fair-use

DMCA/fair-use blog party!

Donna and Derek and Kerr and Balkin and Solum and Frank ...

Let me jam too.

I think understand what Balkin is saying, and also what Kerr is saying.

Here's the deep question, which is being batted around:

Is fair-use a substantive limit, or a technical exception?

The side Kerr is arguing, what some call "affirmative defense", I call the "technical exception" view. That is, it conceives of fair use as having no overarching meaning, no deep significance. It's just a procedural reply in some particular sections of copyright law. The implication here, being that if one creates a new section of the copyright law - such as the DMCA - there's no carry-over, no principle to apply. The sections of the laws are partitioned, and never the twain shall meet.

The side Balkin is arguing, I call the "substantive limit" view. Fair use is an aspect of the First Amendment. It's intrinsic to any copyright-associated law by virtue of drawing power from the First Amendment's scope and reach, as a Constitutional provision. It's a bit like an all-pervasive Holy Spirit that way (the DMCA makes baby Jesus cry).

Now, Balkin is reading the Eldred decision as having a kind of genuflection to the pervasive spirit of fair use. How he does this, from perhaps the largest copyright-grab in history, is awesome to behold. The idea is that the court says the copyright-grab is OK in part since it didn't change fair use:

But when, as in this case, Congress has not altered the traditional contours of copyright protection, further First Amendment scrutiny is unnecessary.

So, goes the thought, this is a shining reaffirmation of the importance of fair use as substantive limit. And that strengthens the argument of those who argue that the DMCA is a restriction of this substantive limit. Follow the reasoning?

Frankly, this strikes me not as making lemonade out of lemons, but rather, wading through a pile of manure and trying to find a pony.

The cyanide in this lemonade is that it in fact doesn't help much against the "legal hack" that the DMCA doesn't affect fair use:

* (c) Other Rights, Etc., Not Affected. - (1) Nothing in this section shall affect rights, remedies, limitations, or defenses to copyright infringement, including fair use, under this title.

So the DMCA defenders are going to argue that in fact "[the DMCA] has not altered the traditional contours of copyright protection". Why? It says so right there, see? "Nothing in this section shall affect ...". But, respond the DMCA opponents, fair use is a substantive limit! No, say the DMCA defenders, fair use is a technical exception ...

Roundabout, here we come, right back where we started from ...

Posted by Seth Finkelstein at 05:30 PM | Followups

May 20, 2003

Googlewash, Nunberg, Orlowski

[Semi-name-dropping disclaimer - I like Andrew Orlowski's articles, and think they're asking good questions even if not immediately having the best answer to the question. I've even been quoted, willingly, in one Register Google piece. I've never talked to Nunberg, but I believe he's used some of my censorware investigations research in his CIPA expert testimony, so I also have incentive to favor him.]

I was puzzled recently when Edward W. Felten wrote:

Sunday's New York Times ran a piece by Geoffrey Nunberg complaining about (among other things) the relative absence of major-press articles from the top ranks of Google search results. ...

The real explanation is simpler : The Times forbids Google to index its site.

Huh? This took me aback. I couldn't even find that "complaining" in the piece at first. Some digging, via John Palfrey to Doc Searls finally let me figure it out. I believe what's fueling a certain reaction is this:

People think that the Nunberg/New York Times article is in part complaining about their Google PageRank - because that is what concerns net-writers!

No, folks. New York Times writers don't care about their PageRank. They don't need it!. They're heard already. By people who read short briefing papers prepared by staff. The New York Times is at the top, and it's a very diferent world up there, from down here.

If anything, I read Nunberg as being ever so slightly critical of Orlowski, and quite accepting of the Google results. I think he was saying very roughly that Google returns what people were talking about, and more people were talking about a "blog" topic than a "major-press" topic here, so that's what you get. Then people viewed this as somehow being a "complaint". But I didn't see Nunberg as complaining, so much as stating that chatter may be popular, but it isn't authoritative, and shouldn't be expected to be so. The same sentiment I express as "Google is good, but not God."

Posted by Seth Finkelstein at 03:45 PM | Followups

March 10, 2003

"Michael Owen" is the UK injunction "mystery footballer"

The "mystery footballer" who cannot be named in the British press (due to an injunction), has apparently been named on a Norwegian newspaper's website.

Says a poster from New Zealand ("godzone_kiwi@xtra.co.nz"), in
http://groups.google.com/groups?selm=IAfaa.3250%248b.444863%40news02.tsnz.net

"http://www.vg.no/pub/vgart.hbs?artid=32734"

And, while I don't read Norwegian, the name "Michael Owen" is clear in the article (hey, I'm in the US. I can say it!)

And in fact, automatic translation gives something where the gist of the article can be derived:

http://www.tranexp.com:2000/Translate/index.shtml?from=nor&to=eng&type=url&url=http%3A%2F%2Fwww.vg.no%2Fpub%2Fvgart.hbs%3Fartid%3D32734

"allegedly document that Michael Owen (22) has been faithless against ... Louise Bonsall, as am pregnant in seventh month"

Isn't the Internet amazing?

Update: More coverage, in English now, from Singapore:
Owen's no Saint Michael
http://straitstimes.asia1.com.sg/football/story/0,1870,176015,00.html
Also available at http://groups.google.com/groups?selm=8ac92a26.0303121333.15fc3530%40posting.google.com

Posted by Seth Finkelstein at 10:23 PM | Followups

February 07, 2003

UK Parliament Mail - The Ministry Of Silly Messages

From: Seth Finkelstein
To: Seth Finkelstein's InfoThought list
Subject: IT: UK Parliament Mail - The Ministry Of Silly Messages
Date: Fri, 7 Feb 2003 13:45:06 -0500

New report:

UK Parliament Mail - The Ministry Of Silly Messages
http://sethf.com/anticensorware/general/uk.php

Abstract: This report examines messages being rejected by a mail system in use by the UK parliament.

I've reverse-engineered why the system used by the UK parliament to scan mail for "inappropriate content" was bouncing messages ranging from Welsh newsletters to a Shakespeare quote. Censorware is not fond of pussy-cats and tit-willows.

URLs:

E-mail vetting blocks MPs' sex debate
http://news.bbc.co.uk/1/hi/uk_politics/2723851.stm

Software blocks MPs' Welsh e-mail
http://news.bbc.co.uk/1/hi/wales/2727133.stm

Plaid up in arms as Commons spam filter bans Welsh
http://www.theregister.co.uk/content/6/29199.html

UK Parliament Mail - The Ministry Of Silly Messages
http://sethf.com/anticensorware/general/uk.php

NTK (Need-To-Know) coverage
http://www.ntk.net/2003/02/07/

Cyber-Rights & Cyber-Liberties (UK)
http://www.cyber-rights.org/

--
Seth Finkelstein
Anticensorware Investigations - http://sethf.com/anticensorware/
Seth Finkelstein's Infothought blog - http://sethf.com/infothought/blog/
List sub/unsub: http://sethf.com/mailman/listinfo.cgi/infothought

Posted by Seth Finkelstein at 01:54 PM | Followups

February 06, 2003

Program vs. data as wave vs. particle, dualities

Let me try this from another direction. In physics, for light, there's a phenomena called "wave-particle duality". That is, in some ways a photon of light acts if it's a tiny billiard ball (a particle) and in other ways it acts if it's a ripple in material (a wave).

So asking "Is something program or data?" is a bit like asking "Is light a particle or wave?". As an intrinsic property, it's always both, But this doesn't mean everything stops there. Depending on extrinsic considerations, in different circumstances, one or the other aspect is the way it is taken in a particular situation.

In a legal analogy, I mentioned the same action being accident or murder depending on the state of mind. What I was attempting to express there, was less the specific idea that the distinction between accident and murder can be based on intent, and more the general idea that it's based on certain extrinsic rules on how to place the very same action. Did the person intend to do harm? How much did they intend? Even if they did intend, is that intent excusable? ("justifiable homicide"). However, the target is just as dead, regardless of the outcome of this rule-based determination procedure of what legal category should apply to the action.

I do think what might be called "program-data" (or "speech-code") duality has profound implications. But I also think discussion of those implications often gets derailed into an uninteresting side-path where people ask

"How can treating dual-thing as aspect-1 in situation-1, be reconciled with the fact that dual-thing is treated as aspect-2 in situation-2? Is dual-thing actually aspect-1 or aspect-2? Surely, since dual-thing can be both aspect-1 and aspect-2, then it must be treated also as aspect-2 in situation-1, and aspect-1 in situation-2. Ha-ha-gotcha!"

As a purely philosophical objection, I don't think this works. Legally, line-drawing is done all the time. The deep problem, as I see it, is if the objection works as a practical issue. As in the following part of the DeCSS decision:

FN275. During the trial, Professor Touretzky of Carnegie Mellon University, as noted above, convincingly demonstrated that computer source and object code convey the same ideas as various other modes of expression, including spoken language descriptions of the algorithm embodied in the code. Tr. (Touretzky) at 1068-69; Ex. BBE, CCO, CCP, CCQ. He drew from this the conclusion that the preliminary injunction irrationally distinguished between the code, which was enjoined, and other modes of expression that convey the same idea, which were not, id., although of course he had no reason to be aware that the injunction drew that line only because that was the limit of the relief plaintiffs sought. With commendable candor, he readily admitted that the implication of his view that the spoken language and computer code versions were substantially similar was not necessarily that the preliminary injunction was too broad; rather, the logic of his position was that it was either too broad or too narrow. Id. at 1070-71. Once again, the question of a substantially broader injunction need not be addressed here, as plaintiffs have not sought broader relief.

Posted by Seth Finkelstein at 12:57 PM | Followups

February 05, 2003

Programs vs. Data, a simple example

Edward Felten discuss Programs vs. Data, and trying to distinguish. Here's an example I've given to people before, for consideration:

The ROT13 algorithm explained ("Caesar Cipher")
1) The decryption algorithm for ROT13 is to take the range of letters from a-z, and for those twenty-six letters, replace the first thirteen of them with the range of letters from n-z and the second thirteen of them with the range of letters from a-m
2) To un-ROT13, do a tr/a-z/n-za-m/ over each character in the file
3) perl -pe 'tr/a-z/n-za-m/;' < infile > outfile
Where did I step over the line, from "speech" to "code"?

Or where did I make the transition between "data" and "program"?

Ed Felten says "it seems unsatisfactory to call something a program or not based on the state of mind of its author.". I submit that for legal purposes, something along those lines of "primary use" or "dominant purpose" is the only system which will work. It's a bit like the different between accident/manslaughter/murder-second-degree/murder-first-degree. The same "data" (outcome) is treated differently depending on a legal "program" (ruling) regarding intent and effect.

Posted by Seth Finkelstein at 05:48 PM | Followups

February 02, 2003

Domains With Typographical Errors - A Google Search Strategy

I was inspired this weekend, and cross-connected the earlier domain searching to Google

Domains With Typographical Errors - A Google Search Strategy
http://sethf.com/domains/typos-google/
by Seth Finkelstein

Abstract: This paper describes a strategy for searching for domain names with typographical differences by using Google, and compares the results to a previous search using approximate string matching.

This is in response to a report
Large-Scale Registration of Domains with Typographical Errors
http://cyber.law.harvard.edu/people/edelman/typo-domains/
by Benjamin Edelman.

He describes an extensive series of domain names with typographical errors which have been registered by a cybersquatter. and asks for help in identifying these targets. This creates what might be called an "inverse problem", of determining what are the target of the squatted typo'ed name

Posted by Seth Finkelstein at 11:57 PM | Followups

February 01, 2003

Space Shuttle Columbia

I remember the Challenger Accident.

Bad deja-vue

Posted by Seth Finkelstein at 08:46 PM | Followups

January 31, 2003

Domains with Typographical Errors - A Simple Search Strategy

Domains with Typographical Errors - A Simple Search Strategy
http://sethf.com/domains/typos/
by Seth Finkelstein

Abstract: This paper describes a simple strategy for searching for domain names with typographical differences, and the results of one such search.

This is in response to a report
Large-Scale Registration of Domains with Typographical Errors
http://cyber.law.harvard.edu/people/edelman/typo-domains/
by Benjamin Edelman.

He describes an extensive series of domain names with typographical errors which have been registered by a cybersquatter. and asks for help in identifying these targets. This creates what might be called an "inverse problem", of determining what are the target of the squatted typo'ed name

Note Donna Wentworth at Copyfight described my paper beautifully - "Seth F. gets agrep on the problem".

Posted by Seth Finkelstein at 11:58 PM | Followups

January 23, 2003

Matt Blaze Master Key security paper, earlier attack descriptions

I read with great interest Matt Blaze's paper,
"Cryptology and Physical Security: Rights Amplification in Master-Keyed Mechanical Locks"

He wrote:

It is always difficult to be sure that an attack is completely novel in the sense of not having previously been discovered independently; the lack of a coherent and open body of literature on locks makes it especially so. In this case, several correspondents have suggested that similar approaches to master key reverse engineering have been discovered and used illicitly in the past. However, there do not appear to be references to this particular attack in the written literature of either the locksmith or underground communities.

I was able to supply him with two references to earlier descriptions of the attack, in one case 15 years ago.

Compare:

2.2.2 The Attack
For each pin position, p from 1 to P , prepare H - 1 test keys cut with the change key bitting at every position except position p. At position p, cut each of the H -1 keys with each of the possible bitting heights excluding the bitting of the change key at that position. Attempt to operate the lock with each of these test keys, and record which keys operate the lock.

With the following item from (note 1987)

http://yarchive.net/security/master_keys.html

From gwyn@brl-smoke.arpa (Doug Gwyn) 12-Nov-1987 17:36:05
Subj: [1137] Re: mastered systems
"Obtain one extra key blank per pin column (7 for the typical institutional Best lock); duplicate the operating key except for one column on the blanks, omitting a different column on each blank. Then, for each blank, try it with the omitted column cut to number 0 (high), then 1, then 2, ... and record which bittings open the lock. That tells you what the splits are in that column. The whole set of trials tells you what all the splits are in all columns."

And similar (note 1994)

http://groups.google.com/groups?selm=2jcejp%24csc%40coyote.rain.org

From: jay@coyote.rain.org (Jay Hennigan)
Newsgroups: alt.locksmithing
Subject: Master key hacking Was:Re: Legality of picks...
Date: 9 Feb 1994 20:53:13 -0800
If you have a "change" (industry term for normal non-master) key and the lock that it fits, as a guest in a hotel would, as well as a number of blanks, you can do the following: Cut a key identical to your key, but with the first pin position uncut or a "0" cut. Try it in the lock. If it works go on to step 2. If not, take the first pin down one depth using a key gauge or micrometer (or a Le Gard or other code cutting machine). Try it again until the key works. When you hit the depth of the cut on your original key, it should obviously work, as the keys should be identical. If so, continue going deeper. You are likely to find a depth on the first pin _in addition_ to the one on your key that opens the lock. If not, then cut another blank with the first position identical to yours, and the second one at the top or "0" cut.
Step 2: Repeat as above with the next pin position.
The object is to find the cut at each pin position that is different from the single-lock key you have, but still opens the lock. This will be the master key bitting. Having two different keys (and locks) from two different areas of the masterkeyed system will make things a bit easier, as you'll have a way of cross checking, especially if there are more than two breaks in some pins. This exercise, if you're precise, and lucky, can take as few as 5 or 6 key blanks. At most, a dozen. No real skill in picking or impressioning is needed. [... rest of article snipped]

Update: There's some interesting commentary on Dave Farber's IP list:

Donald Eastlake commentary:
http://www.interesting-people.org/archives/interesting-people/200301/msg00136.html

Bob McClure commentary:
http://www.interesting-people.org/archives/interesting-people/200301/msg00147.html

Matt Blaze reply message, "Keep it secret, stupid!":
http://www.interesting-people.org/archives/interesting-people/200301/msg00153.html

And a thread on the newsgroup alt.security.alarms
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=3e32f52d_2%40corp.newsgroups.com&rnum=1&prev=/groups%3Fq%3D%2522master%2Bkey%2522%2Bimpression%26hl%3Den%26lr%3D%26ie%3DUTF-8%26scoring%3Dd%26selm%3D3e32f52d_2%2540corp.newsgroups.com%26rnum%3D1

There's also discussion with postings from Matt Blaze himself, on alt.locksmithing
http://groups.google.com/groups?dq=&hl=en&lr=&ie=UTF-8&threadm=1043353164.31996%40cswreg.cos.agilent.com&rnum=1&prev=/groups%3Fq%3Dg:thl2517357864d%26dq%3D%26hl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3D1043353164.31996%2540cswreg.cos.agilent.com

Posted by Seth Finkelstein at 11:57 AM | Followups

January 14, 2003

More on "Blog Politics of Form"

Copyfight's Donna Wentworth replied to my earlier message on Blog blather Let me hasten to reassure that I'm not against thinking about Grand Ideas. I do it myself :-). And I never want to seem to be too hard on people for such thoughts, and regret if I come across that way. I sympathize with the feeling.

There's occasions that an innovation seems world-changing, revolutionary. And sometimes it is world-changing - but often the world doesn't change in the ways you expect. This problem is one reason I've resolved to stop arguing with people over You-Can't-Censor-The-Net. Because someone who is caught up in the latest iteration, often takes it as I'm raining on their parade, when I tell them about earlier times where that battle-cry didn't work.

To be clear, I'm not describing Donna's post in particular, here. She was presenting an idea and asking a question or two--not engaging in a full-scale examination of an issue.

Instead, I'm describing a genre of discussion where I find it difficult to discern any meaning. The problem is that such discussions often don't consider the politics of form, but more akin to literary theory of form. That is, they don't take into account practical strategies of organizing, where there are winners and losers, big money, entrenched interests, and the outcome is often not what would be consider just. Rather, the focus is typically more on the experience of writing and reading, and then large leaps to Profound Thoughts.

And then there's the next fad, and the whole punditry process starts all over again

Literary theory isn't wrong. But a little goes a long way.

I'm all for examining the impact of innovations on society. But I mean, really examining it, as in, also taking into account what doesn't change, and what counter-intuitive results occur.

Rule of thumb : Any examination which doesn't reveal at least one serious negative consideration isn't worthwhile, because it's just hype.

Posted by Seth Finkelstein at 03:31 AM | Followups

January 13, 2003

Reply to "Blog Politics of Form"

Donna Wentworth at Copyfight talks about the "politics of form". While I think that's a interesting topic, I also think it won't get discussed meaningfully. Because the meaningful material is likely to be more specialized and unsexy than befits Grand Ideas. No offense, but as I read over everything, I thought again:

AARRGGHH! More blather!

Some days, I think I would be vastly more popular if I took journo-blathering seriously. Let's see ... "Yes, the weblog is yet another pinnacle in the postmodern [neat word!] cyber-democratization [neat prefix!] of the infosphere [neat phrase!]. It is not the ``I Media'' of the top-down organizational form of the old regime, but as others have noted, the ``We Media'' [neat term!] of a spontaneously self-organized complex system [a sprinkling of pseudoscience jargon is always good!]. We must ask "What Does It All Mean"? [big broad question are excellent filler!] And answer that the meaning is a unique new frontier in human expression [nothing is ever an old retread!] ..."

Sorry, but I get curmudgeonly over this stuff. I lived through the growth of mailing-lists, Usenet, the early Internet, and so on. I remember when there really was an aspect of egalitarianism and democratization with networked communications. But it was a fragile state, stemming from the fact that the community was small and insular then, and it didn't last.

I think the key insight is the following:

PUNDITRY ISN'T DEMOCRACY

More opportunities for punditry doesn't necessarily mean society becomes more egalitarian - this is the fundamental error of 95% of the noise on the topic. It connects to the idea of commentators being the watchdog of a well-functioning world. So then more comments equals a better world. But rather, it just means more people have a chance at becoming professional chatterers, and/or the existing chatterers have yet another outlet. Indeed, that's a change, certainly a change worth studying - but not a unique, unprecedented change. And the implications are likely to be much less than the hype over them.

Posted by Seth Finkelstein at 01:30 PM | Followups

January 09, 2003

Reply to "Replace Copyright with Watermarks, Taxes"

Donna Wentworth at Copyfight asks for thoughts on the following music proposal:

Fisher's first choice, he said, would be to recognize that copyright law is increasingly dysfunctional for handling music royalties and to (1) Authorize artists to insert simple watermarks in their creations, (2) Tax, at the multilateral or national level, things such as ISP access and various technologies upon which music is performed, (3) Count the frequency with which each digital product is consumed, (4) Distribute revenue from the taxes in the proportion in which the various products are accessed. Once the system is in place, he said, copyright law can be "lifted."

I think the general outlines are good, and many people (including myself :-)) have said vaguely similar things in the past. However, the devil is in the details. In particular, I've emphasized point #3 for a reason. HOW does he intend to "Count the frequency with which each digital product is consumed"? Super-spyware? Require every player to recognize the watermark? That would of course require non-watermark-responding players to be illegal, right ... (umm ... didn't we just go through this?)

Don't get me wrong, again, the overall idea, of some sort of mandatory license and statistical royalties seems to be the right thing. However, getting the details correct is the tough part. Arguably, this idea worked reasonable well in the "Audio Home Recording Act", with a tax on that digital recording media. And maybe Fisher's riffing off of it.

But if so, it's a riff in a "visionary" manner, where the details are being neglected for the Grand Idea. It's one thing to tax digital tapes, where there's a discrete object, and the tax is small compared to the price. But what is "various technologies upon which music is performed"? The $10 (?) for the motherboard sounds chips? The speakers? He's not planning to tax free-software Linux players as a "technologies", I hope (I'm having a bad DeCSS flashback here, with code as technology!) The bandwidth? It seems like there's just not enough money there.

Maybe he can make it work. But the acid test for any proposal is to work with free (in both speech and beer) software, and come up with some in-the-ballpark numbers.

Posted by Seth Finkelstein at 03:28 AM | Followups

December 29, 2002

N2H2 (censorware co) - financially "dead company walking"

I've finally finished ploughing through N2H2's recent financial report and attempting to figure out just how near they are to death's door (approaching? threshold? already through?). I think it's a matter of "dead company walking".

I've finally made sense of their announced "cash flow positive" quarter. Remember, N2H2 loses around $1.7million each quarter

Now, look at the N2H2 Balance Sheet

Note how "Cash and Cash Equivalents" takes a big jump up on Sep 30, 2002.

But projecting, the estimated numbers would be (in thousands)

"Net Tangible Assets":
+2,152	(down 1,400 to)	+752	(down 1,017 to)	-265	(down 1,507 to)	-1,772
"Cash and Cash Equivalents":
+6,000	(down 1,740 to)	+4,260	(down 1,485 to)	+2,775	(now project)	728? (est?)
calculate "Cash and Cash Equivalents" - "Net Tangible Assets":
+3,848		+3,.508		+3,040	(now project)	+2,5? (est?)

That is, the next number in the series for "Cash and Cash Equivalents" should be down "1,something" , giving less than 1,000 remaining, around 728 from projecting from the drop in "Cash and Cash Equivalents" - "Net Tangible Assets"

Instead, they record a total "Cash and Cash Equivalents" UP to +4,684. That's an overage of (4,684 - 728? = 3956?). Where are they getting that extra 3956 or so ?

Note nothing dramatic has changed in terms of income and expenses for all of N2H2 fiscal year 2002 . So their recent layoffs can't be the cause of this dramatic change.

Look at the "Other Current Liabilities" line. They have a big change in going from 4,475 to 8,179 = 3704. That seems to be the jump.

The "Cash Flow Statement" agrees

Income hasn't changed much. The cash jump is from "Changes In Liabilities".

What's this 8,179 liability?

Searching the annual report, under
"LIABILITIES AND SHAREHOLDERS' EQUITY (DEFICIT)"
They have "Deferred revenue" of 8,179

What's this "Deferred revenue"? The only reference I can find is (my emphasis)

"Subscription agreements and most maintenance services are evidenced by signed contracts, which are generally 12, 24 or 36 months in duration. Subscription and maintenance revenues are recognized on a straight-line basis over the life of the contract. Contracts billed in advance of services provided are recorded as deferred revenue.

Hmm? What's going on here?

It appears they counted much of *expected Financial Year 2003* revenue, as "deferred revenue" for the last quarter of Financial Year 2002. And listed what they billed as part of "Cash and Cash Equivalents".

That is, the only reason they're "cash flow positive" is that they have gotten substantial billed money in advance of the services.

There's another obscure line where they list current "Working capital" as being -2,193 , with a footnote of "(3) Includes current portion of deferred revenue."

In brief, it's if someone were in debt, and took out a loan, and trumpeted having a "cash-flow-positive" event because the loan was money received now (that it'd have to be paid back later was irrelevant).

It's like the old saying about losing a little money on every sale, but making up in volume.

Posted by Seth Finkelstein at 04:27 PM | Followups

December 13, 2002

Censorware, "filtering", and the imperatives of control

[I've sent this message around a few places in discussion about the Kaiser Family Foundation censorware study]

One censorware aspect the Kaiser report does not discuss, is that in order for control to be effective, sites such as language-translators, privacy sites, anonymity protections, the GOOGLE CACHE, the Wayback Internet archives, etc tend to be banned. Otherwise, such sites act as a "LOOPHOLE" (to use N2H2's terminology) for the control of censorware. This is a structural, architectural, issue. Whether or not you consider this bad, good, or not a horribly high cost, it's factually a deep problem of censorware which is not going to go away from configuration. Take a look at my (sadly under-publicized) work, e.g.

BESS's Secret LOOPHOLE: (censorware vs. privacy & anonymity) - a secret category of BESS (N2H2), and more about why censorware must blacklist privacy, anonymity, and translators
http://sethf.com/anticensorware/bess/loophole.php

BESS vs The Google Search Engine (Cache, Groups, Images) - N2H2/BESS bans cached web pages, passes porn in groups, and considers all image searching to be pornography.
http://sethf.com/anticensorware/bess/google.php

The Pre-Slipped Slope - censorware vs the Wayback Machine web archive - The logic of censorware programs suppressing an enormous digital library.
http://sethf.com/anticensorware/general/slip.php

Very broadly, the Kaiser study found that the more blacklists that are used, the more inaccurate bans there are. Viewed basically in terms of what censorware is - a bunch of blacklists - this should be clear.

That is, fundamentally, a censorware program is a collection of blacklists. Each blacklist has some accurate entries, and some wildly inaccurate ridiculous entries. If you use several blacklists, you get the accurate entries, and then all the wildly inaccurate ridiculous entries contained in all those several blacklists. Simple.

From this point of view, it's not a surprise that several blacklists, have in combination, a much higher number of wildly inaccurate ridiculous entries, than a few blacklists. Roughly, having more blacklists means more silliness, and fewer blacklists means fewer silliness. No special magic to "configuration" there. The less of the censorware you use, the less of the baleful effects you have.

And Kaiser didn't find that censorware bans all the porn sites either! At heart, it's not difficult to get a big list of porn sites. It's really not. But what "benefit", other than the political, is there in just making the outright porn-searchers work a little harder, while randomly denying some people the information they need, and denying everyone such tools as language-translators, google caches, etc?

I don't think this is a simplistic opposition to "filtering". But it is saying there is no magic - there's not going to be any configuration that makes all the naughty stuff go away, while having only nice remaining. Or even most of the way there. The best PR the censorware companies ever did, was to have the word "filtering" attached to their blacklists. Because that channels all the discussion into a focus on the supposedly worthless material, and far away from all the imperatives involved in controlling what people are forbidden to read.

Posted by Seth Finkelstein at 11:53 PM | Followups

December 05, 2002

.kids.us

With the passage into law of the ".kids.us" subdomain, ( Dot Kids Implementation and Efficiency Act of 2002) which I refer to as "dot-kidding", I'm collecting in one post my earlier comments on why it's such a ill-fated idea. While the concept is certainly very pleasant, that political appeal seems to have completely overridden any thought about what is in fact being proposed. This is not a "children's room". It's a government whitelist. Below are some of my explanations as to where a government whitelist has all sorts of implementation problems. Blather, blather:

To facilitate the creation of a new, second-level Internet domain within the United States country code domain that will be a haven for material that promotes positive experiences for children and families using the Internet, provides a safe online environment for children, and helps to prevent children from being exposed to harmful material on the Internet, and for other purposes.

The Basic Problem:

The .kids.us concept can be condensed down to one basic idea, that the US government will certify sites as OK-for-minors. There is no need to have this certification as a domain name. It could be done just as well with a simple list of US government certified OK-for-minors sites, and that would be vastly simpler to administer.

The dirty little secret of this boondoggle is as follows:

NOBODY WANTS IT

Almost nobody wants "whitelists". Whitelists have been around for years and years and years. I could write pages on this history of the idea. Just think about the basics. It's not like the concept just now occurred to people.

Is It OK To Be Happy and Gay?:

Here's why it's not a panacea. Consider the standard:

(5) SUITABLE FOR MINORS- The term `suitable for minors' means, with respect to material, that it--
`(A) is not psychologically or intellectually inappropriate for minors; and
`(B) serves--
`(i) the educational, informational, intellectual, or cognitive needs of minors; or
`(ii) the social, emotional, or entertainment needs of minors.'.

Now, the question: Does the book Heather Has Two Mommies meet this standard? Think about the implications.

Linking Lunacy:

Consider the requirement of no outside links:

"(11) Written agreements with registrars, which shall require registrars to enter into written agreements with registrants, to prohibit hyperlinks in the new domain that take new domain users outside of the new domain."

Besides being redundant (if one is already restricted to the sandbox, why prohibit hyperlinks?), there is a very deep problem here. Are they really saying that there is a profound difference between

"See the material at peacefire.org"

"See the material at peacefire.org (which is located at http://peacefire.org , as you have probably figured out, but http://peacefire.org is not a hyperlink, because if we made a hyperlink to http://peacefire.org we'd be violating our contract, so we can't make a hyperlink to http://peacefire.org)"

Either they end up meaning "no URLs", which is even sillier, or we have a profound problem of not understanding that hyperlinks are nothing more than convenient references. That is, if the exact same reference is acceptable as long as it is not a "hyperlink", that seems to defeat the purpose.

I suppose none of the sites will be able to run common mailing-list or groups/bboard software which tends to turn URLs into hyperlinks.

Maybe it'll be like curse words, e.g. "s*cks" (umm, how many asterisks are going to be needed to be OK?). We can have http://p**f*r*.*rg

Posted by Seth Finkelstein at 02:45 PM | Followups

November 25, 2002

How To Win (DMCA) Exemptions And Influence Policy

Date: Mon, 25 Nov 2002 08:07:11 -0800
From: Lee Tien
Subject: guide to DMCA "exemption" process -- 3 weeks left
To: Law & Policy of Computer Communications

EFF is pleased to present a guide to the DMCA "exemption" process.

http://www.eff.org/IP/DMCA/finkelstein_on_dmca.html

Under this process, the Copyright Office of the Library of Congress must make a triennial inquiry regarding adverse effects of the DMCA's prohibition on circumvention on "certain classes of works."

If adverse effects are shown, the office can "exempt certain classes of works from the prohibition against circumvention of technological measures that control access to copyrighted works." The exemptions only last 3 years.

The author, Seth Finkelstein, is one of the very few people who succeeded in arguing for an exemption (for the act of circumventing access/copy controls on censorware blacklists) in the last round (2000). [The Copyright Office received many comments and rejected the overwhelming majority of them; I think in the end only 2 or 3 exemptions were created.]

The upcoming round is the next one, for 2003. "Written comments are due by December 18, 2002."

This is about the only part of the DMCA that can mitigate its fell sway, so if you have any interest in the topic at all, it's well worth reading.

Lee
--
**********************************
Lee Tien
Senior Staff Attorney
Electronic Frontier Foundation

Posted by Seth Finkelstein at 11:41 AM | Comments (0) | Followups

November 22, 2002

Curmudgeonness on "Revenge of the Blog"

I started reading the commentary for the Revenge of the Blog Conference. Frankly, and no offense to the blog-star panel, I started to overdose very quickly. I had too much deja vu and bad flashbacks, from the days when the magical Internet was going to equalize us all.

The basics: If you're a professional talker, that is a journalist, some lawyers, some policy-makers, and a new punditry tool appears, this leads to more commentary. And some people are well-positioned to take advantage of this new ecological niche, and prosper in it. This leads to much ponderous pontification of What It All Means, which is of course - more punditry, on punditry, which is a favorite subject of punditry.

I saw this happen with mailing lists and Usenet. It came around again at the start of the World-Wide-Web. There was some of it for Internet-Relay-Chat. There was was another iteration when "virtual communities" were all the rage. And now it's come around to blogs.

Let me say again, there's nothing wrong with a profound navel-gaze of The Meaning Of It All. I just couldn't bear to read much of it, since I'd read it all so many times before in the past decade.

Posted by Seth Finkelstein at 09:30 PM | Followups

November 04, 2002

Digital-Rights-Management / Newspeak article

I've adapted some earlier entries on the subject of Digital-Rights-Management matching Newspeak, into a small stand-alone essay on my site. This is now being run in this month's edition of the webzine Ethical Spectacle. Note Lawmeme also liked the idea.

Posted by Seth Finkelstein at 09:33 AM

October 30, 2002

Wishful Thinking, Leeway, and Copy Control

Ed Felten has two posts which I think make an unexpected point in contrast - Wishful Thinking, roughly regarding universal copy-control in hardware, and "leeway" about making laws function effectively.

I suggest that applying the "leeway" concept to the Wishful Thinking post yields an interesting result.

That is, in a way, the spokesman for Hollings, regarding controls, is more correct than is being granted:

Andy Davis, a spokesman for Mr. Hollings, said the technology-minded critics of the bill were "missing the thrust of the senator's argument," which is that there is need for more protection of copyright works if online content and broadband Internet access are to flourish

This is a "politics" reply, which focuses on the short, snappy soundbite, i.e., what-about-the-children, it's-against-theft, motherhood-and-apple-pie, etc. But that shouldn't blind us to the existence of an argument underneath it all.

The idea of Felten's Fritz's Hit List, is mocking Hollings by applying the law as if there were NO "leeway" in it. That's fun.

But I think this is being mistaken for a killer argument that any mandatory copy control proposals must fail, because they must blindly be applied in the most extreme and literal sense. That's appealing to the technical mindset, because one can equivalence all general computers at some abstract level. But it's a much weaker argument in practice.

What the spokesman doesn't want to say, because it would be horrible press-speak, is the following: "Look, this isn't about talking dog collars. It's about locking down what 99.9% of the population uses for business or entertainment. The hard problem is coming up with a solution that works for all of Hollywood and Intel and Microsoft. The practical difficulty is there, not in dog collars."

I don't think it's necessarily correct to believe that this is an unsolvable problem, because we postulate there can be no leeway in the required control for their purposes. Their goal is working the difference between theory and practice.

Posted by Seth Finkelstein at 07:03 AM | Followups

October 18, 2002

More on The Truism of the Restricted-Purpose Language

Ed Felten's written a reply to my item about " The Truism of the Restricted-Purpose Language", ending:

I believe that code is speech, and I believe that its status as speech is not just a legal technicality but a deep truth about the social value of code. What the code-regulators want is not so different from what the speech-regulators of 1984 wanted.

I agree with all of this!

But I'd say the comparison works well for exactly the opposite reasons as intended. Newspeak doesn't conjure up images of the idea that you can't make a language where certain concepts are unexpressible, therefore the Party was silly and stupid to even try. Rather, it conveys images that you can have an official system which is restrictive and oppressive and works to impoverish the vast majority of the population. That is, the comparison to Newspeak is not "it can't work", but "it can work, so beware".

Suppose we remove the literary flourishes from the description of Newspeak. That is, rather than proclaiming:

The purpose of Newspeak was ...to make all other modes of thought impossible. It was intended that when Newspeak had been adopted ... a heretical thought ... should be literally unthinkable ...

Let's have a more qualified, less hyperbolic:

The purpose of Newspeak was ...to make all other modes of thought cumbersome and onerous. It was intended that when Newspeak had been adopted ... a heretical thought ... should be difficult to articulate, easy to be derided and mocked, readily attacked when conveyed to others.

This lacks the punch and flourish of the stark statement of impossible. But it's a much more accurate description of what would likely be the case in practice.

And the idea of computer-language libraries in fact supports this point. What's one big problem with C and C++? The fact that there are so many different libraries which do similar, but not quite identical, functionality. Merely having the ability to extend the language by new definitions is not adequate. There must also be a process to have those definitions accepted in "society" as common, otherwise the process of communication breaks down. Every time a program needs to ported from one library to another, it's a proof that there's a big difference between having the ability to express something, and doing it in a fashion which can be effectively used by other people.

Let's also remember that the strictures of Newspeak weren't going to be enforced by its language merits. Rather, people who started creating unauthorized language-extensions were going to quickly become unpeople - rather like the idea of the DMCA, etc. that programmers or researcher who publish unauthorized expressions are going to be fined/jailed.

Posted by Seth Finkelstein at 03:28 PM | Followups

October 17, 2002

Mutterings about a pattern of geek argument

Just a general remark that's been on my mind recently: perhaps I'm stating the obvious, but Microsoft/Palladium/TCPA/"Trusted Computing", etc, etc, is not being developed because it's such a neat-o nifty-keen geeky concept to play around with, "Gee, whatever could we use this do ...". It's not to add abstract capabilities to a computer system, as an experiment in the advancement of computation. It's for certain reasons, having to do with rights and permissions and control.

I think programmers tend to forget that practical aspect, in a rush to play with the concepts. But we should know better than anyone how wide a gulf there can be, between concept and implementation.

Posted by Seth Finkelstein at 11:55 PM

October 16, 2002

The Truism of the Restricted-Purpose Language

I have to disagree strongly with the idea that the best example for "The Fallacy of the Almost-General-Purpose Computer" is "The Fallacy of the Almost-General-Purpose Language" In fact, I'd say this example undercuts the point, and actually strongly argues the reverse.

I think we get too wrapped-up in the idea of "impossible", along the lines of the idea that Newspeak was to make it impossible to speak frankly about politics. Yes, right, nothing will ever make it "impossible". But my own experiences with Libertarianism thoroughly convince me that it's certainly common to have a political language that makes it very difficult to express certain thoughts. I can't remember how many times a Libertarian has told me that a concept is invalid, because the English sense of the word used to describe the concept doesn't have that sense in the specialized argot of Libertarianism. As in, for example "censorship means ...". The problem is that the word "censorship" has several different meanings in English, but only a single meaning in Liberspeak ("by the government"). Thus in so many conversations, it's a massive chore to convince the Libertarian that just because their definition is restrictive, doesn't make the concept invalid _per se_. And the Libertarian is likely to endlessly repeat some variant of the idea that because the word in Liberspeak has only a specific Libermeaning, other concepts are invalid. It's not utterly and completely beyond human achievement to explain the differences between Liberspeak and English. But wow, it's an amazingly difficult task, and requires a great deal of analytic and writing skill. It's the best example I've ever seen of how Newspeak would actually function in action.

There's a computer-language version of this too. After all, what's the whole point of the Software-As-Speech argument? Programming languages are designed to make it easy to express certain abstract concepts, where English or other languages don't work well. It's not impossible to express the concept in those same languages, but it is much harder and more error-prone. And then it follows that other concepts may be more difficult to express in the programming language. I remember a parody song, where the punchline was "We're a string-processing in FORTRAN shop". Why is that considered hilarious? Because FORTRAN, as a language is so ill-suited for string-processing as to make doing it typically so difficult as to be a joke. Now, it's not impossible to do string-processing in FORTRAN - but it is certainly cumbersome and hard.

So in the abstract, what Hollywood wants might be impossible. But I'm starting to think the focus on the impossibility is leading to ignoring a much more frightening practicality.

Posted by Seth Finkelstein at 06:29 PM | Followups

October 15, 2002

Agendas, and Information Wants To Be Paid-For

One of people's first reactions to the increase in communications from the growth of the Internet, has always been roughly "Oh my God - there's too much information available - we've got to find some way to control it, some means where people who shouldn't have certain information, can be prevented from being able to read it.".

This reaction was not, as sometimes imagined, exclusive to governments concerned with political subversion. In fact, it was a very standard reaction by many people, regarding many types of information (sex, racism, etc.)

It's entirely logical, even expected, that copyright-based businesses should have exactly this reaction too, when faced with exchanges of information which they feel are threatening - namely, that which has not been paid-for.

Posted by Seth Finkelstein at 11:55 PM | Followups

October 14, 2002

Explaining General Purpose vs. Special Purpose Computers

An item from Ed Felten asks how to give a "simple, non-technical explanation" for the truism:

Either you make a general-purpose computer that can do everything that every other computer can do; or you make a special-purpose device that can do only an infinitesimally small fraction of all the interesting computations one might want to do. There's no in-between.

Here's my try at such an explanation, geared to Washington concepts:

Suppose you want telephone calls answered, for an office. You can either hire a human and have that person be a receptionist, or buy an automated telephone answering machine. The human receptionist who has the task of answering telephone calls will also be able to answer letters or do any other clerical task. The automated telephone answering machine will never be able to do anything other than answer telephone calls. There is no in-between, where there's a machine which will do all general clerical work, but nothing else.

Moreover, to continue the analogy, the human receptionist, as a consequence of general-purpose ability, will also be able to tell unauthorized people who has been telephoning the office. And perhaps even what the contents of the telephone calls contain (copying!). An automated telephone answering machine will never be able do that either (on its own).

This is simply two sides of the same coin of having general-purpose ability. Note this problem has been well-known since ancient times - where rulers would maim servants in various ways (e.g. cutting-out the tongues of slaves) in brutal attempts to prevent what might be called nowadays, unauthorized information transfer. Recent legislative proposals are perhaps the modern equivalent of those crippling practices.

Posted by Seth Finkelstein at 06:16 PM | Followups

September 28, 2002

"intellectual property" vs "granted monopoly"

Thought for consideration : We should change usage from "intellectual property" to "granted monopoly".

I'm coming to believe that the term intellectual property is more and more leading to an inability to think about the issue. Copying isn't theft. But what is it? In the case of copyright, it's a violation of the business model of a granted monopoly. This violation may be trivial, or may indeed threaten the business model. But talking of it in terms of property is threatening to crowd out anything else.

Posted by Seth Finkelstein at 11:55 PM

September 26, 2002

More on copyright, "limited times", and "legal hacks"

I received a nice reply (from Derek Slater, a person on the civil-liberties side) about my last entry, where he gently elucidated many key legal differences between copyright clause interpretation and DMCA interpretation. All good material. I didn't mean to give any impression that I was arguing the situations are legally identical in all respects. What I was trying to do earlier was to examine Valenti's copyright comment in terms of implications regarding practice versus formalism. If "'limited' is whatever Congress says it is.", then in practice, that's unlimited, through the method of making "limited" mean something along the lines of "finite (yet not necessarily reached)". A copyright which never expires in practice, is unlimited for business purposes, whether or not it qualifies as limited in a legal sense. Note I'm echoing the Eldred dissent here:

Second, and more importantly, the Court's construction of the Copyright Clause of the Constitution renders Congress's power under Art. I, s 8, cl. 8, limitless despite express limitations in the terms of that clause. ... Under the Court's decision herein, Congress may at or before the end of each such "limited period" enact a new extension, apparently without limitation. As the majority conceded, "[i]f the Congress were to make copyright protection permanent, then it surely would exceed the power conferred upon it by the Copyright Clause." Eldred, 239 F.3d at 377. The majority never explained how a precedent that would permit the perpetuation of protection in increments is somehow more constitutional than one which did it in one fell swoop.

But again, that's the dissent. What strikes me as interesting here, is the way what I call the "finite yet unbounded" interpretation, works around an apparent limit in limit. A geek would call that a "hack". Valenti seems to argue that copyright could be made permanent in all but name (though admittedly the courts don't think we are at that point yet).

But compare the above dissent passage to what Judge Kaplan said about the DMCA, "effectively controls access" argument, in the DeCSS case:

Finally, the interpretation of the phrase "effectively controls access" offered by defendants at trial--viz., that the use of the word "effectively" means that the statute protects only successful or efficacious technological means of controlling access--would gut the statute if it were adopted. If a technological means of access control is circumvented, it is, in common parlance, ineffective. Yet defendants' construction, if adopted, would limit the application of the statute to access control measures that thwart circumvention, but withhold protection for those measures that can be circumvented. In other words, defendants would have the Court construe the statute to offer protection where none is needed but to withhold protection precisely where protection is essential. The Court declines to do so.

Now, I'm NOT saying that these situations are equally valid, and have an identical legal basis behind them. But there did seem to me to be something of the same "hacking" (in the old-style meaning of the word) spirit in the two arguments. That is, nullifying something in practice, by using a definition which reduces the apparent meaning to one having virtually no real-world significance.

If "effectively" meant "successful", then the DMCA would have no power. And if "limited" means "finite yet unbounded", then "limited times" is no practical constraint.

I suppose my point is that what Valenti is doing still strikes me as "legal hack", even if it's a better-premised "legal hack" than the one tried for DeCSS.

Posted by Seth Finkelstein at 10:04 AM

September 25, 2002

Copyright, "limited times", and "legal hacks"

I was thinking about this passage regarding copyright and "limited times", from copyfight:

Jack Valenti on the Constitution's Copyright Clause, quoted in Dan Gillmor's Valenti Presents Hollywood's Side of the Technology Story: "[Just] read Article I, Section 8 of the Constitution, which gives Congress the power to 'promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.' There's no ambiguity...'limited' is whatever Congress says it is."

There's certainly a logical problem here - if "limited" could be a million years, that's not limited in any but the most formal sense. I'm not claiming any special insight on that point, it's been said many a time. However, I was struck by the thinking going on here. It's a mirror of exactly the sort of geek-mindset that tries and fails to come up with a "legal hack". In discussion of the DMCA, I've seen so many programmers say something along these lines: the DMCA language talks about a measure which "effectively controls access", but if such a measure is broken, it must not have been "effective", gotcha, ha-ha. This was in fact addressed as a legal argument in the DeCSS case, and the court didn't buy it all. But it seems the copyright interests are doing precisely the same sort of word-gaming - "limited times", sure, limited to expire 20 years from now, always, an unreachable limit, but still a "limit", gotcha, ha-ha. And so far, they have been prevailing with this argument, though with a shade of dissent.

There's a lesson (politics, or maybe "Critical Legal Studies") in here somewhere.

Posted by Seth Finkelstein at 09:54 AM

September 15, 2002

DRM and lesser vs. both evils

At the risk of repeating myself, I'd like to make one comment about something Ed Felten just said - "... and that what Lessig calls "token based" DRM is a lesser evil than what he calls "copy protection"

Voting, for example, is an exclusive "or" - that is, one candidate winning, means that all the other candidates lose. But here, the control systems being discussed don't have the property that one being implemented means any others will not be also in force. Indeed, it's entirely possible for the ultimate result to be both evils. In fact, object-control plus network-control works together in a very natural belt-and-suspenders fashion.

And this makes a great deal of sense from a Congressional standpoint too. I don't think this discussion has intense politics behind it. But I'd worry if people seriously seemed to get caught up in the idea of actually advocating object-control as a way of supposedly warding-off network-control. I don't think that's being seriously advocated now, just speculated in an academic sense. But I still have the scars from the censorware wars. Beware seductive theory.

Posted by Seth Finkelstein at 11:57 PM

September 14, 2002

The "end-to-end"s versus the means of DRM

Regarding Felten's comments on what is an "end-to-end argument", I took Lessig's reference to "network design" not to be about re-engineering TCP/IP. Instead, I believe the idea was that IF the media industry was given object-control, THEN they'd be happy to go away and not bother about Napster or Aimster or similar, not be concerned about sharing systems. Because they would then feel secure (pun intended) that whatever those sharing systems exchanged, the object-control would prevent unauthorized use. I take this from where Lessig says: "if a technology could control who used what content, there would be little need to control how many copies of that content lived on the Internet"

But to point out the flaw in the above proposition via another way, the statement seems to conflate "content" with "objects". That is, there might be official versions of a song which are controlled objects. But you can be sure, since bootlegs existed even before computers, there will be many, many, unapproved versions in circulation. The technology can control who uses what objects But that's not the same as content.

There's no contradiction at all here in terms of "end-to-end argument". Felten: "If copy-protection is to have any hope at all of working, it must operate on the end hosts". Right. I think Lessig agrees, roughly. The argument is, put the control inside the machines, (via an operating system or hardware which examines objects) AND THEN there will be no problem with the Napster-ilk or other network-based exchange innovations, since the content industry will be able to "trust" that the sharing of controlled content will be prevented ( Lessig: "A different DRM would undermine that push").

But, per Felten: "It must try to keep Aimster ... from getting access to files containing copyrighted material". Right also. That's the flaw in the object-control argument. Because if "wild" objects can still be used and shared, then the network is just as much a threat as before, and still needs to be controlled too (as in Aimster is still a problem).

It's not so much about "end-to-end", but coming to a bad end.

Posted by Seth Finkelstein at 11:24 AM

September 13, 2002

DRM and object control vs network control

I've been reading Lessig's article on Digital-Right-Management, Anti-trusting Microsoft, and various comments I found the article very clear. Let me try to boil it down, in my prosaic paraphrase. I believe the key ideas are as follows:

1) Usage control can be either object-based or network-based.
2) IF control is object-based, THEN it doesn't have to be network-based.
3) Coming from Microsoft doesn't automatically make it a bad idea.

In some reactions, I'd say too much emphasis is being placed on aspect#3. Now, being suspicious of anything from Microsoft is formally an ad-hominem argument, though that suspicion is also prudent. This Microsoft element is generating much attention, since it's at the start of the article, expressed in a humorous way, and has the word "Microsoft" in it. It's generally great pundit-fodder, allowing asking how truly evil is Microsoft in the first place, whether it's thought to be more evil than it deserves versus an overwrought image of evil, and then whether such a stench of evil is clouding our perceptions.

However, this isn't the fundamental problem with the piece, as I see it. The difficulty is in aspect #2. That portion is an appealing thought. The argument runs IF, IF, IF, the desired usage control is put in objects THEN THEN THEN, the network control is unnecessary.

It's such a seductive proposition. I've seen the idea so many times in various contexts. Years ago, it was roughly the same scheme of argument I called censorware-is-our-saviour, during the time censorware was being promoted by some people as a "solution" to censorship laws. Implement control locally, it's thought, and the powers at issue will let the global net alone.

Every time I see one of these arguments, I have the same question:

Show me that the other side believes it.

Not that one would think the other side should accept it, based on the theory which has been elucidated. No, no, no, that is not my question, why they'll be happy. Don't repeat back to me the theory. I understood the theory. Rather, show me some evidence that the other side does in fact consider this enough. Because perhaps the theory is wrong. Here, perhaps they won't consider object-control to be sufficient, and will rather take it as precedent for network-control in addition.

And that's the subtle flaw in aspect #1. The argument is:

1) Usage control can be either object-based OR network-based.

I think the reality is best rendered:

1') Usage control is desired as object-based AND network-based.

The theory fails in the same way for all these types of arguments - they start out by setting up two things as opposites (object versus network), which the other side sees as complements (object plus network). In programming terms, it argues an exclusive "or", where the opponent believes in an inclusive "and".

What I think will happen, is that if object-control is implemented, then lack of network-control will be viewed as a threat. Since, unless the machine is limited to using only those objects which are "domesticated", those which are "wild" will proliferate. That is, all the P2P music and video trading will still be a "problem", just using one-generation-down "wild" copies made from speakers or screens, or otherwise "cracked".

In fact, the fallacy is very clear from thinking of the days of copy-protected software packages (object control). That didn't stop all the illegal file-trading sites (uncontrolled network) - they tended to be full of "cracked" copies (uncontrolled objects). And sometimes the "cracked" copies were even preferred for legitimate users, since they were often less hassle overall, to back-up and re-install. I can hear Jack Valenti now, saying something along the lines of perhaps "the open network is like a diseased sewer which threatens the sterile environment of the industry".

Moreover, there is a terrible social cost attached to such an argument. If people pin their hopes on object-control as the answer against network-control, then the flaws in object-control - exactly those uncertified, unapproved, unMicrosoft materials - will be cast as threats to the "solution", as spoilers against the supposed means of defeating network-control.

I should stress my points here aren't particularly ideological. It's not about whether Microsoft can be trusted with power, or if open-source is good. Rather, the proposed architectural code has a subtle bug in it - it has an XOR (exclusive "or") early in its model, where the system will want an AND (i.e. "both"). We will not save the network by object sacrifice.

Posted by Seth Finkelstein at 10:21 AM