June 15, 2008

Yahoo Search Engine Spiders Directories From File Paths

I just did an experiment and confirmed that the Yahoo spider will try to search a directory from a file. That is, if it sees a URL like http://example.com/stuff/jump.avi , in addition to retrieving that file, it'll try the URL http://example.com/stuff/ . Though Google won't do that (nor will Microsoft). It's easy to test this yourself if you have a website where you can see server logs. This practice has some significant implications for people who claim that trying truncated URLs is improper behavior and even possibly unauthorized access.

Posted by Seth Finkelstein at 03:50 PM
June 13, 2008

How "alex.kozinski.com" worked (Judge Alex Kozinski "Porn Site" Follow-up)

[Original research! Not an echo!!!]

Following up the "Porn Site" of Judge Alex Kozinski kerfuffle, and all the discussion of private vs. public norms, I've been trying to figure out exactly how the web site was configured. We know the controversial material was in a directory called "stuff", hence it was http://alex.kozinski.com/stuff/

I've found a key piece of evidence. In June 2004, Alex Kozinski sent a public letter in HTML, humorous nominating himself as part of a "Judicial Hottie contest":

Courthouse Forum: The Hot. Alex Kozinski

This letter contains various links, and one sentence in particular is:

* I bungee jump. [Ed. note: Click on the link to play this very fun little video clip--and make sure your sound is turned on!]

There, "bungee jump" is linked to: http://alex.kozinski.com/stuff/jump.avi

Again, that's the key directory.

This shows that Judge Kozinski knew the general public could retrieve specific material from that directory, and in at least one case, invited the public to do so.

I speculate that he did not know that his server was configured with a feature which lists all files in that directory when the directory name was given. That is, he may have thought that the only way to know what files were there, was if one was given filenames.

Moral: Security By Obscurity - Isn't.

Note regarding the search engine restriction file "robots.txt":

Yahoo had a cached copy of that directory (seems uncached now) with an entry at least as late as:

25.minutes.to.go.wmv 28-May-2008 12:18 6.3M movie

This strongly indicates there was no search engines prohibition for that directory. Further evidence is at the Internet archive, which shows many versions e.g.:

http://web.archive.org/web/20070629190035/http://alex.kozinski.com/robots.txt

having only entries:

User-agent: *
Disallow: /jurist-l/

[Disclaimer: Do read the letter. Alex Kozinski is impressive and a very cool guy, and those who are trying to have him removed from his position because of this tempest-in-a-teapot should avail themselves of some of the acts portrayed in the files in that directory]

Posted by Seth Finkelstein at 07:18 PM | Comments (4)
April 06, 2005

Reporters Without Borders nominates freedom blogs - including Infothought!

"Vote for freedom of expression blog award-winners!"
http://www.rsf.org/article.php3?id_article=13098

International April 6, 2005

Reporters Without Borders is calling on Internet-users to vote online for award-winners from among 60 blogs defending freedom of expression. There are six categories : Africa and the Middle East, the Americas, Asia, Europe, Iran and International.

...

These awards will be in tribute to webloggers who defend free expression and sometimes pay heavily for it. ...

Now it is up to Internet-users to decide. They may only vote for one blog per geographical category (The International category is of blogs that have a general interest in freedom of expression on the Internet).

Voting closes on 1st June 2005 and the prize-winners will be announced two weeks later.

To register a vote, go to : http://www.globenet.org/rsf/voteblog.php?lang=en

[And in particular, to vote in the International category, where Infothought is nominated, go to:
http://www.globenet.org/rsf/voteblog.php?cat=5&lang=en ]

Posted by Seth Finkelstein at 08:41 AM
January 13, 2005

CBS Report file has been modifed! Cut and Paste now prohibited!

Ernest Miller noticed that he could no longer cut-and-paste from the CBS report, and asked me to investigate. He's right. The report PDF file has been modified since its release. This can be verifed by any tool which will display the internal information of a PDF file.

http://wwwimage.cbsnews.com/htdocs/pdf/complete_report/CBS_Report.pdf

HTTP information (emphasis added below):

HTTP/1.0 200 OK
Server: Apache
Last-Modified: Wed, 12 Jan 2005 21:24:24 GMT
ETag: "b8626f-abd1c-6cea7200"
Accept-Ranges: bytes
Content-Length: 703772
Content-Type: application/pdf
Date: Thu, 13 Jan 2005 19:25:27 GMT

Current CBS Report file, PDF internal information (from the Linux tool "pdfinfo")

Title:          Microsoft Word - DC-685241-v10-Final_CBS_Report__sent_to_Lou_12_20_.DOC
Author:         demartpe
Creator:        PScript5.dll Version 5.2.2
Producer:       Acrobat Distiller 5.0.5 (Windows)
CreationDate:   Wed Jan  5 23:29:52 2005
ModDate:        Wed Jan 12 16:00:24 2005
Tagged:         no
Pages:          234
Encrypted:      yes (print:yes copy:no change:no addNotes:no)
Page size:      612 x 792 pts (letter)
File size:      703772 bytes
Optimized:      yes
PDF version:    1.4

Earlier CBS Report file, PDF internal information (from the Linux tool "pdfinfo")

Title:          Microsoft Word - DC-685241-v10-Final_CBS_Report__sent_to_Lou_12_20_.DOC
Author:         demartpe
Creator:        PScript5.dll Version 5.2.2
Producer:       Acrobat Distiller 5.0.5 (Windows)
CreationDate:   Wed Jan  5 23:29:52 2005
ModDate:        Fri Jan  7 19:17:44 2005
Tagged:         no
Pages:          234
Encrypted:      no
Page size:      612 x 792 pts (letter)
File size:      703330 bytes
Optimized:      yes
PDF version:    1.4

Note the difference in the "Encrypted:" field!

However, the text itself does not seem to have been altered.


Update 4:15 pm EST: Ernest Miller sends that the version of the report on the CBS law firm site has also been modified, confirmed (though the text again does not seem to have been altered).

http://www.klng.com/downloads/CBS_Report.pdf

HTTP information (emphasis added below):

HTTP/1.1 200 OK
Content-Length: 690313
Content-Type: application/pdf
Last-Modified: Tue, 11 Jan 2005 20:16:48 GMT
Accept-Ranges: bytes
ETag: "8564b67a1af8c41:e1b"
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Date: Thu, 13 Jan 2005 21:19:25 GMT

PDF internal information (from the Linux tool "pdfinfo")

Title:
Creator:        PScript5.dll Version 5.2.2
Producer:       Acrobat Distiller 5.0.5 (Windows)
CreationDate:   Wed Jan  5 23:29:52 2005
ModDate:        Tue Jan 11 15:14:40 2005
Tagged:         no
Pages:          234
Encrypted:      yes (print:yes copy:no change:no addNotes:no)
Page size:      612 x 792 pts (letter)
File size:      690313 bytes
Optimized:      yes
PDF version:    1.5


Update Fri Jan 14 14:45 EST 2005
Sisyphean Musings has CBS's explanation:

To allow copying of text to applications such as Word would allow anyone to create a modified or falsified report, which we cannot allow. The law firm hired by the Independent Panel insists that the report not be available in a format that can be altered, and we agree with that decision.

This speaks for itself.

Posted by Seth Finkelstein at 02:53 PM | Comments (9) | Followups
August 15, 2004

BSA Weasel == "Beagle Boys"!

The Business Software Alliance (BSA) has announced an "anti-piracy" site, with a kids' mascot ferret, and a contest to call it a name.

The BSA weasel creature reminded me of something I'd seen before. Something shady, disreputable, criminal. Finally, I remembered! The BSA weasel looks like he's a member of a criminal gang in Walt Disney Comics, the "Beagle Boys":

BSA Weasel Beagle Boys
BSA Weasel Beagle Boys

Look at the family resemblance. Same shirt. Same pants (gang colors?). Same squinty, hooded, eyes. Same toothy smirk. He's even wearing something on his chest, which, making allowances for updating to the modern age, might be a Beagle Boys identification patch (more evidence of gang affiliation!).

Traditionally, the Beagle Boys were after Scrooge McDuck's Money Bin. They must be diversifying. There's certainly a big money bin around the Business Software Alliance, one to rival Scrooge McDuck. So the gang has obviously gotten one of their younger members to convince the BSA executives to take him into the organization (using his weasel-skills - thus explaining what would otherwise be evident stupidity in having such a mascot). While everyone is distracted at the official contest ceremony, the rest of the gang will attempt to pull a heist. Classic plot.

It all fits ....

[Credit: Beagle Boys image from Kit's Silver Age Comic Books ]

Posted by Seth Finkelstein at 08:32 AM | Comments (5) | Followups
May 17, 2004

Nitke v. Ashcroft expert witness report of Seth Finkelstein

Nitke v. Ashcroft is a Internet censorship case challenging the obscenity provision of the Communications Decency Act (CDA). I'm serving as an expert witness on the topic of the Internet, anonymity, privacy, as it all relates to net censorship. My expert witness report is now available on-line:

Nitke v. Ashcroft : Seth Finkelstein expert witness report
http://sethf.com/nitke/ashcroft.php

As stated in the Nitke vs. Ashcroft Expert Witness press release:

The expert witness reports support the plaintiffs' contention that "local community standards" cannot be accurately applied to the Internet and, therefore, cannot be used to determine what is obscene. If the most restrictive communities can control what is placed on the Internet, then everyone will be restricted to that standard. The Internet is a world-wide phenomenon, therefore websites should not be held to standards specific to geo-location because community standards vary significantly from region to region and community to community.

Posted by Seth Finkelstein at 11:59 PM | Followups
May 09, 2004

Interviewed for "Blogging of a Thesis About Blogging"

Daniel Kreiss, who is doing "Blogging of a Thesis About Blogging", wrote an interview with me:

Seth Finkelstein interview

In the spirit of blogging, here's my partipatory journalism regarding it:

Crashing Back Down to (a Realistic) Earth

Had a long chat with Seth Finkelstein last night. He has some fascinating insights/arguments into blogging, and why it's a myth that the journalistic gatekeepers are gone. ...

It's quite good, but I'm biased :-)

The discussion ranges over my ideas of gatekeepers of production being replaced with gatekeepers of audience, to power laws to the "complete and utter nonsense to say that blogging will herald a new era of "participatory democracy" or communication where everyone has a voice" (I did indeed say that).

In looking at the evidence, like the theory of power law, Finkelstein (who uses terms like "calculated" when discussing theoretical arguments; ...

Yup. That comes from my Math/Physics background. Many of these discussions strike me as very much like errors one can make in similar calculations. "What's the (electrical) power necessary to run this motor?" isn't too far from "What's the (political) power necessary to run this candidate?". Complete with the contingent that wants to assume a spherical cow.

Now, there's a part of the interview where I disagree or would comment:

I tend to agree that power law is a good description of how users are reading the web, but I also have a sense that this model does not adequately amount to a theory of digital communication. Communication also has a tendency to percolate back up (trickling perhaps, but it is happening none the less) to the gatekeepers of audience, or beyond that into other social relationships.

This where I'd start thinking/asking, "What do you mean by "has a tendency", that is, how much"? Even in the most totalitarian dictatorship, there's some sort of "communication" between the elites and the population at large. Any smart ruler knows you have to listen to the masses to some extent, if only to keep track of who is a potential threat to imprison or kill. Getting too out-of-touch that way is a recipe for overthrow. But the elites and the dissidents sure aren't equal in communication.

For instance, my own newbie gestures at blogging at the time of this post have resulted in a grand total of two citations! Does that mean I am not heard, that I do not have a voice?

Yes. It means you don't have a voice if, say, you're concerned that a "Slashdot editor" with access to 250,000 readers may domain-hijack your website, for example. You couldn't fight back (unless those two readers happen to be very powerful themselves, what I call "The President And The Pope" argument).

Perhaps. But this might not be the end all measure of communication. This is not meant as a grand gesture here, but perhaps my ideas or reporting influenced someone's thinking, which then got passed onto their own blog, with or without the citation, and then around from there both off and online in their dealings with other people. My communication would then implicitly have an audience and power to it, even though I might have no idea or concept of the boundaries of that audience.

Audience (and used here as a proxy for power) is a variable. It can be measured and compared.

First person: "I'm heard by 250,000 people".
Second person: "Well, I'm heard by 250 people, does that mean I have no voice?"

Basic mathematics is that, all other things being equal, as a first approximation, the second person has 1/1000, one one-thousandth, of the voice of the first person, that the first person has ONE THOUSAND times the power of the second.

The amount of noise devoted to denying and obscuring the implications of this very simple little fact is amazing. On and on: Maybe audience isn't everything (right, it isn't, but it's not nothing either), maybe the first approximation isn't accurate (sometimes, but it's still useful overall), maybe the writer is happy to just stand on a streetcorner and rant to whomever passes by (which wasn't the point).

But the vast inequality in power this implies, replicated in Big Bloggerdom as much as other Big Media, is very ideologically unpalatable.

So regardless of the gatekeepers of audience, all communication has the potential to be implicitly powerful in how it is spread; and we do not have a good means for tracking this.

What is "implicitly powerful"? This sounds a lot to me like saying every lottery ticket has the "implicit power" to be a winning ticket. It does. But we also know that the probability is quite measurable.

True, some people are the social entrepreneurs in network theory, but there is always a dialectic at the micro level of communication (and this also does not account for the mere fact that people writing consistently, about anything, has implications in and of itself.)

"True, some people are super-rich, but even poor people have some money, and this does not account for the fact that having some money at all has implications in and of itself". See the problem? That is, saying almost all people have at least a little money, is typically not very useful to examining the divide between wealth and poverty.

There is a danger however, and Finkelstein is right to forcibly point this out. When people blow bubbles there is a distortion that occurs inside the bubble and whether that is traced through the stock market, the Dean campaign, or by ignoring the very real sites of social, economic, and political power, the promise of technology needs to be realistically combined with the cold hard historical reasoning that tells us there will never be a purely technological fix for what ails us.

Thus, we should advocate, and as strongly as ever, for the structural changes (like public subsidies for media outlets) that will create a more responsive, and responsible, media in this country.

I completely agree with the above. The problem, however, is that too many of the bubble-blowers think blogging in itself is that structural change. And I believe in this regard, they are: 1) deluding themselves 2) being cruel to the have-nots 3) aiding to ensconce the exact same gatekeeper hierarchy, by refusing to grapple with its emergent existence.

Posted by Seth Finkelstein at 06:13 PM | Comments (1) | Followups
March 03, 2004

Free porn, Google, spam, Internet censorship, and the Supreme Court

[Yes, this post really seriously concerns *all* the topics listed, it's truly that _tour de force_]

The Supreme Court just heard arguments on another Internet censorship law, "COPA", ( Ashcroft v. ACLU, 03-218). The Boston Globe reported:

Ordinarily, US Solicitor General Theodore B. Olson prepares for an appearance before the Supreme Court by acting out his argument before a pretend court. This time, for a case about the Internet, he added a new twist: searching online for free porn.

At his home last weekend, Olson told the justices yesterday, he typed in those two words in a search engine, and found that "there were 6,230,000 sites available."

The top lawyer who represents the Bush administration before the Supreme Court said the search's results illustrate how pornography on websites "is increasing enormously every day," a central point in his argument for saving an antipornography law that was enacted six years ago but has yet to go into effect.

Now, let's do something often unrewarded in this world - think. What search did he do exactly? It seems to be the following search in Google:

http://www.google.com/search?q=free+porn

That gives me now "about 6,320,000" results, close enough, the total number returned often varies a bit.

Now, what that search means is roughly the number of pages containing the words "free" and "porn" anywhere in the entire page (or links with those words). This blog entry will qualify as one of those results as soon as it is indexed. I don't think this blog entry is proof of how pornography on websites "is increasing enormously every day,", much less the need for an Internet censorship law.

I've written about the problems of Google and stupid journalism tricks before. But, sigh, nobody reads me, so this won't get reported. Anyway, the story gets even better.

I started digging down into the results to see if I could find some non-sex-site mentions before the Google 1000 results display limit (Yes, Mr. Olson, there are more than 1000 sites devoted to sex in the world, that's true). Google's display crashed stopped in the high 800's! That is, displayed at the bottom, for:

http://www.google.com/search?q=free+porn&num=100&start=900

In order to show you the most relevant results, we have omitted some entries very similar to the 876 already displayed.
If you like, you can repeat the search with the omitted results included.

The number varies, but it's been under 900.

Joke: Hear ye! Hear ye! Instead of "6,230,000 sites available", there's really uniquely less than 900! At least, according to Google.

Now, this is the Google display crash from bugs in the Google spam filtering. Google has cleaned-up their index so the crash is not happening on the first screen of results. But it's still in their results display code. Usually, people don't see the bug in practice, since the crash has now been pushed very far down in the sequence of results.

But here I had a reason to go looking out as far as I could, and ran into the crash in a bona-fide real-world situation. Not just a trivial query too, but one with profound implications for Censorship Of The Internet.

[Update 3/4: Michael Masnick brings to my attention that what I thought was the old Google spam crash is now reduced to duplicate-removal processing on the 1000 results display limit - the point is still that I can use fallacious superficial search "logic" to assert there's less than 900 sites, because Google "says" so. But the technical reason is not quite what I wrote originally]

Humor: If the evidence from a Google search was good enough to be used to justify censorship when it said "6.2 million", why isn't it good enough to justify no censorship if on further investigation it says less than 900? That is, if you thought it was valid before, with a big number, why isn't it valid now, with a small number? (garbage in, garbage out)

Look at me, I'm a journalist (or grandstanding lawyer) - Google says there's no practically no porn on the net!

Posted by Seth Finkelstein at 09:52 AM | Comments (9) | Followups
February 16, 2004

Howard Dean Domains

Inspired by Joe Trippi's blog domain, I went digging though the domain database to see if there was anything "interesting" to be found there. No scandal, but some amusing material associated with the Dean campaign. Most of the list was just DeanForInsertstatehere.com or BlahForDean.com. But amusingly, the domains:

DEANFLIPFLOPS.COM
HOWARDFLIPFLOPS.COM
WATCHDEANFLIPFLOP.COM

were all registered by "Dean For America" on "31-Oct-03".

And on "16-SEP-03", the Dean webmaster had registered

FLIPFLOPFORAMERICA.COM and FLIP-FLOPFORAMERICA.COM

I wonder what the story is there, just for the humor value.

Other funny domains:

DEANDEANDEANDEAN.COM
DEANDEANDEANDEANDEAN.COM
DEANDEPRESSION.COM
DEANISWRONG.COM

And interestingly:

DONTRALPH.COM
OPENSOURCEPOLICY.COM
POLICYFORAMERICA.COM

None of these seem to be in use.

If anyone wants it, I've made the list available (not meant to be exhaustive) at:

http://sethf.com/domains/dean/

Again, not exactly hot material, but it has its moments.

Posted by Seth Finkelstein at 11:59 PM | Followups
January 11, 2004

Outsourcing != Democracy

Whenever someone preaches that an industrial change is going to lead to a major revolution, I find it that it's useful to consider whether there will be a revolution, but in the opposite direction entirely. So it is with many democracy-of-the-media discussions I've seen recently. All of these seem to have the same path:

The Media Revolution Is At Hand:

Production is much cheaper. Employees are easier to replace. The occupation is becoming less the province of skilled workers, and more of amateur labor which works nearly free. Advances in mechanization, err, communication, allow for cutting staff drastically, and one laborer can now do what previously required several people. If you don't take advantage of these trends, your competitors will, so get with it.

This is democracy?

No.

This is OUTSOURCING!

No wonder so many of the media pundits are so rude about blogs - they're defending their conception of themselves as hard-to-replace highly skilled labor.

But on the other hand, why am I supposed to be so excited that many skilled jobs are turning into unskilled jobs or cheap-labor jobs? Well, there is of course the populist joy in seeing an arrogant profession brought low. But putting aside heart-warming Schadenfreude at their humbling, the end result here seems to be the exact opposite of what's preached. That is, overall, there will be more power for management, not labor.

Posted by Seth Finkelstein at 11:26 PM | Comments (7) | Followups
December 16, 2003

Seth Finkelstein GrepLaw Interview (Censorware, Copyright, and Blogs)

GrepLaw has an interview with me today:

Seth Finkelstein on Censorware, Copyright, and Blogs
http://grep.law.harvard.edu/article.pl?sid=03/12/16/0526234&mode=nocomment

It's over 6,000 words long. Of course, I think it's well worth reading. But I'm biased there.

The topics range over censorware, copyright, DMCA, free-speech activism, committing bloggery, and more. I went on at length. So even if you've heard me say it all before, it might be worth a look just for the collected edition.

[Update: URL now goes to archived text on my site]

Posted by Seth Finkelstein at 07:24 AM | Comments (3) | Followups
November 26, 2003

Google Bayesian Spam Filtering Problem?

New Google report from Seth Finkelstein:

Google Bayesian Spam Filtering Problem?
http://sethf.com/anticensorware/google/bayesian-spam.php

Abstract: This report describes a possible explanation for recent
changes in Google search results, where long-time high-ranking sites have disappeared. It is hypothesized that the changes are a result of the implementation of a "Bayesian spam filtering" algorithm, which is producing unintended consequences.

Posted by Seth Finkelstein at 09:07 AM | Followups
November 15, 2003

Google Deskbar

Google Deskbar is the latest little tool from Google. It's a self-contained searching program, which is very lightweight and fits snugly in a desktop screen (PR: "Google Deskbar enables you to search with Google from any application without lifting your fingers from the keyboard. Installs easily in your Windows taskbar.")

I was poking around at its innards in order to see if there was anything interesting inside. Internally, it seems to be a "microbrowser". That is, I think it hooks into Windows/Internet Explorer services in order to do a search, exactly as if you had typed it into the Internet Explorer browser. And then uses the Windows Operating System display routines to present the results.

On the one hand, that makes it heavily operating-system dependent in terms of code. On the other hand, it's extremely cheap in terms of development, a neat little hack.

The most socially interesting thing about it, is that given it's tying into Windows/Internet Explorer services, it appears to share the Google cookie with Internet Explorer, and use the Google cookie itself in all searching. That's not obvious, though it makes sense in retrospect.

It's actually a little strange, in terms coming full circle with applications, to realize it's a microbrowser. That is, the original web browsers were simple programs devoted to rendering simple code. Then the inevitable "creeping-featurism" took over ("2. More generally, the tendency for anything complicated to become even more complicated because people keep saying "Gee, it would be even better if it had this feature too"."). So the browser became a behemoth, of often not-quite-working plugins, handling sound and video and cascades of style bleats. It's now so bloated that writing a small and fast program to do one common operation and display the results quickly, is some sort of innovation. Somewhere there's a lesson in that.

Update: I should have mentioned Dave's Quick Search Taskbar Toolbar Deskbar, thanks to LISnews

Posted by Seth Finkelstein at 11:56 PM | Followups
October 27, 2003

Whitehouse.gov iraq robots.txt directories - an explanation?!

Update 10/28: The White House says it's merely a design issue, from

http://www.2600.com/news/view/article/1803

Per: http://www.bway.net/~keith/whrobots/whresp.html


[(10/27) Just sent this to Dave Farber's list, about the whitehouse iraq robots.txt directories (update: note for more background, see http://www.bway.net/~keith/whrobots/ )]

Archived at

The White House And Iraq Directories
http://sethf.com/domains/whitehouse-iraq/

Dave, I've been analyzing the robots.txt file, exactly because the directories are so strange. I have a theory on what's happened. But it's so jaw-dropping that I'm hesitant to rush it into a formal report/release. In short:

There's no conspiracy.

There's a real-life instance of the joke genre which runs "I thought you said ..."

For example, here's one of the jokes: "After a California earthquake, Dan Quayle is sent to visit the most damaged site. But he never arrives there. Finally, he's found in Florida. He says, shocked, "Go to the EPIcenter? I thought you said ..." [EPCOT Center]

The joke here? Someone said:

"Don't have the search engines looking at the Iraq documents index"

And that was heard as:

"Don't have the search engines looking at every "index" with Iraq"

Really!

The evidence for this is that the robots.txt file has lines for

Disallow: /disk2/www/htdocs/infocus/iraq
Disallow: /disk2/www/htdocs/infocus/iraq/news/infocus/iraq

These are the only lines where there's never any matching pattern of "iraq" and "text" at all. They're obviously special in some way. And they look like they're a searchable index.

Then there's the fact that some people are confused between directories, the function of the file "index.html", and that a bare directory will display as "Index of <directory name>" in some servers.

So ... "Iraq index" ... "Index of <directory name>" ... Oooops!

Never attribute to malice which can be explained by stupidity.

This is hard to believe. But it fits!


Update - the robots.txt file has been changed. Grab it from

http://sethf.com/domains/whitehouse-iraq/wh-robots.txt

Or while it lasts, the Google cache:

http://216.239.41.104/search?q=cache:tCfemw3M-aUJ:www.whitehouse.gov/robots.txt

Posted by Seth Finkelstein at 09:46 PM | Comments (4) | Followups
October 21, 2003

"Cites & Insights" November 2003, and math of six degrees of separation

Walt Crawford just published the November 2003 edition of his library 'zine (not blog) "Cites & Insights". It's excellent reading over many topics. More excellent, to me :-), is that I'm mentioned in three different places, in discussions of censorware, copyright, and perspectives on legal risks. I sent a few clarifications, though I don't think it's worth the space of going through the items for a post.

Rather, to do a change of pace, the discussion of the "Six Degrees Of Separation" idea caught my eye:

Once you leave a field, you need to look for other communities--and lots of us don't belong to that many communities. I'd be astonished if "six degrees of separation" for the world as a whole, or even for the United States, worked out in practice. It's a community thing. I'd be astonished if "six degrees of separation" for the world as a whole, or even for the United States, worked out in practice. It's a community thing.

The result is right. Formally, it's a graph-theory mathematical result. Given a graph of 6 billion nodes, and each node connected to (a few hundred? a thousand?) or so total other nodes, what's the average length of the smallest path between two nodes? I don't have a reference to the exact answer, but it's low.

The interesting experimental result of these studies is that estimating a good path in the real-world is actually practical. The key is that, while there's community clustering, people can figure out how to "route" a message across communities, if they want. The critical factor is figuring out the maximal jump per each link. As the results show, it's do-able.

Note asking "What's the number of hops for a connection"? is very different from "How many connections are made, versus die of disinterest?". That's akin to the issue of average life expectancy, where historically, there's a big difference between "Average everyone's lifespans, from 0 to 100", versus "If you survive childhood, how much longer do you live?" - because many people used to die around "0". And many message chains die around "0" too.

That is, overall, very few people may be interested in being routers (there's a lot of dropped packets). So if a path completes (every person is being a router), it has only a few hops necessary. But don't expect many paths to complete. Two different ideas.

Posted by Seth Finkelstein at 11:59 PM | Comments (1) | Followups
October 07, 2003

Google Spam Filtering Gone Bad

I believe I've uncovered the cause of the "Google NACK", a problem where Google is returning no or very few results for certain combinations of search terms. I conjecture it is a consequence of trying to eliminate spam search results, but instead wrongly eliminating all subsequent results. Read:

Google Spam Filtering Gone Bad
http://sethf.com/anticensorware/general/google-spam.php

Abstract: This report describes a problem which caused Google to return very few, or no, results for particular combinations of search terms. It is almost certain this is a consequence of search results being post-processed by spam-defense which has gone awry.

Feel free to verify my methodology. Google has an incentive to rapidly patch any publicized examples.

[Hmm, maybe I should go into "Google studies", Google doesn't sue people!]

Posted by Seth Finkelstein at 03:37 PM | Followups
October 05, 2003

Blogs, Journalism, and Mathematics

The fallacy of "blogging == journalism revolution" has been on my mind today, from BloggerCon. I've figured out the key reasoning error:

People assume production is the same as audience

This is wrong. This is false. This is an unwarranted leap of logic ("then a miracle occurs") that has very little to recommend it, and much to argue against it.

A recent blog survey, "The Blogging Iceberg", has a good paragraph on this:

Nanoaudiences are the logical outcome of continued growth in blogs. Assume for a moment that one day 100 million people regularly read blogs and that they each read 50 other peoples' blogs. That translates into 5 billion subscriptions (50 * 100 million). Now assume on that same day there are 20 million active bloggers. That translates into 250 readers per blog (5 billion / 20 million) - far smaller audiences than any traditional one-to-many communication method. And this is just an average; in practice many blogs have no more than two dozen readers.

Everyone can't have an audience of millions. That's a simple mathematical fact.

So, what's the result of traditional media + blogs? Are the media which does have an audience of millions going to just go away? Why would that happen?

There's a reasoning disconnect, from a very idealist dream, of everyone reading and writing to each other (on an assumed equal or at least meritocracy basis), to the practical constraint that it can't happen in implementation. Because everything from economies of scale to clustering tendencies ("power laws") is going to produce a relatively few large-audience outlets, and everything else is noise.

Posted by Seth Finkelstein at 11:58 PM | Followups
September 16, 2003

Verisign Typosquatting Explorer

I wrote a little perl program to examine what domain names were being suggested by Versign from their current foray into typosquatting

If anyone's interested, go to my page for Verisign Typosquatting Explorer

I haven't had much time to look to see if there's much in the results

Posted by Seth Finkelstein at 08:52 AM | Comments (1) | Followups
September 03, 2003

Eeyores vs. Tiggers

Derek Slater had an extensive post On Bunner, where he remarked in a passage:

Seth "Eeyore" Finkelstein (who's been posting a lot about Bunner) and I discussed this awhile back. ....

The reference caught my eye, in an amusing way. Hmm, I thought, wasn't Lessig also Eeyore?

That inspired me: Forget Liberals vs. Libertarians or Geeks vs. Suits. An unexamined divide is Eeyores vs. Tiggers.

Especially when I saw this quote from Eeyore, which sums up much:

'Sometimes he thought sadly to himself "Why?" and sometimes he thought "Wherefore?" and sometimes he thought "Inasmuch as which?" - and sometimes he didn't quite know what he was thinking about.'

Remember the Tigger is described as:

Their tops are made out of rubber. Their bottoms are made out of springs. They're bouncy, trouncy, flouncy, pouncy Fun, Fun, Fun, Fun, Fun!

Unfortunately, there is not just only one (link omitted out of self-preservation). Anyway, it's fun to be a Tigger. (fun, fun, fun, fun, fun!) You get to be bouncy, trouncy, flouncy, pouncy. To sing of "Emergent Pundocracy" and "Smart Snobs", go on about "The Second Soupy Powder". Who wouldn't want to live in "Cyber's Place", the new home of wunderkind?

By contrast, being an Eeyore is indeed pretty gloomy. It's no - fun - at - all. Copyblight and shrinking-wrap and trade-bleakness and De-'Em-See-Away. Lawsuits and lawyers and liability and losing.

However, the Eeyores tend to be right, while the Tiggers get to be popular. But to quote Eeyore,

'Pathetic. That's what it is. Pathetic.'

Posted by Seth Finkelstein at 11:09 PM | Followups
August 27, 2003

DVD-CCA v. Bunner, my punditry on What It Means

What follows are some thoughts I have about what the Bunner DVD trade-secret case recent decision actually means. Note I am not a lawyer, and the views below are my own, no warranty expressed or implied, free advice is worth what you pay for it, and so on.

In general, this is in the abstract, a formal, procedural, decision. It is not a factual ruling. It's a matter of law. However within those formal, procedural, matter-of-law constraints, I see things as being said, which are not good. But I see it as problematic in a much more complex fashion than the popular press is reporting it.

The popular reporting may be that this decision ruled the facts against Bunner. That's wrong. But I also think it's too abstract (though not strictly wrong), to infer nothing at all about how the facts are likely to be ruled on "remand" stemming from what's written in this decision.

My understanding is that the Appeals Court says:
(emphasis mine in all the below)
http://www.eff.org/IP/Video/DVDCCA_case/20011101_bunner_appellate_decision.html

"Preliminary injunctions are ordinarily reviewed under the deferential abuse-of-discretion standard. We consider only whether the trial court abused its discretion in evaluating two interrelated factors."

They would like to let Bunner off. But they have a problem. They will have a very hard time doing that under a "deferential" "abuse-of-discretion standard". So they make a big jump:

"However, not all restraining preliminary injunctions are entitled to such deferential review. ... Thus, in order to determine the appropriate standard of review, we must first decide whether the restraint imposed by the trial court's preliminary injunction implicated Bunner's First Amendment right to free expression. If so, we exercise independent review. "

This jump gets them out of the "deferential" state, and into the "independent review" state. And they are happy, because they then can write on about the importance of free speech, as a principle.

But this jump lands in the CA Supreme Court. The CA Supreme Court slams it, hard. Not valid, error, core dump, etc. They send it back to the Appeals Court.

We have now returned from the jump. Since no further ruling on facts has formally been made, we could abstractly be said to be no worse off than before. That would be the formal answer. However, informally, I think the key is in this part:

"If, after this examination, the court finds the injunction improper under California's trade secret law, then it should find that the trial court abused its discretion. (See ibid. [holding that, in determining whether the "issuance of a preliminary injunction constitutes an abuse of " discretion under the First Amendment, the reviewing court must independently review the factual findings subsumed in the constitutional determination]; ... [holding that preliminary injunctions are reviewed "under an abuse of discretion standard"].) Otherwise, it should uphold the injunction.

The Appeals Court didn't want to do that review under an "abuse of discretion" standard. So though the case is now being returned back to a favorably-inclined court, it's going back with extremely strong "guidance" to be decided in a way that the Appeals Court wanted to avoid - for the obvious reason that such a path strongly implied upholding the injunction, as a practical matter.

The Appeals Court is now locked back into the "abuse of discretion" box. Along with plenty of attitude conveyed, that the defendant is a bad guy and the plaintiff is a good guy. In theory, they could still have a favorable ruling. But I see them as being told here to uphold the injunction unless they can come up with an extremely good reason why not (again. "abuse of discretion").

Of course, I-Am-Not-A-Lawyer. But I'm trying not to be a defendant either 1/2 :-).

Update: A smart, top-flight, veteran, California lawyer tells me that I'm misreading that key standard of review aspect. The Appeals Court is in fact being told to exercise a fully independent review, not a deferential review. If so, I'll own up to misreading the above.

Again, IANAL

Posted by Seth Finkelstein at 12:57 AM | Comments (9) | Followups
June 25, 2003

"CIPA-compliant" library censorware

The idea of minimal, "open-source", library-specific censorware is being widely discussed (see, e.g. Edward Felten's comments - thanks Donna)

Here's the problem:

1) If any library wanted to play challenge-the-law, all they would need to do is sit back and say "Give us the specific, judicially-decided, URLs to be banned, and we'll ban them -- but not one URL more." And then wait for the compliance lawsuit to be brought. Very simple.

2) If they don't want to be challenging the law, why would they undertake what will certainly be a major PR hassle? That is, anyone can come up with harsh-but-not-illegal sites and say "Library X allows these PORNOGRAPHY sites to be viewed!". So do they get added to the blacklist or not? You mean the library is going to stand up to a constant barrage of bad PR like this? If they were willing to do that, we'd be in case #1.

Two words: Robert Mapplethorpe.
Blacklisted or not? Think through your answer in either case.

What happens when the "North American Man-Boy Love Association" asks to be whitelisted?

The idea of Open-Source Censorware (more accurately, an Open-Source Censorware Blacklist) is one which is very appealing from 10,000 feet. But it falls apart on any close examination.

OpenCensorware is far more work than may be apparent.

Here's the most well-known people who are trying it:

http://www.squidguard.org/
http://www.squidguard.org/blacklist/

Heard of them? No? Consider there's reasons why.

By the way, the Australians tried this idea too:
http://zem.squidly.org/software/guilt.html

http://www.anu.edu.au/mail-archives/link/link0002/0275.html

"Announcing the GnU Internet Lust Terminator, an open-source censorware proxy that only filters ABA-supplied banned URLs.

The software is being developed by Zem for 2600 Australia and will be eventually submitted to the IIA for inclusion as an approved filter"

The Australian government didn't approve it.

[Update 6/26 - I've also suggested privoxy (http://www.privoxy.org/) ]

Before people write back, here's my challenge:

Don't tell me this is such a great idea. Find libraries who will use it who agree it's such a great idea!

[Disclaimer - I said to one proponent of this idea that I'd help make it happen, if he could find libraries which wanted it, and funding for it, as part of a challenge above]

Posted by Seth Finkelstein at 11:25 AM | Comments (7) | Followups
June 13, 2003

DMCA vs fair-use

DMCA/fair-use blog party!

Donna and Derek and Kerr and Balkin and Solum and Frank ...

Let me jam too.

I think understand what Balkin is saying, and also what Kerr is saying.

Here's the deep question, which is being batted around:

Is fair-use a substantive limit, or a technical exception?

The side Kerr is arguing, what some call "affirmative defense", I call the "technical exception" view. That is, it conceives of fair use as having no overarching meaning, no deep significance. It's just a procedural reply in some particular sections of copyright law. The implication here, being that if one creates a new section of the copyright law - such as the DMCA - there's no carry-over, no principle to apply. The sections of the laws are partitioned, and never the twain shall meet.

The side Balkin is arguing, I call the "substantive limit" view. Fair use is an aspect of the First Amendment. It's intrinsic to any copyright-associated law by virtue of drawing power from the First Amendment's scope and reach, as a Constitutional provision. It's a bit like an all-pervasive Holy Spirit that way (the DMCA makes baby Jesus cry).

Now, Balkin is reading the Eldred decision as having a kind of genuflection to the pervasive spirit of fair use. How he does this, from perhaps the largest copyright-grab in history, is awesome to behold. The idea is that the court says the copyright-grab is OK in part since it didn't change fair use:

But when, as in this case, Congress has not altered the traditional contours of copyright protection, further First Amendment scrutiny is unnecessary.

So, goes the thought, this is a shining reaffirmation of the importance of fair use as substantive limit. And that strengthens the argument of those who argue that the DMCA is a restriction of this substantive limit. Follow the reasoning?

Frankly, this strikes me not as making lemonade out of lemons, but rather, wading through a pile of manure and trying to find a pony.

The cyanide in this lemonade is that it in fact doesn't help much against the "legal hack" that the DMCA doesn't affect fair use:

* (c) Other Rights, Etc., Not Affected. - (1) Nothing in this section shall affect rights, remedies, limitations, or defenses to copyright infringement, including fair use, under this title.

So the DMCA defenders are going to argue that in fact "[the DMCA] has not altered the traditional contours of copyright protection". Why? It says so right there, see? "Nothing in this section shall affect ...". But, respond the DMCA opponents, fair use is a substantive limit! No, say the DMCA defenders, fair use is a technical exception ...

Roundabout, here we come, right back where we started from ...

Posted by Seth Finkelstein at 05:30 PM | Followups
May 20, 2003

Googlewash, Nunberg, Orlowski

[Semi-name-dropping disclaimer - I like Andrew Orlowski's articles, and think they're asking good questions even if not immediately having the best answer to the question. I've even been quoted, willingly, in one Register Google piece. I've never talked to Nunberg, but I believe he's used some of my censorware investigations research in his CIPA expert testimony, so I also have incentive to favor him.]

I was puzzled recently when Edward W. Felten wrote:

Sunday's New York Times ran a piece by Geoffrey Nunberg complaining about (among other things) the relative absence of major-press articles from the top ranks of Google search results. ...

The real explanation is simpler : The Times forbids Google to index its site.

Huh? This took me aback. I couldn't even find that "complaining" in the piece at first. Some digging, via John Palfrey to Doc Searls finally let me figure it out. I believe what's fueling a certain reaction is this:

People think that the Nunberg/New York Times article is in part complaining about their Google PageRank - because that is what concerns net-writers!

No, folks. New York Times writers don't care about their PageRank. They don't need it!. They're heard already. By people who read short briefing papers prepared by staff. The New York Times is at the top, and it's a very diferent world up there, from down here.

If anything, I read Nunberg as being ever so slightly critical of Orlowski, and quite accepting of the Google results. I think he was saying very roughly that Google returns what people were talking about, and more people were talking about a "blog" topic than a "major-press" topic here, so that's what you get. Then people viewed this as somehow being a "complaint". But I didn't see Nunberg as complaining, so much as stating that chatter may be popular, but it isn't authoritative, and shouldn't be expected to be so. The same sentiment I express as "Google is good, but not God."

Posted by Seth Finkelstein at 03:45 PM | Followups
March 10, 2003

"Michael Owen" is the UK injunction "mystery footballer"

The "mystery footballer" who cannot be named in the British press (due to an injunction), has apparently been named on a Norwegian newspaper's website.

Says a poster from New Zealand ("godzone_kiwi@xtra.co.nz"), in
http://groups.google.com/groups?selm=IAfaa.3250%248b.444863%40news02.tsnz.net

"http://www.vg.no/pub/vgart.hbs?artid=32734"

And, while I don't read Norwegian, the name "Michael Owen" is clear in the article (hey, I'm in the US. I can say it!)

And in fact, automatic translation gives something where the gist of the article can be derived:

http://www.tranexp.com:2000/Translate/index.shtml?from=nor&to=eng&type=url&url=http%3A%2F%2Fwww.vg.no%2Fpub%2Fvgart.hbs%3Fartid%3D32734

"allegedly document that Michael Owen (22) has been faithless against ... Louise Bonsall, as am pregnant in seventh month"

Isn't the Internet amazing?

Update: More coverage, in English now, from Singapore:
Owen's no Saint Michael
http://straitstimes.asia1.com.sg/football/story/0,1870,176015,00.html
Also available at http://groups.google.com/groups?selm=8ac92a26.0303121333.15fc3530%40posting.google.com

Posted by Seth Finkelstein at 10:23 PM | Followups
February 07, 2003

UK Parliament Mail - The Ministry Of Silly Messages

From: Seth Finkelstein
To: Seth Finkelstein's InfoThought list
Subject: IT: UK Parliament Mail - The Ministry Of Silly Messages
Date: Fri, 7 Feb 2003 13:45:06 -0500

New report:

UK Parliament Mail - The Ministry Of Silly Messages
http://sethf.com/anticensorware/general/uk.php

Abstract: This report examines messages being rejected by a mail system in use by the UK parliament.

I've reverse-engineered why the system used by the UK parliament to scan mail for "inappropriate content" was bouncing messages ranging from Welsh newsletters to a Shakespeare quote. Censorware is not fond of pussy-cats and tit-willows.

URLs:

E-mail vetting blocks MPs' sex debate
http://news.bbc.co.uk/1/hi/uk_politics/2723851.stm

Software blocks MPs' Welsh e-mail
http://news.bbc.co.uk/1/hi/wales/2727133.stm

Plaid up in arms as Commons spam filter bans Welsh
http://www.theregister.co.uk/content/6/29199.html

UK Parliament Mail - The Ministry Of Silly Messages
http://sethf.com/anticensorware/general/uk.php

NTK (Need-To-Know) coverage
http://www.ntk.net/2003/02/07/

Cyber-Rights & Cyber-Liberties (UK)
http://www.cyber-rights.org/

--
Seth Finkelstein
Anticensorware Investigations - http://sethf.com/anticensorware/
Seth Finkelstein's Infothought blog - http://sethf.com/infothought/blog/
List sub/unsub: http://sethf.com/mailman/listinfo.cgi/infothought

Posted by Seth Finkelstein at 01:54 PM | Followups
February 06, 2003

Program vs. data as wave vs. particle, dualities

Let me try this from another direction. In physics, for light, there's a phenomena called "wave-particle duality". That is, in some ways a photon of light acts if it's a tiny billiard ball (a particle) and in other ways it acts if it's a ripple in material (a wave).

So asking "Is something program or data?" is a bit like asking "Is light a particle or wave?". As an intrinsic property, it's always both, But this doesn't mean everything stops there. Depending on extrinsic considerations, in different circumstances, one or the other aspect is the way it is taken in a particular situation.

In a legal analogy, I mentioned the same action being accident or murder depending on the state of mind. What I was attempting to express there, was less the specific idea that the distinction between accident and murder can be based on intent, and more the general idea that it's based on certain extrinsic rules on how to place the very same action. Did the person intend to do harm? How much did they intend? Even if they did intend, is that intent excusable? ("justifiable homicide"). However, the target is just as dead, regardless of the outcome of this rule-based determination procedure of what legal category should apply to the action.

I do think what might be called "program-data" (or "speech-code") duality has profound implications. But I also think discussion of those implications often gets derailed into an uninteresting side-path where people ask

"How can treating dual-thing as aspect-1 in situation-1, be reconciled with the fact that dual-thing is treated as aspect-2 in situation-2? Is dual-thing actually aspect-1 or aspect-2? Surely, since dual-thing can be both aspect-1 and aspect-2, then it must be treated also as aspect-2 in situation-1, and aspect-1 in situation-2. Ha-ha-gotcha!"

As a purely philosophical objection, I don't think this works. Legally, line-drawing is done all the time. The deep problem, as I see it, is if the objection works as a practical issue. As in the following part of the DeCSS decision:

FN275. During the trial, Professor Touretzky of Carnegie Mellon University, as noted above, convincingly demonstrated that computer source and object code convey the same ideas as various other modes of expression, including spoken language descriptions of the algorithm embodied in the code. Tr. (Touretzky) at 1068-69; Ex. BBE, CCO, CCP, CCQ. He drew from this the conclusion that the preliminary injunction irrationally distinguished between the code, which was enjoined, and other modes of expression that convey the same idea, which were not, id., although of course he had no reason to be aware that the injunction drew that line only because that was the limit of the relief plaintiffs sought. With commendable candor, he readily admitted that the implication of his view that the spoken language and computer code versions were substantially similar was not necessarily that the preliminary injunction was too broad; rather, the logic of his position was that it was either too broad or too narrow. Id. at 1070-71. Once again, the question of a substantially broader injunction need not be addressed here, as plaintiffs have not sought broader relief.

Posted by Seth Finkelstein at 12:57 PM | Followups
February 05, 2003

Programs vs. Data, a simple example

Edward Felten discuss Programs vs. Data, and trying to distinguish. Here's an example I've given to people before, for consideration:

The ROT13 algorithm explained ("Caesar Cipher")

1) The decryption algorithm for ROT13 is to take the range of letters from a-z, and for those twenty-six letters, replace the first thirteen of them with the range of letters from n-z and the second thirteen of them with the range of letters from a-m

2) To un-ROT13, do a tr/a-z/n-za-m/ over each character in the file

3) perl -pe 'tr/a-z/n-za-m/;' < infile > outfile

Where did I step over the line, from "speech" to "code"?

Or where did I make the transition between "data" and "program"?

Ed Felten says "it seems unsatisfactory to call something a program or not based on the state of mind of its author.". I submit that for legal purposes, something along those lines of "primary use" or "dominant purpose" is the only system which will work. It's a bit like the different between accident/manslaughter/murder-second-degree/murder-first-degree. The same "data" (outcome) is treated differently depending on a legal "program" (ruling) regarding intent and effect.

Posted by Seth Finkelstein at 05:48 PM | Followups
February 02, 2003

Domains With Typographical Errors - A Google Search Strategy

I was inspired this weekend, and cross-connected the earlier domain searching to Google

Domains With Typographical Errors - A Google Search Strategy
http://sethf.com/domains/typos-google/
by Seth Finkelstein

Abstract: This paper describes a strategy for searching for domain names with typographical differences by using Google, and compares the results to a previous search using approximate string matching.

This is in response to a report
Large-Scale Registration of Domains with Typographical Errors
http://cyber.law.harvard.edu/people/edelman/typo-domains/
by Benjamin Edelman.

He describes an extensive series of domain names with typographical errors which have been registered by a cybersquatter. and asks for help in identifying these targets. This creates what might be called an "inverse problem", of determining what are the target of the squatted typo'ed name

Posted by Seth Finkelstein at 11:57 PM | Followups
February 01, 2003

Space Shuttle Columbia

I remember the Challenger Accident.

Bad deja-vue

Posted by Seth Finkelstein at 08:46 PM | Followups
January 31, 2003

Domains with Typographical Errors - A Simple Search Strategy

Domains with Typographical Errors - A Simple Search Strategy
http://sethf.com/domains/typos/
by Seth Finkelstein

Abstract: This paper describes a simple strategy for searching for domain names with typographical differences, and the results of one such search.

This is in response to a report
Large-Scale Registration of Domains with Typographical Errors
http://cyber.law.harvard.edu/people/edelman/typo-domains/
by Benjamin Edelman.

He describes an extensive series of domain names with typographical errors which have been registered by a cybersquatter. and asks for help in identifying these targets. This creates what might be called an "inverse problem", of determining what are the target of the squatted typo'ed name

Note Donna Wentworth at Copyfight described my paper beautifully - "Seth F. gets agrep on the problem".

Posted by Seth Finkelstein at 11:58 PM | Followups
January 23, 2003

Matt Blaze Master Key security paper, earlier attack descriptions

I read with great interest Matt Blaze's paper,
"Cryptology and Physical Security: Rights Amplification in Master-Keyed Mechanical Locks"

He wrote:

It is always difficult to be sure that an attack is completely novel in the sense of not having previously been discovered independently; the lack of a coherent and open body of literature on locks makes it especially so. In this case, several correspondents have suggested that similar approaches to master key reverse engineering have been discovered and used illicitly in the past. However, there do not appear to be references to this particular attack in the written literature of either the locksmith or underground communities.

I was able to supply him with two references to earlier descriptions of the attack, in one case 15 years ago.

Compare:

2.2.2 The Attack

For each pin position, p from 1 to P , prepare H - 1 test keys cut with the change key bitting at every position except position p. At position p, cut each of the H -1 keys with each of the possible bitting heights excluding the bitting of the change key at that position. Attempt to operate the lock with each of these test keys, and record which keys operate the lock.

With the following item from (note 1987)

http://yarchive.net/security/master_keys.html

From gwyn@brl-smoke.arpa (Doug Gwyn) 12-Nov-1987 17:36:05
Subj: [1137] Re: mastered systems

"Obtain one extra key blank per pin column (7 for the typical institutional Best lock); duplicate the operating key except for one column on the blanks, omitting a different column on each blank. Then, for each blank, try it with the omitted column cut to number 0 (high), then 1, then 2, ... and record which bittings open the lock. That tells you what the splits are in that column. The whole set of trials tells you what all the splits are in all columns."

And similar (note 1994)

http://groups.google.com/groups?selm=2jcejp%24csc%40coyote.rain.org

From: jay@coyote.rain.org (Jay Hennigan)
Newsgroups: alt.locksmithing
Subject: Master key hacking Was:Re: Legality of picks...
Date: 9 Feb 1994 20:53:13 -0800

If you have a "change" (industry term for normal non-master) key and the lock that it fits, as a guest in a hotel would, as well as a number of blanks, you can do the following: Cut a key identical to your key, but with the first pin position uncut or a "0" cut. Try it in the lock. If it works go on to step 2. If not, take the first pin down one depth using a key gauge or micrometer (or a Le Gard or other code cutting machine). Try it again until the key works. When you hit the depth of the cut on your original key, it should obviously work, as the keys should be identical. If so, continue going deeper. You are likely to find a depth on the first pin _in addition_ to the one on your key that opens the lock. If not, then cut another blank with the first position identical to yours, and the second one at the top or "0" cut.

Step 2: Repeat as above with the next pin position.

The object is to find the cut at each pin position that is different from the single-lock key you have, but still opens the lock. This will be the master key bitting. Having two different keys (and locks) from two different areas of the masterkeyed system will make things a bit easier, as you'll have a way of cross checking, especially if there are more than two breaks in some pins. This exercise, if you're precise, and lucky, can take as few as 5 or 6 key blanks. At most, a dozen. No real skill in picking or impressioning is needed. [... rest of article snipped]

Update: There's some interesting commentary on Dave Farber's IP list:

Donald Eastlake commentary:
http://www.interesting-people.org/archives/interesting-people/200301/msg00136.html

Bob McClure commentary:
http://www.interesting-people.org/archives/interesting-people/200301/msg00147.html

Matt Blaze reply message, "Keep it secret, stupid!":
http://www.interesting-people.org/archives/interesting-people/200301/msg00153.html

And a thread on the newsgroup alt.security.alarms
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=3e32f52d_2%40corp.newsgroups.com&rnum=1&prev=/groups%3Fq%3D%2522master%2Bkey%2522%2Bimpression%26hl%3Den%26lr%3D%26ie%3DUTF-8%26scoring%3Dd%26selm%3D3e32f52d_2%2540corp.newsgroups.com%26rnum%3D1

There's also discussion with postings from Matt Blaze himself, on alt.locksmithing
http://groups.google.com/groups?dq=&hl=en&lr=&ie=UTF-8&threadm=1043353164.31996%40cswreg.cos.agilent.com&rnum=1&prev=/groups%3Fq%3Dg:thl2517357864d%26dq%3D%26hl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3D1043353164.31996%2540cswreg.cos.agilent.com

Posted by Seth Finkelstein at 11:57 AM | Followups
January 14, 2003

More on "Blog Politics of Form"

Copyfight's Donna Wentworth replied to my earlier message on Blog blather Let me hasten to reassure that I'm not against thinking about Grand Ideas. I do it myself :-). And I never want to seem to be too hard on people for such thoughts, and regret if I come across that way. I sympathize with the feeling.

There's occasions that an innovation seems world-changing, revolutionary. And sometimes it is world-changing - but often the world doesn't change in the ways you expect. This problem is one reason I've resolved to stop arguing with people over You-Can't-Censor-The-Net. Because someone who is caught up in the latest iteration, often takes it as I'm raining on their parade, when I tell them about earlier times where that battle-cry didn't work.

To be clear, I'm not describing Donna's post in particular, here. She was presenting an idea and asking a question or two--not engaging in a full-scale examination of an issue.

Instead, I'm describing a genre of discussion where I find it difficult to discern any meaning. The problem is that such discussions often don't consider the politics of form, but more akin to literary theory of form. That is, they don't take into account practical strategies of organizing, where there are winners and losers, big money, entrenched interests, and the outcome is often not what would be consider just. Rather, the focus is typically more on the experience of writing and reading, and then large leaps to Profound Thoughts.

And then there's the next fad, and the whole punditry process starts all over again

Literary theory isn't wrong. But a little goes a long way.

I'm all for examining the impact of innovations on society. But I mean, really examining it, as in, also taking into account what doesn't change, and what counter-intuitive results occur.

Rule of thumb : Any examination which doesn't reveal at least one serious negative consideration isn't worthwhile, because it's just hype.

Posted by Seth Finkelstein at 03:31 AM | Followups
January 13, 2003

Reply to "Blog Politics of Form"

Donna Wentworth at Copyfight talks about the "politics of form". While I think that's a interesting topic, I also think it won't get discussed meaningfully. Because the meaningful material is likely to be more specialized and unsexy than befits Grand Ideas. No offense, but as I read over everything, I thought again:

AARRGGHH! More blather!

Some days, I think I would be vastly more popular if I took journo-blathering seriously. Let's see ... "Yes, the weblog is yet another pinnacle in the postmodern [neat word!] cyber-democratization [neat prefix!] of the infosphere [neat phrase!]. It is not the ``I Media'' of the top-down organizational form of the old regime, but as others have noted, the ``We Media'' [neat term!] of a spontaneously self-organized complex system [a sprinkling of pseudoscience jargon is always good!]. We must ask "What Does It All Mean"? [big broad question are excellent filler!] And answer that the meaning is a unique new frontier in human expression [nothing is ever an old retread!] ..."

Sorry, but I get curmudgeonly over this stuff. I lived through the growth of mailing-lists, Usenet, the early Internet, and so on. I remember when there really was an aspect of egalitarianism and democratization with networked communications. But it was a fragile state, stemming from the fact that the community was small and insular then, and it didn't last.

I think the key insight is the following:

PUNDITRY ISN'T DEMOCRACY

More opportunities for punditry doesn't necessarily mean society becomes more egalitarian - this is the fundamental error of 95% of the noise on the topic. It connects to the idea of commentators being the watchdog of a well-functioning world. So then more comments equals a better world. But rather, it just means more people have a chance at becoming professional chatterers, and/or the existing chatterers have yet another outlet. Indeed, that's a change, certainly a change worth studying - but not a unique, unprecedented change. And the implications are likely to be much less than the hype over them.

Posted by Seth Finkelstein at 01:30 PM | Followups
January 09, 2003

Reply to "Replace Copyright with Watermarks, Taxes"

Donna Wentworth at Copyfight asks for thoughts on the following music proposal:

Fisher's first choice, he said, would be to recognize that copyright law is increasingly dysfunctional for handling music royalties and to (1) Authorize artists to insert simple watermarks in their creations, (2) Tax, at the multilateral or national level, things such as ISP access and various technologies upon which music is performed, (3) Count the frequency with which each digital product is consumed, (4) Distribute revenue from the taxes in the proportion in which the various products are accessed. Once the system is in place, he said, copyright law can be "lifted."

I think the general outlines are good, and many people (including myself :-)) have said vaguely similar things in the past. However, the devil is in the details. In particular, I've emphasized point #3 for a reason. HOW does he intend to "Count the frequency with which each digital product is consumed"? Super-spyware? Require every player to recognize the watermark? That would of course require non-watermark-responding players to be illegal, right ... (umm ... didn't we just go through this?)

Don't get me wrong, again, the overall idea, of some sort of mandatory license and statistical royalties seems to be the right thing. However, getting the details correct is the tough part. Arguably, this idea worked reasonable well in the "Audio Home Recording Act", with a tax on that digital recording media. And maybe Fisher's riffing off of it.

But if so, it's a riff in a "visionary" manner, where the details are being neglected for the Grand Idea. It's one thing to tax digital tapes, where there's a discrete object, and the tax is small compared to the price. But what is "various technologies upon which music is performed"? The $10 (?) for the motherboard sounds chips? The speakers? He's not planning to tax free-software Linux players as a "technologies", I hope (I'm having a bad DeCSS flashback here, with code as technology!) The bandwidth? It seems like there's just not enough money there.

Maybe he can make it work. But the acid test for any proposal is to work with free (in both speech and beer) software, and come up with some in-the-ballpark numbers.

Posted by Seth Finkelstein at 03:28 AM | Followups
December 29, 2002

N2H2 (censorware co) - financially "dead company walking"

I've finally finished ploughing through N2H2's recent financial report and attempting to figure out just how near they are to death's door (approaching? threshold? already through?). I think it's a matter of "dead company walking".

I've finally made sense of their announced "cash flow positive" quarter. Remember, N2H2 loses around $1.7million each quarter

Now, look at the N2H2 Balance Sheet

Note how "Cash and Cash Equivalents" takes a big jump up on Sep 30, 2002.

But projecting, the estimated numbers would be (in thousands)

"Net Tangible Assets":
+2,152(down 1,400 to) +752(down 1,017 to) -265(down 1,507 to) -1,772
"Cash and Cash Equivalents":
+6,000(down 1,740 to) +4,260(down 1,485 to) +2,775(now project) 728? (est?)
calculate "Cash and Cash Equivalents" - "Net Tangible Assets":
+3,848  +3,.508  +3,040(now project) +2,5? (est?)

That is, the next number in the series for "Cash and Cash Equivalents" should be down "1,something" , giving less than 1,000 remaining, around 728 from projecting from the drop in "Cash and Cash Equivalents" - "Net Tangible Assets"

Instead, they record a total "Cash and Cash Equivalents" UP to +4,684. That's an overage of (4,684 - 728? = 3956?). Where are they getting that extra 3956 or so ?

Note nothing dramatic has changed in terms of income and expenses for all of N2H2 fiscal year 2002 . So their recent layoffs can't be the cause of this dramatic change.

Look at the "Other Current Liabilities" line. They have a big change in going from 4,475 to 8,179 = 3704. That seems to be the jump.

The "Cash Flow Statement" agrees

Income hasn't changed much. The cash jump is from "Changes In Liabilities".

What's this 8,179 liability?

Searching the annual report, under
"LIABILITIES AND SHAREHOLDERS' EQUITY (DEFICIT)"
They have "Deferred revenue" of 8,179

What's this "Deferred revenue"? The only reference I can find is (my emphasis)

"Subscription agreements and most maintenance services are evidenced by signed contracts, which are generally 12, 24 or 36 months in duration. Subscription and maintenance revenues are recognized on a straight-line basis over the life of the contract. Contracts billed in advance of services provided are recorded as deferred revenue.

Hmm? What's going on here?

It appears they counted much of *expected Financial Year 2003* revenue, as "deferred revenue" for the last quarter of Financial Year 2002. And listed what they billed as part of "Cash and Cash Equivalents".

That is, the only reason they're "cash f