UNITED STATES DISTRICT COURT SOUTHERN DISTRICT OF NEW YORK ------------------------------------------------------------------------x BARBARA NITKE and THE NATIONAL COALITION : FOR SEXUAL FREEDOM, : Plaintiffs, : -against- Index No. 01 Civ. : 11476 (RMB) JOHN ASHCROFT, ATTORNEY GENERAL OF THE UNITED STATES OF AMERICA, AND THE UNITED : DECLARATION STATES OF AMERICA, OF SETH : FINKELSTEIN Defendants. : ------------------------------------------------------------------------x
SETH FINKELSTEIN, pursuant to 28 U.S.C. §1746, certifies that the following statements are true and correct and understands that these statements are made under penalty of perjury:
This declaration is made in lieu of direct testimony at trial, as directed by the court. I am an expert witness in the case, and all of the statements set forth below are based upon personal knowledge and/or belief.
I graduated from the Massachusetts Institute of Technology in 1985 with Bachelor of Science degrees in both Mathematics and Physics.
Since 1985, I have worked as a computer consultant for a variety of companies. [Redacted]
Over the past two decades, I have gained extensive experience in computer programming and in navigating the architecture of the internet. This experience has allowed me to become familiar with the ways in which governments, internet service providers, and website operators have placed controls on access to content on the internet. My research has focused on how content controls conflict with rights to anonymity and free expression.
In 1997, I cofounded the Censorware Project, an organization that attempted to determine what websites were being blocked by various access control programs, which are also referred to as “censorware.” I developed decryption programs that sought to determine what websites were blocked by various censorware programs. We determined that popular commercially-available censorware programs prevented access to numerous websites with unobjectionable content.
As part of the research I conducted for my expert report in this case, I examined how geolocation programs determine the likely physical location of an internet visitor. Geolocation programs use routing trails and IP addresses to approximate the location of individuals who access internet content. I found that no commercially-available programs can identify with certainty the physical location of internet visitors. I also found that it is relatively easy for anyone who wishes to disguise his or her physical location to use these programs to generate data that geolocation programs will use to make a false approximation of a visitor’s geolocation.
There are numerous ways in which individuals seeking access to a target website can disguise their geolocation. If they relay their request for the target website through a proxy server, the target will receive an inquiry from an IP address that belongs to the proxy, not the individual. The geolocation software will use this information to approximate the internet visitor’s physical location. Unless the proxy server happens to be located in the same area as the individual, the website’s geolocation software will not be able to accurately identify the individual’s geolocation.
There are numerous types of proxies that internet visitors can use if they wish to hide their geolocation. The most common example is what is known as an anonymizing proxy server, which relays the individual’s request through a group of servers located in different areas of the US, and sometimes even outside the US. Examples of such proxy servers are anonymize.net (Exhibit 45a) and anonymizer.com (Exhibit 45b). There are over a hundred anonymizing proxy servers that I am aware of, as listed in Exhibit 45c, and there may be many more of which I am not aware.
In preparing my expert report I conducted experiments using an IP location service, ip-to-location.com (http://www.ip-to-location.com) through various privacy proxies. IP2Location makes the following claim of accuracy: “The database has over 95% of accuracy in country and ISP level, 70% in region level and 65% in city level, which is higher than any of our competitors.” See Exhibit 45s. An accuracy of "70% in region level and 65% in city level" might be fine for targeting of advertisements or generating a report on customer demographics. These services can be useful for businesses, where the worst case is that marketing money is misspent. However, for the extraordinarily demanding context of criminal liability, such an accuracy rate is not acceptable. As an article from Interactive Week put it: "It's impossible for these guys to be 100 percent accurate," says Peter Christy, a Jupiter Media Metrix analyst. "You can't use this for life-and-death situations." See Exhibit 45t.
A similar point was made at length for taxation, in a report issued by the Information Technology Association Of America (http://www.itaa.org/) regarding "Ecommerce Taxation And The Limitations of Geolocation Tools" (Exhibit 45u). The report concludes: “Geolocation technologies do provide valuable non-tax commercial functionalities (i.e., marketing data, etc.) where a high degree of accuracy regarding a user's jurisdiction is not required at a transaction level. However, given the current inability of such technologies to overcome obstacles presented by corporate networks, anonymizers, AOL users, IPv6, and the other issues discussed above, coupled with their lack of complete certainty as to customer location, they cannot be relied upon for consumption tax purposes.” See also Exhibit 45v.
At the time I conducted my experiments using IP2location I was physically located in Cambridge, Massachusetts. Barbara Nitke’s website can be accessed using various IP-disguising websites. Many of the items in Exhibit 45 demonstrate the various geographic locations which were returned by IP2location after I accessed its geolocation service using various IP-disguising websites:
Exhibit 45f: iplocation1.gif - The ip-to-location.com service, default behavior, shows my location as Boston, Massachusetts, which is very close to may actual location in Cambridge.
Exhibit 45g: iplocation2.gif - viewed through babelfish translator (acting as a proxy in effect) shows my location as Saugerties, New York.
Exhibit 45h: iplocation3.gif - viewed through proxyone.com shows my location as Saint Petersburg, Florida.
Exhibit 45i: iplocation4.gif - viewed through the-cloak.com shows my location as San Francisco, California.
Exhibit 45j: iplocation5.gif - viewed through pureprivacy.com shows my location as San Antonio, Texas.
Exhibit 45k: iplocation8.gif - viewed through mdsme.de shows my location as GERMANY.
Exhibit 45l: iplocation6.gif - viewed through guardster.com shows my location as San Francisco, California.
Exhibit 45m: iplocation7.gif - viewed through proxyweb.net shows my location as British Columbia, Vancouver, CANADA.
Exhibit 45n: metaspinner1.gif - example of using metaspinner proxy, located in GERMANY.
Exhibit 45o: iplocation9.gif - viewed through anonymouse.ws shows my location as Washington, DC.
Exhibit 45p: iplocation10.gif - viewed through proxyify.com, shows location as Valley Stream, New York.
I have done extensive research on the capabilities of anonymizing proxy servers, and contributed to a project organized by the Voice of America on Chinese internet usage. The Chinese government prevents people in China from gaining access to many non-Chinese websites. I worked with a group of programmers who were attempting to determine how to circumvent blocking by what is called “The Great Firewall of China.”
It is relatively easy for an individual who has never used an anonymizing proxy server to find many no-cost servers through an internet word search. Once one is located, in most cases it takes only a few seconds to access the website through the anonymizing proxy server. See the anonymous browsing quick start page attached as Exhibit 45d.
It would not be easy for a website operator to design a geolocation system to reject requests from anonymizing proxy servers because anonymizing proxy servers use a large number of servers and IP addresses and change them rapidly. While analysis of the pattern of servers and IP addresses might yield some of the IP addresses, most likely any effort to track all of these would miss the new ones and generate a list of servers and IP addresses that were no longer used by the proxy server.
Placing controls on anonymizing proxy servers located in the United States will not affect the ability of internet visitors to hide their geolocation because they can use proxy servers located outside of the United States. Many of these exist today and the delay that most of them cause in accessing a website is minimal.
Another type of proxy server that hides the user’s geolocation is a translation website, such as Babelfish. These sites translate the text on websites from one language to another. However, if an individual wants to view the website in its original language without providing identifying information he or she can request that the translation site translate the target from any non-English language (say, Chinese) to English. Assuming that there is no Chinese on the website, the translation website will provide the individual with an exact copy of the target, while the target thinks it is receiving an inquiry from the server on which the translator is located. This will also frustrate any efforts to accurately approximate the geolocation of the individual.
Another reason why geolocation programs will not prevent access to material on the internet is the existence of internet archives, such as archive.org. These are websites that make a record of the content of other websites at a particular point in time, and make these records available to internet visitors even after the content on the other websites has changed. See Exhibits 45q and 45r for examples of material from Barbara Nitke’s website that has been archived and remains on the internet outside the control of its originator. The existence of these archives prevent individuals from controlling access to the content of their websites once it has been placed on the internet.
Currently the largest internet service provider in the US is AOL, which has approximately 25 million subscribers. Because all internet traffic on AOL dial-up services is routed through Virginia, as noted in Exhibit 45s, efforts to geolocate website visitors who are using AOL dial-up have so far been unsuccessful. This means that a large group of internet visitors cannot be geolocated using existing technology. I am aware of no programs that are able to circumvent this problem.
I have read the expert report provided by Quova. In my opinion, it substantially overstates the accuracy of Quova’s technology because it fails to take into account efforts to frustrate geolocation. While Quova may be correct in evaluating the effectiveness of its technology when internet users are cooperating, it is not correct if they are attempting to evade geolocation. Such evasion is likely if their access to the internet will be restricted if they can accurately be geolocated.
This creates the paradox that the more effective the geolocation technology, the more likely it is that internet users will attempt to evade it. Even if cooperative geolocation was 100% accurate, the availability of anonymizing proxy servers, translators, and AOL dialup would frustrate a large percentage of efforts to accurately identify the geolocation of non-cooperative individuals.
I am generally familiar with the cost of software from my programming experience. In my opinion, it would be very expensive to develop software that was capable of approximating the geolocation of visitors to a website and showing them only that portion of the website which was not obscene according to the community standards of the place where they accessed the internet.
The reason that the cost would be so high is that the software would have to incorporate a large number of tests to ensure that each image met the applicable community standards. Since programs cannot read graphic images, each image on the site would have to be evaluated for a large number of factors. These factors would have to be correlated with the standards for each community in the United States. Since I am not aware that there is currently a database that evaluates every aspect of a photograph that could be considered obscene, creating such a database would be very time-consuming and costly.
The cost of this geolocation software would be prohibitively expensive for a website that does not charge for access. Only commercial websites with very high volume and user fees would be able to afford the cost of developing and installing such software.
I declare, under penalty of perjury, that the foregoing is true and correct.
Dated: Cambridge, Massachusetts
October 6, 2004_________________________
Seth Finkelstein