[I could not resist a chance to use that title]
I spent some time trying to figure out what caused the recent sexblog kerfuffle. I noticed affected sites all seemed to link to commercial erotic sites (for example Comstock Films?).
My speculation as to what happened, is that Google's anti-spam algorithm got set a little too aggressively in terms of what sites are considered porn-spam. The twist comes that this didn't hit the affected sexbloggers directly, but indirectly, as they then got hit by a linking-to-spam penalty. That is, it's not that they were marked as spam themselves, but rather that they were suddenly seen as closely associated with porn spam.
Such an indirect change wouldn't necessarily affect all blogs which link to the spam-false-positive commercial erotic sites. It's just one factor, and other factors could override any penalty. The actual calculation involved could be very complex. No way to prove this, just a theory.
It's an amusing thought that somewhere deep in the innards of Google's anti-spam algorithm, there might be an honest-to-Potter-Stewart (I-know-it-when-I-see-it) line between "pornography" and "erotica".
Regarding Valleywag.com's original article, which seems to have done a certain amount of poisoning the well:
Some word Violet [Blue] wrote probably triggered a Google ban, inadvertently, but the search engine's rules are opaque, as is the procedure for an appeal against deletion.
Never eat at a place called "Mom's", never play cards with a man named "Doc", and don't take search engine analysis from a site called "Valleywag". There's far more to Google's criteria than simple word counting.
"SEO superstition" strikes again:
Chronicle writer disappears in porn clampdown
The personal blog of San Francisco's Violet Blue, a sex writer published in the San Francisco Chronicle and Valleywag's sister site, has been removed from the Google index, along with several other adult sites. Tiny Nibbles, which runs a well-known annual list of the year's sexiest geeks, does not show in Google's search results, even if filters are turned off. Other sites affected include ErosBlog, a sex news site, and Comstock Films, which makes adult movies of real-life couples. The content's all legal, and naughty, rather than degrading. Some word Violet wrote probably triggered a Google ban, inadvertently, but the search engine's rules are opaque, as is the procedure for an appeal against deletion. You think there are other search engines, so that's okay? There are no other search engines.
IT'S A BUG! The sites haven't been removed from the index. If you go
further into the results, the sites are still there. They apparently
"sandboxed" [update: "marked as spam-like"] for some unknown reason, so they're showing up
much lower than normal. Almost exactly 30 spots, in fact.
Google doesn't hate you. Really.
A good person to contact about these things is Matt Cutts.
[Sigh ... *why* *bother*? Who is going to hear me in the face of the sensationalism?]
[Update: Some of the sites are back, the Google people know about the issue. Again, this is really about spam false-positives, not censorship].
Jimmy Wales, the founder of Wikipedia, the online encyclopaedia, is set to launch an internet search engine ... that he hopes will become a rival to Google and Yahoo!
The idea is not Wikipedia itself as a search enginge, but using some of the techniques of getting people to work for free, outsourcing to suckers, err, I meant to say collective intelligence, for a search engine.
Digression: I am amused that Jimbo Wales is going around correcting reports:
The Wikia Search project homepage explains: Amazon has nothing to do with this. :) Help me spread the word?
Myself, I don't take the view that it can't work, but rather that there are some tough problems (e.g. see discussion in Nick Carr's post) that have to be addressed for it to work. I believe Google already does some investigation for feedback of its results, in sampling occasionally what people click-through in results.
I find Wikipedia fascinating in part for the hodgepodge of ways it has managed to solve the problem of getting material (dream-selling, intellectual "extortion", plagiarism, and more), combined with the really elaborate ideological defenses it's evolved to deflect criticism of its flaws. It's all not a combination one would be able to foresee working in advance. People often mystify this, but it's not that the elements are unknown, it's that making a going concern out of them all is very hard.
But I could see some comparable approaches it would be interesting to at least try for a search system, especially if someone else is paying for it all with venture capital money. Heck, if I didn't have such baggage as a sometimes-critic of Wikipedia, and similarly vis-a-vis the Harvard Berkman Center (lesson there: [Seth], no agreement with you needs to be kept), I'd make a proposal to Wales.
1. When I was a senior in high school (Bronx High School Of Science), I placed eighth in the nation in the Westinghouse National Science Talent Search (that was probably the high point of my failed ambition to become a mathematical physicist). There's even a picture of me with the other winners in an old Discover magazine (I almost had an opportunity to tell off then Vice-President Bush, but it didn't materialize).
2. I've never shaved my beard or moustache. I've trimmed an errant strand or two with scissors, but nothing more. When people occasionally ask how long it took to grow my beard, I answer "All my adult life". It's not for a religious reason. But ever since it began to grow, I've always thought it a positive feature, and never wanted to remove it. I may change that one day, but so far not.
3. I have a metal rod in my right leg, repairing an injury from a car accident. This gives me the minor pseudopower of being a metal-detector-detector. It really can set off airport security. So I always know when the screeners have metal detectors on high, as opposed to lesser threat levels. So far, I've never had more than minor delays, but it can be annoying when traveling or going into high-security buildings.
4. I spent some months using a wheelchair as part of recovering from the above-mentioned injury. One really does get treated differently when in that position.
5. I've never taken a formal class in computer programming (my degrees are Math and Physics). I went into programming because it was job that was far more congenial than other options, e.g. substitute teaching (no offense meant). It both allows one to sleep late, and not wear a tie, two critical conditions for me. The occupation description I use, "Consulting Programmer", is a nod to Sherlock Holmes, Consulting Detective.
Now I have to
not break the chain, err, find five other
people, who I presume haven't already played. Given that this is a holiday
weekend, let's make this some love for linklorn:
Karen Coyle, Elisabeth Riba, Dave Rogers, Matthew Skala, Michael Zimmer.
The Google SOAP API, a system for getting Google search results in a way programmers can easily use them, is no longer being supported by Google (non-techies: SOAP is a protocol, like Java is a programming language), in favor of, essentially, a web ad box (aka "AJAX API"). The system hasn't been working well for a while now, and it looks like the plug is being pulled on it.
The basic meaning of this, is that Google is telling independent search developers to get lost, in favor of billboard displayers.
Everybody talks about search-as-a-service, but few people want to do something about it. I suspect this is one of those projects where the cost to run it exceeds what people will really pay for it. I've had ideas of my own in this direction, but the economics is daunting.
Anyway, in the ensuing discussion, there's been relatively little attention paid to the projects to reverse-engineer Google's "web ad box". This mention may be useless in terms of dissemination, but I'll do it anyway:
Written by Matthew Wilkinson
Monday 18 December 2006 20:20:09
Recently, Google disabled the use of it's Google Search SOAP API. It now recommends that you use the Google AJAX Search API, which displays a search box on your website, much like a widget. This of course denies developers the means by which to fetch Google Search results and use them in their website. However, me and Martin Porcheron over at MPWEBWIZARD, decided to crack this new API to get some search results out of it.
Memo to any Yahoo corporate readers: I assume you already know this, but there's a golden opportunity to grab some of the "cool" from Google here. Set up a compatible server, so anyone who has a Google SOAP API application can switch over to using Yahoo just by switching servers. Yes, it's a lot of server work for no direct revenue, and Yahoo already has a search API, and Google may make threatening legal noises. But you'll rarely have a better opportunity to grab mindshare from developers than now: "Google doesn't want you - but we do!".
There's a recent column in the Guardian discussing libel law and the Internet, where it's argued:
The internet brings a fundamental change to the relationship of publisher and subject: now the subject can publish, too. So Susan Crawford, a professor at New York's Cardozo Law School and a member of Icann, the board that oversees internet structure, has blogged that in this era, "libel law seems much less relevant - rather than sue, you can just write back". A commenter on my blog responded that some bloggers boast larger audiences than others, so this playing field isn't as level as it seems: "On occasion, a weak target can become a cause celebre". True. But I still argue that libel law was built for an era when few owned the press and the doctrine must be updated to account for the democratised and accelerated means of response today.
That commenter was me. The full version of my comment was:
Jeff, if you seriously want thoughts, it is my deeply-considered view, after many, many, years of observing this issue, that the discussion becomes somewhere between absurd and cruel. It is not "conversation" when one person speaks to ten, hundreds, of thousands, and the target may have some obscure response off somewhere read by a few friends and family.
Some people don't believe in libel law as a matter of principle - they say it's the province of the rich who don't need it, that the little guy who might need it can't fight back anyway, if you're smeared, just "take it" because attempting any sort of defense will only make the situation worse. That's one general point of view, and it has nothing specific to do with blogs or Internet.
Alternately, if one does believe in libel law, then we know ("Power Law Distribution") that there are vast, enormous, audience disparities, which apply to blogs and the Internet as well as other media. It's a mathematical fact, and denying that doesn't make it go away. Some "bloggers" are for all intents and purposes the same as a mainstream media syndicated columnist. How many times have you heard some boast like "I have [huge number] readers, that's more than [media outlet]!". And we can't all have a zillion readers, again, that's just a fact.
Sometimes disputes take place between relative equals. On occasion, a weak target can become a cause-celebre. But that such cases exist does not invalid that there's plenty of situations where a person who is libeled has no EFFECTIVE means to reach any sort of comparable audience. To rebut the idea that it *could* happen, individually, we all *could* win the lottery - but almost all of us won't.
Again, there's an argument that libel law is more harm than good. But for heaven's sake, don't tell the Great Unread, who make tiny mostly-unheard squeaks compared to the booming megaphones of A-listers, that they can eat cake.
Further, Z-lister saith not.
In my recent post on the structure of Bubble 2.0, I ended by saying:
"That feast is starting now, and the main dish is YOU."
While perhaps I should play, what more is there for me to say? I've said it all before, and what good did it do?
Bubble 2.0 is the province of a very small, extremely incestuous elite (A-list), of clever men (mostly) who run around marketing dreams. I can decry the academic cheerleading for unpaid freelancing, trying to get across my contention:
"Popularity Data-Mining Businesses Are Not A Model For Civil Society"
But there's no upside for me, and plenty of downside. The people who do well at this pander to reactionaries, and there's little market for "technology-positive" social criticism (especially compared to e.g. a goal of $100 million dollar venture capital fund).
So the hype may be "You", but the question is "Why?"
The Google Employee Stock Options coverage has been a case study in uncritical thinking. I know, what else is new, but I'll say it anyway.
About the best other criticism I've found is an excellent post on SearchViews, doing time-value calculations, about the aspect of that the plan dramatically shortens the time of the option when the employee sells it.
Initially hailed as an innovative HR strategy, then called "good for investors", the option plan has received so much praise that Internet Outsider asks, "If anyone has figured out the drawbacks of Google's new transferable option plan, please weigh in, because at first glance it looks like a win all around." Though numerous 'draw backs' have been suggested, including "an employee rush for exits", "shareholder dilution" and "arrogance", I'm surprised that no one has pointed out the most important nugget from plan's fine print: [detailed calculation]
But it's almost all been echoing of Google's announcement, or confusion over what the "transfer"/sale system does - and what it does not do. For example, there is no innovation here in determining the value of a Google stock price option. There's already a big public market in trading such options. The auction is basically just to determine who is the low bidder for handling the employee option transaction, given there's some weird constraints in the process. Which bring me to one simple example, in discussing the program, where what should be grist for serious reporting has apparently passed unnoticed:
Institutional buyers, who will be invited by Google to participate, will not be able to resell the employee stock options.
No offense meant to any reporter, but what in the world does this sentence MEAN? That is, it should be a big red flag that something strange is going on. Options on a stock are bought and sold all the time. How is the institutional buyer intended to distinguish from "the employee stock options" and "the stock options bought from yesterday's sheep-shearing"?
And this connects to the earlier issue of why not just let employees sell their stock options on the open market? After thinking about it for a while, I *suspect* this has to with the connection between the options and the underlying stock, maybe that if employee options were released into the open market, they would have to be covered by the company issuing stock (or something similar). But if they're just "transfered" to an institution, they still exist in accounting format as options, so certain negative effects (from Google's point of view) are avoided.
Wouldn't you like to know what this is all about? I would. I'm sure there's a professor of finance out there somewhere, who could explain it all. And might even be *blogging* - to an audience of a few hundred people. But they definitely haven't been found by the big echo chambers. And if that person ever did receive a little attention, the blog-evangelists would shout from their hilltops, the bogosphere triumphs - there's a specialist somewhere on the planet, so "overall" - not counting the endless hype reverberating from the massive audience "blogs", and also discounting that "old media" includes small trade newsletters too - blogs win!
I really think it says something profound about the failure of journalism in terms of civic structure, that random unpaid volunteers are supposed to provide the work that isn't supported otherwise.
STEWART: ... it's not so much that it's bad, as it's hurting America.
STEWART: You're on CNN. The show that leads into me is puppets making crank phone calls.
[Update: Changed title from earlier version - share the blame]
Google Transferable Stock Options (TSOs) are, in my cynical view, Google's way of solving the problem of having someone else take the fall for what happens to its employee's stock options when the Google stock price eventually takes a dive.
Disclaimer: I thought Google stock was a sucker bet all the way up, which may bear on how seriously to take this post. I have never owned it, and have no transaction involving it (i.e. am not "short").
Briefly: Google has a stock problem. A stock's price, even in a bubble, can't rise forever. Google gives employees "options", a right to buy the stock at a specific price. These take time to become active ("vest"). A look at the calendar shows there must be a lot coming due. And I suspect many employees thinking they should get out while the getting is good.
Employee options, to turn into money, normally first have to turn into stock. This dumps stock on the market. Which can drive down the stock price. Which will cause more employees to think they have to get out while the getting is good. Which will further drive down the stock price ...
See the problem?
Moreover, there's a complicated tax law "gotcha" which can hurt employees enormously with bubble-stock options. I won't explain it all, but if you wait to sell the stock, and the stock craters, you can basically end up owing huge taxes on profits which no longer exist. Which is another incentive to sell the stock as fast as possible.
Google employees who get caught in that trap would likely be very unhappy.
So, what to do? Google came up with a great solution: Pass the hot potato to the outsiders, the folks who are hearing the tale of the endless fountain of money. Let employees *sell* the options to outsiders.
At first glance, this sounds like a great idea. After all, options are bought and sold every day. Why should employees not be able to sell theirs? Well, in ordinary circumstances, it wouldn't be a problem. But in a situation like Google, it's a set-up to rip-off the ultimate buyers for the benefit of Google and its employees.
First, someone better versed in the technicalities of options mathematics should check me on this, but I believe many simple option valuation models will give a "wrong" answer for the value of an option in a situation like Google's stock, which is relatively new and has gone almost straight-up. That is, intuitively, the stock price behavior is going to change dramatically at some point, to leveling-off or dropping, and that's not accounted-for in any theory where the calculation has internally modelled an infinite time series dramatically different from the existing series (technically: "misestimation of implied volatility"?).
If everyone is using the same "wrong" rule for their cost, then it's all just a standard market game of Greater Fool. But if some sellers have a *zero* cost, to buyers at an "inflated" cost, that's taking the game to another level.
And more deeply, normally, a market in options is limited in the ability to cause a stability problem, because kind of like matter/anti-matter pairs, a normal option transaction has two people on opposite sides of that transaction, so "financial energy" is obviously conserved. An *exception* to this situation is company-granted options, like Google is doing now. Which is roughly comparable to energy creation (money) while shunting the corresponding equivalent anti-matter (stock) to a time-displaced future date. There's still conservation in an overall sense (Google is not God, so can't get around that constraint), but it can be very imbalancing to put off the day of reckoning. And more importantly, the people who get wrecked at that future date tend to be different from the people who make out like bandits at the time of creation.
Note - this is a complicated topic. I know, "stock options accounting" are fighting words to many. This is a blog post, not a financial treatise.
But here's the "beauty" of what Google is doing - by selling the employee options, the employees can take profits without the options turning into stock! Which keeps the stock price up. Which encourages the buyer to hold onto the options. Which further puts off the day they turn into stock ... Brilliant!
That is, it keeps the party going on by putting off one pressure to sell stock. And who pays the ultimate bill? The buyer of the employee option, who at some point eventually gets stuck paying a high price for something which (oversimplified) becomes worthless when the stock starts leveling-off / going-down.
Doing Evil? Well, depends on whether you're the seller (who's at Google) or the buyer (aka citizen-lunchmeat)...
I'm mentioned substantively twice in the book.
About the Al Gore / Internet story:
The only redeeming part of this story is that it's simple to document the falsity -- because of the Internet. Seth Finkelstein, a programmer and anti-censorware activist, has created a page on the Internet collecting the original interview and the subsequent reports about it. His is the model of the very best the Internet could be. That virtue, however, didn't carry too far beyond the Internet.
And concerning the Nitke court case and geographic location on the Net:
But it is still possible to evade identification. Civil liberty activist Seth Finkelstein has testified to the relative ease with which one can evade this tracking. Yet as I will describe more below, even easily evaded tracking can be effective tracking. And when tied to the architectures for identity described above, this sort will become quite effective.
While there's been much discussion that the Google PageRank of websites can lead to lots of shady deals around buying and selling links, it's been less remarked that this also provides a way to profit from cracking a website. It used to be that most websites just weren't that interesting. The sites that take credit-cards for data are comparatively few, often use a third-party service for the billing transaction, and redirecting an order page to steal that information will be noticed quickly.
But every site has has its position in the recommendation social network, its ability to link, its Pagerank and "trust".
Thus, if a bad guy finds a security flaw in some website software, rather than being reduced to writing "d00dz rul3z!" on a page, which is not profitable, there's now a brand-new way to make money off the cracking: Insert links to boost another site's search engine results.
One "advantage" of this scam is that sites of non-profit organizations are likely to have a lot of rank and trust, but overworked and underpaid webmasters, which makes such sites a "sweet spot" for exploitation.
So obscure, "hidden" links inserted in various places are not likely to be noticed, and finding someone to fix the page won't set off the sort of red alert reaction involved in credit-card theft.
The United Nations Educational, Scientific and Cultural Organization, UNESCO has now been hit by this scam, as well as many other sites.
At this point, the actual cracker is unclear, and whether or not the link-receiving sites knew about the cracking or were unaware of the criminality. The cracker seems to have exploited a bug in some forum and link-cataloging software.
I mailed the UNESCO webmaster about their site being cracked, and there's now some attention to this particular event. But the general problem is likely to get worse, as the potential becomes more exploited.
"Given enough eyeballs, no scam is too shallow."
If niche markets for micro-content are supposedly The Revolution, why not niche markets for micro-scams? Indeed, abstractly, it would seem to be an even better fit. Product and monetary "exit" are combined into one!
Moreover, one can't even have simple faith in a technological solution to a solution problem, because the inbuilt technological arms race can attract programmers on the side of evil - the "botnets" used in spam-attacks are arguably impressive feats of subversion and distributed coordination.
No easy answers. Maybe we can only hope that some spam-fads, like some disease epidemics, eventually burn-out:
There is another aspect the scammers forgot to realise: there may be millions of people willing to order Viagra from shady websites, but investors willing to make a habit of buying spammed stocks and losing money are certainly in short supply.
A new online tool designed to circumvent government censorship of the internet already appears to be a runaway success, a University of Toronto researcher who helped develop the software says.
Some 30,000 copies of psiphon (pronounced sigh-fawn) had been downloaded by 2 p.m. Monday after it was made available at 1 p.m. last Friday, Michael Hull, the program's lead engineer told CBC News Online.
That rate of interest by far surpasses his highest estimate for the total number of downloads anticipated, he said.
"I thought we were going to have maybe 10,000 downloads," he said, noting that traffic to the site was still on the rise. "I was amazed."
He said it, I didn't ("amazed").
I'm all for this project, but the activism lesson I draw from its prominent coverage is NOT necessarily a happy one. There's been activists working on this sort of stuff for years and years. The critical variable here is not technology, since those reporters wouldn't be able to tell a Tor from a FreeNet. What matters is *ATTENTION*. The backing from the various organizational sponsors is the reason for the widespread publicity.
Don't get me wrong. The attention being devoted to Psiphon is good. But I worry people are going to draw some very wrong lessons from the media frenzy. I've said this before, but it gets repeatedly demonstrated. Without some sort of support from an attention-system, it doesn't matter what you do in terms of fighting censorship, you'll talk to the crickets!
This is "Work" as in "free" - NYT: Yahoo and Reuters Want You to Work for Their News Service.
He said it, not me:
"This is looking out and saying, `What if everybody in the world were my stringers?'" Mr. Ahearn said.
And who's getting paid? Not you! (well, a little if your work is usable, but not much, unless you're really, really, lucky)
Users will not be paid for images displayed on the Yahoo and Reuters sites. But people whose photos or videos are selected for distribution to Reuters clients will receive a payment. Mr. Ahearn said the company had not yet figured out how to structure those payments. The basic payment may be relatively small, but he said Reuters was likely to pay more to people offering exclusive rights to images of major events. ...
And later in the article, certain Usual Suspects appear - i.e. certain projects which aim to repackage minor writers and researchers for potential mass media syndication (though this is not how they describe themselves).
I'm tempted to ask my question again: What's so great about the outsourcing of journalism (and who thinks it's so wonderful)? What's so fantastic about unpaid freelancing? But I should know better.
Shortly after my previous post was published, and echoed at Google Blogoscoped (a popular Google-oriented blog, ranks #44 of all blogs on Technorati), whatever Google penalty flag which affected the Wikipedia Watch site was removed. Search position for relevant terms skyrocketed. It's clear this wasn't a transitory problem, as it had persisted for months. The most likely explanation is someone at Google who had the power to clear the flag, saw the Google Blogoscoped item, and fixed the false positive.
I will not flatter myself to think they saw my post! In terms of audience, the Google Blogoscoped echo only sent around 39 hits. Now, all readers gratefully accepted, but it was a revealing statistic. Another data-point in what I think of as The Meaning Of Exponential Distribution Of Attention.
From another angle, this case was an example of the problems of Google's spam algorithms, and needing to "know someone" to get a problem fixed.
This turns out to be interesting, as I was able to refine the tests to a sharper outcome. Now, let's keep in mind the difference between the facts, and the theory to explain them. There's something I call "SEO superstition", which is the very understandable way random variations can mislead people to form bad theories. And "never attribute to malice that which can be explained by stupidity". So a particular pattern may not be even real, or if it is, that doesn't necessarily indicate that Google's editing search results to marginalize critics (we should be so threatening ...).
So, with that in mind, comparing search terms, I found the following rankings today for www.wikipedia-watch.org for the indicated strings of words (searching as a set of words, not a quoted phrase):
[can you sue Wikipedia]
Yahoo - #1 and #4
MSN - #1 and #2
Google - more than #300 (!)
Yahoo - #5
MSN - #3
Google - I had to go past #700 before I found a result for wikipedia-watch.org
[phenomenon of Wikipedia]
Yahoo - #3
MSN - #4
Google - somewhere around #80
Note Wikipedia Watch site ranks #1 in Google for the search [Wikipedia Watch], but I think that may be misleading.
Feel free to try to reproduce, it's not difficult.
Conclusion: This is a real differential. It's too much to be explained by various SEO factors. Something is amiss here.
I think wikipedia-watch.org has somehow tripped a spam penalty on Google. This is not necessarily Daniel Brandt's fault. But there is a downgrading of the site.
[PS: Invocation - Spammeister Matt Cutts, you might want to check this out. I know Daniel gives you a hard time about being an ex-NSA spook, but look at it this way - a bug's a bug].
[Update Sun Dec 3 09:30:41 EST - the site's issue has been fixed now, for reasons unknown, and the Wikipedia Watch page updated accordingly showing dramatic ranking increases]