February 28, 2011

Google "Farmer" Update Blesses Wikipedia, Curses Mahalo, Centralization

The Google "Farmer" Update results, that is, the winnner and losers from Google's latest algorithm change regarding "content farms", have now been analyzed. So we have outcomes such as:

Let's see in detail what Google did to the affected domains. The first conclusion is quite straightforward: the number of keywords these domains are ranking for dropped dramatically. Looking at mahalo.com as an example, it went from 33,875 keywords before the update to just 9,740 keywords after the update went public – a decrease of more than 70%.

Versus gainers

1. Amazon.com
2. eHow.com
3. NexTag.com
4. Wikipedia.com [sic - should be wikipedia.org]
5. Walmart.com
6. Target.com
7. Etsy.com
8. Answers.Yahoo.com
9. Sears.com
10. bestonlinecoupons.com

Note a pattern above? Another step to centralization, with some aggregator sites anointed as winners, and some as losers. And Wikipedia ends up even more dominant on Google.

I have to remind myself I'm basically completely unable to get the law/policy types to realize the enormous extent to which Wikipedia is de facto subsidized by Google. Here, not only is Wikipedia getting yet another boost, but some of its arguable commercial competitors are being killed! It's not because Wikipedia has some magic itself, in "community" or "civility", or whatever huckerism is being hyped. Rather, it has the algorithm support of Google.

Another gem noted - "Google also said that if its YouTube site gained, that was "happenstance."". If a big ISP did a network management change that just by "happenstance" might have benefited an enormous media property it owned, accusations of bias and favoritism would be rife.

Bonus link - Search Neutrality as Disclosure and Auditing (Frank Pasquale)

Given these parallels, I've compared principles of broadband non-discrimination and search non-discrimination. But virtually every time the term "search neutrality" comes up in conversation, people tend to want to end the argument by saying "there is no one best way to order search results - editorial discretion is built into the process of ranking sites." ... To critics, a neutral search engine would have to perform the (impossible) task of ranking every site according to some Platonic ideal of merit. ... Neutrality is a very broad term, and the obvious differences between the technical operation of physical infrastructure and search engines should not stop us from applying certain broad principles to each entity.

But there's no money behind that.

By Seth Finkelstein | posted in google | on February 28, 2011 11:59 PM (Infothought permalink)

Seth Finkelstein's Infothought blog (Wikipedia, Google, censorware, and an inside view of net-politics) - Syndicate site (subscribe, RSS)

Comments

I despise Jimmy Wales; I despise Wikipedia bureaucrats, particularly deletionists; but when I do a Google search on a random topic that I want a gloss of, I almost always want the Wikipedia page for that topic. I think Google's algorithm is giving good results by placing Wikipedia so highly; I disagree with your implication that there is some kind of unsavoury collusion going on between Google and Wikipedia - in fact, I think a claim like that would be laughable on its face.

Search engines can't be "neutral"; they are by necessity majoritarian, which is not a neutral stance (it penalizes minorities). But that's not a flaw; that's a good thing.

Posted by: Barry Kelly at March 1, 2011 06:29 AM

Oh, and the guys over at that "Concurring Opinions" should look into things like the groupthink behind New York Times and Washington Post way ahead of search results as a source of hidden biases. Mainstream media with its superficial and biased eclecticism in what it covers is far more damaging... "Concurring" opinions masquerading as facts are one of the most pernicious of dangers.

Posted by: Barry Kelly at March 1, 2011 06:36 AM

I think you're missing my overall point about centralization driven by algorithmic choices. No implication about illegal collusion is intended, but that should not be the end of thought. I suggest reading the full article on search neutrality, since common objections are addressed.

Posted by: Seth Finkelstein at March 1, 2011 08:11 AM

I'm inclined to agree with your concern about centralization. However, I'm not yet confident in the implication to be drawn from this data. (Putting aside particulars, and thinking about a theoretical long tail type distribution, I would expect that demoting "spam" would favor everyone else in proportion to which they are already favored.) But I'm not much up on SEO analysis/metrics.

In any case, with respect to Wikipedia, I think it odd to say Google is "subsidizing" Wikipedia when one could also say Google is "appropriating value" from Wikipedia. I suppose both are true if by subsidies one is speaking of attention, and by appropriation it is a non-rivalrous type of consumption.

Also, if you are referring to me by way of civility/community hucksterism/hype, my argument is in the context of the Zeroeth Law (how could Wikipedia work at all?), Godwin's law (given that people are often jerks online), and the myth of wiki-pixiedust (so it must simply be the gee-whiz technology).

If one defines success simply as page views, Wikipedia reaps the effects of Google's algorithms, no doubt, as does everyone else. If by success one means a civil community or higher quality articles, it's possible that Wikipedia would've faired better without the prominence provided by Google. (If it wasn't always in the top 5 returns, perhaps it wouldn't be such a target for POV pushing.)

Posted by: Joseph Reagle at March 2, 2011 07:21 AM

Joseph, correct me if I'm mistaken, but I don't believe your argument considers the Google subsidy effect at all. At the core, you simply have a well-worn virtue narrative, where success is attributed to the moral platitudes of the successful. This is even worse than "gee-whiz technology", it's an old type of toadyism to the powerful.

Posted by: Seth Finkelstein at March 3, 2011 10:29 AM

I think the main difference between Wikipedia and the other content farms is their approval process. It is almost impossible to edit Wikipedia with "staying" power because of the community and their approval process. Google recognizes this and the fact that that their content is the most current is the main difference. Also, they only have 1 page dedicated to a certain topic, while most content farms have multiple, which makes them much larger.

Posted by: Tony at March 3, 2011 05:11 PM