July 22, 2006

Google News problem parsing "Techdirt" site

Techdirt:

Google News changed something on July 6th, so that all of our stories appearing in Google News now show up with the headline "Permalink to this story." We hadn't made any changes to our site, and Google News had always performed flawlessly in the past. ... What followed was a series of emails from Google staff (much of it sounding like boilerplate "canned" responses) that in almost every case blamed us for their own glitch. That's what we get for trying to point out a glitch to them.

This might be an interesting case study of algorithmic quirks. I assume Google isn't doing this deliberately. But I suspect there's a long list of trade-offs in Google News' ad-hoc parsing which is leading to a poor result for Techdirt. And that Google doesn't want to devote the time of a programmer skilled in debugging in order to diagnose the issue.

I sent Techdirt a suggestion: Put a class="permalink" attribute on your permalinks (<a class="permalink" href="...">). That *may* fix it.

If they try my idea, and I'm right, I'll write more about it.

[Update: I guess not. Google works in mysterious ways.]

By Seth Finkelstein | posted in google | on July 22, 2006 11:59 PM (Infothought permalink)
Seth Finkelstein's Infothought blog (Wikipedia, Google, censorware, and an inside view of net-politics) - Syndicate site (subscribe, RSS)

Subscribe with Bloglines      Subscribe in NewsGator Online  Google Reader or Homepage

Comments

Seth,

Thanks for your suggestion.. it's worth a shot, so I've added class="permalink" to the Permalinks on Techdirt.. We'll see if it makes the googlenewsbot happy..

dennis.

Posted by: dennis at July 24, 2006 06:03 PM

For my blog, which freely mixes posts-that-share-the-page-with-other-posts and posts-that-have-their-own-page, I had to give Google a special page as they're only able to parse the latter: http://blog.outer-court.com/articles.html

Posted by: Philipp Lenssen at July 26, 2006 12:09 PM

Nope, as of July 27, they added the class but Google still does it for the article I checked.

Posted by: Jimmy at July 27, 2006 01:41 PM