March 09, 2007

Unpaid-Labor-Powered Search Engine Still Vaporware For Now

Segueing to search: Wikipedia founder says to challenge Google, Yahoo

TOKYO (Reuters) - The online collaboration responsible for Wikipedia plans to build a search engine to rival those of Google Inc. and Yahoo Inc., the founder of the popular Internet encyclopaedia said on Thursday.
Wikia Inc., the commercial counterpart to the non-profit Wikipedia, is aiming to take as much as 5 percent of the lucrative Internet search market ...

I've been on the project's public mailing list, and there hasn't been much activity lately. Down in the land of developers outside of Wikia, there still isn't even a machine set up for initial exploration. Not that this means much, as projects go. But in case anyone wants, or cares, to read something besides gush, the citizen-mushrooms are still in the dark here.

I wonder if one Wikia "exit strategy" under consideration is to be bought by Yahoo or Microsoft. Note the search project might not actually produce much in terms of search innovation, but if it prompts an acquisition somehow, that's a big win for the Wikia investors. Again, just speculation on my part.

By Seth Finkelstein | posted in wikia-search | on March 09, 2007 11:50 PM (Infothought permalink)

Seth Finkelstein's Infothought blog (Wikipedia, Google, censorware, and an inside view of net-politics) - Syndicate site (subscribe, RSS)

Comments

I doubt that Jimbo is clever enough to be angling for a Wikia, Inc. buyout at this point. For one thing, the search engine talk is clearly hot air, and that will count for less as Web 2.0 declines over the next year. How do I know it's hot air? Because the hardest part of a search engine is the crawling, and Jimbo's Jumbo Engine doesn't plan to do any of this. They plan to show all that niche fancruft from Wikia, but they don't plan to crawl the web. No one wants to read the stuff on Wikia.

It's more likely that Jimbo is simply trying to divert attention from Wikipedia's current Essjay controversy by throw out some Web 2.0 hype about challenging Google's dominance. After all, it was only a couple months ago that Jibmo blamed his own "big mouth" for prematurely talking about his search engine plans. Nothing has happened since then to make such talk any less premature.

Jimbo still has a "big mouth" (his words), but that's preferable to talking about Essjay at Jimbo's various pre-scheduled appearances around the world.

Jimbo is playing the high-tech media coverage game. It's the only thing he's good at. And that's only because there are too many high-tech reporters chasing too few stories. They'll grab at anything for a couple paragraphs of copy.

Posted by: Daniel Brandt at March 10, 2007 01:04 PM

How about Records and Archives in the News' search results
of the indomitable Peter Kurilecz transparent in,
for example, a frequently asked questions FAQ type format
or a format like one finds on websites with an about link.
Fanatics who appreciate RAIN could more accurately review
the keyword descriptors, the search engines and such and
offer their own ideas that might even improve or vary
the results!

Or devise their own techniques
modeled on the current techniques used by Peter Kurilecz.
Attributing the original technique to Peter Kurilecz
is surely appropriate. Giving others the opportunity
to use that successful technique is a teaching moment.
A learning moment for us who value the energies that
go into it.

Records and Archives in the News RAIN at
http://lists.ufl.edu/archives/recmgmt-l

What luck it would be that others would devote energies
to take the model technique of Peter Kurilecz and devise
something else also of value!

The sharing of knowledge, the teaching of techniques
is a beautiful model for a profession that is deserving
of greater recognition, for example in programs like that at
http://www.albany.edu/cci/

Maybe sometime the resources will be available to offer
a grant for Peter Kurilecz' great energies that produced
Records and Archives in the News RAIN, an effort that
would also be wonderful on a website with features to make
even better use of the content of the results of the
searches. And the explanations that would encourage others
to try producing something using the Peter Kurilecz model.

The Peter Kurilecz device is an art !
Here's one source of grants for artists
http://artdeadlineslist.com

Another aspect is
how to incorporate a thesaurus in the search engine.
Imagine making up a search and the search engine suggests
alternate keywords descriptors you might also try.
Or the search engine suggests at the point of searching
hints, tips and pointers to refine your particular search.

Have you kind folks out there read
The Extreme Searcher's Internet Handbook,
a guide for the serious searcher,
extreme tools and techniques for web information users.
by Randolph Hock
Foreword by Gary Price
http://www.amazon.com/gp/reader/0910965684
http://worldcat.org/oclc/53090956

Protect editorial privilege,
critique the dynamics of editorial privilege.

Posted by: dsaklad@gnu.org at March 10, 2007 06:56 PM

Daniel: Huh? I believe you're thinking of a different project, "Wikiseek" / searchme.com. They're a separate start-up (really). The Wikia-search does plan to crawl the web in general.

Posted by: Seth Finkelstein at March 10, 2007 11:35 PM

I work for a search-engine company. You may consider the following a conflict of interest or a sign that I have some clue whereof I speak....

Nothing I have seen about this "collaborative search" gives me one clue about how this technique is supposed to give better results than Google/Yahoo/MSN/etc. I mean, if you go to our company's Web site, it's pretty easy to figure out what we're doing that's worth (certain corporations and government agencies) paying money for. A few years ago, I interviewed at Endeca, another search-engine company in this neighborhood, and it didn't take long for me to understand what they were selling.

By contrast, Wales seems to be selling vaporware with the scent of magic pixie dust. Especially since the Open Directory Project is already doing some volunteer-assisted Web indexing work.

And note that both MetaCarta and Endeca sell to institutions, and we use our technology to crawl, analyze, and serve up our customers' internal Web sites. Google also sells an appliance that allows customers to apply their technology to intranets. I don't see how a wiki-based search engine can get traction in this very high-margin slice of the search-engine market.

Posted by: Seth Gordon at March 11, 2007 07:31 PM

Surprisingly, I think Wales has an actual argument. It's basically hand-tweaked search results. That is, the base will be a commodity search engine, with a value-added component of hand-adjustment for the popular (and hence spam-intensive) terms. To the objection that this has been tried before and failed, because of the labor involved, he could say (not in the following terms, but you get the idea) that he's an expert at building a cult social system to get lots of people to work for free. Very few projects can claim that, and while ODP could be cited, it was not nearly as well-connected and heavily supported by marketing hypesters.

Will it work? Well, there's plenty to be said against it. But the above does seem like a rational argument.

Posted by: Seth Finkelstein at March 12, 2007 12:35 AM