May 11, 2008

Wikia Search Infrastructure And Organization Discussion

The Wikia Search project (Wikipedia-model search engine) mailing list has had a thread discussing the status of updates and the social organization of the project, and which sort of arrangement would be best for the goals of having an open search project. There's been some confusion between who owns what part of the current technical infrastructure. Paul Vixie, who runs the Internet Systems Consortium (ISC) clarified (comment reformatted for readability):

The servers were donated to ISC, not Wikia. The bandwidth they use is provided by ISC. Wikia has donated a 15-ton air conditioner, a smattering of network switches and front end servers, and a heaping lot of [Wikia search lead]'s time and other Wikia staff time. Wikia has agreed in principal to underwrite the power costs of the ISC physical plant used by the crawlers and indexers. But as for a [different organization] there already is one (see www.isc.org, ugly though it is) and i think it's odd that anybody is still worried about that part. ISC is a 501(c)(3) [nonprofit] whose mission is public benefit. If anybody here thinks we're either incompetent or untrustworthy with regard to owning and operating a search engine backend, I'd thank you very much to call me on the phone and explain your concerns to me realtime.

My view is that the problem is not ISC's bona-fides. Rather, it's more the issue of Wikia's incentives as a VC-backed startup, versus the optimum structure of an open-source search project. Focusing on Paul Vixie or ISC shifts towards what I call the "positive" ad-hominem argument (roughly: "I'm a good person. Therefore, what I do must be good. If you say it's bad, you're saying I'm bad person. Because only a bad person does bad things. But I'm a good person, and you are then a bad person for saying otherwise. Therefore, you must be wrong. Because I'm a good person").

I'm a strong supporter of the ideas of Free Software and Open Source. But Wikia seems to be doing the worst implementation of this kind of project. Which is the arrangement where supposedly programmers do unpaid labor because they just luvvvv programming, and businesses make money off the honey from the little worker-bees.

Semi-related, there's an amusing graph of the Wikia Search Hype Cycle

Are you going to help take it to the next level?

Hmmm ...that's "you", as in you-Yes-YOU? No. I don't think so. I'm very wary of the way Wikia is positioned to "brand" everyone's work as its own (after all, that's happening right now with ISC's servers), and to commercialize itself anything of value from independent developers.

By Seth Finkelstein | posted in wikia-search | on May 11, 2008 04:41 PM (Infothought permalink)
Seth Finkelstein's Infothought blog (Wikipedia, Google, censorware, and an inside view of net-politics) - Syndicate site (subscribe, RSS)

Subscribe with Bloglines      Subscribe in NewsGator Online  Google Reader or Homepage

Comments

Brand. Yes. You got he point. Brand is the most important thing that Wikia is donating to the project. From all the wannabe search engines out there, Wikia Search was the one to get attention from the media and thus from the potential participants of the project.

Maybe people thought that since that was a project from the "wikipedia guy" it had higher chances of being successfull, and then went to find out more about it, and then they realized that the "wikipedia guy" didn't really have a very clear idea about how to make Wikia Search successful, but after taking the time to find out more about the project some of them decided it was interesting nevertheless and stayed to contribute.

Open source works because of the crowds of available developers, but there are many projects competing for their attention, so if there is something, such as a brand, that can attract them, then this isn't something bad.

Except for the fact that Wikia isn't non-profit, this isn't any different from other Open Source projects. If you contribute to any of the dozens of Apache projects, what people will see from outside is just "Apache" and not you.

So the difference would be in "to commercialize itself anything of value from independent developers." Let's take a different example then. Instead of Apache, let's talk about JBoss (an important Java Application Server). That is another project where lots of people contribute. But there is a small group in that that owns the brand and sells documentation, consulting, etc. Sun, IBM, etc support Open Source projects because they also think it can help them sell services. They pay a few programmers and benefit from the work of thousands of others that they don't have to pay. So, how is that different from Wikia? Except that Wikia is still as clueless about how it will make money as it is on how it will make the project work. Wikia is already paying the person who is making the difference in the project, who instead of the "wikipedia guy" is the "jabber guy".

So that's already four contributions to the project: brand, (a few) people, the idea and hardware. What else exactly do you expect from them, that they would be lacking compared to other companies who sponsor open source projects?

Posted by: Bani at May 11, 2008 05:34 PM

When the brand is built on exploitation and digital-sharecropping, I think that's very bad indeed. "Except for the fact that Wikia isn't non-profit ..." - that's a BIG except. I believe you're way underestimating the incentives that Wikia has to not give back the reputation and patronage compensation to programmers for open source participation, but rather to defect on contributors for its own profit (anti-strawman: Wikia didn't invent this issue, the problem is generic - I'm saying Wikia has an extremely strong version of the problem, and there's plenty of reason to be concerned over it).

The difference is that Wikia is not paying for a product to help it sell services, and taking in minor tweaks (usually from other paid programmers). Roughly, Wikia is expecting everyone else to build it the product in the first place, which it will then sell, and trying to pull that off by presenting the free work it wants as a public service! (really amazing if one breaks that down)

Note they didn't come up with either the idea or the hardware. The idea is common, the hardware is ISC (though they do get credit for getting ISC on board, but that's a different matter). Note part of this post is specifically about the very issue of what their getting credit for the hardware in the public mind means as a harbinger for the future.

What I'd want from them is again giving back in terms of the norms of reputation and patronage compensation to programmers for open source participation. I doubt they even know how to do that, because their entire company is built on a model of extracting unpaid work.

Posted by: Seth Finkelstein at May 11, 2008 06:52 PM

What better alternatives are there for people interested in contributing to the wikipedia weblinks?...

Posted by: the zak at May 11, 2008 08:19 PM

Paul Vixie just posted again to the list and has asked a lot of questions, one of which was "2. why is Wikia's the only frontend?" Is it possible *right now* for someone other than Wikia to create a front-end to the search data? Can someone point me to a link about this?

Posted by: Anthony at May 11, 2008 09:27 PM

I think it's *possible* in theory, but I suspect in practice it'll require some co-operation and technology trasfer from Wikia or maybe ISC. Good question to ask - good task to try 1/2 :-).

Posted by: Seth Finkelstein at May 11, 2008 09:47 PM

Most Linux hackers don't work for free (even if we define free as "in return for reputation or patronage"). Developers who work on serious open source projects (like the kernel, databases etc. as opposed to MyMp3Player) are employed either by consortiums (eg Linus) or corporations (eg Alan Cox at Red Hat).

Now, is Wikia Search in the MyMp3Player category or the Apache, MySql, kernel category?

Posted by: anon at May 12, 2008 02:29 PM

Anthony, yes, this is possible, documented here:
http://search.wikia.com/wiki/Tech/Open_Index
Seth, the technology (index server, request and response formats) is there already.

Posted by: Rainer at May 14, 2008 04:55 PM