• Print

Wikipedia: A community of editors or a community of authors?

Having a bit more time than usual over the holidays, I caught up on various types of reading, including following old links. One of the pieces I came across that I can’t believe I missed when it was first published back in 2006 is Aaron Swartz’s Who Writes Wikipedia?

This piece is a must-read for anyone who cares about the future of publishing. Aaron argues that Jimmy Wales’ account of how Wikipedia happens is wrong:

I first met Jimbo Wales, the face of Wikipedia, when he came to speak at Stanford. Wales told us about Wikipedia’s history, technology, and culture, but one thing he said stands out. “The idea that a lot of people have of Wikipedia,” he noted, “is that it’s some emergent phenomenon — the wisdom of mobs, swarm intelligence, that sort of thing — thousands and thousands of individual users each adding a little bit of content and out of this emerges a coherent body of work.” But, he insisted, the truth was rather different: Wikipedia was actually written by “a community … a dedicated group of a few hundred volunteers” where “I know all of them and they all know each other”. Really, “it’s much like any traditional organization.”

The difference, of course, is crucial. Not just for the public, who wants to know how a grand thing like Wikipedia actually gets written, but also for Wales, who wants to know how to run the site. “For me this is really important, because I spend a lot of time listening to those four or five hundred and if … those people were just a bunch of people talking … maybe I can just safely ignore them when setting policy” and instead worry about “the million people writing a sentence each”.

Aaron makes the argument that when you count words rather than edits (as Jimmy Wales does when citing Wikipedia contributor statistics), most of the content in Wikipedia does indeed come from contributors outside the core.

Aaron’s principal point was that from a governance point of view, Wikipedia should focus more on random, individual contributors of content, rather than on the core of editors. (And recent controversies have supported Aaron’s contention that Wikipedia’s core community may be too ingrown.) But I was interested for another reason.

Unlike Aaron, I think that Jimmy is right: Wikipedia does have a lot in common with traditional publishing organizations. But Aaron is also right: you have to value the contributors. Take O’Reilly’s book publishing operations: we have far more outside authors than we have employees. Many of them are passionate experts rather than professional writers or editors, just like Wikipedia authors. Their work is improved by an editing team and brought to market in the context of brands that we’ve created, but we couldn’t do what we do without them. This is just as true of any publishing company. Did Bloomsbury’s editors invent Harry Potter? No, it was a welfare mom who dreamed up the idea while riding on the train.

Any publishing organization needs both: a large network of contributors and a core of committed regulars.

This is why I’ve always found the publishing disdain for “user generated content” to be so perplexing. The fundamental job of publishing is curation — finding good stuff and bringing it to an audience that might not otherwise encounter it. It often (but not always) includes editing and improving it. (Some of our most successful books required very little editing.) Sometimes it involves commissioning or creating it, but that is far from the norm.

This is why publishers should be studying Wikipedia (and YouTube, and Google) — because they are all showing us the new face of publishing. At their heart, they involve new means of content creation yes, but more profoundly, they involve new means of curation. Wikipedia creates a context within which authors can exercise their skills, displaying their knowledge and their passion. Yes, it allows for collaborative creation, and that’s good. But the core framework of Wikipedia was developed by a small team, and a small team provides the editing work that keeps it on track.

This isn’t fundamentally new. It’s a different and better way of doing some tasks that publishers already perform.

Ditto Google. PageRank might be thought of as a way of getting millions of readers to work on the slushpile of web content, and promoting the best material to the top, where it can become professionalized.

(Incidentally, this is one of the reasons why we bring technologists and publishers together at our Tools of Change for Publishing conference. A great deal of what is happening on the web is the reinvention of the practices of publishing, not creating an alternative to them, but recreating them, reinforcing them, and showing publishers what is most important about what they do, and how to re-discover their core competencies in the new medium.)

P.S. If Aaron’s analysis is right, it demonstrates that Wikipedia is significantly different in its contribution pattern from open source software, which Ohloh’s contribution statistics demonstrate have a pattern much like Jimmy Wales’ official story about Wikipedia, that most of the work is done by a small core community.

tags: ,
  • John H

    My instinct is that while there is a core of in-depth articles written by the core team of editors, there is a long tail of articles on more obscure subjects written by outsiders.

    The long tail articles are generally of poorer quality, because they’re often written by people who ‘dip in’ to Wikipedia to have a go, and don’t necessarily dive into the reams of content guidelines. It would really to Wikipedia good to improve the quality of many of these articles, and concentrating on the core does not achieve this (not that I know what would).

  • http://tim.oreilly.com Tim O'Reilly

    John,

    You may be right. But Aaron’s point was that MOST of the articles are written by outsiders. They are then edited and improved by the insiders. The “long tail of articles”, as you put it, aren’t written by a different demographic, but they haven’t benefited from Wikipedia’s *editing* community.

  • http://borisoneclipse.blogspot.com Boris Bokowski

    Open source development as tracked by Ohloh only looks at commits, which are almost by definition done by a small core community, the committers.

    I am a committer on the Eclipse Platform, and we are seeing more and more contributions from non-committers, especially for mature components that are more or less in maintenance mode. Because of the necessary IP process, it is not as easy as in Wikipedia to get those contributions into the codebase, but they do make their way in eventually.

    Overall, I don’t think there is that much of a difference in the contribution patterns between open source projects and Wikipedia, as long as the open source projects explain well how ‘outsiders’ can contribute.

  • fr_FR

    (fr_FR)
    Wikipedia¬†: Une communaut√© de r√©dacteurs ou une communaut√© d’auteurs¬†?

    Ayant un peu plus de temps que d’habitude durant les jours f√©ri√©s, j’ai rattrap√© avec diff√©rents types de lecture, y compris suivre des anciens liens. Un des morceaux sur lequel je suis tomb√©, dont je ne peux pas croire que je l’ai manqu√© quand il a √©t√© publi√© pour la premi√®re fois en 2006, est ¬´¬†Qui √©crit Wikip√©dia¬†¬ª (Who Writes Wikipedia) de Aaron Swartz.

    Ce morceau est une lecture incontournable pour quiconque se soucie de l’avenir de l’√©dition. Aaron soutient que le compte de Jimmy Wales sur la mani√®re dont Wikip√©dia se passe est erron√©¬†:

    J’ai rencontr√© pour la premi√®re fois Jimbo Wales, le visage de Wikip√©dia, quand il est venu pour parler √† Stanford. Wales nous a racont√© l’histoire de Wikipedia, la technologie et la culture, mais il dit une chose remarquable. ¬´¬†L’id√©e que beaucoup de gens ont de Wikip√©dia¬†¬ª, a t-il not√©, ¬´¬†c’est que c’est un ph√©nom√®ne √©mergent —¬†la sagesse des foules, l’intelligence de l’essaim, ce genre de choses¬†— des milliers et des milliers d’utilisateurs individuels qui ajoutent chacun un peu de contenu et de cela il √©merge une ≈ìuvre coh√©rente.¬†¬ª Mais, a t-il insist√©, la v√©rit√© est un peu diff√©rente¬†: Wikipedia a effectivement √©t√© √©crit par ¬´¬†une communaut√© … un groupe d√©di√© de quelques centaines de volontaires¬†¬ª, que ¬´¬†je connais tous et o√π ils se connaissent tous les uns les autres¬†¬ª. Vraiment, ¬´¬†c’est beaucoup plus comme une organisation traditionnelle.¬†¬ª

    La diff√©rence, bien s√ªr, est crucial. Pas seulement pour le public, qui voudrait savoir comment une importante chose comme Wikip√©dia est en fait √©crite, mais aussi pour Wales, qui veut savoir comment g√©rer le site. ¬´¬†C’est pour moi vraiment important, parce que je passe beaucoup de temps √† √©couter les quatre ou cinq cent et si … ces gens √©taient juste un tas de gens en train de parler … peut-√™tre alors je pourrai sans risque simplement les ignorer lors de la d√©finition de la politique¬†¬ª et plut√¥t m’inqui√©ter ¬´¬†du million de personnes qui √©crivent chacun une phrase¬†¬ª.

    Aaron fait de l’argument selon lequel lorsque vous compter des mots plut√¥t que des modifications (comme ce que fait Jimmy Wales lorsqu’il cite des statistiques de contributeur sur Wikip√©dia), la plupart du contenu de Wikip√©dia vient en fait de contributeurs √† l’ext√©rieur de ce noyau.

    Le point principal de Aaron est que, d’un point de vue de gouvernance, Wikipedia devrait se concentrer davantage sur des contributeurs de contenu individuels, au hasard, plut√¥t que sur le noyau de r√©dacteurs. (Et de r√©centes controverses ont appuyer l’affirmation de Aaron selon laquelle le noyau de la communaut√© Wikip√©dia √©tait peut-√™tre encore trop immature.) Mais j’ai √©t√© int√©ress√© pour une autre raison.

    Contrairement √† Aaron, je pense que Jimmy a raison¬†: Wikipedia poss√®de de nombreux points communs avec les maisons d’√©dition traditionnelle. Mais Aaron a aussi raison¬†: vous devez attacher beaucoup d’importance aux contributeurs. Prenez les op√©rations d’√©dition de O’Reilly¬†: nous avons beaucoup plus d’auteurs de l’ext√©rieur que nous avons d’employ√©s. Beaucoup d’entre eux sont des experts passionn√©s plut√¥t que des √©crivains ou des r√©dacteurs professionnels, tout comme les auteurs sur Wikipedia. Leur travail est am√©lior√© par une √©quipe de r√©daction et est mise sur le march√© dans le contexte de marques que nous avons cr√©√©, mais nous ne pourrions pas faire ce que nous faisons sans eux. Cela est tout aussi vrai pour n’importe quelle soci√©t√© d’√©dition. Est-ce que les r√©dacteurs de Bloomsbury [NdT¬†: Bloomsbury est une maison d’√©dition] ont invent√© Harry Potter¬†? Non, √ßa √©t√© une maman sociale qui a r√™v√© √† l’id√©e alors qu’elle prenait le train.

    Toute maison d’√©dition a besoin des deux: un vaste r√©seau de contributeurs et un noyau d’habitu√©s commis.
    C’est pourquoi j’ai toujours trouv√© que le d√©dain de l’√©dition pour le ¬´¬†contenu g√©n√©r√© par l’utilisateur¬†¬ª laissait perplexe. Le travail fondamental de l’√©dition est le soin —¬†trouver de bonnes choses et les apporter √† un public qui autrement ne les auraient pas rencontr√©es. Cela souvent (mais pas toujours) comprend la r√©daction et l’am√©lioration. (Certains de nos livres les plus r√©ussis ont exig√© tr√®s peu de r√©daction.) Parfois il ne s’agit que de mise en service ou de la cr√©er, mais c’est loin d’√™tre la norme.

    C’est pourquoi les √©diteurs devraient √©tudier Wikip√©dia (et YouTube et Google) —¬†parce qu’ils nous montrent tous le nouveau visage de l’√©dition. En leurs c≈ìur, elles font appel √† de nouveaux moyens de cr√©ation de contenu oui, mais plus profond√©ment, elles font appel √† de nouveaux moyens pour soigner. Wikip√©dia cr√©e un contexte dans lequel les auteurs peuvent exercer leurs comp√©tences, affichant leurs connaissances et leur passion. Oui, cela permet une cr√©ation collaborative, et c’est une bonne chose. Mais le structure de base de Wikipedia a √©t√© d√©velopp√© par une petite √©quipe, et une petite √©quipe permet que le travail d’√©dition reste sur la bonne voie.

    Ce n’est pas fondamentalement nouveau. C’est une fa√ßon diff√©rente et meilleure de faire certaines t√¢ches que les √©diteurs effectuaient d√©j√†.

    Idem pour Google. PageRank pourrait √™tre vu comme un moyen d’arriver √† faire travailler des millions de lecteurs sur la pile noire [NdT¬†: terme d’√©dition pour d√©signer l’ensemble des manuscrits ind√©sirables envoy√©s par l’auteur directement √† l’√©diteur] du contenu Web, et de promouvoir le meilleur mat√©riel au sommet, o√π il peut devenir professionnalis√©e.

    (Soit dit en passant, c’est une des raisons pour lesquelles nous apportons des technologues et des √©diteurs ensemble √† notre conf√©rence ¬´¬†Outils de changement pour l’√©dition¬†¬ª (Tools of Change for Publishing). Une grande partie de ce qui se passe sur le web est la r√©invention des pratiques de l’√©dition, non pas la cr√©ation d’une alternative √† celles-ci , mais leurs recr√©ation, les renfor√ßant, et montrant aux √©diteurs ce qui est le plus important √† propos de ce qu’ils font et la mani√®re de red√©couvrir leurs comp√©tences centrales dans les nouveaux m√©dias.)

    P.S. Si l’analyse de Aaron est juste, elle d√©montre que Wikipedia est significativement diff√©rente dans son sch√©ma de contribution par rapport au logiciel libre (open source software), dont les statistiques de contribution par Ohloh d√©montrent y avoir un sch√©ma un peu comme l’histoire officielle de Jimmy Wales au sujet de Wikip√©dia, que la plupart du travail est fait par une petite communaut√© centrale.

  • http://en.wikipedia.org/wiki/User:Mav Daniel Mayer

    I’ve been an editor and author contributing at Wikipedia since early 2002. At first, I authored little but edited a great deal and helped shape content and behavior policy. But once those policies were mostly in place and countless others with more energy did the editing, I switched to putting most of my energy into authoring content and only editing articles that I’ve brought to featured status or that I’m expanding to that status.

    I now enjoy adding content and letting editors help clean it up per all the content policies and guidelines along with adding small improvements to the content. I’m an exception though; in my experience, most content authors at Wikipedia don’t stick around for long. I think the editor culture may be too much for many of them; it takes some effort to let your work be “edited mercilessly” and you need to keep a watchful eye on your best work or it may slowly degrade by a “death of a 1000 edits” by less informed editors.

    But I completely understand the need for editors since I used to be one. So I persist. We just need a better way to help content authors preserve good content by limiting damage from less informed editors (usually the ones outside of or new to the community). I hope that the upcoming stable version feature will help in that regard.

  • http://www.openlogic.com/blogs/author/stormy/ Stormy

    I wonder what would happen if you could apply that model (outside contributions with core editors and integrators) to software? Eclipse plug-ins might be like that. You could get pieces of functionality from many different sources with different talents and problems to solve and integrate them into one solution. Firefox extensions might be another example of something closer to Wikipedia than your typical open source software project. I think it’s a really powerful model.

  • Andrew Burcin

    Tim, in your postscript you comment that “if Aaron’s analysis is right, it demonstrates that Wikipedia is significantly different in its contribution pattern from open source software”. This would hardly be surprising as the type of knowledge generally required to contribute to open source is a limited sliver of human knowledge (computer programming) whereas the type of knowledge required to contribute to Wikipedia in fact spans the whole of human knowledge. It would therefore make far more sense that there would be a huge range of contributors and a core of editors who would then help to provide the proper writing style. Just based on this single observation I would expect a very different contribution structure than in OSS.

  • http://tim.oreilly.com Tim O'Reilly

    Absolutely, Andrew. But it’s one of those things that doesn’t necessarily jump out at you. Because of course Wikipedia has a great deal in common with open source software: anyone can see the source (and in fact it’s even more radical in enabling anyone to edit the source), everything is under version control, there is a discussion list for each entry. So it’s easy to see Wikipedia as a kind of “open source for text.” But Aaron’s observation (and your comment) focus on what makes it different. And these small differences can end up being really meaningful.

  • http://rvgolfer.blogspot.com Dale Archibald

    I’ve been having a fling at Wikipedia in spare moments. I’ve started a couple of subjects, and made minor editing changes in others. (I’ve been in writing/editing all my life.)

    I love the concept of people of good will sharing expertise. Now, if we can just teach the readers not to trust it without doublechecking… because there are those who are of ill will, and those who are just plain boneheadedly wrong.

  • http://notes.computernotizen.de Torsten

    The new challenge for Wikipedia is a balance between the core group and interested authors. In the first years there was much consensus, the Wikipedia community was small enough to discuss important issues with everybody who was interested.

    Today there are so much rules that are used to keep outsiders out. Policies like WP:POINT can be interpreted in many ways and people get angry when their articles are deleted and they don’t understand why.

    To write a new article is not a “wiki-wiki” thing anymore. You have to do a lot of research not only about the subject of the article but on the procedures in Wikipedia to make your article stick.

  • http://makarevitch.org/rant/wikipedia.html Nat

    There is no wisdom in crowds (someone wrote that any crowd is a headless monster), but only a few folks who are ‘gems’ for the topic at hand because they master it and like (or have) to show it. They know the topic because they care (and vice-versa: positive feedback), while others don’t.

    Efficient ‘crowdsourcing’ is about taping a crowd to efficiently mobilize, for each goal, its pertinent ‘gems’. A project leveraging a crowd must, for each given problem, encourage gems and somewhat, on the long term, mute the others. It works because ‘gems’ care, therefore (in an adequate context) they tend to stick and, if necessary, fight polluters, ignorants, jokers, phonies(…).

    This leads to ‘groups’, ‘gems’ of a particular topic who get to know and appreciate each other.

    Wikipedia (and all similar projects) have very few safeguards against two major distorting forces. The first is the fact that ‘officials’ (sysops…) slowly begin to act just as if the project belongs to the more active and useful contributors (them) and as if the community made them Imperators deciding upon ‘strategic matters’, free to give the boot to , while the community only gave them privileges usually granted to cops in order to have them enforce the rules, not to define them. This a ‘coup’ which lasts thanks to some consent of the governed, which only exists because those last are not perceiving, in this hugeproject, such a slow grinding. Some of them may be booted later (“Horatii sysops” vs “Curiatii dispersed lucid ones”, news at 10).

    The second distorting force is made of propagandists, coming because Wikipedia contents is read and believed by many (“Propaganda is to democracy what the bludgeon is to a totalitarian state”, Chomsky) because many ‘applied’ articles contents (sciences, technology…) are pretty good. Members of this second group, in order to favor their material in the contents, try hard to become members of the first one.

    ‘NPOV’ is an utopia to me and won’t magically appear in such a context.

    As a sidenote: an OSS development project does not work this ‘crowdy’ way because it evolves under direction and is elitist: let’s find a single known OSS project enabling a potential contributor who is unknown from the team to commit in the official repository, or even to submit a patch which will be seriously reviewed for integration. Don’t hold your breath. OSS project’s members form a ‘group’, not a ‘crowd’.

  • http://en.citizendium.org/wiki/User:Tim_Chambers Tim Chambers

    Thanks for giving this classic article a new look, Tim! I found your article today via a Google alert.

    I switched to Citizendium when Larry Sanger started the pilot, and I continue to believe that CZ has a better approach. By eliminating anonymity it is definitely excluding some who wish to contribute content, and that’s unfortunate. Then again, WP is already filling that niche, and this article and the responses to it (both here and attached to Aaron’s article) document a sample of the problems that anonymous contributors face. And problems caused by anonymous users are high-profile embarrassments for WP. By requiring real names, a true community can form at CZ based on real-life identities. And real-life reputations are put up as collateral against the quality of the content. Furthermore, editors are required to have demonstrated expertise — so far dominated by academic credentials, as one would expect to find on a project that desires to become the Internet’s source for reliable human knowledge. This isn’t unlike the model that you, with roots in dead-tree publishing, use to distribute reliable knowledge, Tim. And when you say you agree with Jimmy, I don’t think you mean to say that you agree with his willingness to overlook Essjay’s fraud, to cite one high-profile example of WP’s built-in sloppiness. I am also curious to know what you think of WP’s anti-elitism, explained in detail by Larry Sanger three years ago.

    I think the problem with WP is that it doesn’t have qualified editors. And its culture is allergic to measures of expertise that the rest of the world — including traditional print publishers — take for granted. That, I think is the key contribution that CZ makes to the endeavor to create a free encyclopedia: editors who can prove that they know what they are talking about.

    1E4AF729D5CEFFD0

  • http://www.ccil.org/~cowan John Cowan

    The ideal Wikipedian is someone quite different from the ideal O’Reilly author. The latter is a passionate subject-matter expert, and a great many of O’Reilly’s authors actually live up to that ideal, which is why O’Reilly’s books set the standard for excellence in their field.

    To be a good Wikipedia contributor/editor, you don’t need to necessarily know anything about what you are writing about, and being a passionate expert is probably a disadvantage, since it makes you confuse what you know as part of your expertise with what you can document from the source literature, leading to many unsourced statements. What you need is the ability to research that literature and boil it down into encyclopedia articles.

    That is why I no longer do anything on Wikipedia except correct obvious errors, usually copy-editing errors where subject-matter expertise is irrelevant. I have never written an article (except Egbert B. Gebstadter) and undoubtedly never will.

  • http://www.i2.psychologie.uni-wuerzburg.de/ao/staff/schroer.php Joachim Schroer

    There are a number of research papers on Wikipedia that might be relevant for this topic, and that I found most instructive when I tried to read up on the discussion between Jimmy Wales and Aaron Swartz.

    First, edits to Wikipedia seem to follow the Pareto principle. I.e., 80% of all edits are performed by 20% of the contributors:

    Spek, S., Postma, E., & van den Herik, H. J. (2006). Wikipedia: Organisation from a bottom-up approach. Paper presented at the 1st International Symposium on Wikis (WikiSym 2006), Odense, Denmark. Retrieved from http://www.cs.unimaas.nl/s.spek/spek-wikisym06.pdf.

    This distribution is even more pronounced when the popularity of the respective articles is taken into account by analyzing content that is actually read (or at least retrieved from Wikipedia). In this case, 90% of the words retrieved from Wikipedia are written by about 10-15% of contributors:

    Priedhorsky, R., Chen, J., Lam, S. K., Panciera, K., Terveen, L., & Riedl, J. (2007). Creating, destroying, and restoring value in Wikipedia. Paper presented at the GROUP 07 conference, Sanibel Island, FL. Retrieved from http://www-users.cs.umn.edu/%7Ereid/papers/group282-priedhorsky.pdf

    Together, these 10-20% of contributors should be considered “core” contributors. However, it is also true that occasional contributors sometimes add high-quality content:

    Anthony, D., Smith, S. W., & Williamson, T. (2007). The quality of open source production: Zealots and good samaritans in the case of Wikipedia (Technical Report No. 2007-606). Dartmouth: University.
    Retrieved from http://www.ists.dartmouth.edu/library/358.pdf

    (One should note that “quality” in this is defined as “retained/not changed by other editors”. This definition, however, risks favoring contributions to unpopular articles because they are edited less frequently.)

  • http://thehealthlinks.com Pat

    The Wikipedia is biased. it’s representative of the participants. The only way to remove these biases is getting more participants with diverse backgrounds.