Open Source, Community and Audiobooks: Q&A with LibriVox Founder Hugh McGuire

LibriVoxLibriVox is a volunteer effort with a big goal: record audiobook editions for every title in the public domain. In the following Q&A, LibriVox founder Hugh McGuire discusses the project’s beginnings, the organic development of the LibriVox community, and the distinctions (or lack thereof) between “professional” and “amateur” efforts.

How did LibriVox start?

LibriVox came about in August 2005, when I was looking for free, full-length audiobooks online for a long car trip. I went to gutenberg.org, but found mostly machine-read stuff there, which I don’t like. Eventually I found someone who had recorded half of Lady Chatterley’s Lover, which was enough for my trip, but it occurred to me when I got through the first half, that i would have to wait months to hear the rest.

At the time, I’d been thinking and writing about a fair bit about the free software/open source movement, and how it might apply to non-software projects. Wikipedia was a big inspiration there, as was Brewster Kahle’s Internet Archive, and his call for, “universal access to all human knowledge.” I’d been enjoying podcasting, and in particular had been excited by AKMA’s project to get a group of volunteers to record and distribute Lawrence Lessig’s Free Culture, also an inspiration. Project Gutenberg has been a long-time inspiration, too.

So all those things percolated together, and I thought, well why not try to start a big open source project to get volunteers around the world to record public domain texts for free? I put up a blog, sent out some emails, and had 13 people agree to read our first book in the first day. Three months later, we’d completed eight books; within a year later we had 257 books.

How many audiobooks do you offer?

Currently our collection includes 1,896 completed works, all free, all public domain. We produce anywhere from 60 to 115 works in a given month, which puts us among the most prolific publishers of audiobooks in the world. We’ve got books in 26 different languages, including Finnish, Japanese, Spanish, Chinese, Russian and German. Our books include classics from Twain, Austen, Nietzsche, Zola, Plato, Shakespeare, Sun Tzu, as well as many more obscure books, such as The Romance of Rubber.

What file formats are LibriVox audiobooks available in?

128kbps MPe, 64 kbps MP3 and Ogg Vorbis. All our files are hosted on the Internet Archive.

How many volunteers have participated? How do these folks find LibriVox?

We have 12,362 users registered on our forum, 2,476 of whom have volunteered to read. Many of our volunteers started as listeners. The rest just find us one way or another on the Web.

Do volunteer readers typically read entire books? Or, do books feature multiple volunteers reading different chapters?

Roughly half of our projects are collaborative projects, and half solo projects. That ratio has remained stable over the past few years (and, frankly, was something of a surprise to me: I did not expect so many people to record entire works, since it is a challenging thing to do).

What do you think motivates people to participate?

A whole range of things, but probably the main thing is our volunteers enjoy recording texts. Many were read to as kids, and enjoy being read to, or reading to others; some of us have idealistic motivations, about free access to knowledge; others have ambitions to become professional voice actors (a number of our volunteers have gotten gigs as pro readers). There is probably a certain satisfaction of being the voice of a writer you love for thousands of people. For a long time, Pride and Prejudice was our most popular book, downloaded hundreds of thousands of times, read by a library student from Missouri. I expect that’s a pretty wonderful feeling, having so many people get so much pleasure from something you’ve done for your own enjoyment.

We also have a wonderful, helpful online community, so I think many people just enjoy hanging out on our forum. The main thing that motivates people, and keeps us going, is that it’s fun.

Do you find that the same core group of volunteers continues to participate year after year, or do volunteers come and go?

There is a core of a few of us who have been around since the beginning, but we’ve had lots of turnover. It’s the kind of thing that becomes an obsession for many people, and so there is a natural burnout process. But there always seems to be a new crop of people to jump in. We have about 25 moderators/admins, and probably adding and subtracting about three people every three months or so.

As with Wikipedia, a huge portion of our recordings are done by a small number of readers. The 20 most prolific readers have read 30 percent of the sections in our catalog (!).

Was there a moment when the LibriVox community seemed to take on a life of its own? If so, when did this happen and how did you know it?

That’s easy: on Sept 12, 2005, when Boing Boing wrote about us. Traffic went from a couple of hundred a day, to 10,000 in one day. Nothing’s been the same since!

What are the biggest challenges you’ve faced in building and maintaining the LibriVox community?

In the early days the main challenge was dealing with the growth in the community and production. With a few books I gathered all the files myself, and uploaded all the files to the Internet Archive as they came in. But by the time we got to 10 projects that was too much for me (and I’m no good at organizing that sort of thing); at 100 projects we needed to streamline our system. We currently have about 400 active projects (that’s typical) and we are releasing an average of eight hours of audio a day. So the whole management of that process evolved organically, but took a fair bit of thinking about.

The other challenge more recently is a change in the sorts of people who are deeply involved in the project. In the early days, it was kind of like the wild west, and we attracted a motley crew of open sourcey types, with a broad range of skillsets (Web design, coding, etc). These days it seems like there are fewer of those kinds of people around (or, because the community is much bigger, it’s harder to find them), so some things we’d like to get done (for instance making our Web site more accessible) have been on the back burner for a long time. So that’s a challenge we have yet to figure out.

Have you marketed LibriVox, either through traditional advertising channels or via grassroots campaigns?

We’ve never done any marketing, except sending the odd email to Boing Boing and places like that. We get something like 40,000 visits a day on our site, all of it driven by general interest on the Web — small blogs writing about us, podcasters talking about us, and once in a while a big media piece (New York Times, Reason, LA Times, BBC, NPR etc). But mainly it’s just old-fashioned netroots marketing that seems to take care of itself.

Which titles and genres are most popular? Why did these titles/genres catch on?

The big ones are Bronte, Twain, Austen, L.M. Montgomery, Thomas Hardy, Dickens, and Conan Doyle. I think these are the classic stories of English literature, and so they are the writers most people seek out. But Einstein’s “Relativity: The Special and General Theory” has been downloaded 38,000 times, so it’s a pretty broad range of interest displayed by listeners. There are a number of sites that select out the best-of LibriVox, and that probably drives a fair bit of traffic, but it’s all a bit of a mystery to me how certain things in our collection become popular.

In previous coverage, it’s been noted that LibriVox’s goal is to “record every book in the public domain.” Do you have a sense of how many books that would involve and how long it would take to accomplish that goal?

I have no idea how many books that would be. In theory, the corpus of texts in the public domain should increase every year, as copyright terms expire. But the US Congress keeps extending copyright term, so we seem to be stuck with a fixed number of texts, mainly those published before 1923. Maybe someday that will change. I hope so.

But to the question: Project Gutenberg has 25,000 public domain books available to us, and the Open Content Alliance/Internet Archive just passed the 1 million mark of scanned public domain books. The there is Google’s project. So, I’m not sure what the total number would be, but ideally we’d like to do all public domain books in all languages. We have our work cut out for us.

Our plan is to continue our efforts until we’re finished. If we up our production a little bit, to say 1,000 books/year, it will take us 1,000 years to get through the Open Content Alliance’s collection (which contains the entire Gutenberg collection). But if we can really get cracking and push to increase production by a factor of 10, we could cut that to 100 years.

Let’s split the difference and say 550 years.

Is there any distinction between “amateur” and “professional” on LibriVox? How do you define quality in a volunteer effort? Does quality even matter in this case?

No, there is no distinction really. Everyone is encouraged to join us. We have a wide range of quality, from truly exceptional (in a traditional sense), to good, to not so great. Our goal, however, is to record the books, and to make a platform that allows anyone to contribute to the effort. We ask no questions, require no auditions, make no judgments about style or technique, and are happy for every single audio file someone chooses to contribute to the project. So in many important ways we are not like a traditional publisher: our focus is more on our volunteers, helping them to record in order to contribute to our mission:

“To make all books in the public domain available, for free, in audio format on the Internet. “

And in some ways it’s a wonderful side-benefit that the world gets free audiobooks as a result of our efforts.

I personally like the more idiosyncratic recordings in our collection — the birds chirping in the background, and the rustling papers, the odd cough or stumble. These bring a different sense of humanity to the books than do professional readings. But that’s my personal feeling, and I do love the more traditionally “good” recordings as well.

But my general feeling is that the Internet is very good at sifting through piles of complex information, so other sites should come along and rank and sort our content, by whatever criteria they find important. It’s out there and available for all to use for free, however they would like to do so.

We have a policy against rating, and against un-asked-for criticism on our forum. It tends to discourage participation, and we need as many people to help out as we can convince.

However, you can search our catalog by reader; you can search for just solo works; and you are also encouraged to submit another version of recordings. A good number of our books have multiple versions.

So in short: we don’t do the sorting ourselves (though we have started to compile a list of favourite recordings from among our community), but we encourage others to do it.

tags: , , , , ,