Tools

Piracy and Advertising: An Unlikely Union that Just Might Work

In a surprisingly progressive move, a number of major publishers are using YouTube's Video ID tool to monetize pirated content. The tool flags questionable material and presents copyright owners with a choice:

Copyright holders can choose what they want done with their videos: whether to block, promote, or even--if a copyright holder chooses to partner with us--create revenue from them, with minimal friction. [Emphasis added.] -- (From YouTube's Video ID about page.)

YouTube's phrasing seems overly optimistic, but the New York Times says some publishers are choosing the partnership option:

David King, a product manager at YouTube, said in an interview that 90 percent of the copyright claims made using the identification tool remain on the site and are converted to advertising inventory. The other 10 percent are either removed from the site or tracked by the content owner.

The Times article notes that at this point advertising revenue from Web video is miniscule and publishers using the tool are still skeptical. Nonetheless, it's encouraging to see a piracy approach that doesn't default to heavy-handed tactics.

Q&A with Developer Who Turns Ebooks into iPhone Applications

Ebook files and e-reader software usually exist as separate entities, but Tom Peck of AppEngines merged the two to create individual ebook applications for the iPhone App Store. In the following Q&A, Peck discusses his ebook software development process, consumer response to his apps, and future ebook projects.

Why did you opt to bundle individual ebooks as software applications rather than create a single e-reader program?

I have been reading ebooks (mostly from eReader.com) for many years. I wanted to make a book reader program for the iPhone that was as simple to use as possible. I feel that the way existing ebook solutions work is too complex for many users: they have to download the ebook software, then go to a separate Web site and create an account, enter credit card data, and then find and purchase content.

The iPhone App Store sales and distribution process makes it simpler and more convenient to have an ebook reader as part of an ebook itself. Developers can only distribute applications through the App Store; there is no way to distribute data files like ebooks. Therefore, it made sense to me that each book had to be a complete application.

Although this is more convenient for App Store customers to get a book, the process of making each book into an app takes more time for development. Each book becomes its own Xcode project, requires testing, and requires time to load all of the data (descriptions, screen shots, application file) to the App Store. I have developed tools and techniques that automate as much as possible, but each book takes several hours to complete, not counting the many hours spent writing the ebook reader itself.

Have you used any of the e-reader applications available through the App Store (e.g. Stanza, eReader, etc.)? If so, how do these compare to your own apps?

I have used the eReader software. I am a long-time eReader customer, having purchased dozens of their books and read them on my Treo. I have not used Stanza.

The biggest difference is that those products let the user download content from the Internet. Some let users create their own content and download it to the iPhone, which is nice. My reader is purely a book reader.

The eReader app supports a bookshelf list, showing all the ebooks. With my apps, each ebook appears as its own icon on the home screen.

My current reader program compares nicely to eReader. At the moment, I do not support landscape mode, which eReader does. Both offer text search and table of contents. I admit that the search function in my first batch of books was not very usable; newer books have a much better implementation, even better than eReader's. Both programs support different font sizes, images embedded within the text, layout options such as indenting and centering, and font styles.

One feature my reader has is instant repagination when the user changes font size. Using my reader, the user can increase or decrease font size using the "pinch" gesture, similar to zooming in and out of photos, and the results are immediate. I spent a lot of time to make this very, very fast. Changing the font size in eReader requires the program to repaginate in the background, a process that can take over 30 seconds for the entire book.

How many ebooks have you made available through the App Store?

Currently, about 140. More are in the pipeline; all newer, copyrighted works from other publishers and authors.

What has the response been like?

Response has been very good. My current download numbers for all books (not counting several free books) is almost 1,000 books a day. The numbers per book vary day by day, with some books having as many as 50 downloads a day. Most of the public domain titles have counts around five per day.

Most encouraging are that the newer works are selling just as well as the classic stuff. iPulp, a publisher of science-fiction and adventure short stories for young adults, has four works in the store right now with six more in review. These are priced at $0.99 and $1.99 and have sales of about 10 per day. The two Max Quick novels sell for $5.99 each. Currently they are selling about 13 copies per day and the numbers are increasing (they've been in the store for less than two weeks).

Are you selling ebooks or ebook applications through other platforms?

Right now, I am only working with the App Store. I am watching to see what other cell phone vendors and carriers do. As some of your blog postings have noted, the success of the App Store is making other carriers look at copying Apple.

I have spent time with Google's Android platform and have a version of the ebook software that runs on Android.

How much of your ebook content comes from Project Gutenberg?

My initial group of books, about 110, were all from Project Gutenberg. I constantly get requests from customers to add new books, so I have added more Project Gutenberg stuff. Now that I am working with publishers and authors to produce their works as ebooks, I will focus primarily on new works.

Can you list some of these publishers/authors? How did your relationships with these publishers and authors come together?

In the store now are a book on computer security by Neal Puff and a memoir by Teresa Wright. All relationships came about because of my presence in the App Store with the initial set of ebooks. I've been contacted by small publishers and individual authors to turn their works into ebooks for the iPhone. I work with them to get the content in an appropriate format, get the various graphic elements (cover art, icons, etc.), produce the ebook app, have them review the app, and put the app into the App Store.

Do publishers pay you a flat fee to prep App Store titles or is it a revenue share?

Revenue share.

Did you anticipate this type of publisher response?

I was a bit surprised at how quickly publishers contacted me. I thought I would have to market to them.

Are there other content sources or types you'd like to incorporate?

One publisher I am working with offers textbooks. That would be an interesting type of content. A textbook could take advantage of the ebook being a standalone app, offering more interactive content for quizzes that would appear within the book.

Some App Store reviewers complain that you're making money off of public domain content. How do you address these complaints?

The Project Gutenberg license clearly allows people to sell works based on the Gutenberg files. I am following the license, and I do send 20 percent of the revenue earned to the Project Gutenberg Foundation. Mobipocket, eReader and Amazon Kindle all sell public domain works for much more than $0.99.

Each book requires a lot of manual work. The Project Gutenberg text files are a good starting point, but I have to edit each one to add information about chapter starts, poems, songs, emphasized text, etc. Many files have extra data like page numbers that have to be cleaned up. I tried to automate this part, but there is so much variety in the files that only hand editing can get the correct results.

Since your ebooks are applications, and iPhone apps are stored on the device's docking screens, is there a concern about clutter? Do you have any organization tips for people who buy multiple ebook apps?

I would say that this is a general problem with the iPhone Home Screen user interface. iPhone blog sites describe users with 100 apps or more on their devices, and finding a specific app can become a problem.

iTunes does allow users to selectively install apps on individual devices. This is probably the best way to deal with lots of apps: for users to only install the apps they need, and keep the rest on their desktop machine. Personally, I tend to read about two books at a time, then I remove them from the device when finished.

What near-term features or products are you planning?

I am working on a new version of the reader software that adds many new features: bookmarks, notes, landscape mode, etc. Once completed, I will re-release all existing books with the new features. Customers will get the updates for free.

I also am working on several non-ebook iPhone apps.

Links: The Simple Solution for Context

A recent report from the Associated Press finds that news consumers are engaged in a futile search for depth and context. Ethan Zuckerman offers a different perspective in his excellent analysis of the findings:

The [report] authors argue that news fatigue is a function not just of negativity, but of too many headlines. Some of the people in the study (basically, everyone who has internet access at work) report restlessly reloading news websites waiting for something new to appear. This is a pretty unsatisfying experience with most news stories, which don't change all that fast, but it's an easy form of news to get and one that cable news networks now appear obsessed with. It was less clear to me than from the researchers that this constitutes a consumer desire for depth - it simply looked like boredom with the same old headlines to me. [Emphasis added]

My take is that these seemingly insurmountable and divergent needs -- avoiding boredom and finding context -- can both be served by one simple tool: hyperlinks. A series of well-placed, hand-picked links expands the boundaries of a particular story without affecting the core narrative. No other medium offers such an elegant and powerful mechanism. No other medium gives readers a choice to go deeper.

Unfortunately, that choice is only available if editors aggregate and embed links. Simply making content available through Web sites, mobile devices, newsletters, RSS feeds and Twitter isn't enough. As the AP report suggests, consumers want something deeper (or less boring), and editors are uniquely positioned to provide that service by exercising the unique curatorial skills they've developed in the news trade. Ignoring links -- or relegating them to rarely-read closing paragraphs -- is an egregious disservice to the audience because it withholds the very things consumers crave.

(Via the Reading 2.0 list)

Books Fail to Crack Top 100 in iTunes App Store

Over at Radar, Ben Lorica analyzes sales and category data for the iTunes App Store and makes an interesting discovery about the store's book section:

The Book category is comprised mostly of ebooks and while there are over 150 such "apps", it was the only category not represented in the Top 100 rankings ...

As Ben notes, most of the applications in the App Store's book category are individual ebooks -- most drawn from Project Gutenberg -- wrapped up as stand-alone software packages. The user reviews attached to these ebook apps fall into two camps: critics who cry foul over public domain titles repurposed with a price tag, and advocates who see value in the applications' low cost (most are $0.99) and easy access.

Photo Blog Shows Innovation Still Alive in Media Orgs

Alan Taylor, a Web developer at the Boston Globe, hit the sweet spot between immersive storytelling and simple technology with his photo blog, The Big Picture. Taylor discussed the genesis of the blog with Waxy.org in a June interview. Here's a few notable excerpts relevant to publishers:

I have an advantage in that my main role is as a developer here, so I could build all my own templates, format my own style, and so on. I sort of bulldozed some things through though, like extra width, few ads, and I made it simple internally by doing it mostly on my own, no requests for development time, marketing or promotion.

Taylor's photo selection process combines technology and editorial curation. He selects photos from Web searches, photography sites, and wire services. Then he uses custom scripts to extract meta data and resize images for the blog.

When I find an image I like, I save it to a local folder until I get about 25 or so good ones to choose from. Then I open all 25 in Photoshop, arrange the windows in a horizontal tile and drag them around to get a rough ordering that makes sense. Then I start to edit out images that don't make the cut, run a couple of recorded Photoshop Actions to size the images, and do some hand-cropping if necessary.

On his personal site, Taylor explains the simple ideas that brought The Big Picture together:

When I see quality photography consigned to the archives, or when I see bandwidth readily given up to video streams of dubious quality, or when I see photo galleries that act as ad farms, punishing viewers into a click-click-click experience just to drive page views - those times are the times I'm glad I was able to get this project off the ground (many thanks to my friends within boston.com)

The Big Picture brought in 1.5 million page views in its first 20 days; phenomenal numbers for any upstart blog. More importantly, the site shows how tech skillsets and big media resources (those wire services aren't cheap) can catalyze innovation within a large publishing organization.

Web Community Management Tips

Whether intentional or not, Bob Garfield from NPR's "On the Media" reopened an old wound when he questioned the need for user comments on newspaper Web sites.

The "comments issue" is polarizing. Die-hard community advocates believe comments are an integral part of the online experience. Detractors draw a straight line between user comments and the apocalypse. It's a contentious topic with very little middle ground.

For our purposes, there's no point in looking at all the arguments and counter-arguments. The comments debate has been going on for at least 10 years (much longer, if you count Usenet), and it will persist as long as trolls continue to lower the conversational bar. That's just the way it is.

However, this latest flare up offers an opportunity to redirect the focus to some of the time-tested best practices for managing Web communities. Derek Powazek (whom we recently interviewed for an unrelated piece) offers an excellent starting point with "10 Ways Newspapers Can Improve Comments," and Cory Doctorow's "How To Keep Hostile Jerks From Taking Over Your Online Community" is also recommended reading.

I've also picked up a few bits of wisdom from my own experiences as a community manager:

  1. Nurture the Good -- The majority of people want to do the right thing. They want to engage in fruitful and fulfilling conversations. They want to build and protect special communities. These are the people you focus on.
  2. Push Trolls to the Margins -- All popular communities will eventually suffer through a troll infestation. The trick is the minimize a troll's impact by not taking the bait. Moderators should never engage in a public argument, and key community members should be encouraged via private messages and back channels to ignore troll attacks. A marginalized troll is a useless troll, and they know it.
  3. Share Ownership -- I focused on inclusiveness in my first community because I was unsure about my own voice and opinions. In a serendipitous twist, the "we're all equal and we're all in this together" perspective led to a shared sense of ownership. It took a while for folks to buy what I was selling, but a consistent focus on collaboration and equality eventually led to individual responsibility and effective self-policing. I've used this same technique on subsequent communities and the results have always been positive.
  4. Calm by Example -- Experienced community managers know that the Web is a fickle place; today's egregious opinion often evaporates within a matter of days. A measured community manager allows fiery debates to run their course without spilling out of control, and on those rare occasions when guidance is required, a calm force is far more powerful.

What community tips do you have? Please share your thoughts in the comments area (unless you're a troll).

Processing the Deep Backlist at the New York Times

At the O'Reilly Open Source Convention (OSCON), Derek Gottfrid of the New York Times led a fascinating session on how the Times was able to utilize Amazon's cloud computing services to quickly and cheaply get their huge historical archive online and freely viewable to the public.

How big is the archive? Eleven million individual articles from 1851 to 1980, or 4 terabytes of data (over 4,000 gigabytes). The Times got it ready for distribution in 24 hours, for a total cost of $240 in computing fees and $650 in storage fees.

As part of their original TimesSelect subscription service, the paper had scanned their entire print archive. Each full-page scan was cut into individual articles. Typical of newspaper format, the articles often spanned column or page boundaries, which meant that many articles were composed of several scans. In the original subscription-based program, whenever a reader requested one of these historical articles, the Times computer would need to stitch together all of the scans for a particular article before presenting it.

This on-demand process used significant computing resources, but because TimesSelect was subscription-based there was never much traffic. Once this archive was open to the public it was expected to generate greater usage, and the safest approach in those cases is to serve pre-generated versions of all 11 million articles. Using traditional software development practices -- with a single computer churning through one article at a time -- the processing could potentially take weeks and tie up Times servers that were needed for other tasks.

Gottfrid turned to Amazon Web Services (AWS) and its two main products:

Amazon Elastic Compute Cloud (EC2) is a form of "virtualization" where one very large computer is divided up into many virtual computers that can be individually leased out for use. Traditional hosting costs money whether the server is working or idle; in EC2 you pay only as long as the virtual computer is running. When it's no longer needed, it's shut down. This makes the service ideal for one-off processing jobs.

In addition, Amazon doesn't care whether you use one EC2 "instance" 100 times, or 100 instances all at once -- the cost is the same. The difference is when you can usefully divide a job into 100 concurrent tasks, because then it takes 1/100th the total time.

Amazon's other major AWS offering is the Simple Storage Service (S3), for large-scale file hosting. Like EC2, it is a leased model -- you pay only for the space that you use in a given time period.

Gottfrid leveraged these technologies in combination with a relatively new software library called Hadoop. Hadoop is written in the Java language and is based on work done at Google. It allows programmers to very easily write programs that can be run simultaneously on multiple computers.

Combining Hadoop concurrency with EC2 and S3, the Times was able to run a job that might have taken weeks of processing time and complete it in 24 hours, using 100 EC2 instances. They were pleased enough with S3 it became their permanent hosting platform for the scans. Hosting with Amazon or other cloud computing services is usually cheaper and has much better bandwidth than the average provider, although downtime can and does occur.

At last year's OSCON, the Times announced the formation of its developer blog, Open. You can read more about the original AWS project as well as TimesMachine, a project that became economically feasible due to the low cost of AWS.

How Hackers Show it's Not All Bad News at the New York Times

News of a looming downgrade of NYT stock to "junk" status by Standard & Poor's sadly isn't all that shocking. I'm certainly glad I'm not an investor holding any NYT.

But there's something going on at the Times that probably won't make it to Silicon Alley Insider, much less the mainstream business press, and it's something that's starting to make me think the Times just might succeed in adapting to the changing rules of the media and publishing game (though there will almost certainly be many more casualties before it's over).

So what's the Times doing that's so important? They're hacking.

Not hacking in the nefarious sense, but in the original sense of experimentation, and curiosity, and solving interesting problems (as Paul Graham put it, "Great hackers think of it as something they do for fun, and which they're delighted to find people will pay them for.") How many other publishers are running blogs about their work with open source software? Even fewer are developing and releasing their own high-quality open source software:

Quite frankly, we wanted to scale the front-end webservers and backend database servers separately without having to coordinate them. We also needed a way to flexibly reconfigure where our backend databases were located and which applications used them without resorting to tricks of DNS or other such "load-balancing" hacks. Plus, it just seemed really cool to have a JSON-speaking DB layer that all our scriptable content could talk to. Thus, the DBSlayer was born.

That is not typical newsroom conversation.

But this isn't just about open source software, or even about some developers building cool software to run backend system. The Times has put developers right in the middle of the newsroom. At a MediaBistro event in May, Aron Pilhofer from the "Interactive News Technology" group at the Times (sharing the stage with their Editor of Digital News, Jim Roberts), talked about how the Minnesota bridge collapse was when they realized they needed to develop their own tools to cover the news with the web, and not just on the web. Less than a year later, when Hillary Clinton's infamous public schedule was released, they had the people and the skills in place to crunch 12,000 PDF documents (containing images of scanned documents) through a text-recognition program, on to Amazon's "Elastic Computing Cloud" and finally into a Ruby on Rails Web application providing full-text search across all eight years of calendars.

Just this week, the Times' Derek Gottfrid gave a talk at O'Reilly's Open Source Convention (OSCON) titled "Processing Large Data with Hadoop and EC2" based on work he'd done on the Times' archives. Again, this is the kind of talk you're not likely to hear at most newspapers (or magazines, or book publishers) these days:

I was able to create a Hadoop cluster on my local machine and wrap my code with the proper Hadoop semantics. After a bit more tweaking and bug fixing, I was ready to deploy Hadoop and my code on a cluster of EC2 machines. For deployment, I created a custom AMI (Amazon Machine Image) for EC2 that was based on a Xen image from my desktop machine. Using some simple Python scripts and the boto library, I booted four EC2 instances of my custom AMI. I logged in, started Hadoop and submitted a test job to generate a couple thousands articles — and to my surprise it just worked.

Earlier this month at FOO Camp I had the pleasure of meeting another hacker from the Times, Nick Bilton, part of the Times R&D lab -- the folks who built the impressive NYT iPhone App.

UPDATE: Nick Bilton points out via email that:

There were people from nytimes.com that were instrumental in building the NYT iPhone app also ... Is there anyway you can add a couple of words that the R&D Group 'worked with nytimes.com' to help build the iPhone app?

If you're worried about EBITDA and EPS, then you're rightly worried about the Times right now. But if you're worried about the future of journalism, and about the ability of established media companies to adapt to a digital world, there's also reason to be excited about the Times right now too.

Tech Publisher Asks "Are Ebooks Ready for Technical Content?"

Dave Thomas from the Pragmatic Programmers is mulling whether to make their books available on the Kindle, and encountering many of the same issues we faced here at O'Reilly regarding technical content and the limitations of current ebook devices:

In fact, we've had a prototype form of that capability for a while now, but we've always held back. Frankly, we didn't think the devices worked well with our kind of content. Basically, the .mobi format used by the Kindle is optimized for books that contain just galleys of text with the occasional heading. Throw in tables, monospaced code listings, sidebars and the like, and things start to get messy.

Dave's post has sparked a great conversation within the comments, including one from Shelly Powers, whose book Painting the Web was among those included in our pilot program:

I think that providing the package deal that O'Reilly does (with PDF, epub, and mobi), in addition to downloadable code is the way to go. If you sell Kindle books, you definitely need to make both your figures and your source available, separately. For instance, I have my Painting the Web figures in an online gallery and the examples are available at O'Reilly--takes care of a lot of issues related to Kindle. Another approach could be to make available (for no additional cost) a PDF of just the figures, or the figures and code.

Preparing a book for the ebook market may seem like a lot of work, but you have the potential to reach a new audience of book buyers. Buyers used to the internet and having access to immediate information; who may not want to order a book and wait a week for it to arrive, but who will buy a book if it means they can have access to it now. I wouldn't have considered myself an "impulse buyer" when it comes to books, but I have probably at least a dozen books I bought because the ebook format was cheaper (that's a key element), and I could get the book _right now_.

On one hand, merely working to replicate a print experience isn't the right way to exploit the benefits of the new platform; on the other hand, publishers (and as usual, I use that term quite loosely) should be able to expect at least minimal rendering of common elements like tables, along with support for at least the same core 14 fonts available in Acrobat (speaking of fonts, if you're looking for a laugh check out this mock "font conference").

POD Opens Door to Magazine Experiments and Customization

MagCloud is a new print-on-demand (POD) service targeting the magazine industry. In the following Q&A, MagCloud consultant Derek Powazek -- co-founder of JPG Magazine and founder of Fray -- discusses the utility of POD and the evolving relationship between print and Web content.

How did you get involved with MagCloud?

I came into the project over a year ago -- it had been percolating in HP Labs for a long time before that, led by Andy Fitzhugh, Udi Chatow, and Andrew Bolwell. Andy is the one who brought me in. We had this meet and greet lunch to talk about the future of publishing and it turned out we had the same vision. He kept saying, "Right, now push that further."

When did you first encounter POD?

Years ago, when Heather [Champ] and I were exploring ways to make a photography magazine, Lulu was really the only game in town. We learned so much creating JPG there, and starting with a POD service allowed us to experiment, develop the voice and vision of the magazine, and build an audience. I think it's a very natural way to start a magazine.

How did you gravitate toward a POD model for magazines?

It's all about the Giant Pile. I've worked on a lot of newspaper and magazine projects, and they all had one thing in common: A huge print run, followed by the slow, terrible realization that you've gotta get rid of all that paper.

POD banishes the Giant Pile to the dustbin of history where it belongs. Because, with a POD system, you don't print it until somebody wants it. It avoids the pile. It avoids creating trash (70 percent of all magazines are never bought). It brings some of the elegance of the Internet to this very old industry.

But mostly it was just a financial decision. Heather and I weren't out to become publishing magnates. We just had an idea that we thought people would like. We wouldn't have been able to do it at all if not for POD.

What types of magazine publishers (large, small, individuals, etc.) are best suited for MagCloud?

I think that magazines are about nurturing a community. If you look at the most successful magazines (Rolling Stone in the '60s, Wired in the '90s, Make now), they've always been the ones that surfed the zeitgeist. They found a growing community of people and reflected it, and in that reflection, began to lead it for a time.

But if you tell people in the publishing industry that they're really in the community business, they'll say "shut up, hippy" and go back to monetizing their audience metrics.

So the trick is to find those niche audiences that need a voice. And there are a lot of them. And the truth is, they know who they are better than we do. So, with MagCloud, the idea is to open up the tools so that those communities can create their own magazines. We think they're going to make amazing things.

Do you see larger magazine publishers eventually moving to POD, or will this be a niche option?

Not only do I think that large magazine publishers will move to digital printing, but I think that the idea that we used to print millions of things that were exactly the same will someday be seen as a cute historical artifact. "You mean every copy of this magazine was the same for everyone, Grandpa? Weird!"

For the biggies, it's just a matter of economics. As soon as the price per page for printing on digital is cheaper than traditional offset printing, the biggies will move. The quality of POD is already the same or better than offset.

It'll start with smaller publications because they're the most agile, and they don't see the real price savings of scale anyway. Right now, if you're printing a few thousand copies, digital printing is the same cost as traditional offset. (I've been wrestling with this for Fray.com -- we're right at the cusp. Our first issue was printed via traditional offset, but issue two will be printed with MagCloud.)

And once magazines move to POD, they'll realize it opens up opportunities they never had before. When you can really tailor each issue for each subscriber, what will you do? Exciting, huh?

Book publishers often focus on the short-term elements of POD, most notably POD's higher cost per page. Some industry folks try to cite the long-range benefits, such as efficiency, higher retail prices via customization, etc., but the per-page discrepancy continues to be a sticking point. Have you encountered similar obstacles on the magazine side?

Magazines are a better fit for POD because, unlike books, they're usually all color and timeliness is much more of a factor. Plus, the price per page for digital print is falling fast, while the price per page of traditional offset has remained very steady. Still, the exciting part is all the opportunities digital printing enables. Ultimately, POD services like MagCloud will enable a degree of customization that is not only cheaper, but just plain impossible to do via traditional means.

Beyond strict numbers, what do you see as the upside to print editions? Does a print product carry a higher level of esteem for a writer or consumer?

I love the Web. I think it's still a publisher's dream come true. But, inconveniently, we humans are still real world creatures. And no matter how much connectivity blankets the planet, and how good our devices get, there will still be a role for print.

I don't say this because I'm some ancient technology fetishist. I don't own a tube amp. I sold all my CDs. It's just that print is a really good delivery mechanism for some kinds of experiences. Reading a physical magazine is a different experience than surfing hypertext online.

And, yes, I think the scarcity of print does give it a higher level of importance for its creators and consumers. On the Web, where every page is just a click away from any other, there's no relative importance communicated. But in a magazine, you know that a team of writers and editors picked this story to go here. That has a profound effect on how that media is consumed.

The Media Industry's Perspective Problem

A newsroom survey conducted by the Pew Research Center's Project for Excellence in Journalism touches on one of the major issues -- and failings -- affecting mainstream media: the power of flawed perspective. Here's an excerpt from "The Changing Newsroom" report:

Staffing for coverage of sports, local government and politics, police and investigative reporting, all grew in 30% of the newsrooms surveyed. Although not specifically measured in the survey, anecdotal evidence suggests that at least some of these gains have been driven by pressure to provide web content during the course of the day. Some of this content is often then "reversed published" back into the newspaper. [Emphasis added.]

There's a huge difference between "published" and "reversed published." A published piece of content -- be it an article, a podcast, a broadcast, or even a book -- is pushed into the world with a clear intent (inform, entertain, influence, etc.). But reversed published content has been stripped of intent. Its sole purpose is to fill space; whether it entertains, informs, or influences is secondary.

The whole concept of "reversed published," and the adjacent issues of print vs Web vs mobile vs broadcast, illustrates a fundamental flaw in the media perspective. Content should be defined by its audience, not by its container. If an article is initially published on the Web, that article must be geared toward the Web audience. If the same material later appears in the paper, that material needs to be geared toward the newspaper audience. Same goes for mobile consumers and broadcast consumers.

Repurposing material without regard for its audience is a luxury the media industry used to enjoy when it was a primary information conduit. The only difference is that years ago the Web was where rehashed shovelware was dumped ("Story continues on A12", anyone?). Early Web users quickly tired of media's detritus, so they looked elsewhere for useful information. Apparently, media organizations didn't learn from this past mistake because now they're pulling the "repurposed content" maneuver with traditional audiences. No one wants rehashed bits.

This is where perspective comes in. If a media organization continues to think in terms of content containers rather than content consumers, then it will inevitably default to "reverse publishing" and other bad habits. These days, as audiences scatter and company valuations plummet, every piece of content needs the justifications and intentions of fully published material.

Cloud Computing's Potential Impact on Publishing

If you use Google Docs or access email via a Web browser, you're already versed in cloud computing. Access to Web-based material is taking the place of downloads.

Cloud computing focused in the early going on software as a service (SaaS) applications, but Amazon, Netflix, Google, Apple, Microsoft and others are now tapping the cloud for content delivery (some of these companies focus on streaming entertainment, while others focus on content creation/management).

An interesting conversation about the cloud's impact on content publishers popped up recently on Peter Brantley's Read 20 list. Peter, by way of an an article link, noted that Amazon is moving some of its video distribution business into the cloud. From Last100:

Not only is Amazon utilizing streaming in order to deliver "instant" playback but it also means that content doesn't have to be permanently stored on a user's hard drive. As a result, Amazon is able to offer another potential benefit to customers: a virtual video library of previously purchased content, stored in the 'cloud' (on the company's own servers) ready to be streamed as many times and to as many compatible devices as the user has access to. While this will initially consist of PCs running Mac OSX or Windows, along with select TVs from Sony, in the future this could extend to many different devices, either through specific partnerships like the one currently forged with Sony, or by utilizing browser-based standards or any other technology or protocol Amazon chooses to support.

Expanding on Peter's post, Mike Shatzkin said the centralization of cloud-based content raises issues around digital rights management (DRM) and other access limits:

The cloud changes everything in terms of piracy and copyright. We are living in a transitional period where computer storage is decentralized. When that period is over, and the time is now not far off, everything is accessed from the cloud and it will be a relatively easy matter for rules about content access to be enforced by the content originator or distributor.

As others on the Read 20 list pointed out, cloud computing brings up additional questions around copyright and ownership. Toss in concerns about system reliability, open vs. closed clouds, and the potential for lock-in (or lock out) and you can see this rabbit hole growing deeper.

Cloud adoption may also represent an important moment in book publishing's digital transition. Publishers have enjoyed the past luxury of learning digital lessons from the media, music and film industries, but the wait and see approach may not work this time. If consumers come to expect access to their content -- all their content -- anywhere/anytime, publishers will need to meet that expectation ... or risk watching an unaffiliated company or industry step in.

Open Question: Should Publishers Develop Software Apps?

Book publishing's response (or lack thereof) to the iPhone 3G and the App Store has stirred up an interesting question around publishing and software development: namely, should publishers create their own software applications?

Sara Lloyd from thedigitalist says a focus on content, not software, is key:

Interestingly the price of apps [in Apple's store] is already plummeting as free apps get more highly and more frequently rated and the paid-for apps drop down the ratings. Perhaps this suggests even more strongly that the App is not The Thing; it is merely a container or a channel for the content, which will still be The Thing.

On the other side, James Bridle from booktwo.org says publishers are the natural source for e-reader apps:

Most ereader technologies are built by techies who put the technology before the reading experience: the combined skills of typesetters, print designers, editors and technologists that only publishers possess could, with the right direction, produce a far superior ereader app than any we've seen so far.

What's your take? Should book publishers move into the software domain? Please post your thoughts in the comments area.

Survey of Book Industry Reaction to New iPhone and App Store

Kassia Krozser struck a nerve earlier this week with criticism of the publishing industry's slow approach to the new iPhone and the just-opened App Store. From Booksquare:

Call me crazy, but I'd expect an industry that salivates over moving 150,000 units to be all over the potential for reaching seven million "mobile is the future" customers. Are you not out there, listening to readers, gauging their interest? They want, you have, and you're still hiding the goods. I get this isn't the largest market you have, but is that an excuse to sit on the sidelines?

Sara Lloyd doesn't see long-term value in this current burst of iPhone excitement. From thedigitalist:

... apart from a few digital PR points scored against competing publishers, there doesn't seem to me to be any huge value in first mover advantage here for publishers, unless we want to make the decision to become software developers. The perception is that the App Store has 'opened up' the iPhone to publishers and to e-reading. The reality is that the iPhone has always been enabled for e-reading ... So, whilst we have been awaiting the launch of the App Store with interest, we didn't see enormous advantage in, for example, creating a reading app ourselves or Being There on Day One, just for the sake of it.

Expanding on the software theme, James Bridle says book publishers are uniquely positioned to develop ebook applications that meet consumer needs. From booktwo.org:

... who better than publishers to craft such software? Most ereader technologies are built by techies who put the technology before the reading experience: the combined skills of typesetters, print designers, editors and technologists that only publishers possess could, with the right direction, produce a far superior ereader app than any we've seen so far.

Broadening the analysis, Michael Cairns says the "silo" mentality displayed in this iPhone debate is a competitive obstacle that needs to be put aside. From PersonaNonData:

To bring us back to the iPhone circumstance, as long as publishers continue to think in terms of traditional functional silos and roles and responsibilities they limit themselves in their ability to leverage their assets. In contrast witness Amazon which has never considered any aspect of the publishing value chain to be off limits and more publishers need to think in this manner if they want to redress some of the advantages Amazon and others retain (or new competitors develop) in the marketplace.

(Many of the links and call-outs in this post were provided by Peter Brantley via his Read 20 list.)

"Lost" Builds Community through Book Club and Web Games

Producers of ABC's "Lost" often sneak books into the fabric of episodes so die-hard fans can hunt for clues (or red herrings) in external literary sources. Seeing an opportunity, ABC is launching the official "Lost Book Club" through ABC.com and iTunes. From UPI:

Also available on ABC.com will be a message board to discuss the titles, a synopsis of each book, along with when and how it was referenced in the show, and an introduction by co-creator/executive producer Damon Lindelof and executive producer Carlton Cuse, ABC said.

Two years ago, Hyperion published Bad Twin, a book "written" by one of the passengers on "Lost's" ill-fated flight Oceanic 815 (if you're a fan of the show, you'll recognize the author as the guy who got sucked into the engine moments after 815 crashed).

Response to Bad Twin was tepid, but the universe beyond "Lost" episodes has been successfully mined through a number intricate alternate reality games that reveal clues about the show's secondary mysteries. Speaking as a full-fledged "Lost" junkie myself, I know of a number of folks who spent dozens of hours playing these games.

Book publishers with mythology-laden source material may want to take a note from "Lost," "Harry Potter," "Star Wars" and other series. These franchises create organic affinity communities that thrive on interactivity and story expansion, and they can be fostered through forums, social networks, and real-world meetups at related events. Outside observers and casual viewers may not understand the impulse to dress like Boba Fett or write "Lost" fan fiction, but the ardent enthusiasm of a dedicated community presents opportunities that should not be tossed off.

(Via Publishers Weekly)

Open Question: Do You Use Twitter?

Mediabistro recently conducted an informal round-up of publishers and authors who use Twitter to publicize titles and interact with readers. Within TOC, we use Twitter (plug: follow us here) to exchange quick bursts of information and story ideas, and we've also found it to be a surprisingly effective beat coverage tool -- breaking stories and new memes often appear on Twitter before they hit the blogosphere and mainstream media outlets.

This anecdotal evidence suggests Twitter is gaining steam in the publishing world, but is that really the case? Are you using Twitter? Have you even heard of Twitter? Please share your thoughts in the comment area.

Google Book Search: It's All About the Index

Adam Hodgkin says the grand design of Google Book Search is aimed at creating a massive index, not an all-encompassing, locked-down reading system. From Exact Editions:

Some of Google's critics suppose that the aim of the GBS [Google Book Search] project is to capture, corale [sic] and deliver to readers the whole of the world's literature in a readable format. But perhaps the business goal has all along been to produce a complete searchable index of literature, not the monopolistic reading medium. [Bold text included in original post.]

Google's initiatives have always focused on the creation and expansion of digital content platforms rather than individual content products. The public's mistaken focus on Google products -- rather than Google platforms -- was noted in Wired's recent story about Google's mobile project, Android:

Those hoping for a new gadget to rival the iPhone finally understood that Google had something radically different in mind. Apple's device was an end in itself -- a self-contained, jewel-like masterpiece locked in a sleek protective shell. Android was a means, a seed intended to grow an entire new wireless family tree. Google was never in the hardware business. There would be no gPhone -- instead, there would be hundreds of gPhones.

I can understand where the confusion comes from: The creation of a gargantuan reading service seems to be in Google's wheelhouse because they're one of the few companies that can actually attempt such a project. But as we learned with Android, Google isn't a product-centric company -- all those individual tools and services plug into bigger platforms. Development of a searchable full-text book index that can be distributed across all sorts of devices is more in line with Google's history and its focus.

Calling Google a Publisher Underestimates its Platform

Google has never positioned itself as a publisher, but a recent News.com piece looking at Google's role in Web advertising says the company's 2006 YouTube acquisition moved Google into the publishing space:

Google itself is a publisher, at least in one sense: it offers countless videos through [its] YouTube service. So Google has more incentive than just its DoubleClick division to improve display advertising.

YouTube is certainly content-centric, but Google didn't pay $1.65 billion for all those videos. It shelled out big bucks for YouTube's audience and, more importantly, its platform.

Publishers tend to see the world through singular products -- books, newspapers, magazines, Web sites -- but platform companies, like Google, see these same products as an aggregated stream of general content that needs to be delivered. If you control the delivery mechanism, you can mine it for revenue -- something Google has already done through its AdSense and AdWords programs, which piggyback on Google's search tools to deliver contextual advertising. Now that Google has monetized and claimed the Web search market, the company is expanding its platform into harder-to-crack content spheres: books, TV, and radio. This is why Google Book Search isn't just an archive. It's a content pipe that plugs into Google's overall architecture.

Google clearly recognizes that its platform is only effective if it serves up useful material, as illustrated in this passage from the same News.com piece:

People are consuming more and more media on the Internet but paying less and less, [Google Chief Exec Eric] Schmidt said. "That's bad for Google. We are critically dependent on high-quality content," he said.

Publishers are experts at producing the content Google needs, but incorrectly labeling Google a publisher -- and, ostensibly, a competitor -- obscures the essential relationship between Google and actual publishers.

So, in an effort to keep publishers on target in the platform discussion, here are a few top-level items to consider:

Identify the platforms -- Platform companies are focused on distribution, both through their own Web properties and via underlying delivery technologies. They may own popular Web sites that generate revenue through some forms of content (e.g. YouTube), but their real interest lies in aggregating and disseminating material. Google is the big platform provider, but Facebook and Amazon are both making moves into the platform arena. Even if you ultimately dismiss a particular company, it's still important to competitively -- and correctly -- identify its platform moves.

Consider how your content can be delivered through available platforms -- Look at user patterns. Ask yourself: How do people use these platforms to find and consume content? How are other companies effectively delivering their material? The newspaper industry offers an important case study for this point: It initially relied on subscription models for its Web content, but in recent years many papers have removed subscription restrictions so each article can be discovered -- and mined for ad revenue -- through Google and other search platforms. The industry is finally working with user behavior, not against it.

Look for revenue streams -- We've recently harped on the importance of tie-backs and analytics in digital experiments, and those same warnings apply here as well. If you're going to distribute your material through a platform, you need to have revenue streams in mind. This could take the form of advertising, affiliate relationships, trialware, or links/call-outs to upsell products. It could also be part of a larger branding campaign.

Add open formats to the production process -- Google is a massive platform player, but the Internet's open and distributed infrastructure allows other companies to develop their own platforms. Publishers looking for platform-friendly positioning can take advantage of future platforms -- including those not yet envisioned -- by incorporating open formats (XML, HTML, RSS, EPUB, etc.) into their production processes. There's no reason to gamble on proprietary formats and exclusivity because the big platforms, and the smart platform companies, will use methods that have already been adopted by the widest possible audience. And if a closed format does reach critical mass (iTunes and AAC, for example), commonly used open formats will be incorporated into conversion tools and projects.

These general points require deeper contextualization for particular companies and initiatives, and the business threats presented by large platform companies need to be rationally examined and acknowledged (particularly, centralization and lock-in). Nonetheless, publishers need to recognize that misrepresentations are where the real threat lies. Incorrect platform assumptions limit the significant opportunities.

Q&A with Susan Danziger, CEO of DailyLit

DailyLitDailyLit is a digital service that delivers short, scheduled book installments to subscribers by email and RSS. The company offers free and pay-per-read titles in plain text, which makes them accessible through nearly all email clients, browsers or mobile devices. In the following Q&A, DailyLit CEO Susan Danziger discusses the company's philosophy, process, and upcoming services.

How many titles do you offer through DailyLit? How many do you hope to have by the end of 2008?

We currently have over 950 titles (450 or so which are available on a pay-per-read basis), and by the end of this year, we're targeting several thousand pay-per-read titles.

Releasing titles in plain text seems like a simple way to avoid the formatting needs and device restrictions that come with proprietary ebook formats. Was this your intention, or was plain text just an easier way to get started?

It was definitely our intention to allow the installments to come in on any device, which is an important part of how we designed the experience. We started with plain text because it was the easiest to implement, and we will be launching HTML shortly as well.

Do most DailyLit users read installments on mobile devices?

10%-20% of our readers currently read their installments on mobile devices, but as the reading quality on mobile devices improves (the iPhone is a great start), we're confident that more and more people will be reading their installments on these devices.

Can readers purchase print editions or ebooks through the site/service?

At this point, only DailyLit editions are available. We're starting to allow publishers and others to sponsor certain titles, which would allow a link to purchase other editions. With this sponsorship model, instead of readers paying for the title, sponsors would pay for them instead.

How will the sponsorship model work? Also, in regards to "other editions," are you only referring to printed editions, or does this include different ebook formats as well?

DailyLit readers would have access to free DailyLit versions of the books under the sponsorship model. Sponsors of titles would be able to include links that would lead to their sites (or other sites that sponsors indicate). "Other editions" could be printed editions or other digital formats.

How many DailyLit users receive updates via email? How many via RSS?

About 90% of our readers receive installments via e-mail; 10% via RSS.

How much time goes into prepping books for delivery? Is production handled in-house?

The production time depends on the format in which the book is delivered. If the book is delivered in PDF, the production time can be up to eight weeks. We prefer it if books are delivered in EPUB or XHTML, which greatly reduces the production time, not to mention cost. Production is handled in-house for certain titles, but for most titles we use an outside production house.

How are installments defined? Is it by word count? Average reading time?

Installments are usually around 1,000 words, which is under five minutes of reading. If a chapter is about to end, we'll adjust the length of the installments accordingly. Certain books, such as books of quotes, have much shorter installments. Under the "Manage Your Subscriptions" feature, folks can personally adjust the length of each installment (to 2 times or 4 times the length), so an avid reader can read more.

Are fiction titles the easiest to serialize, or does any chapter-based book work?

Fiction titles are probably easier to serialize since they're more straight forward. With non-fiction titles, we need to account for footnotes and other ancillary materials. That said, we're featuring titles from all different genres, from science fiction, such as books by Cory Doctorow, to such non-fiction best-sellers as Skinny Bitch. We also feature language books, such as titles from Berlitz, business books, as well as romance titles from Harlequin.

What types of books don't lend themselves to serialization?

Reference books that readers do not want to read cover to cover don't work in serialized form. Apart from that, since DailyLit is intended for those readers too busy to read (or who want to sneak in an extra book during the day), any other kind of book works well. After all, folks are avidly reading War and Peace, Moby Dick, The Art of War and Pride and Prejudice, and none of these books were originally intended to be serialized.

Who sets the pricing for titles?

Since we've structured this as a licensing deal with publishers, DailyLit sets the price.

Are you licensing a specific version of a book (i.e. "text-only" or a particular ebook format)?

We characterize it as "digital serialization rights" so it's a combination of serialization (typically understood as a license) and a digital rendition of the book. Depending on rights available for the title, we might license text only or with illustrations/photographs.

How have publishers responded to DailyLit?

We've had a great response from publishers. On the whole, they've been really excited about this new format, which combines marketing and potential incremental revenue. We've also been developing innovative technology -- several initiatives will be rolled out shortly -- which will help the publishers market their titles and expand their reader base.

What sorts of tools will you be releasing?

One such tool is public subscriptions, which will allow publishers, authors or third parties to serialize a book publicly on their site. Each day on that site, folks will be able to view a new installment of a book. This is a way to build community on their site and would be an alternative to giving away free PDFs of books. We'll also offer readers the opportunity to receive a personal e-mail or RSS subscription to that title if they don't want to return to the site each day, but for that they [consumers] would need to pay. As such, it's a neat viral marketing tool as well as having potential for incremental revenue.

Do you use digital rights management (DRM) on titles?

We put the reader's experience first, which means that there are no attachments or files that need to be opened with a special device or software. With respect to illustrations or photographs, we are able to track where they go and, in the event of a hot link, we can disable use of an illustration associated with a particular subscription.

Have you run into any piracy issues? Is this a concern?

We haven't run into any piracy issues. Since books are divided into hundreds of installments, there is less concern that individual installments are copied or forwarded. In fact, any installments forwarded by readers have been viewed by publishers and authors as a way to virally market their titles.

In addition to books, you feature Wikipedia tours, language lessons and SAT prep. Are other non-book projects in the works? Where do you see DailyLit expanding?

We're in the process of adding newly created titles for DailyLit, including allowing authors and publishers to create content that work well in the serialized format. We're also developing lots of interesting technology to help market books and expand the current reach to additional readers. For instance, we recently launched via Twitter a group read or virtual book club so that folks can read books according to the same schedule. Folks can sign up now to participate.

Release Early, Release Often: Agile Software Development in Publishing

"How do Web startups release three or four new versions of a product in the time it takes publishers to launch just one new feature on their online platforms?"

This question framed "The Agile IT Organization," a lively and well-informed discussion at the recent Society for Scholarly Publishing annual conference in Boston. As a software engineer, I've used both agile and traditional product development methodologies and I was interested to hear the perspectives of other programmers as well as publishers who've gone through the process.

Geoffrey Bilder of CrossRef provided an introduction to agile development practices, which are concisely summarized in plain English by a core set of principles.

Summarizing even further, agile development means:

  1. Minimal up-front specification. A project has high-level goals (e.g. "make our back catalog searchable and available for print-on-demand purchase"), but is not fully described before development begins.
  2. Frequent, short-cycle releases. A project is broken up into mini-projects, each with a small set of features that take only a few weeks to implement. Every release ("iteration") has a specification, development and testing phase. This means that every couple of weeks the software is fully usable, although it may have very few features at the start.
  3. Change to the product design is accommodated and even expected. Market conditions, corporate re-organization or user demands may mean that new features are added or old ones are re-worked. Changes are treated as just another iteration.

The panel at SSP focused on two approaches: internal, IT-driven products, and those developed by a third-party vendor. Larry Belmont, manager of online development at the American Institute of Physics, gave an excellent presentation on the in-house approach. His organization ran its first agile project with a timeline measured in days rather than weeks or months.

Leigh Dodds, CTO of Ingenta, provided the vendor perspective, and described the principles of a formal type of agile development known as Scrum.

The panel was, to their credit, enthusiastic about the approach, but agile development requires commitment and is not right for every organization or project. Some caveats that need to be emphasized:

  • Short development cycles come with a price: you will be asked to review and comment on small pieces of the larger project, and be involved on an almost daily basis. Many publishers need vendors they can treat like plumbers: "I want a new sink put here, it should look like this, call me when it's done." If someone in your organization isn't prepared to think very hard every day about copper pipe fittings, agile isn't right for you.
  • Project managers must be empowered to make decisions. Whether the project is in-house or vendor-driven, every day the PM will be asked to make calls without appealing to higher powers. When editorial buy-in is required, or when the product needs a larger review, consider a hybrid approach: appoint a single decision-maker with deep editorial knowledge to work on evaluating, testing and approving each iteration, but use a more traditional alpha/beta/gold release process for the wider group.
  • Product features may change, but time and budget should be invariant. Hard deadlines might seem to be antithetical to the free-wheeling, change-friendly agile approach, but in my experience they're critical. They focus the entire team: key decision-makers cannot spend weeks in committee, IT personnel don't fear the "death march" project with no end in sight, and it's more difficult to introduce budget overruns that cause friction with management and vendors. If an agile project does run out of time, you will still have a launchable product that's been thoroughly tested and reviewed all the way down the line, not something just out of beta with weeks of QA ahead. Many agile methodologies use the hard deadline, or timebox, as the primary method of structuring the project.

"Release early, release often" can sound a lot like "throw whatever we've got out the door." This is one reason why the iterative approach has been so embraced by Web startups: each small release has been thoroughly tested and evaluated, and there's never a moment where the software doesn't work. It's possible to to go live with a project that might not be "finished" according to the original master plan, but might otherwise be caught up in insurmountable technical hurdles or tied up in editorial review.

If publishers are going to be ready for an "iPod moment," this kind of flexibility and responsiveness is critical.

Stay Connected
RSS TOC RSS Feeds
 Blog Feed
 News Feed
 Combined Feed
 New to RSS?
Newsletter Subscribe to the TOC newsletter.
Tarsier Icon Follow TOC on Twitter.
Newsletter Join the TOC Facebook group.
TOC Widget Get the TOC Headline Widget.
Search
Conference
Tools of Change for Publishing Conference

Save the Date! TOC 2009 will take place Feb. 9-11 2009 at the Marriott Marquis in New York City. Sign up for the conference newsletter to hear about important dates and developments as the show approaches.

TOC DVDs
TOC 2008 Tutorial DVDs

Now available. These tutorials dive into the necessary skills and tools critical to the future of publishing.

TOC Job Board
Tag Cloud