Recently by Liza Daly

How to Read any Type of Document on the Kindle (Almost)

There are a few options for readers who want to convert PDFs or other non-supported files to the Kindle's AZW format. Amazon's recommended method is to email the file to your personal Kindle email address. It's also possible for users to convert PDFs and other document types themselves using Mobipocket Creator or Stanza.

All of the above methods have the same flaw: AZW does not support the kind of advanced layout available in formats like PDF, and non-Latin fonts aren't easy to convert. What if you need to review a complex legal form, or read a graphic novel, or one in Chinese? A hidden feature can help.

screen_shot-50087.gif The Kindle has an undocumented picture-viewing mode that was first uncovered by Igor Skochinsky. Although the black and white E Ink screen is not especially good at displaying actual photographs, it is quite good at rendering line art and text.

Here's how to do it, using PDF as an example. Note that unofficial features may be buggy and could damage your Kindle; proceed at your own risk.

  1. Convert the PDF to a series of images. Commercial versions of Acrobat should be able to do this in batch, but users of free readers may have to convert a page at a time. The Kindle can read JPEG, PNG and GIF; the latter two will work best. Because the picture-viewing application doesn't support a table of contents, you'll need to name the image files in ascending alphabetical or numeric order (e.g. "0001.jpg," "0002.jpg," etc.) For best results, resize the image to 600 x 800, the resolution of the Kindle screen.
  2. Connect the Kindle to your computer using the USB cable. Once connected, browse to the Kindle's drive. If you have an SD card installed that will appear on your computer as well. The following procedure works on either the Kindle or the SD card. I prefer to do everything on the SD card -- it feels safer.
  3. Create a folder called "pictures," and a folder inside of that with the name of your "document." Put the images in the document folder. Disconnect the Kindle from the PC. When you go to the Kindle's home screen, nothing will have changed. This is where the secret feature comes in:
  4. Press Alt-Z from the home screen. Your book title should appear in the list.
  5. Click on the book title. It will open the first image. Use the normal Kindle next/previous buttons to page through the "book." The picture viewer has menu options of its own to control the size of the image and how it's rendered.

screen_shot-50096.gif

Credit: octopus pie

Of course because the "PDF" is really an image it's not possible to search the document or rescale the fonts. Text-heavy PDFs should be converted in one of the recommended ways.

This same technique can be used to load image-based documents directly, such as comics. (Peeking inside the "pictures" folder after it's been read by the Kindle reveals a file with the extension manga, suggesting that the picture viewer was intended to be used for this purpose).

It's also possible to convert documents in Russian, Chinese or other non-Latin scripts this way. The Kindle does have support for embedded non-Latin fonts as part of its "Topaz" file format, but there are no tools for end-users that output Topaz.

(Screenshots courtesy the undocumented Alt-Shift-G feature, which saves to the root of the SD card.)

Optimizing Web Content for the Kindle Browser

Amazon Kindle Amazon's Kindle store is convenient, easy-to-use and stocked with thousands of titles. But what about publishers and content distributors who want to reach the estimated 240,000 Kindle users without going through Amazon's program? And what about content formats that the Kindle does not directly support?

One selling point of the device is its free, ubiquitous Internet service and Web browser. Amazon has filed the browser under "Experimental" but it's quite usable as-is. With a few simple changes to a Web site's HTML code, it's even possible to specially cater to Kindle users.

The screenshots used in this article are from the mobile version of Bookworm, my Web application for reading ebooks in the EPUB format. Although what's being displayed is ebook content, it's being delivered by the Kindle's browser, not the Kindle ebook technology, which does not yet support EPUB.

Because the mobile Web version is already heavily optimized for small devices, the layout is simpler than a traditional Web site. What works for an iPhone or other wireless device will also be a good starting point for the Kindle, although we'll see there are some special considerations that don't apply to any other device.

Default or Advanced Mode?

When the Kindle ships, its Web browser is in "default mode." It will not load images or CSS styles, but it does render basic HTML tags like the italic tag <i>. Personally, I prefer "advanced mode," which displays Web pages more like a traditional browser, but some sites can be unreadable in this mode.

When optimizing for the Kindle it's best to consider that most users will not change from "default mode," or even realize that the option exists.

How different are these modes? Here is a comparison shot of the same screen from Bookworm in both modes:

kindle-4.jpg kindle-6.jpg
My list of books in Advanced mode, showing tabular layout and more advanced font styles My list of books in Default mode

In Default mode, all the information about the books runs together. It would be better to present this as a simple vertical list, the way the Amazon Kindle store does, rather than as a table.

Font Size Considerations

You can choose from six font sizes in the Kindle browser. As a content creator, you can provide a wider range of font sizes in your Kindle-formatted Web page, but take care that they aren't too small. The device doesn't clearly display fonts that are smaller than its default six sizes.

In this screenshot, the table of contents for a Bookworm book is not readable, even though this page has already been tailored for the small display of mobile phones:

kindle-3.jpg

This problem is only likely to occur in Advanced mode where stylesheets are activated.

Usability

The Kindle's method of selecting and traversing hyperlinks is unique. The user activates links by selecting along the vertical, or Y-axis, using the scroll wheel. When multiple links fall on the same line, the Kindle will open a dialog box so the user can clarify which link is the target.

In Bookworm, users move to the next or previous chapter by selecting navigation links lined up horizontally (see the top row of the first image). In the Kindle, this presentation forces the user to click a second time to select the appropriate one:

kindle-2.jpg

For commonly-used navigational items like this, line up the links in a vertical row:

  1. Next
  2. Contents
  3. Previous

Now no second click (and accompanying page refresh) is necessary.

It's also important to remember that the Kindle is a black-and-white device. If your site uses text color to convey any useful information (such as what is or is not a hyperlink), re-work the design to accommodate a grayscale display.

Finally, keep pages short. The Kindle cannot scroll; long Web pages are paginated like books. Pagination with E Ink devices is slow relative to scrolling on a computer screen. If possible, keep all your content on the first Kindle "page" when viewed at the default font size.

Targeting the Kindle

Web browsers are identified using their "user-agent" string. The current version of the Kindle is broadcasting this user-agent: Mozilla/4.0 (compatible; Linux 2.6.10) NetFront/3.3 Kindle/1.0 (screen 600x800). It's beyond the scope of this article to describe how to set up your Web site to deliver different kinds of content to different browsers, a process that varies considerably with your site's technology.

How do you test your layout if you don't have a Kindle? There's no substitute for having the real device (tell your boss it's for "research"), and currently Amazon does not offer any kind of browser emulator. Some possibilities:

  1. Disable stylesheets on your browser and look at the output. Does it still make sense? (Instructions for disabling stylesheets; Firefox users should install the Web Developer add-on)
  2. Use a text-only browser like Lynx

Some Last Advice

Don't spend too much time on this process. The next version of the Kindle is expected soon, no doubt with an improved browser. Indeed, Amazon could offer a new version of the existing browser at any time. Most of the changes recommended above should take little time and money to implement, and can make a great difference in user experience.

In addition, optimizing your site for small-screen browsers can have other benefits: they allow an increasing number of mobile users to get quick access to your content, and aid accessibility for screen-readers and other non-standard browser types.

Processing the Deep Backlist at the New York Times

At the O'Reilly Open Source Convention (OSCON), Derek Gottfrid of the New York Times led a fascinating session on how the Times was able to utilize Amazon's cloud computing services to quickly and cheaply get their huge historical archive online and freely viewable to the public.

How big is the archive? Eleven million individual articles from 1851 to 1980, or 4 terabytes of data (over 4,000 gigabytes). The Times got it ready for distribution in 24 hours, for a total cost of $240 in computing fees and $650 in storage fees.

As part of their original TimesSelect subscription service, the paper had scanned their entire print archive. Each full-page scan was cut into individual articles. Typical of newspaper format, the articles often spanned column or page boundaries, which meant that many articles were composed of several scans. In the original subscription-based program, whenever a reader requested one of these historical articles, the Times computer would need to stitch together all of the scans for a particular article before presenting it.

This on-demand process used significant computing resources, but because TimesSelect was subscription-based there was never much traffic. Once this archive was open to the public it was expected to generate greater usage, and the safest approach in those cases is to serve pre-generated versions of all 11 million articles. Using traditional software development practices -- with a single computer churning through one article at a time -- the processing could potentially take weeks and tie up Times servers that were needed for other tasks.

Gottfrid turned to Amazon Web Services (AWS) and its two main products:

Amazon Elastic Compute Cloud (EC2) is a form of "virtualization" where one very large computer is divided up into many virtual computers that can be individually leased out for use. Traditional hosting costs money whether the server is working or idle; in EC2 you pay only as long as the virtual computer is running. When it's no longer needed, it's shut down. This makes the service ideal for one-off processing jobs.

In addition, Amazon doesn't care whether you use one EC2 "instance" 100 times, or 100 instances all at once -- the cost is the same. The difference is when you can usefully divide a job into 100 concurrent tasks, because then it takes 1/100th the total time.

Amazon's other major AWS offering is the Simple Storage Service (S3), for large-scale file hosting. Like EC2, it is a leased model -- you pay only for the space that you use in a given time period.

Gottfrid leveraged these technologies in combination with a relatively new software library called Hadoop. Hadoop is written in the Java language and is based on work done at Google. It allows programmers to very easily write programs that can be run simultaneously on multiple computers.

Combining Hadoop concurrency with EC2 and S3, the Times was able to run a job that might have taken weeks of processing time and complete it in 24 hours, using 100 EC2 instances. They were pleased enough with S3 it became their permanent hosting platform for the scans. Hosting with Amazon or other cloud computing services is usually cheaper and has much better bandwidth than the average provider, although downtime can and does occur.

At last year's OSCON, the Times announced the formation of its developer blog, Open. You can read more about the original AWS project as well as TimesMachine, a project that became economically feasible due to the low cost of AWS.

ALA 2008: Librarians and Patrons Want More Openness

At this year's American Library Association (ALA) conference in Anaheim, Calif., one theme emerged in talk after talk: librarians and the readers they serve demand more flexibility, transparency and openness in publishers' offerings. This affects not just digital-only reference works, but the book acquisition via library catalogs and standalone ebooks.

Reference publishing and resource discovery -- Reference publishers invest time and money in bespoke search interfaces for advanced users, but are users finding them? In the ALA panel "The Future of Electronic Reference Publishing," librarians repeatedly commented that multiple reference sources are confusing to users, and that resources must also be discoverable via Google and the library's own digital catalog.

If users do go directly to an individual resource or platform, the search interface should behave "like Google." Although the panel of major reference publishers did state that they are converging on Google's query language, many legacy systems remain that would be economically infeasible to re-tool.

Library catalogs and systems -- The need for more transparent, network-based services applies to the library catalog as well. In the marathon session, "The Ultimate Debate on the Future of the Library Catalog," speakers identified a critical need for geo-based services and APIs for finding what's in my local library -- now. Once a book is located I should be only a few clicks away from reserving it or even ordering it for delivery to my home.

That dream is still far off -- even with a service like WorldCat it's not currently possible for me to find and reserve a book at my local library. The closest offering presented on WorldCat is Harvard University's library, which is not about to lend to the likes of me. The problem is even worse for rural libraries. As for my local library -- I love books and this post is the first time it even occurred to me to visit their site. I'm not alone in that.

Ebooks -- This is a transitional time in publishing, and while many patrons still prefer print, an increasing number are asking for electronic books, especially in university libraries. Students and academics emphatically reject DRM and restrictions on usage, but many ebooks sold to libraries have technical barriers to printing, cut-and-paste and downloading.

Licensing and subscription costs are also a concern for libraries. Ebooks may be re-priced or re-bundled, challenging the basic assumption that once a library buys a title, it owns the book indefinitely. Librarians want assurances that the products they purchase are either available perpetually, or at least have clearly-stated licensing terms that do not change without notice.

The ability to safely and permanently archive electronic books has been a long-time concern of some librarians, but the floods in New Orleans and Iowa have changed some minds. Off-site electronic archiving would save at least some resources, especially for very small or rural libraries can't afford state-of-art preservation facilities.

Exploring DIY E-Reader Platforms

I've been working with the EPUB open ebook format a lot lately, but when I want to read a book in it, I have to use my computer. There just aren't any devices which support it yet. Naturally this leads me to wonder whether I could build my own e-reader.

I'm not a hardware person, but the last few years have seen an emergence of open hardware platforms designed to allow even ordinary programmers like me to modify and customize small devices. As far as software goes, an e-reader is pretty straightforward: it's just some text on a screen. That shouldn't be too hard, right?

Surveying the landscape of hardware options, I've ranked below a variety of devices from "friendliest" to "most-intensive DIY." I'm not addressing PDA or phone devices here, largely because I consider their screen size and text rendering insufficient (but plenty of people disagree).

The Chumby -- With a 3.5" touch screen and reasonable $175 price tag, this little wireless computer in a bean bag is an obvious candidate. There's a full-fledged development environment and large community of users. Most Chumby applications are written in a lightweight version of Flash, which is easy enough to develop in.

It has a few downsides, though. The Chumby doesn't have much storage space at all, so any ebooks would have to be saved online and streamed to it, a page or a chapter at a time. Since it's meant to be an always-on wireless device, that seems doable. The screen might be too small to comfortably read lots of text, as it's meant for short bursts like RSS feeds or Twitter updates.

Unfortunately, it's powered by a wall outlet, with only a small 9-volt battery for emergency backup. People on the hardware forums have managed to hack in rechargeable batteries, and I wouldn't be surprised if a totally-wireless Chumby is forthcoming. [Disclosure: O'Reilly AlphaTech Ventures is an investor in Chumby Industries.]

BugLabs -- The most open of the commercial hardware platforms, BugLabs sells individual pluggable modules that support various features, from touchscreens to cameras to GPS. It looks like a great platform, but it's very expensive ($349 for the base module plus $119 for the 2.5" touch-sensitive screen). The screen is probably too small for comfortable reading, but the company Web site promises a larger display soon.

Both the Chumby and BugLabs have touchscreens, which is key for making small screens more usable.

The Kindle -- All the heavy lifting has been done already to get into the Kindle filesystem and peek inside. It's probably too difficult to extend the existing Kindle environment without true source code, but it might be possible to do some simple things, like add new fonts. Few people have really explored hacking on e-ink devices, largely due to high cost and low availability. I suspect when the first color e-ink devices come out, used black and white ones will become popular playthings for enthusiasts.

YBox2 -- For the ultimate DIY experience, the YBox2 platform is a pile of electronic parts you solder together and assemble in an Altoids tin. It doesn't come with a touch-screen, or any screen at all: you connect it to a television or monitor. It uses the tiny Propeller chip, which powers many hobbyist devices and small robots. Like the Chumby, YBox2 comes with networking capability but little storage, and would need to stream book content from the Internet. The networking isn't wireless and of course there's no handy rechargable battery, but if you are the kind of person who can build a YBox2 you probably know how to make those too. I am not that kind of person.

While I'd be happy to crawl around a hacked Kindle, I know I'm not ready to program my own microcontroller. BugLabs seems great from a developer standpoint, especially when they release a larger screen, but I'm unwilling to shell out almost $500 just to experiment. The Sony Reader doesn't have networking, so that's much less interesting. Perhaps a Chumby is in my future. Any other options?

Release Early, Release Often: Agile Software Development in Publishing

"How do Web startups release three or four new versions of a product in the time it takes publishers to launch just one new feature on their online platforms?"

This question framed "The Agile IT Organization," a lively and well-informed discussion at the recent Society for Scholarly Publishing annual conference in Boston. As a software engineer, I've used both agile and traditional product development methodologies and I was interested to hear the perspectives of other programmers as well as publishers who've gone through the process.

Geoffrey Bilder of CrossRef provided an introduction to agile development practices, which are concisely summarized in plain English by a core set of principles.

Summarizing even further, agile development means:

  1. Minimal up-front specification. A project has high-level goals (e.g. "make our back catalog searchable and available for print-on-demand purchase"), but is not fully described before development begins.
  2. Frequent, short-cycle releases. A project is broken up into mini-projects, each with a small set of features that take only a few weeks to implement. Every release ("iteration") has a specification, development and testing phase. This means that every couple of weeks the software is fully usable, although it may have very few features at the start.
  3. Change to the product design is accommodated and even expected. Market conditions, corporate re-organization or user demands may mean that new features are added or old ones are re-worked. Changes are treated as just another iteration.

The panel at SSP focused on two approaches: internal, IT-driven products, and those developed by a third-party vendor. Larry Belmont, manager of online development at the American Institute of Physics, gave an excellent presentation on the in-house approach. His organization ran its first agile project with a timeline measured in days rather than weeks or months.

Leigh Dodds, CTO of Ingenta, provided the vendor perspective, and described the principles of a formal type of agile development known as Scrum.

The panel was, to their credit, enthusiastic about the approach, but agile development requires commitment and is not right for every organization or project. Some caveats that need to be emphasized:

  • Short development cycles come with a price: you will be asked to review and comment on small pieces of the larger project, and be involved on an almost daily basis. Many publishers need vendors they can treat like plumbers: "I want a new sink put here, it should look like this, call me when it's done." If someone in your organization isn't prepared to think very hard every day about copper pipe fittings, agile isn't right for you.
  • Project managers must be empowered to make decisions. Whether the project is in-house or vendor-driven, every day the PM will be asked to make calls without appealing to higher powers. When editorial buy-in is required, or when the product needs a larger review, consider a hybrid approach: appoint a single decision-maker with deep editorial knowledge to work on evaluating, testing and approving each iteration, but use a more traditional alpha/beta/gold release process for the wider group.
  • Product features may change, but time and budget should be invariant. Hard deadlines might seem to be antithetical to the free-wheeling, change-friendly agile approach, but in my experience they're critical. They focus the entire team: key decision-makers cannot spend weeks in committee, IT personnel don't fear the "death march" project with no end in sight, and it's more difficult to introduce budget overruns that cause friction with management and vendors. If an agile project does run out of time, you will still have a launchable product that's been thoroughly tested and reviewed all the way down the line, not something just out of beta with weeks of QA ahead. Many agile methodologies use the hard deadline, or timebox, as the primary method of structuring the project.

"Release early, release often" can sound a lot like "throw whatever we've got out the door." This is one reason why the iterative approach has been so embraced by Web startups: each small release has been thoroughly tested and evaluated, and there's never a moment where the software doesn't work. It's possible to to go live with a project that might not be "finished" according to the original master plan, but might otherwise be caught up in insurmountable technical hurdles or tied up in editorial review.

If publishers are going to be ready for an "iPod moment," this kind of flexibility and responsiveness is critical.

What OpenID Can Do for Academic Publishers

OpenID is a free, decentralized system for managing your identity online. What does that mean? It's easy to explain by example.

Right now you probably have dozens of accounts on different Web sites. It's likely that you use the same (or similar) user names and passwords on all of them. OpenID solves the problem of creating nearly-identical accounts on different services, and also allows you to control how much personal information you provide to each service that asks for your OpenID.

What makes OpenID interesting in the publishing community is that it distinguishes between two concepts that are often conflated:

  1. Identity: Who am I?
  2. Authentication: What do I have access to?

Traditional user name and password schemes are used for both purposes, but they are actually quite different.

Identity only -- When I shop at Amazon.com (assuming I'm not boycotting it), I only need to provide my identity. I don't need any special permission to access Amazon's search and browse features. What I do want to protect are my account information and shopping cart, but arguably those belong to me, not Amazon.

Identity and authentication -- When I want to post to the TOC blog, I need to provide both types of credentials: identity, so the blog software can put my name under my post, but also authentication to prove that I'm a registered contributor. If you write a comment to this post, you'll only be asked to provide identity.

Authentication only -- The third case -- authentication without identity -- is common in subscription-based journals and research material. I can go to the Boston Public Library, sit at a terminal, and get access to hundreds of online resources in the deep web that aren't available to the general public. The library has paid for the right to access the resources, but those sites only need to know that I'm authenticated through an institutional subscription, not who I am as an individual. This is the correct default behavior, and it's admirable that librarians fight hard on behalf of patrons to explicitly protect users' identities.

This leaves academic and journal publishers without an obvious way to offer their users some of the benefits of identity-based systems: bookmarking, tagging, annotating, and sharing. One solution is to build another layer of access control: first I authenticate, either by using a library terminal or entering my library card number, and then I identify myself with yet another user name and password. Only then do I get the ability to save searches, bookmark documents and possibly share those with other authenticated users of the resource.

Publishers could instead use OpenID to handle identity management in these products. Compared with building such a system from scratch, OpenID is inexpensive and is already fully-implemented in many programming languages.

Users benefit in several ways: they don't have to create a new account and remember another set of credentials, and now they have new options for personalizing their research experience. It also opens up the possibility of tying together saved resources across multiple products owned by different publishers, similar to some types of citation management software.

Currently, signing up and using OpenID can be a bit confusing for novices, but the user experience is expected to improve. In the near future it's likely to be largely opaque to end-users, who will only need to know that their identity is managed by a source they already trust.

One last point that's relevant to library users: an OpenID account can still provide anonymity. There's no requirement or guarantee that my OpenID account name has anything to do with my legal name. It's likely that many users will have multiple OpenIDs in the same way that people use throwaway email accounts when registering on Web sites. However, the onus is still on the end-user to be careful where and how they distribute their personal information.

Storytelling 2.0: Alternate Reality Games

Publishers are experimenting with an emerging form of interactive entertainment known as Alternate Reality Games (ARG). ARGs are mediated by the Web but they also extend into the real world, with players traveling to physical places and interacting with game characters via email, text messaging, Twitter, and even "old-fashioned" telephones.

I spoke to the founders of ARG design firm Fourth Wall Studios, the company that created the first publishing ARG, Cathy's Book. I wanted to know if ARGs are a viable form of commercial storytelling, if they can be packaged up after the experience has ended, and if they can engage with a wider audience beyond hard-core gamers.

Q: Do you think the high level of engagement required of an ARG limits the audience? Is there such a thing as a "casual" ARG, that can be enjoyed in the spare moments between soccer practice and dinner time?

A: Elan Lee, Fourth Wall Studios Founder/Chief Designer: ARGs up until now have been like rock concerts. Thousands (if not millions) of people come together at one point in time to collectively experience something incredible. They have a good time, sing along, maybe buy a t-shirt, but when they go home to tell their friends about it, there's no action their friends can take other than to hope they don't miss the next one. The traditional ARG is an experience that exists between the start and end date of the campaign, and if you weren't there at the right time, you simply miss out.

To continue the metaphor, think of our games [at Fourth Wall] as ARG "albums" instead of concerts: something you can play when, where, and how you want. Ultimately, it is only through this "album" approach that this new form of entertainment is going to evolve into a mainstream genre of storytelling.

Q: Many ARGs have been developed as promotional tools for other media: music releases, films, TV series, video games, and now books. Is there a perception that ARGs have to be in support of something else, rather than entertainment themselves?

A: Elan Lee: ARGs have had their roots in marketing because frankly, at this early stage, that's a great place to find money. Marketers have a tougher job every day of finding ways to get their message heard above the noise, and they have a lot of money to throw at the problem. It's a great situation for both sides: marketers get to engage their audience in a way that attracts, involves, and maintains an audience around a product. ARGs benefit in that we get to run wild and ground-breaking experiments as we birth this new art form.

Also, at least in the case of Nine Inch Nail's Year Zero and Cathy's Book, the ARG elements were not conceived as marketing, but as an inextricable part of the content. An album or a book was the spine of the experience, but the work of art itself was conceived as an interactive multimedia whole.

Q: Cathy's Book was targeted at a young adult (YA) audience. Do you think YA is a strong market for this kind of interactive entertainment? Would it be possible to engage even younger children?

A: Sean Stewart, Fourth Wall Studios Founder/Chief Creative: Cathy's Book and the new hardcover, Cathy's Key, are designed to be first and foremost a fun (and funny) adventure story. We've added a lot of "fourth wall" elements -- you can call Cathy's phone number and leave her a message, investigate clues she doesn't have time to investigate or write to email addresses you find in the book and see what responses come back to you. Cathy even hosts a gallery where readers can submit their own artwork -- the best of which will be published in the paperback of Cathy's Key. The basic impulse behind this series is to make books -- a traditionally passive, solitary activity -- something with an active, social component as well.

"Fourth Wall" fiction -- experiences that play out at least partly over your browser, your phone, your life -- feels somehow very right for this new age; it's a kind of storytelling that arises naturally from the world of three-way calls, instant messenger, text messaging, and shooting a friend an email with a link to something cool you saw on the Web. To that extent, it's going to feel the most natural to the people most comfortable with that kind of wired world.

When I was in New York last year, meeting with the publisher of Cathy's Book, my 12-year-old daughter emailed me a PowerPoint slide deck, complete with music and animations, explaining why I should get her a Mac laptop for Christmas. Yeah, I think her generation finds interactive entertainment more natural than mine. And yes, I think it would be not only possible, but really effective to build interactive, exploratory stories for even younger kids -- but to do that, we need to get away from the traditional ARGs willingness to be confusing. Most people like to have some clue what the heck they are supposed to do next. It won't surprise you to learn that this is another crucial design issue Fourth Wall Studios has set out to solve.

Q: Reading is usually a solitary pursuit, but there's an almost universal desire to "live" in some genres, whether it's idealized period romances, spy novels, or detective stories (murder mystery parties, especially popular in the 1980s, illustrate this). How important are traditional fiction genres in ARG? Can there be an element of role-playing involved? Are there genres that haven't been explored yet that have potential?

A: Sean Stewart: The first paid writing I ever did, actually, was for live action role playing games and murder mystery dinner parties in the '80s. I never would have guessed that writing for those things would turn out to be extremely important training for me, but in fact the intersection of writing and theater, where you try to find ways for the audience to participate in the story, lies at the heart, I think, of the next evolution in storytelling.

We believe that immersing yourself in a world is a fundamental part of what makes fiction fun. Any time I follow a character -- whether in a Jane Austen novel or a "Matrix" movie -- I am imagining what that must be like. One of the biggest pay-offs in an ARG is that you don't just imagine a fictional world, as in a book, or see it, as in a movie: you actually inhabit it. When I read a Harry Potter novel, I get to go to Hogwarts vicariously; when I play an ARG, I get to go myself. I am finding Web sites on my browser, I am talking to characters on my phone: the world of the fiction has reached out to me.

That proposition, by the way, shouldn't be limited by genre. ARGs have often had a thriller/science fiction slant to them, but even inside our games we've done romantic comedies, spy plots, documentary-style slice-of-life experiences, tragedies, and even Westerns. Fourth-wall fiction isn't about a given genre: it's a set of tools and approaches for letting the audience participate in any kind of story.

Q: What happens when the game is over? Is it possible to package up an ARG as a complete work (whether online or in print) to be experienced linearly? Or is the experience meaningless without real-time participation?

A: Elan Lee: Here's where I'm going to try to get as much mileage out of the "rock concert" metaphor as I can. There is no denying the electric energy present at a concert and there is absolutely no substitute for "being there." However, there are only so many available seats per venue, and only so many venues you can play before exhaustion sets in (both for the artist and the audience). For ARGs to evolve into a mainstream form of entertainment, they must create their own version of "albums" to complement the "concert." Don't get me wrong, I'm not saying we have to find a way to put a package around these things and call it a day; I only suggest that both pieces of the experience must exist for the real potential of the form to be realized.

What Makes a Collaborative Writing Project Successful?

Penguin's collaborative writing experiment A Million Penguins was launched in February 2007 and completed in March 2007. This month saw its final scholarly assessment published in a research report out of De Montfort University in Leicester, UK.

The results? Terrible, according to Gawker, echoing a consensus that the project failed as literature. As a study of online behavior, though, it's quite fascinating, and the research paper describes examples of all types of user contributions, from the grandiose and self-serving to the quietly constructive.

But if "every book needs its author," game-like fiction has been shown to be more amenable to collaboration. Each of Penguin's We Tell Stories pieces was co-written by interactive developers and a novelist. This month, the Guardian has launched a participatory interactive fiction project.

Although technically a type of computer game, interactive fiction has a long association with print authors, starting with the commercially successful adaptation of Douglas Adams' The Hitchhiker's Guide to the Galaxy (1984). In 2003 Adam Cadre (Ready, Okay!, HarperCollins, 2000) wrote the game Narcolepsy incorporating 12 dream sequences written by different authors (of which I was one). In a more experimental vein, the recent UpRightDown project released its first story, which generated submissions in multiple media, including some interactive works.

One lesson from these experiments is that while a work of fiction may not need a single author, it does need a single editor or authority to weave together disparate contributions and reject the obvious vandals. A unified final work has the potential to be a marketable product rather than a research project. (On the other hand, if the printed German Wikipedia sells, all bets are off.) Scale is important as well: two or even three dozen contributors are probably manageable; A Million Penguins had 1,700.

The Guardian's interactive fiction project is being managed using wiki software at textadventure.org.uk. The organizers are soliciting both programmers and non-technical writers. It is scheduled to run through at least the end of May.

Iliad Book Edition E-Reader Coming to UK

iliad-book-edition.jpgJust in time for our discussion on the ideal e-book reader comes a new product that will be the first e-reader sold in the United Kingdom.

Trading Wi-Fi for increased storage and an overall price drop, the iLiad Book Edition is a successor to the iLiad 2. Both use the same iRex e-ink technology and feature a tablet-based touch screen. There is no bundled online service or book store, but both iLiads have support for open formats such as PDF. 50 public domain books are preloaded.

Borders UK will sell the device in a small number of stores, and will launch an online ebook store shortly thereafter.

Unfortunately, even this "reduced" price of £399/€499 is unlikely to win over e-reader skeptics, especially without network connectivity. Buying books will always require tethering the device to a computer and completing the purchase over the Web.

Other iLiad Book Edition technical specs:

  • 8.1-inch (diagonal) Electronic Paper Display
  • 8.5 inch high x 6.1 inch wide, weight 15.3 ounces
  • 768 x 1024 pixels resolution, 160 DPI, 16 levels of grey-scale
  • File formats supported: PDF, HTML, TXT, JPG, BMP, PNG, PRC (Mobipocket)
  • 128MB accessible flash memory; storage expandable via USB, MMC or CF cards
  • Built-in stereo speakers and mini-headphone jack
  • USB Connectivity to PC
  • Optional external 10/100MB Ethernet networking via Travel hub

Tutorial: Add AB Meta Tagging to Your Blog

Many publishers use blogs to promote new products and engage customers. Dedicated blog readers will subscribe and receive every post, but the best way to reach a wider audience is still via search engines.

Embedding simple machine-readable code is a key component of the "semantic" Web, in which search engines don't just treat Web pages as a jumble of keywords, but instead can understand their meaning.

Technology firm Adaptive Blue has recently released a scheme for tagging books, movies and other media to enable search engines to label media products appropriately. Because Adaptive Blue's AB Meta is so new, there aren't yet dedicated tools for it. Fortunately, the scheme is very simple and re-uses basic Web tagging. Publishers can use this scheme -- today -- to enrich blogs and product pages.

Here we provide instructions for adding AB Meta content to a WordPress blog. Examples for integrating the format into other blogging software can be found in the description of AB Meta.

Using AB Meta with WordPress

  1. Download the HeadMeta plugin
  2. Unzip the plug-in and copy the headmeta folder to your wp-content/plugins directory.
  3. Enable the plug-in in the WordPress Plugin Management page (/wp-admin/plugins.php)
  4. When writing a new post, look under Advanced Options -> Custom Fields.

The Custom Fields form will allow you to set two items: a key and a value:

  1. The key will always be "head_meta".
  2. The value will be in the following general format:
    name="an AB Meta field" content="the field's value"

Here's an example for a book title:

wordpress-advanced-lg.jpg

To qualify as AB Meta content, one field is required and should always be added:

name="object.type" content="book"

After that, you will add fields that are specific to your book content. Here are some examples from the Adaptive Blue site for the book The Kite Runner:

name="object.type" content="book"
name="book.title" content="The Kite Runner"
name="book.author" content="Khaled Hosseini"
name="book.isbn" content="1594480001"
name="book.year" content="2004"
name="book.link" content="http://books.com/1594480001.html"
name="book.image" content="http://books.com/1594480001.jpg"
name="book.tags" content="fiction, afghanistan, bestseller"
name="book.description" content="Story of an Afghan immigrant."

For WordPress, in the Custom Fields option, these would all be entered like this:

In the key field: head_meta
In the value field: name="object.type" content="book"

In the key field: head_meta
In the value field name="book.title" content="The Kite Runner"

... and so on, through all of the metadata fields to be included with the blog post.

What advantages are there to using AB Meta?

At the time of this writing, there are no applications that are specifically indexing AB Meta content. However, the scheme is quite simple, both for human and computer readers, and is likely to see widespread adoption. Tagging content with it now means that when these tools become available, you will already have significant inventory indexed. In addition:

  1. Many of the fields in AB Meta correspond to values in the Google Book Search API. This should make it trivial for Google to match articles about books to specific entries in Google Books, where customers can preview content before buying.
  2. It's likely that tools based on Amazon Web Services will be built on top of AB Meta to allow those tags to generate direct or affiliate links to the Amazon.com book store.
  3. Some XML-based workflows already store book metadata in the Dublin Core schema, and AB Meta supports Dublin Core directly.
  4. Simpler blog plug-ins that support or even can auto-generate AB Meta are certain to be developed.

So get tagging! In the meantime we'll continue to monitor progress of AB Meta in terms of adoption and tools.

Ebook Format Primer

Amid all the recent ebook news, many publishers may still be unclear about the different formats and devices. How do ebooks actually get made? What changes need to be made to existing workflows to enable content distribution to ebook devices? We've put together this primer to help clear things up.

The simplest solution, of course, is to partner directly with the ebook manufacturers and let them take care of the details. These partnerships must be drawn up for each new platform and publishers are at the whims of the device-makers' terms of use. Innovative publishers may want to first experiment on their own and be prepared to shift platforms strategically: this means ebook distribution must fit into existing workflows. Although some of the formats below support digital rights management, consider eschewing DRM in favor of flexibility and cross-platform support.

Let's start with the major devices first:

  1. The Sony Reader primarily uses Sony's proprietary Broadband eBooks (BBeB) format for documents with DRM but also supports RTF and non-DRM PDF. Sony does not provide any official tools for end users to convert to BBeB although at least one unofficial open source tool can convert HTML to BBeB. The most flexible non-DRM formats are RTF and PDF. Microsoft Word can readily save to RTF and Microsoft offers detailed instructions on converting from XML to RTF, but pure open-source alternatives are not mature. XML to PDF conversion has stronger open source support but files may need to be specially tweaked for optimum display on the Reader.
  2. The Amazon Kindle uses Amazon's proprietary AZW format, which supports DRM. There are no tools available to directly convert to AZW, but AZW is a wrapper around the Mobipocket format and DRM-free Mobipocket files can be read on the device. Mobipocket documents can be created using a free (but not open-source) tool called Mobipocket Creator. As if the format wars weren't confusing enough already, "Mobipocket DRM" is not the same as AZW, and files created as Mobipocket DRM cannot be read on the Kindle. Mobipocket Creator does have a "batch" creation mode which could be integrated into an existing workflow, but the software is Windows-only. The Kindle also supports HTML and Word documents, but not PDF.

Specialized readers aren't the only way consumers may be viewing ebook content. Ultra-portable laptops like the Eee PC and OLPC XO are price-competitive with standalone readers. (I have an OLPC and reading by the pool in bright sunlight is quite a joy.) The next version of the iPhone is expected soon, and while the first edition was already a serviceable reader, the next version is likely to be more so, and to reach a wider audience.

All the devices listed above, except the Sony Reader, can read a common format: HTML. If XML is already a part of your workflow, converting to HTML is trivial.  If not, HTML is a worthwhile investment for a number of reasons:

  1. XHTML is the standard markup for book content in OPS/.epub. .epub support is just getting off the ground but is expected to become widespread.
  2. If your publishing workflow includes HTML, your organization is able to distribute content to dozens of devices in addition to the open Web.

HTML is also the lingua franca of online search engines, and inclusion of partial or full HTML books will attract casual surfers and can drive community engagement with your content. Whether it's BBeB or AZW that becomes the Betamax of the next decade (and one, if not both, will be obsolete by then), HTML conversion is guaranteed to pay off in the foreseeable future.

Stay Connected
RSS TOC RSS Feeds
 Blog Feed
 News Feed
 Combined Feed
 New to RSS?
Newsletter Subscribe to the TOC newsletter.
Tarsier Icon Follow TOC on Twitter.
Newsletter Join the TOC Facebook group.
TOC Widget Get the TOC Headline Widget.
Search
Conference
Tools of Change for Publishing Conference

Save the Date! TOC 2009 will take place Feb. 9-11 2009 at the Marriott Marquis in New York City. Sign up for the conference newsletter to hear about important dates and developments as the show approaches.

TOC DVDs
TOC 2008 Tutorial DVDs

Now available. These tutorials dive into the necessary skills and tools critical to the future of publishing.

TOC Job Board
Tag Cloud
Publishing News
Latest from O'Reilly Radar