The line between book and Internet will disappear

A few months ago I posted a tweet that said:

The distinction between “the internet” & “books” is totally totally arbitrary, and will disappear in 5 years. Start adjusting now.

The tweet got some negative reaction. But I’m certain this shift will happen, and should happen (I won’t take bets on the timeline though).

It should happen because a book properly hooked into the Internet is a far more valuable collection of information than a book not properly hooked into the Internet. And once something is “properly hooked into the internet,” that something is part of the Internet.

It will happen, because: what is a book, after all, but a collection of data (text + images), with a defined structure (chapters, headings, captions), meta data (title, author, ISBN), and prettied up with some presentation design? In other words, what is a book, but a website that happens to be written on paper and not connected to the web?

An ebook is just a print book by another name

Ebooks to date have mostly been approached as digital versions of a print books that readers can read on a variety of digital devices, with some thought to enhancing ebooks with a few bells and whistles, like video. While the false battle between ebooks and print books will continue — you can read one on the beach, with no batteries; you can read another at night with no bedside lamp — these battles only scratch the surface of what the move to digital books really means. They continue to ignore the real, though as-yet unknown, value that comes with books being truly digital; not the phony, unconnected digital of our current understanding of “ebooks.”

Of course, thinking of ebooks as just another way to consume a book lets the publishing business ignore the terror of a totally unknown business landscape, and concentrate on one that looks at least similar in structure, if not P&L.

While you can list advantages and disadvantages of print books versus ebooks, these are all asides compared with the kind of advantages that we have come to expect of digital information that is properly hooked into the Internet.

Defining a book by what you cannot do

What’s striking about this state of affairs — though not surprising, given the conservative nature of the publishing business, and the complete unknowns about business models — is that we define ebooks by a laundry list of things one cannot do with them:

You cannot deep link into an ebook — say to a specific page or paragraph chapter or image or table
Indeed you cannot really “link” to an ebook, only various access points to instances of that ebook, because there is no canonical “ebook” to link to … there is no permalink for a chapter, and no Uniform Resource Locator (url) for an ebook itself
You (usually) cannot copy and paste text, the most obvious thing one might wish to do
You cannot query across, say, all books about Montreal, written in 1942 — even if they are from the same publisher

You cannot do any of these things, because we still consider that books — the information, words, and data inside of them — live outside of the Internet, even if they are of the e-flavor. You might be able to buy them on the Internet, but the stuff contained within them is not hooked in. Ebooks are an attempt to make it easier for people to buy and read books, without changing this fundamental fact, without letting ebooks become part of the Internet.

Many people don’t want books to become part of the Internet, because we just don’t know what business would look like if they were.

This will change, slowly or quickly. While the value of the digitization of books for readers has primarily been, to date, about access and convenience, there is massive and untapped (and unknown) value to be discovered once books are connected. Once books are accessible in the way well-structured websites are.

What lurks beneath the EPUB spec

The secret among those who have poked around EPUB, the open specification for ebooks, is that an .epub file is really just a website, written in XHTML, with a few special characteristics, and wrapped up. It’s wrapped up so that it is self-contained (like a book! between covers!), so that it doesn’t appear to be a website, and so that it’s harder to do the things with an ebook that one expects to be able to do with a website. EPUB is really a way to build a website without letting readers or publishers know it.

But everything exists within the EPUB spec already to make the next obvious — but frightening — step: let books live properly within the Internet, along with websites, databases, blogs, Twitter, map systems, and applications.

There is little talk of this anywhere in the publishing industry that I know of, but the foundation is there for the move — as it should be. And if you are looking at publishing with any kind of long-term business horizon, this is where you should be looking. (Just ask Google, a company that has been laying the groundwork for this shift with Google Books).

An API for books

An API is an “Application Programing Interface.” It’s what smart web companies build so that other innovative companies and developers can build tools and services on top of their underlying databases and services.

For instance:

Google Maps has an API so that geolocation services (for instance Yelp) can use Google Maps and the business data contained therein to better serve their niche customers
Twitter has an API so that other services can build Twitter clients, search Twitter, provide Twitter analytics, etc.
Amazon has an API that lets developers easily find and point to product information.
Wikipedia has an API, so that you can do thing like make books out of every edit done on the Wikipedia article, “The Iraq War“

We are a long, long way from publishers thinking of themselves as API providers — as the Application Programming Interface for the books they publish. But we’ve seen countless times that value grows when data is opened up (sometimes selectively) to the world. That’s really what the Internet is for; and that is where book publishing is going. Eventually.

I don’t know exactly what an API for books would look like, nor do I know exactly what it means.

I don’t know what smart things people will start to do when books are truly of the Internet.

But I do know that it will happen, and the “Future of Publishing” has something to do with this. The current world of ebooks is just a transition to a digitally connected book publishing ecosystem that won’t look anything like the book world we live in now.

Related:

Popular topics:

The inevitability of truly connected books and why publishers need APIs.

An ebook is just a print book by another name

Defining a book by what you cannot do

What lurks beneath the EPUB spec

An API for books

TOC

Stay Connected

More O'Reilly Sites