• Print

The line between book and Internet will disappear

The inevitability of truly connected books and why publishers need APIs.

A few months ago I posted a tweet that said:

The distinction between “the internet” & “books” is totally totally arbitrary, and will disappear in 5 years. Start adjusting now.

The tweet got some negative reaction. But I’m certain this shift will happen, and should happen (I won’t take bets on the timeline though).

It should happen because a book properly hooked into the Internet is a far more valuable collection of information than a book not properly hooked into the Internet. And once something is “properly hooked into the internet,” that something is part of the Internet.

It will happen, because: what is a book, after all, but a collection of data (text + images), with a defined structure (chapters, headings, captions), meta data (title, author, ISBN), and prettied up with some presentation design? In other words, what is a book, but a website that happens to be written on paper and not connected to the web?

An ebook is just a print book by another name

Ebooks to date have mostly been approached as digital versions of a print books that readers can read on a variety of digital devices, with some thought to enhancing ebooks with a few bells and whistles, like video. While the false battle between ebooks and print books will continue — you can read one on the beach, with no batteries; you can read another at night with no bedside lamp — these battles only scratch the surface of what the move to digital books really means. They continue to ignore the real, though as-yet unknown, value that comes with books being truly digital; not the phony, unconnected digital of our current understanding of “ebooks.”

Of course, thinking of ebooks as just another way to consume a book lets the publishing business ignore the terror of a totally unknown business landscape, and concentrate on one that looks at least similar in structure, if not P&L.

While you can list advantages and disadvantages of print books versus ebooks, these are all asides compared with the kind of advantages that we have come to expect of digital information that is properly hooked into the Internet.

Defining a book by what you cannot do

What’s striking about this state of affairs — though not surprising, given the conservative nature of the publishing business, and the complete unknowns about business models — is that we define ebooks by a laundry list of things one cannot do with them:

  • You cannot deep link into an ebook — say to a specific page or paragraph chapter or image or table
  • Indeed you cannot really “link” to an ebook, only various access points to instances of that ebook, because there is no canonical “ebook” to link to … there is no permalink for a chapter, and no Uniform Resource Locator (url) for an ebook itself
  • You (usually) cannot copy and paste text, the most obvious thing one might wish to do
  • You cannot query across, say, all books about Montreal, written in 1942 — even if they are from the same publisher

You cannot do any of these things, because we still consider that books — the information, words, and data inside of them — live outside of the Internet, even if they are of the e-flavor. You might be able to buy them on the Internet, but the stuff contained within them is not hooked in. Ebooks are an attempt to make it easier for people to buy and read books, without changing this fundamental fact, without letting ebooks become part of the Internet.

Many people don’t want books to become part of the Internet, because we just don’t know what business would look like if they were.

This will change, slowly or quickly. While the value of the digitization of books for readers has primarily been, to date, about access and convenience, there is massive and untapped (and unknown) value to be discovered once books are connected. Once books are accessible in the way well-structured websites are.

What lurks beneath the EPUB spec

The secret among those who have poked around EPUB, the open specification for ebooks, is that an .epub file is really just a website, written in XHTML, with a few special characteristics, and wrapped up. It’s wrapped up so that it is self-contained (like a book! between covers!), so that it doesn’t appear to be a website, and so that it’s harder to do the things with an ebook that one expects to be able to do with a website. EPUB is really a way to build a website without letting readers or publishers know it.

But everything exists within the EPUB spec already to make the next obvious — but frightening — step: let books live properly within the Internet, along with websites, databases, blogs, Twitter, map systems, and applications.

There is little talk of this anywhere in the publishing industry that I know of, but the foundation is there for the move — as it should be. And if you are looking at publishing with any kind of long-term business horizon, this is where you should be looking. (Just ask Google, a company that has been laying the groundwork for this shift with Google Books).

An API for books

An API is an “Application Programing Interface.” It’s what smart web companies build so that other innovative companies and developers can build tools and services on top of their underlying databases and services.

For instance:

We are a long, long way from publishers thinking of themselves as API providers — as the Application Programming Interface for the books they publish. But we’ve seen countless times that value grows when data is opened up (sometimes selectively) to the world. That’s really what the Internet is for; and that is where book publishing is going. Eventually.

I don’t know exactly what an API for books would look like, nor do I know exactly what it means.

I don’t know what smart things people will start to do when books are truly of the Internet.

But I do know that it will happen, and the “Future of Publishing” has something to do with this. The current world of ebooks is just a transition to a digitally connected book publishing ecosystem that won’t look anything like the book world we live in now.

Related:

tags: , , ,
  • http://www.widescript.com jeroenvduffelen

    It will happen? It is happening already! At Widescript we are bringing the e-book (epub & zhook format) to every connected device through a web browser. At the moment we are developing some examples to show of the potential of web enabled interactive e-books. We are considering building an API for some of the interactive features but don’t really know what should be made public through it. What kind of features would you think should be opened up with an API?

  • http://mobileministrymagazine.com Antoine RJ Wright

    I will agree with the above poster, this is already happening in niche areas. It is only a matter of time before such conventions are normal.

    In respect to publishers becoming more aligned with developers, we’ve seen this in the Biblical publishing arena with two events (Olive Tree partnering with Zondervan, and Logos’s work on the Biblia API). Both efforts blurr the lines of publishers and developers, and set the stage for books being redefined (see the EEC commentary coming from Logos).

    Just a matter of time; and you folks at O’Reilly should be right on the leading edge of this as well :)

  • http://hughmcguire.net Hugh McGuire

    @jeroenvduffelen … certainly it *is* happening already. But on the fringes of the publishing world, which is where it *should* happen. As for what kind of info should be exposed via API, I don’t quite know. At least: all meta data, and associated images. Maybe you should be able to find paragraph #142, sentence #7 (or – be told that “To be or not to be” is found there). Maybe the API should allow readers or partners to add metadata to phrases, words, locations contained in the text – ie to build a semantic metadata layer “behind” the book, including links etc. Certainly all books should allow for commenting, to be displayed as: a) every comment, b) comments from just me, c) comments from just john smith, d) comments from a particular group of people (eg. my editors, my family, my english lit class etc); probably the API should allow a metadata analysis tool to find place names within the text, and add appropriate links to, say, Google maps.

    etc. etc. etc.

  • http://hughmcguire.net Hugh McGuire

    @Antoine: I totally agree about the religious publishers. This implementation of the Bible: http://www.youversion.com/ … especially the mobile versions, is, I believe, the most sophisticated implementation of truly webby books.

  • http://www.writediteach.com/blog/ Shawn Douglas

    I also saw the article about the publishing of the historiography of the Wikipedia “Iraq War” entry today. That article combined with your commentary here has my brain spinning in many interesting directions.

    My foremost thought is that there is solid potential in making edits and revised editions with ease when a book is connected via the Internet. Rather than having to live with a few errors that were missed by the editors after publishing, an Internet-connected book could receive supplementary edits and revisions caught both by the publishing house and by other readers. (Those readers would be able to send in their spotted errors through an Internet submission, of course.) Additionally, your Internet-connected book would include a historiography detailing what has been edited and when.

    A new pay model would probably have to be set up, however. Small edits and corrections would likely be free via either download or through an API. However new editions with revised material would need to be bought, though potentially it could be done at a discount if the user bought the previous edition.

    The “API” that you mention may not specifically be an API in the sense that we think of it now. It may rather be a tiny proprietary application a publisher requires you to install to manage the transfer of information between their system and your Internet-connected book. Or maybe it may be an open source standard application that publishers agree to use? I don’t know.

    I’m going to have to write more about this on my blog later today!

  • Bob Watson

    As long as DRM exists the “paper book” will have a strong presence for those who don’t want to pay for yet another download. (You won’t want to loan your Kindle to a friend.)

    If DRM disappears the “paper book” will have a weaker presence, but will linger for those who feel most familiar with them … and, perhaps, in rare collections which value permanence in our culture.

    The question will be … will authors value DRM as much as publishers do today?

  • http://www.levimontgomery.com Levi Montgomery

    I’ve said much the same thing. In fact, my timeline prediction even matches.

    http://www.levimontgomery.com/index.php/2010/08/17/the-perfect-ereader-system/

  • Alex Tolley

    I question the validity of the argument that: “… a book properly hooked into the Internet is a far more valuable collection of information than a book not properly hooked into the Internet”.

    Today you can do text search against Google Books, and still happily locate the text in a dead tree copy. Sure it might be more convenient to have the text appear directly in a digital copy, but how often do you really need to do that?
    If one is doing research on book contents, that is very different from just wanting to read a book and maybe check a reference or a phrase once in a while.

    The type of book is important too. A technical book is often just a collection of examples tied together with a narrative. Arguably that format could change drastically. OTOH, a novel is a cohesive narrative structure that cannot be easily decomposed.

    The beauty of a book, especially hard copies, is that they last and can be read even when slowly decaying. Electronic books are not proven to be robust, and an EMP in a cyber war will probably vaporize most of these books in an instant. Bradbury’s firemen never had it so easy.

  • http://hughmcguire.net Hugh McGuire

    @alex: i see no conflict between a hardcover and a book properly hooked into the internet. A properly hooked-in book could/can produce whatever reading format you like: hardcover, softcover, epub, pdf, mobi, text, rtf etc. That, in fact, is part of the point, and part of the distraction – the format doesn’t matter, if the information is housed, accessible, and properly structured.

    see, eg:
    http://booktwo.org/notebook/wikipedia-historiography/

  • http://www.marklogic.com Stephen Cohen

    Check out SciVerse from Elsevier, they’ve got APIs for their digital libraries http://www.sciverse.com

  • http://www.magellanmediapartners.com Brian O''Leary

    Good post, Hugh. I’d add the idea that you also can’t have a conversation inside the current version of the book.

    In the sense that we know how to interface with books, they do have APIs, but they are not open. So in this digital era, I’d also make the case plainly: “Open up your API, or someone else will.”

  • http://hughmcguire.net Hugh McGuire

    @brian: Yes, that’s a whole other angle… that if you as a maker of digital books don’t go in this direction, then others will likely do it for you, and you’ll lose control.

  • bowerbird

    hugh-

    thanks (i guess) for taking some of my ideas and
    putting ‘em here as your own, and those of o’reilly.

    remember that i also have good examples _and_
    working code for apps across the entire workflow
    if anyone should wanna buy a bridge to the future,
    instead of just admiring the view of there from here.

    -bowerbird

  • http://twitter.com/brij Brij

    Great post Hugh. As always it will start from the fringes and will wait for killer app to cross over into mainstream publishing. Lot of vendors are experimenting (including our own Fliplog framework) in this space.

    Imagine the possibility where same book evolves (or rather mutates) for different age-group with the help of external APIs. There will be many book mashup possibilities down the road.

    Fun stuff overall.

    Thx
    Brij

  • http://www.bookglutton.com Aaron

    There are ways to do these things already — but most publishers don’t want them. BookGlutton has for a long time had a way to publish Epub content directly to the Web in a way that allows each individual paragraph to have its own URL. It’s live–you can try it. For over a year, we have also provided a completely free annotation API that allows anyone else to attach notes to those paragraphs and share them across reading systems. This means notes can be attached not just to the epubs in our system, but can appear anywhere else that those epubs are distributed. We do this in an entirely standards-compliant way. What we’ve built has become the reference implementation for a lot of other small companies, and we’ve talked directly with big publishing houses for three years — and there’s zero interest in seeing titles appear in this environment. With the sole exception of O’Reilly Media, no major publisher has wanted to deep link into their titles or allow people to cut/copy text and attach notes at the paragraph-level. That is the reality of the industry, and it’s why Apple, Amazon and Google are all operating silos of content instead of systems that allow book content to integrate seamlessly with the web. I think it will be much longer than 5 years before we see this, unfortunately, but I also think it’s inevitable.

  • http://www.louisvuitton4love.com louis

    There are ways to do these things already — but most publishers don’t want them. BookGlutton has for a long time had a way to publish Epub content directly to the Web in a way that allows each individual paragraph to have its own URL. It’s live–you can try it. For over a year, we have also provided a completely free annotation API that allows anyone else to attach notes to those paragraphs and share them across reading systems. This means notes can be attached not just to the epubs in our system, but can appear anywhere else that those epubs are

  • http://www.canadianelectroniclibrary.ca/Cdn-public-policy-collection.html Bob Gibson

    Hugh: While I agree with your main premise — that ebooks have a long way to go — I wanted to say that with an authenticated online book in a library setting, because it has a persistent url, a reader can do all the things you list as “can’ts” including deep linking, advanced cross-searching, (limited) cutting/pasting/, sharing, annotating, citing. The library (collective) environment is what allows this to happen with copyright material.

  • http://hughmcguire.net Hugh McGuire

    @Bob I’m not too familiar with the library set-up you describe, but librarians and libraries are on the cutting edge of this digital shift, because their concerns – metadata, findability, access, freedom, privacy, usability – are, indeed, the foundation of information culture. Further, librarians tend to have a historical view of books and publishing, that neither technologists, nor publishers – or most writers for that matter – have.

    So I’ll buy your claim that these kinds of solutions will come out of a library culture/implementation – at least in embryonic form.

  • http://www.ioscode.com Paul Willworth

    The Subutai Corporation is doing some smart things making books more “of the internet” starting with the Mongoliad https://mongoliad.com/faq

    In the fiction realm anyway.

  • bowerbird

    aaron said:
    > There are ways to do these things already
    > — but most publishers don’t want them.

    and here we come to the very crux of the problems…

    look at the i.d.p.f. members who control .epub, and
    suddenly the reason for the problems will be clear…

    we’re letting corporate publishers sabotage the future,
    in their (futile) attempt to cling to their business model,
    so the money that fuels their greed continues to come in.

    the .epub format is needlessly complex — because the
    corporations were hoping that’d thwart the revolution…

    we _still_ don’t have consistent or correct rendering of
    .epub by viewer-apps — because the corporations want
    to frustrate e-customers to the greatest extent possible.

    we _still_ don’t have good (i.e., simple and dependable)
    .epub authoring-tools — because the corporations want
    to make it as hard as possible for authors to self-publish.

    and — as we see here — these corporations will continue
    to drag their feet and stall, for as long as they possibly can.

    and because they got all you idiots to sucker-up for their
    “standard” format, now you’re stuck with their delay tactics.

    it will take _years_ for the .epub format to evolve so that
    even the basic functionalities mentioned here can happen.

    and every one of those new functionalities will be plagued
    by implementation differences by the various viewer-apps,
    which will take even _more_ years to get straightened out.

    and this is exactly how the corporate publishers _want_ it.
    you’re being played for fools, and you don’t even know it.

    it’s time to make a call for new e-book programs which are
    based on a _simple_ format, and which can already _attain_
    the vast array of possibilities the digital arena promises us…

    -bowerbird

  • Alex Tolley

    @Hugh McGuire – I think James Bridle is actually making my points.

    1. The destruction of the Library of Alexandria. While not a single point of failure (originals were confiscated and the copies handed back) certainly indicates what happens when objects like books are stored in once place. The modern equivalent is Wikipedia. Destroy their servers and everything is lost.

    2. Where is linkage useful. James uses the Wikipedia example of the Iraq War to show the value of historiography. How useful would the same document history be for a novel? Similarly, his example of the Wikipedia Links race game shows the triviality of the use of links, rather than how useful they are.

    So on the point about the internet adding extra value, I think is not proven and needs more thought. It may add value in some cases, it may not in others.

    But part of your riposte was about there being no conflict between making a book internet ready and its format. I beg to disagree.
    A book as a collection of information (bits) will be stored in a limited number of servers of storage media. This number will be far fewer than the number of copies that could be owned. Indeed the logic will result in fewer hard copies and increase the possible likelihood of complete loss, like the Library of Alexandria. Digital media are fragile and the solution always touted is to increase replication to avoid this. But replicated electronic data remains fragile, even if it were done. So I question whether your idea will not lead us into a trap of easy loss through accident.

    In other words, the seductive nature of a completely internet ready book will bring on the decline of the safest way to preserve knowledge in the event of an accident – the hard copy book.
    Bruce Sterling has made the valid point that electronic books are the only viable way to disseminate knowledge around the world. I don’t disagree with that. Nor do I disagree that knowledge should be more easily linked (who wants to return to hard copy citation indexes), but what I do argue with is the process that will reduce the number of hard copy books and lead to a potential catastrophic failure of our knowledge base.

  • http://hughmcguire.net Hugh McGuire

    @alex: so lets make sure we have many hard copy back-ups, scattered across the globe. along with lots of redundant servers.

  • Laisvunas

    bowerbird is right. The real aim of IDPF is not advancing digital publishing but sabotaging it.

    It is enough to see their website at idpf.org – it could easily win the first award in most boring website competition. No organization serious about its mission would put such website online.

    If you read specs they published you will find another very interesting things. So, for example in section 2.3.7 of OPS 2.0 published 2007 you will find such pearl:

    Reading Systems must not, by default, render the textual content of the script element, and should not execute the script itself.

    You might think that it was written in early nineties, not in 2007!

    One might have hoped that IDPF will improve the spec since there was declared in section 1.7 of the same document:

    Other themes deemed important for future versions include: [...] support for active content (e.g. multimedia, scripting)[...]

    So, three years went through and in 2010 IDPF publishes OPS v 2.0.1 and in it there is exactly the same section 2.3.7 word in word!

    Excellent leadership of the publishing industry in the time of digital revolution!

    It seems that those who really care about digital publishing should not wait anything from IDPF but establish an alternative body and prepare alternative spec – more simple and more powerful.

  • Alex Tolley

    @Hugh McGuire “so lets make sure we have many hard copy back-ups, scattered across the globe. along with lots of redundant servers.”

    Guarantee that and I’m with you. The logic of the technology and economics is against you. Internet ready information will be in centralized servers, distributed for backups and redundancy, as it is today. There is no need to have a hard copy book, if the costs are lower, delivery and updates faster, and the utility of electronic ones are positive. The hard copy will disappear, the electronic copies not as widespread or secure as needed. It is a massive, single point of failure waiting to happen.

  • http://xavierbadosa.com Xavier Badosa
  • bowerbird

    alex said:
    > Guarantee that and I’m with you.

    you’re the one who’s worried, alex.
    so how about if _you_ guarantee it?
    and then we can all be “with you”…

    > It is a massive, single point of failure waiting to happen.

    so, alex, tell us please, what are you going to do to prevent it from happening?
    spit in the face of the digital revolution, and stop it cold and dead in its tracks?

    i agree with you, _wholeheartedly_… when the big bad electromagnetic pulse
    knocks out all of our computers, and all our internal combustion engines too,
    we’re gonna be stuck in a terrible world of hurt. so alex, what’s your solution?
    have you got a cabin in the woods, stocked with supplies and an encyclopedia,
    where beyoncé and i could shelter ourselves and then repopulate the planet?

    -bowerbird

  • http://www.x-cito.com Kevin Shockey

    I agree with you almost completely. For me I don’t think the question is so much, whether a book is better or not when connected to the Internet. Instead, it’s merely inevitable, almost evolutionary.

    As a popular form of content, it will merge with the web, because it is the least common denominator, the ultimate democratization of access to content. In addition, there’s just too many powerful tools to help sort and find what you’re looking for.

    Great article.

  • http://www.flatworldknowledge.com Brad Felix

    Hugh, and Aaron: Flat World Knowledge, though not a “major” publisher, lives more in this camp. We’ve got deep linking to paragraphs, para-level note taking, etc.

    But things certainly get more exciting with the API. Perhaps the API will evolve as the web did: Version 1 will be about “read” access, which for us means things like embedding our content into Learning Management Systems and the like. Version 2 will be “read/write”, allowing 3rd parties to participate in publishing. For us that means pretty cool possibilities – for example, external apps being able to customize our textbooks.

    Content is the commodity. Publishers are service providers. An API is a killer service. I can’t wait.

    -Brad

  • http://nataliaventre.com Natalia Ventre

    The Kindle has some social sharing features, but I’d love to have more interactions through books.

    Right now there’re many ebooks formats (ePub, mobi), an universal format with and Internet API would be great, but I don’t think I want the ebooks to be an Internet only thing, like a website, because it’s the offline mode that allows to read everywhere.

  • http://www.katrinadoeparker.com Katrina Doe

    I see a huge benefit to both people/customers AND publishing companies to get this going sooner than later. I am an average, non-techie who just got a Twitter account in the last week – that’s actually how I found this article.

    I find that I search for a LOT of information online with an extremely varied list of topics. One of my ‘beefs’ with online information is that much of it only skims the surface or is the tip of the iceberg. So much MORE in-depth, specific information is offered in books. I find myself searching around the internet for information, compiling some notes, & then going to Amazon & entering in some key words that I think might pull up some books that could fill in the gaps. I read the reviews & then order the books with the best reviews that seem to fit my criteria the best.

    If books/entire texts were *searchable* on the internet, people searching the web could more easily find exact phrases/specific information in books. To me, this is HUGE! And I think there’s a large benefit to publishing companies to have this information available. Amazon already has viewable pages in many books. And, guess what, I haven’t seen the numbers but I can tell you from my own experience that *I* buy books that I can ‘look’ inside more often than I buy books that I can’t. I mean, think about it, it’s a major reason people still go to bookstores – because we can look *inside* a book to see if it will actually cover the information we want to learn. Not only that, we can read a little bit to see if we even like the author’s writing style.

    I agree that it’s only a matter of (hopefully short) time before we see this break wide open & be commonplace for the ‘average’ web surfer. I wanted to offer my ‘Joe Schmoe’ perspective because I think that a lot of times the techie-heads designing these interfaces add layers to things that interest them & that they can appreciate but that the average person will likely never utilize or even understand. And THEN, some of the most basic usability issues are overlooked because they’re so obvious to the people writing the code. I’m not criticizing, I’m simply observing.

    I loved the article & I now have high hopes for what the future of ‘books’ is going to look like in the coming years. I have been leaning more on the side of keeping my tangible books that I can place on a shelf & reference when I need them. After reading this article, however, my perspective has changed & I can see a lot of value in what’s been described.

    Thanks for the thoughtful article! Katrina :)

  • http://hughmcguire.net Hugh McGuire

    @aaron: bookglutton’s great platform/api is exactly the sort of thing we need, and the total lack of interest from big publishers is why I won’t take bets on the timeline. The ideal world, something glutton-like becomes a open standard for books – but we’ve got a ways to go yet.

    @bowerbird & @Laisvunas nothing is stopping publishers from publishing straight-to-web/html (which achieves more or less what I’m talking about). But publishers don’t want to do that. So the IDPF is evolving based on what publishers want/need. If you’d like a different approach to publishing, you (and anyone else) could easily make a new publishing “company” (or collective, or technology) that does things the way you’d like them to. But there is a difference between building technology and getting people to use it; a difference between making a publishing company (or collective) and getting people to publish there; a difference between publishing books “right” and getting people to read them.

    @brad: “Content is the commodity. Publishers are service providers. An API is a killer service. I can’t wait.” … Flatworld & others are chipping away, and the question is when the demand/willingness to embrace these new ideas, will a) pull enough readers away from other models, and/or b) make traditional publisher change their course.

    @natalie: ebooks won’t be web-only in the same way that email isn’t “web-only” …

  • bowerbird

    hugh said:
    > But publishers don’t want to do that.

    _corporate_ publishers don’t want to do that, which is what i said,
    it’s exactly what i said, but yes, hugh, i’m glad you agree with me.

    (of course, it would be very surprising if you took my idea and then
    disagreed with me, wouldn’t it? yes, i’d think that’d be surprising.)

    > So the IDPF is evolving based on what publishers want/need.

    the i.d.p.f. and the corporate publishers are one and the same.

    the mammals need to reject this dinosaur organization and its
    dinosaur format, and turn instead to a small fast flexible format,
    as the first step in a full campaign toward a larger revolution…

    -bowerbird

  • bowerbird

    hugh said:
    > But publishers don’t want to do that.

    _corporate_ publishers don’t want to do that, which is what i said,
    it’s exactly what i said, but yes, hugh, i’m glad you agree with me.

    (of course, it would be very surprising if you took my idea and then
    disagreed with me, wouldn’t it? yes, i’d think that’d be surprising.)

    > So the IDPF is evolving based on what publishers want/need.

    the i.d.p.f. and the corporate publishers are one and the same.

    the mammals need to reject this dinosaur organization and its
    dinosaur format, and turn instead to a small fast flexible format,
    as the first step in a full campaign toward a larger revolution…

    -bowerbird

  • http://webseitz.fluxent.com/wiki Bill Seitz

    I think the revenue model will look like the NyTimes coming paywall model: you can read a bit for free (esp when following links from outside), then you pay for full access.

  • http://csarven.ca/ Sarven Capadisli

    Hi Hugh,

    “I don’t know exactly what an API for books would look like, nor do I know exactly what it means. I don’t know what smart things people will start to do when books are truly of the Internet.”

    If I understood you correctly, the following might be what you are looking for. If not, here is the connection that I drew:

    The ‘API’ is already here. It can be accomplished with RDF (Resource Description Framework) [1] where bibliographic records can be described with various ontologies (e.g., FRBR: Functional Requirements for Bibliographic Records [2], Book [3]), and identified with URIs like urn:isbn:9780812696110 or http://example.org/isbn/9780812696110

    The rest of the data within the record can be marked with other ontologies (e.g., Dublin Core terms [4]) and thus making them globally identified, searched for, and merged with other data on the Web.

    Given this, I personally don’t see much future for ebooks that we know of today since they are not connected to the Web. Hence, I would predict that ebooks will either become obsolete in favour of published Web data, or simply transform to ordinary Web pages as you’ve described earlier.

    See also:
    * “The RDF book mashup demonstrates how Web 2.0 data sources like Amazon, Google or Yahoo can be integrated into the Semantic Web.” [5]. There is a search form [6], as well as a Firefox search addon [7].
    * Linked Data and Libraries – almost like being there [8]
    * Open Library [9]

    [1] http://en.wikipedia.org/wiki/Resource_Description_Framework
    [2] http://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records
    [3] http://ontologi.es/book/vocab
    [4] http://dublincore.org/documents/dcmi-terms/
    [5] http://www4.wiwiss.fu-berlin.de/bizer/bookmashup/
    [6] http://www4.wiwiss.fu-berlin.de/bookmashup/search.php?keywords=Bullshit+and+philosophy
    [7] https://addons.mozilla.org/en-US/firefox/addon/12821/
    [8] http://blogs.talis.com/nodalities/2010/08/linked-data-and-libraries-almost-like-being-there.php
    [9] http://openlibrary.org/

  • http://csarven.ca/ Sarven Capadisli

    Hi Hugh,

    “I don’t know exactly what an API for books would look like, nor do I know exactly what it means. I don’t know what smart things people will start to do when books are truly of the Internet.”

    If I understood you correctly, the following might be what you are looking for. If not, here is the connection that I drew:

    The ‘API’ is already here. It can be accomplished with RDF (Resource Description Framework) [1] where bibliographic records can be described with various ontologies (e.g., FRBR: Functional Requirements for Bibliographic Records [2], Book [3]), and identified with URIs like urn:isbn:9780812696110 or http://example.org/isbn/9780812696110

    The rest of the data within the record can be marked with other ontologies (e.g., Dublin Core terms [4]) and thus making them globally identified, searched for, and merged with other data on the Web.

    Given this, I personally don’t see much future for ebooks that we know of today since they are not connected to the Web. Hence, I would predict that ebooks will either become obsolete in favour of published Web data, or simply transform to ordinary Web pages as you’ve described earlier.

    See also:
    * “The RDF book mashup demonstrates how Web 2.0 data sources like Amazon, Google or Yahoo can be integrated into the Semantic Web.” [5]. There is a search form [6], as well as a Firefox search addon [7].
    * Linked Data and Libraries – almost like being there [8]
    * Open Library [9]

    [1] http://en.wikipedia.org/wiki/Resource_Description_Framework
    [2] http://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records
    [3] http://ontologi.es/book/vocab
    [4] http://dublincore.org/documents/dcmi-terms/
    [5] http://www4.wiwiss.fu-berlin.de/bizer/bookmashup/
    [6] http://www4.wiwiss.fu-berlin.de/bookmashup/search.php?keywords=Bullshit+and+philosophy
    [7] https://addons.mozilla.org/en-US/firefox/addon/12821/
    [8] http://blogs.talis.com/nodalities/2010/08/linked-data-and-libraries-almost-like-being-there.php
    [9] http://openlibrary.org/

  • http://hughmcguire.net Hugh McGuire

    @sarven: what I’m thinking of is, for instance, starting to apply semantic mark-up, and further, metadatd, to the text of the books themselves. This is an impossible job for a publisher; but if you allow access to a book through an API, suddenly, you can imagine groups of people providing this metadata.

    For instance, you might have a layer behind the Bible, linking to google map locations of every town mentioned. Or you might have a layer behind The Odyssey linking to, say, the wikipedia entry for each personage named.

    Again, getting a publisher to do this is impossible to imagine.

    Opening the text via API to allow others to do this – now that makes sense.

    Other thoughts:

    - allowing Wordnik to access book content – so wordnik.com can pull out usage examples of words.

    - opening the text and audio via API to allow people to link the two to make synced versions of texts

    - opening the text to Bite-Size Edits to allow aspiring editors to make an updated version of the text.

    etc.

    the point of APIs is that you don’t really know what people want to do with your underlying assets, but by opening them in a controlled kind of way, you make it easy for them to dream up things to do.

  • http://csarven.ca/ Sarven Capadisli

    @Hugh McGuire

    Something like http://openlibrary.org/books/add ?

    Further reasoning and data mash-ups can be automated by using the URIs of the item and its (meta)data and merge it with other data sets on the Web. When URIs match from various data sets, we have a global database of things.

    I think an example is due here. Say I make the following claim:

    “I have a copy of a book with ISBN 9780812696110 and titled ‘Bullshit and Philosophy”

    And say you wrote that book and make a claim on your site as such:

    “9780812696110 has the description ‘.. event took place in Montreal'”

    Yet another site about geo data would claim:

    “Montreal has latitude 45.50884 and longitude -73.58781″

    This data can be easily linked to one another using URIs.

    Many vocabulary exist to describe classes and properties of people, organizations, bibliographic records, bio organisms, geo locations… you name it.

    You are right about a need to enter some of the core data in and it is being done independently by experts in their own domain using respective ontologies. On top of that, a lot of inferences can be automated based on whatever is there.

    I think RDF and SPARQL fits the bill.

  • http://www.webnovela.es Juan José Díez

    For example, this book is “truly of the Internet”, is not an ebook, but a ciberbook or webnovel, live only in the web and it is free,navigable,with hypertext, multimedia and interactivity.
    http://www.webnovela.es

  • http://www.writediteach.com/blog/ Shawn Douglas

    @Hugh Regarding my early comment on the tenth about high editability in an Internet-connected book, I just stumbled upon an example of one.

    DynamicBooks and College Open Textbooks Partner to Make Open Textbooks Easy to Edit and Customizable

    From the article:

    “Using the DynamicBooks editing tool, instructors can revise content, add or delete chapters or sections and include audio, video and course notes to make the textbooks more current and more relevant for their students. Students can access DynamicBooks online, download the books to their computer, print up to ten pages at a time. Printed, bound versions are also available for student purchase. Instead of copyright, all rights reserved, these textbooks are copyrighted with Creative Commons, GFDL, and customized open licenses that mean the textbooks can be freely shared and modified.”

    Not quite the API you were talking about, but it’s something similar.

  • http://www.writediteach.com/blog/ Shawn Douglas

    Oh, and while the DynamicBooks editable textbook I mentioned isn’t truly an Internet-connected book that can be hyperlinked to, I believe it’s a logical next step above the mostly static e-book of today.

  • Alex Tolley

    @bowerbird
    Unfortunately I don’t have a solution. I just want a little less wholesale cheer leading and a little more thought about the consequences. We’ve already had cultural losses – the BBC erasing videotaped programs from the 1960’s to reduce costs. This would not have happened to less ephemeral media like film and now those archival tapes are gone permanently.
    I’m not saying the BBC should never have used video tape, but I am saying that there were secondary effects that resulted in avoidable losses of information because concerns about archiving for the future were not attended to.
    We see something similar with movies where old film stock is decaying. This is another case where greater replication may have saved old movies, but now the only remaining copies are degrading. And movies are just 100 years old.

    Another concern is proprietary data formats that could result in yet more data being lost due to opaqueness when trying to read their bits without the reader code.

    As a technology, hard copy books offer relatively decent replication as a consequence of their distribution and use model, ease of operation, and demonstrated long lifetimes if on good media.

    I’d really like that as a backstop. Perhaps something like the Library of Congress, but replicated and able to ensure long term storage and retrieval of materials.

  • http://hughmcguire.net Hugh McGuire

    @alex: something like the Internet Archive?
    http://www.archive.org/about/about.php

  • http://rock-n-code.com Julio Javier Cicchelli

    Evidently certain industries still can’t cope with the rapid changes we are all experiencing nowadays. For good or bad, they will just have to face reality and adapt quick or face extinction.

    In the particular case of the Publishing industry, I don’t believe that the book will be replaced by ebooks but they will have to coexist and complement each other in a certain way. The differentiation between the book and the ebook is irrelevant due to its main purpose (from the cave wall to the ebooks, the point is to express ideas and opinions via a printed/written media) even though the support can provide to the reader completely different experiences that are adapted to his/her particular tastes and needs. I agree that both supports contains data and meta-data that can be easily processed and provided as a service.

    Certainly, the introduction of the ebook introduces a new way to consume books and broaden the explicit boundaries and experiences the book format have imposed since its creation. This is where the opportunity of the emerging ebook publishing industry lies. Of course, not every reader is capable of un-learn certain predefined concept related to books and the experience of reading (and, as you may have guessed, that will take some more time). The publishing industry also have to take this into account.

    Recently, my company was hired to build a social-media platform for e-reading. Even though the complexity to create such a system is moderate, I noticed that there are still a couple of challenges the publishing and the IT industries have to sort out in order to pave the way to the adoption. First and foremost, EPUB as a digital publishing standard should be taken as an intermediary to a format based on HTML5 (such as ZHOOK) that can easily handle every possible data and meta-data a ebook can have. Currently, XHTML is limiting the development of data/semantic processing of its content hindering the development of useful API services and the business itself. Second, the fear of the vast majority of publishers to a bubble risk on this emerging market. Even though the market of mobile devices that are able to consume this kind of data grows exponentially, the publishing industry is still struggling with this new support to their content like the music business struggled with the rise of the MP3 format.

    In my sincere opinion, the publishers must have to adopt API services of their content and let developers create innovative and value-added applications on top of them. This can create a bi-directional relationship between the publishers and their readers with less intermediates. The publishers can reach their readers more intimately (due to the content generated by the readers) and provide them with the content they want while the readers benefit from the adapted content and the available data and meta-data provided by the publishers and other readers.

  • http://www.facebook.com/JenniferStevensonAuthor Jen Stevenson

    Fascinating. But I’m a fiction writer. How would I use this kind of enhanced book–i.e. a website consisting of my novel plus extras?

    I can’t picture the application to pop fiction immediately, but I know who can. I’m going to send the URL of this thread to some fanfic writers. I’ll bet they can catch fire on this.

  • http://themiracleinjuly.com/story Michelle Anderson

    I have built “a website consisting of my novel plus extras” for a publishing experiment in the form of a semi-autobiography that would shine as via an ebook API.

    My entire first draft of my story is online at http://themiracleinjuly.com/story using Wordpres, a few plugins, and a hacked theme design to submerge the reader in the story. Almost 100,000 words in 28 chapters are embedded with hundreds of non-intrusive, interactive links to media relevant to the story. The reader

    But MIJ is web-based…trapped in the limitations of it’s delivery format. Some readers print the chapters for reading offline and go back online to click on links in the story that they’re particularly interested in. Many readers prefer to schedule blocks of time in which to sit in front of their computer to read a chapter or chapters and click on every single link — play all the music, view all the photos and videos — to fully experience the story as I’ve created it. Both of these situations require more effort than necessary. There has to be a better way.

    Anyone interested in mentoring me on the next steps in my publishing experiment? Please get in touch via the contact form at http://themiracleinjuly.com/story/donate/contact/

    Thanks!

    @mediaChick

  • http://www.philsimonsystems.com Phil Simon

    Not surprising that this would come from O’Reilly, the most progressive of the traditional publishers. Semantic technologies have to get moving to enable the types of searches described–i.e., those beyond mere keywords found on Google books.

    Interesting stuff though.

  • http://twitter.com/slainson Suzanne Lainson

    I don’t really know why there should be online “books” at all, particularly if you are using them to promote something else rather than selling them. Ever since I got online (in 1993), everything I’ve written gets uploaded in blog or newsletter length segments. I’m pulling together topics that in the past I would have collected as chapters in a book. But now as I finish a segment, I publish it. It’s just as organized as if I were doing a book. I’m just not waiting until I have hundreds of pages written before I put it out.

    I don’t understand why anyone would want to sit on material until it reaches book length. And if you find new information later on that you want to incorporate into what you have already published, you can go back and edit or add to the earlier version.