• Print

The Analog Hole: Another Argument Against DRM

Digital rights management (DRM) might be unpopular with the public and
plagued with social and technical challenges, but at least it’s a guarantee that digital books can’t be pirated — right?

Not so fast. Experienced computer crackers will find weaknesses in any encryption scheme, but regular folks with basic computer
skills can exploit the one weakness found in all DRM’ed media: the analog hole.

What is the Analog Hole?

The “analog hole” reflects a basic principle of physics: before humans
can consume any digital media, the ones and zeroes that computers
understand must be converted into an analog format
that our senses can perceive. For music, it’s sound waves; for
video and for digital books, it’s patterns of light.

If you’ve ever visited a major metropolitan city you’ve probably
seen the analog hole in action: street vendors selling pirated
copies of popular movies, often months before they’re officially
released on DVD. Most of these are “cam” films, shot in real
movie theaters using camcorders. Even without access to a
physical copy of the film, pirates are able to capture its analog
expression: the sound and pictures as perceived by a theater-goer.

In music, the analog hole is often used to get around software
preventing digital copying. A user simply plays the the desired song
on their computer using the legal DRM-enabled software, and records
the audio coming out of their computer. Now they have a copy of
the sound recording, which can be re-imported into the computer
and digitally-encoded, with the original DRM stripped out. (A
similar principle is at work when DRM systems go defunct and
users are told to pirate
their own music
, although the industry uses the euphemism “making a backup.”)

Film and music companies are painfully aware of the analog hole
and have taken steps to close it, either by
monitoring patron behavior (as in movie theaters) or by
petitioning to legally limit the recording features of consumer electronics.

Because reading is a visual experience,
there is the possibility of an analog hole exploit. Unlike
with camcorder copies or re-burned MP3s, there is a potential for no
loss in quality. And with a little ingenuity, the process can
be completely automatic.

One example: Ebooks and Optical Character Recognition (OCR)

Here’s a sample digital book as displayed in Adobe Digital Editions. (This
book is public domain and isn’t technically covered by DRM, but the principle
is exactly the same.)


I hid as much of the Digital Editions menus as I could and took a screenshot of this first page of Pride and Prejudice.

Next I downloaded some free optical character recognition (OCR) software. OCR
programs can “read” images and output the words in them as plain text. It’s a normal
part of digitization projects, in which archival printed material is first scanned
and its text is automatically extracted. At the consumer level, OCR software is often bundled with commercial scanners and fax machines.

I took my screenshot and fed it to the OCR software. Here’s what I got without any special fine-tuning or spell-checking. Note
that all typos are from the OCR software.

Chapter 1

It is a truth universally acknowledged, that a single man in
possession ofa large fortune must be in want of a wife, However little
known the feelings or views of such a man may be on his first entering
a neighbourhood, this truth is so well fixed in the minds of the
surrounding families, that he is considered the rightful property of
someone or other of their daughters.
“My dear Mr. Bennet,” said his lady to him one day, “have you heard
that Netherfield Park is let at last?”
Mr. Bennet replied that he had not.

…and on through the entire first page. This output was in HTML, ready to be
posted to the Web for anyone to read.

The OCR isn’t 100 percent accurate, of course, but neither are the widely-available
pirated ebooks created by laborious scanning of physical books, page
after page. I was also using free software that requires careful fine-tuning to get working optimally; commercial OCR software is much better, especially when combined with

It wouldn’t be difficult to automate the process of advancing one page in Digital Editions, taking a screenshot, and passing that on to my OCR software. Once the workflow was in place, I could strip hundreds or thousands of books of their DRM in a matter of minutes.

Another Possibility: Speech Recognition

My local library is kind enough to allow me to check out digital audiobooks. Naturally
they’re also secured with DRM (so much so that I can’t actually play them, as they require Windows Media Player and I have only Mac and Linux computers). But assuming I could play them, I’d have available to me a nice stream of professionally-produced audio.

You’re using speech recognition software every time you call a customer service line and an
automated voice prompts you to speak your credit card number. If that’s happened
to you, you also know that speech recognition isn’t 100 percent accurate yet, but under
certain conditions it can be quite good. Automatic speech-to-text transcription
isn’t nearly as far along as optical character recognition, but it’s another
analog hole exploit that will eventually become trivial to perform.

Does This Mean Publishers Shouldn’t Produce Ebooks or Audiobooks?

No! What I hope to convey is that DRM is not a true safeguard against ebook piracy.
(It is, however, a known deterrent to ebook adoption.) I’ve heard a lot of passing the buck on DRM: publishers claim authors want it, booksellers claim publishers
insist on it. These days it’s hard to find someone to publicly state that they’re
actually for it.

I think of DRM like this: years ago my apartment was broken into and I
called a locksmith to replace the door. My landlord had authorized me to
get “the best lock possible,” and the locksmith obliged with a four-foot steel
bolt. It was almost too heavy to turn but made a very satisfying noise when it snapped shut.

I asked the locksmith, “Is this really unbreakable?”

“The lock is, sure.” He slapped the door frame. “But this is made out of
wood. If I really wanted to get in I’d just kick out the door. That’s why I’m honest
about what I sell.” When I looked
puzzled he handed me his business card.
It contained his name, phone number, and company slogan: “A feeling of security.”

Authors and publishers should be compensated for their talent and their hard work,
and the desire for DRM is understandable. Book lovers, too, want their
favorite authors to succeed. But digital books are a form of technology as
much as they are literature, and technologies that are successful adapt to people’s
needs. Is just a “feeling” of security worth the ire of good customers who
want to read their books wherever and however they like?

tags: , , , , , , ,

Comments: 2

  1. it’s quite commendable that you are
    willing to teach elementary school
    to today’s publishers, but how ’bout
    if we cut directly to the chase, ok?

    in order to be “a book” in the future,
    you will need to be online. _in_full_.

    every word on every page will have to
    be able to be the destination of a link.

    if a book isn’t _fully_exposed_ like that,
    it might as well be completely invisible,
    because no one is gonna care about it.

    that means anyone can read the book,
    without buying it (in paper _or_ in bits).

    and yeah, that means that it’s gonna be
    hard to force anyone to pay for the book.

    knowing this, full well, perhaps our society
    will create a way to provide compensation
    for a book depending on how many times
    its pages are viewed, in the same way that
    musicians receive money depending upon
    how often a song gets played on the radio.

    thinking about it, that’s somewhat similar
    to our current system of _libraries_, where
    lots of people can read a book “for free”,
    but a degree of offsetting compensation
    results because all of those local libraries
    have to _buy_ a copy of the book to lend.

    so society is already paying a certain price
    so that people can read books “for free”…
    think of this as a logical extension of that.

    so this kind of arrangement would be good.

    but it is still beside the point, which is that
    in order to be salient in the world-at-large,
    a book will have to be fully exposed online.

    there’s no other option. it needs to be said.

    a book that is not online will be like that tree
    that falls when there’s no one there to hear it.

    compared to the books which are fully online,
    a book which is only _partly_ online (let alone
    completely absent) simply won’t get traction.

    sooner or later, if you even try to sell such a
    “pig in a poke”, people will just laugh at you.
    the only way to sell it will be if it’s fully online.

    furthermore, if you are “a book”, you want
    this online exposition of yourself to be the
    _canonical_ version, meaning that all links
    to this book point to it, and nowhere else…

    that way, all additions, deletions, corrections,
    annotations, and such are done _right_there_,
    and you don’t have to track down all the copies
    that have been dispersed out into the universe.

    (indeed, all of those copies would “check back”
    with the canonical site to update themselves.)

    so listen up, publishers, this is what you need
    to know in order to proceed into your future…

    you cannot lock up books. you’ll fail if you try.
    d.r.m. is not just silly, not just unworkable, it is
    the exact opposite of what you should be doing.

    instead of trying to thwart the ease of copying
    in the digital arena, make it work _for_ you, by
    putting a book online, where you will control it.

    this is what tomorrow’s cyberlibrary will be like
    — a canonical version of every book, online, on
    the same site, where everyone can link to it, and
    the books even interlink between themselves…


  2. the above comment was posted _before_ the surprise
    settlement between google and the authors/publishers.