The Analog Hole: Another Argument Against DRM

Digital rights management (DRM) might be unpopular with the public and
plagued with social and technical challenges, but at least it’s a guarantee that digital books can’t be pirated — right?

Not so fast. Experienced computer crackers will find weaknesses in any encryption scheme, but regular folks with basic computer
skills can exploit the one weakness found in all DRM’ed media: the analog hole.

What is the Analog Hole?

The “analog hole” reflects a basic principle of physics: before humans
can consume any digital media, the ones and zeroes that computers
understand must be converted into an analog format
that our senses can perceive. For music, it’s sound waves; for
video and for digital books, it’s patterns of light.

If you’ve ever visited a major metropolitan city you’ve probably
seen the analog hole in action: street vendors selling pirated
copies of popular movies, often months before they’re officially
released on DVD. Most of these are “cam” films, shot in real
movie theaters using camcorders. Even without access to a
physical copy of the film, pirates are able to capture its analog
expression: the sound and pictures as perceived by a theater-goer.

In music, the analog hole is often used to get around software
preventing digital copying. A user simply plays the the desired song
on their computer using the legal DRM-enabled software, and records
the audio coming out of their computer. Now they have a copy of
the sound recording, which can be re-imported into the computer
and digitally-encoded, with the original DRM stripped out. (A
similar principle is at work when DRM systems go defunct and
users are told to pirate
their own music
, although the industry uses the euphemism “making a backup.”)

Film and music companies are painfully aware of the analog hole
and have taken steps to close it, either by
monitoring patron behavior (as in movie theaters) or by
petitioning to legally limit the recording features of consumer electronics.

Because reading is a visual experience,
there is the possibility of an analog hole exploit. Unlike
with camcorder copies or re-burned MP3s, there is a potential for no
loss in quality. And with a little ingenuity, the process can
be completely automatic.

One example: Ebooks and Optical Character Recognition (OCR)

Here’s a sample digital book as displayed in Adobe Digital Editions. (This
book is public domain and isn’t technically covered by DRM, but the principle
is exactly the same.)


I hid as much of the Digital Editions menus as I could and took a screenshot of this first page of Pride and Prejudice.

Next I downloaded some free optical character recognition (OCR) software. OCR
programs can “read” images and output the words in them as plain text. It’s a normal
part of digitization projects, in which archival printed material is first scanned
and its text is automatically extracted. At the consumer level, OCR software is often bundled with commercial scanners and fax machines.

I took my screenshot and fed it to the OCR software. Here’s what I got without any special fine-tuning or spell-checking. Note
that all typos are from the OCR software.

Chapter 1

It is a truth universally acknowledged, that a single man in
possession ofa large fortune must be in want of a wife, However little
known the feelings or views of such a man may be on his first entering
a neighbourhood, this truth is so well fixed in the minds of the
surrounding families, that he is considered the rightful property of
someone or other of their daughters.
“My dear Mr. Bennet,” said his lady to him one day, “have you heard
that Netherfield Park is let at last?”
Mr. Bennet replied that he had not.

…and on through the entire first page. This output was in HTML, ready to be
posted to the Web for anyone to read.

The OCR isn’t 100 percent accurate, of course, but neither are the widely-available
pirated ebooks created by laborious scanning of physical books, page
after page. I was also using free software that requires careful fine-tuning to get working optimally; commercial OCR software is much better, especially when combined with

It wouldn’t be difficult to automate the process of advancing one page in Digital Editions, taking a screenshot, and passing that on to my OCR software. Once the workflow was in place, I could strip hundreds or thousands of books of their DRM in a matter of minutes.

Another Possibility: Speech Recognition

My local library is kind enough to allow me to check out digital audiobooks. Naturally
they’re also secured with DRM (so much so that I can’t actually play them, as they require Windows Media Player and I have only Mac and Linux computers). But assuming I could play them, I’d have available to me a nice stream of professionally-produced audio.

You’re using speech recognition software every time you call a customer service line and an
automated voice prompts you to speak your credit card number. If that’s happened
to you, you also know that speech recognition isn’t 100 percent accurate yet, but under
certain conditions it can be quite good. Automatic speech-to-text transcription
isn’t nearly as far along as optical character recognition, but it’s another
analog hole exploit that will eventually become trivial to perform.

Does This Mean Publishers Shouldn’t Produce Ebooks or Audiobooks?

No! What I hope to convey is that DRM is not a true safeguard against ebook piracy.
(It is, however, a known deterrent to ebook adoption.) I’ve heard a lot of passing the buck on DRM: publishers claim authors want it, booksellers claim publishers
insist on it. These days it’s hard to find someone to publicly state that they’re
actually for it.

I think of DRM like this: years ago my apartment was broken into and I
called a locksmith to replace the door. My landlord had authorized me to
get “the best lock possible,” and the locksmith obliged with a four-foot steel
bolt. It was almost too heavy to turn but made a very satisfying noise when it snapped shut.

I asked the locksmith, “Is this really unbreakable?”

“The lock is, sure.” He slapped the door frame. “But this is made out of
wood. If I really wanted to get in I’d just kick out the door. That’s why I’m honest
about what I sell.” When I looked
puzzled he handed me his business card.
It contained his name, phone number, and company slogan: “A feeling of security.”

Authors and publishers should be compensated for their talent and their hard work,
and the desire for DRM is understandable. Book lovers, too, want their
favorite authors to succeed. But digital books are a form of technology as
much as they are literature, and technologies that are successful adapt to people’s
needs. Is just a “feeling” of security worth the ire of good customers who
want to read their books wherever and however they like?

