Recommended Reading on XML and Publishing

StartWithXML: Why and How

While clearing out some old files, I came across a folder of articles culled during research about three years ago, while I was building the case for increasing our use of XML for book production. If you're looking to take a break from the steady stream of terrifying financial news, here's a few hours of time well-spent on angle brackets. Much of this skews fairly technical (including actual math), but there's some useful context to an XML conversation:

  • When Word-to-XML conversions get nasty from Mike Gross at Data Conversion Laboratory. "Before you begin a conversion, look through your source Word documents to see how well they were formatted but be prepared you may be horrified with what you find."

  • From the Journal of Digital information, a paper by Terje Hillesund, Many Outputs -- Many Inputs: XML for Publishers and E-book Designers. Terje takes a contrarian view on XML, though specifically calls out what many trade publishers primarily deal with as well-suited for XML: "For many typographically simple genres, like most present fiction, reuse has already proved to be relatively easy ... In the future, XML-based workflows will make re-use of many fiction genres even easier, as these visually and navigationally uncomplicated texts can be made into a variety of paper and electronic editions from the same XML document by use of style sheets ..."

  • The response to Hillesund from XML guru Norm Walsh, XML: One Input -- Many Outputs: A response to Hillesund. "Before considering the flaws in each of [his] arguments, it is interesting, if slightly incongruous to his arguments, to note that Hillesund's paper includes no less than four examples of the successful use of XML precisely for the publication of multiple output formats from a single input document. "

  • A fascinating paper from 1998, On the Pagination of Complex Documents, which discusses the challenges inherent with automated pagination of the kind found in many XML-based rendering systems (as well as older systems such as LaTeX). "Using competitive analysis we show that, under realistic assumptions, not only first-fit but any online pagination algorithm may produce results that are arbitrarily worse than necessary. This explains why so many people are not satisfied with paginations produced by LaTeX if no manual improvement is done"

  • It hasn't been updated since 2005, but Choosing an XML editor, from Thijs van den Broek offers a nice survey of XML editors. "The study consisted of a literature search, surveys to identify user needs, current usage, existing editors, and (existing and desired) features of editors, as well as an evaluation exercise."

  • Here at O'Reilly our workflow is centered around DocBook XML, but DITA (Darwin Information Typing Architecure) is a more recent XML vocabulary, also designed primarily for technical information. IBM developerWorks has a nice overview, Introduction to the Darwin Information Typing Architecture. "This document is a roadmap for the Darwin Information Typing Architecture: what it is and how it applies to technical documentation. It is also a product of the architecture, having been written entirely in XML and produced using the principles described here."

  • Written from the perspective of a technical documentation group at Cisco, Low-Cost, Flat-File XML for the Masses is an interesting case study from a team committed to finding a way to use XML that was both better for writers and didn't require a large investment in new software: "You can realize the benefits of publishing from modularized XML, without the expense of an enterprise publishing system, by implementing the authoring environment on top of nothing more than your operating system's file system. Although this environment is not adequate for enterprise publishing needs, it is more than adequate for the needs small writing teams, businesses with a limited number of related products, proof-of-concept demonstrations, and even home users."

3 Comments


bowerbird said:
October 28, 2008 3:58 AM

do you sincerely recommend _reading_ those articles?

honestly? that's the best you can do?

they're all very old. not just old, but old and outdated.

surely you can do better than that...

-bowerbird

Bowerbird:

I have found much of the StartWithXML discussions to be surprisingly unaware of the results of the extensive thinking on architectural issues that XML people have done (and applied to solve real problems at real publishers) for over ten years. On the one hand, it's kind of neat to see a new generation discover the benefits of XML in publishing; on the other hand, by ignoring the past, they're condemned to repeat it, so they're slowly reinventing several wheels.

Reading some older literature by people like Norm is the best remedy for this.

Bob DuCharme

bowerbird said:
October 30, 2008 2:44 AM

bob-

why not direct people to recent material
covering older issues _and_ newer ones?

pointing them at outdated articles simply
_ensures_ that they won't be up to speed.

at the very least, have some discussions
that point out where the outdated stuff
would steer them wrong, and reveal why.

what i'd really like to see, however, are
some performance-based _challenges_,
where the costs and benefits of an x.m.l.
approach are compared with those of a
simpler methodology. i'd be most happy
to represent the "simpler" side in a bout.

-bowerbird

Leave a comment


TOC Comment Guidelines






Stay Connected
RSS TOC RSS Feeds
 News Posts
 Commentary Posts
 Combined Feed
 New to RSS?
Newsletter Subscribe to the TOC newsletter.
Tarsier Icon Follow TOC on Twitter.
Newsletter Join the TOC Facebook group.
Newsletter Join the TOC LinkedIn group.
TOC Widget Get the TOC Headline Widget.
Search
TOC In-Depth

Impact of P2P and Free Distribution on Book Sales Impact of P2P and Free Distribution on Book Sales

This report tests assumptions about free digital book distribution and P2P impact on sales. Learn more.


StartWithXML: Making the Case for Applying XML to a Publishing Workflow StartWithXML Research Report

The StartWithXML report offers a pragmatic look at XML tools and publishing workflows. Learn more.


Tools of Change for Publishing tutorial DVDs TOC 2008 Tutorial DVDs

Dive into the skills and tools critical to the future of publishing. Learn more.

Tag Cloud
TOC Community Topics