• Print

Can the Author Really Help?

A very experienced former book packager who has moved on to become an industry observer and critic of some note pushed back on my suggestion on Friday that authors could be involved in tagging content for contextual meaning. “Not in this lifetime,” was his comment, and he suggested that copy editors or managing editors might be the more likely candidates to mark what we’re looking for.

Among other things, our critic suggests that the usual consequence of having an author mess with the code is that the next job the packager or publisher has to do is pull out a lot of not-useful clutter.

What are we looking for? We want the biographies within a historical book that are used to introduce the characters. We want the place descriptions from any book — many from novels — that would be of interest to anybody visiting or looking for information about the place. We want to know which of the woodworking projects in a collection are suitable for Christmas, or require minimal tools, or a minimal skill level, so that we can create different collections for different audiences.

All things like this, the author will know best, usually better than the copy editor. Also better than the acquiring editor. And certainly better than the managing editor.

Extracting the value of the author’s knowledge and developing the tool sets and workflows that make it functional to incorporate it in the XML document of a book will require a lot of reinvention. In that sense, “not in this lifetime” is an accurate metaphor for when it will happen. Publishing needs some “born again” changes and this is one of them.


Comments: 9

  1. I’d have to disagree with the “not in this lifetime” comment based on what I’ve seen here at O’Reilly. Some authors enthusiastically embrace tagging content and look at it as an opportunity rather than a burden. One such example is the authors of Version Control with Subversion (http://svnbook.red-bean.com/). They write in semantically rich DocBook XML and use that source to create HTML for their web site. We used that same XML to produce a PDF for the book. We’re seeing more and more examples of this kind at O’Reilly. We actively encourage all of our new authors to give DocBook a try, and you’d be surprised to see how often they take to it. When they do express a willingness to work in XML, we do our best to provide support and answer questions so that we can prevent the kind of mess that your critic worries about.

    Sure, some authors just want to write, and as writers of technical documentation, O’Reilly authors are more likely to be willing to use something like XML, but there are plenty of authors who see value in taking the time to accurately tag a manuscript. And I couldn’t agree more that no one is better suited to the task than the author herself.

  2. i see nothing in this post that can over-ride
    the observation from that “very experienced”
    book-packager that author tagging results in
    “a lot of not-useful clutter” that has to be
    removed by someone else later in the workflow.

    it’s as if you waved your hands and expected
    that that alone would make the point disappear.


  3. Whether we’re talking about an author’s writing or tagging, there will always be some clean up to do. How much clean up will depend on the author. But the clean up in the workflow that I describe is not usually so bad that I find myself wishing the author hadn’t bothered tagging at all. And some clean up of “not useful clutter” is still better (i.e., faster and less expensive) than having to convert the incoming files from whatever word-processing format the author used to whatever page layout format we’re using and then finally to XML.

  4. This is a tough one as I tend to agree with both sides on this point. For many years now I’ve tried to get publishers and authors to adopt standard style sheets to help streamline the production work flow. Getting the content “tagged” up front either through the use of character styles, paragraph styles, or XML tags does simplify the process greatly and helps to improve the overall efficiency and delivery of the final product to market. But at what cost? Do you want to provide training and support to a potentially large author base in the use of new tools? For someone like O’Reilly with tech savvy technical writers, this is a less daunting task than it might be for others.

    The realization is that every publisher, no matter how similar the content, works differently and their level of comfort with modifying the front-end of their business varies greatly. Microsoft Word is still the foundation for most publishers. Providing a good tool set here can set the foundation for creating structured documents at a granular level.

    While it is true that the author has most of the domain knowledge, it is also true that an experienced copy editor can have this same domain knowledge and can provide the support they need to granularly tag the content that might be useful to an end user. A good publisher would not only look at the context of a book as a whole but also at how it might be able to be leveraged across multiple platforms, devices, or as a separate chunk of data that is integrated with content from another source (custom publishing). This is where the real value of granularly tagging content comes in. You have to get beyond the idea of just using content in the printed form. It potentially has much more value. Sometimes this value can only be exposed with the right amount of upfront tagging to allow adequate reuse.

  5. The idea that authors can’t be bothered with things like indexing is astonishing to me after having seen authors like Jim Elliott throw themselves at the task willingly. Good authors don’t run away from responsibility and just “throw a manuscript over the wall”. What I learned from Jim on the Harnessing Hibernate book is that a dedicated author cares enough about the content to add meaningful metadata, and I can now tell the difference between a book with an author created index and a book with an index created by a part-time contractor who knows little about the subject area.

    I didn’t add indexing to the Maven book on my own yet, but I will add indexing to the online, free version soon. When I do, that online version will diverge even more from the separate fork that was required for the production process.

  6. tim, nobody said that such authors do not exist.


  7. I’m late to this conversation, but shouldn’t authors be “guided” in this effort? editorial AND marketing (don’t forget to include us marketing people – we might actually be able to help a lot more if we’re included from the getgo – and esp w/ xml tagging) should do the guiding – extract the key needed info for tagging from the authors. of course, cleanup will be necessary, but don’t discount the importance of the author in knowing their audience/market.

  8. @Kat Meyer. You are absolutely right that marketing has a tagging role. What we’re seeing in our research is that they tend to grab the tagging mantle FIRST, before editors. Makes sense, because marketers are in a place where they see the benefits of the tags; their utility. It’s more imaginary for editors. It could be that marketers and editors should be guiding authors. Previous books published in the same vertical niche would also guide authors: house taxonomies will definitely develop over time.

    And you’re not really “late” to the conversation. The whole conversation is still early: in our project and, particularly, across the industry.

  9. Late in the conversation, but frankly, it still continues today. Its easy to blame the authors, but in reality the form factor and usability of the authoring tools isn’t really where it needs to be.

    The industry still relies heavily on MS Word for authoring, and at best people will tag in-line content with character style sheets. Getting a template file out to writers with an agreed-upon vocabulary for style sheets seems like an easy thing to do, but in practice, try getting people to install or use a .DOT file, particularly if they’re not working on a machine that you’ve set up for them.

    True, there are authoring tools that offer better tagging functionality than Word, and there are also 3rd party modules in the MS Office space that can ameliorate the issue. However, if the industry is really going to get to XML-First and have authors tag content at inception, the tools are going to have to come a long way. For now, ‘post-tagging’ will remain the norm — by editors, by marketing, by technologists or service bureau resources that prep files for multi-channel delivery.

    As far as getting authors to do the tagging, I don’t quite agree that they’ll resist, but see how some non-digital types might. An incentive remains — if the author can prep the file according to some best practices (@katMeyer), the incentive is that their content will have a longer tail on Web, in Print, on e-reader devices, and beyond. That in itself should help motivate us all in this economy.

    In the end, the author should and must help — AND every participant in the workflow for content preparation should enrich the content accordingly.