XML

XML (an acronym for "eXtensible Markup Language") defines a set of standard rules for creating markup languages. A markup language is a mechanism of labeling content, typically text, and primarily so that it can be understood and processed by software. One common markup language you've probably seen before is HTML (HyperText Markup Language), which looks like this:

<p>I <b>really</b> dislike <a href="http://toc.oreilly.com/resources/drm.html">DRM</a>.</p>

Which when viewed as part of a Web page, looks like this:

I really dislike DRM.

Those words inside of the brackets are called "elements".

There are a wide variety of types of content to describe with a markup language, but there's a lot of benefit to using a common type of markup language, regardless of the content type. For example, in the case of a document meant for the Web, to describe that content you'd want to describe things like "paragraph" (<p>), "heading" (<h1>), and "ordered list" (<ol>).

But in the case of corporate financial data, you'd want to describe things like "inventory" (<ifrs-gp:Inventories>) and "total current assets" (<ifrs-gp:AssetsCurrentTotal>). By using XML to define the markup, even though the content is quite different, a common set of tools and techniques can be used for creating and processing both types of content.

Although few people work directly with XML regularly, most everyone using the Web actually uses XML every day. For example, the RSS feeds used to track blog updates are XML documents. Here's a snippet of the TOC feed:

<feed xmlns="http://www.w3.org/2005/Atom">
<title>Tools of Change for Publishing</title>
<link rel="alternate" type="text/html" href="http://toc.oreilly.com/" />
<link rel="self" type="application/atom+xml" href="http://toc.oreilly.com/atom.xml" />
<id>tag:toc.oreilly.com,2008-01-24://40</id>
<updated>2008-07-18T21:01:03Z</updated>
<subtitle>Tools of Change for Publishing from O&apos;Reilly Media: Technology is transforming publishing. Are you ready for the future? </subtitle>
<generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.1</generator>

<entry>
<title>[TOC Directory] Recent Additions</title>
<link rel="alternate" type="text/html" href="http://toc.oreilly.com/2008/07/ toc-directory-recent-additions-2.html" />
<id>tag:toc.oreilly.com,2008://40.25177</id>

<published>2008-07-22T14:30:00Z</published>
<updated>2008-07-18T21:01:03Z</updated>

XML is often used as a file storage format (either as the primary format or an alternative) in word processing and desktop publishing software like Word, InDesign, OpenOffice and even Excel. All use the rules of XML to define their file formats, though each uses their own particular names for their elements, and have different rules about how their elements can appear within a document. Indeed, one of the big advantages of XML is the ability to use a standard set of tools for defining the rules of a particular document ("tables are not allowed inside of sidebars" or "every image must have a caption"). Before XML, such rules were often codified in style manuals, or perhaps enforced with custom software such as Word macros or InDesign scripts.

For more technical information, see this in-depth technical overview of XML, this "Learning XML" course from the O'Reilly School of Technology, or "Learning XML, 2nd Edition".

TOC Stories Referencing XML

Some Tasty Bits from the StartWithXML UK Survey

We've got some raw results from the StartWithXML survey in the UK, and they are very different in some respects from the US survey we did. Some salient points:48.7% of...

CSS in an XML Workflow

At the StartWithXML Forum in New York in January, Rebecca Goldthwaite of Cengage gave a great demonstration of how Cengage uses CSS in their XML workflow. Many publishers regard style...

StartWithXML is Going to London

StartWithXML will be continuing in London! On September 2nd, at the British Library, we'll be conducting a one-day forum similar to the one we held in New York last January,...

New on O'Reilly Labs: Open Feedback Publishing System

O'Reilly engineer Keith Fahlgren has formally launched our new Open Feedback Publishing System over on O'Reilly Labs: Over the last few years, traditional publishing has been moving closer to the...

Open XML API for O'Reilly Metadata

In addition to Bookworm, O'Reilly Labs now includes an RDF-based API into all of O'Reilly's books: Most publishers are familiar with the ONIX standard for exchanging metadata about books among...

At TOC: Bookworm Online EPUB Reader Now Part of O'Reilly Labs

Update: There are now 400+ shiny DRM-free EPUB books from O'Reilly if you want to give Bookworm a test drive. Much of what's on our complete list with a green...

StartWithXML Research Report Now Available for Sale

If you weren't able to attend the StartWithXML Forum last month in New York, the accompanying research report is available for sale. The report covers topics like: Where am I...

Webcast Video: Essential Tools of an XML Workflow

Below you'll find the full recording from the TOC webcast, "Essential Tools of an XML Workflow," with Laura Dawson....

New York Times Opens "Best Sellers API"

The New York Times on Tuesday opened up its "Best Sellers API," offering programmatic access to best-seller data (going back to 1930!) from the Times: The Times Best Sellers...

Presentations from the StartWithXML Forum

The following slides accompanied many of the presentations during the StartWithXML forum, held Jan. 13, 2009 in New York City. XML--Why Bother? David Young, Hachette Book Group USA As Chairman...

Coverage of StartWithXML

Turns out I was not the only one on Twitter for the StartwithXML Forum on January 13th. Joe Bachana was tweeting as well. Kind of interesting to see the posts...

BeyondPrint Offers Helpful Review of StartWithXML

A review of the StartWithXML forum and research paper supports the effort but questions why we are silent on the quality of XML tools.

Slides from "Essential Tools of an XML Workflow" Webcast

Laura Dawson has made her slides available from the recent TOC Webcast, "Essential Tools of an XML Workflow." A complete recording of the event will be posted here soon. View...

[TOC Webcast] Essential Tools of an XML Workflow

Tools of Change for Publishing, in conjunction with StartWithXML, will host "Essential Tools of an XML Workflow," a free webcast with presenter Laura Dawson, on Thursday, Dec. 11 at...

Webcast Video: What Publishers Need to Know about Digitization

Below you'll find the full recording from the recent TOC Webcast, "What Publishers Need to Know about Digitization," with Liza Daly....
Additional XML Links & Resources

Visit StartWithXML

XML Companies from the TOC Directory

Stay Connected
RSS TOC RSS Feeds
 Blog Feed
 News Feed
 Combined Feed
 New to RSS?
Newsletter Subscribe to the TOC newsletter.
Tarsier Icon Follow TOC on Twitter.
Newsletter Join the TOC Facebook group.
TOC Widget Get the TOC Headline Widget.
Search
Tag Cloud