Documents In-Stream

In ODF enters the Semantic Web, Rob Weir writes in his blog, An Antic Disposition, of the challenges of encoding the wide range of possible metadata describing a document in a semantically meaningful fashion within the constraints of a schema.

What if I wanted to annotate an academic paper, a work-in-progress, to mark which quotations have been verified and which ones remain to be be verified? Or what if I want to annotate statements in testimony according to which statements contradict or corroborate another witness’s statements? This goes far beyond pattern matching. I need a way to encode my knowledge, my view of the subject, in the document.

This is a critical challenge for any analysis of a document-rendered proposition – this capacity must be present to facilitate a colloquy, and it is equally vital for the simple recording of one’s own cognitive digestion, as Weir notes.

With this challenge perceived and well-stated, Weir then leaps into a place that many of have been speculating about, with the realization that our understanding, and our need, for the “document” has been dissolving as our presentation of thoughts, and our speculations, become ever more conversational and embedded in a longer and evolving multi-threaded, perambulating discourse.

We have data in a document — “Words, words, words” as Hamlet tells Polonius. But for those who work with thoughts, the present constraints of encoding our knowledge as simple linear strings of Unicode characters is severe. In general text is multi-layered and hyper-linked in strange and marvelous ways. Your father’s word processor and word processor format are inadequate to the task. The concept of a document as being a single storage of data that lives in a single place, entire, self-contained and complete is nearing an end. A document is a stream, a thread in space and time, connected to other documents, containing other documents, contained in other documents, in multiple layers of meaning and in multiple dimensions. What we call a traditional document is really just a snapshot in time and space, a projection into print-ready output form, of what documents will soon become.

This edges into the neighborhood of Stowe Boyd’s recent thoughts on whatever the heck we might envision the third coming of the web might be like (which had also struck the eye of Tim O’Reilly). Stowe muses:

Personally, I feel the vague lineaments of something beyond Web 2.0, and they involve some fairly radical steps. Imagine a Web without browsers. Imagine breaking completely away from the document metaphor, or a true blurring of application and information. That’s what Web 3.0 will be, but I bet we will call it something else.

In-stream documents will wind up re-shaping what gets written, how it gets written, and how we read. In our struggling future, as we endeavor to lower the bar for dialogue with others, as we must re-stake expectations for how we adjudge and provide respect to those engaged in that participation, more and more voices will be heard.

Bring on the cacophony.