Text and XML of All #TOC 2009 Tweets

I was planning to do some crunching last night and early today, but between an unexpected flight delay coming back from New York, and the pleasant surprise of getting Slashdotted about Bookworm, the day is quickly slipping away. I’ll give it a go over the weekend, but if anyone else is eager to play, here’s a super-raw text dump (the best I could do for getting around the API limit). Update: to be explicit, this covers roughly mid-afternoon Sunday 2/8 through late morning Thursday 2/12, so includes the entire event, but not every #toc tweet.

Update #2: Using the raw text as a starting point, I’ve generated an XML file listing all of the people who tweeted with hashtag #toc during the conference, and listed each of their tweets. I’ll leave it as an exercise to the reader 🙂 to sort by time, or otherwise slice/dice (best visualization among those submitted in the comments by 2/24 at midnight EST gets a free pass to TOC 2010 — winner chosen by the TOC program committee, and announced 2/26).

Update #3: Unfortunately, the Twitter Search API appears to only have returned the first ~15 or so of each user’s #toc tweets (nowhere near enough to include all of the 200+ tweets from the top tweeter, @thewritermama, so that XML doesn’t contain all of the tweets in the plain text. I’ve posted the intermediate XML I used, which contains less data about each tweet and tweeter, but does contain all of the tweets.

Update #4: Anyone interested in the gory details of where the XML came from, I’ve posted some background over at O’Reilly Labs.

