• Print

World Wide Lexicon Toolbar changes the reading experience for the other 99% of web pages

Brian McConnell’s latest coding effort, World Wide Lexicon Toolbar,
meets my criterion for a piece of critical infrastructure: after two
days with it I can’t get along without it, and I plan to avoid any
browser that doesn’t have it installed.

Brian is a highly adaptive programmer. With roots in the telecom
industry and several start-ups on his resume, he also wrote

Beyond Contact: A Guide to SETI and Communicating with Alien Civilizations

for O’Reilly. The

World Wide Lexicon project

he’s been working on for the past several years is again something
totally different.


Install the add-on

(currently experimental) in Firefox 3.5 or higher
and visit a page in some language other than your default. Before your
eyes, headings and text change into your native language. You can get
similar effects by submitting the page to a popular translator such as
Google (which is one of the tools used behind the scenes by the WWL
toolbar), but the instantaneous effect of the toolbar makes you feel
closer to the people whose sites you visit around the world.

There are several languages that I know well enough to get the gist of
a page, but where I miss some of the details and get frustrated by
gaps in my vocabulary. Therefore, I set the WWL toolbar to “Bilingual
view,” so each block element of the original text is shown together
with its translation. The bilingual view is considerably less
attractive, because it swells the size of each block element, but I
can tell already that it will improve my language skills quickly.

WWL is designed for volunteer translations. If it becomes more
popular, people will submit translations that are much more accurate
than the machine-generated ones the WWL must fall back on currently.

What’s the process behind this new dimension to web browsing?
McConnell let me in on some of the magic.

Volunteer translations

McConnell invented WWL several years ago with the core notion of
encouraging people to translate web pages they thought should get a
wider audience. When he first told me about the idea, I was skeptical
that he would get many volunteers. But then I heard of other volunteer
translation efforts. For instance, there’s a whole subculture of
people who write subtitles for popular Hollywood films. This runs
afoul of copyright law, of course (and so do the copies of movies
they’re attached to, probably) but they show the lengths to which
crowdsourcing has progressed in the translation area.


FLOSS Manuals
,
a project I do volunteer work for, also finds dozens of people willing
to translate its open source documentation.

McConnell’s first set of tools were designed to facilitate on-the-fly
translations. Web designers could enhance their web sites by
downloading from the WWL site some JavaScript that made each text
element on the page editable. (I

blogged

about this in December 2007.) The paste-in displayed a little pencil
icon, signaling to viewers that they could do instant
translations. All they would have to do was click on an element, and a
text box would pop up where they could enter their translation. The
web site would then register the translation with the central WWL
site.

World Wide Lexicon API

The WWL API covers the entire life cycle of a translation: registering
a translation, rating translations for quality, searching for a
translation of a particular page into a particular language, and
retrieving a translation. Queries can specify a minimum rating.

Toolbar

The latest achievement of the WWL project is the toolbar officially
released yesterday. It determines the user’s native language through
settings in the browser. When each page is visited, the toolbar uses
the domain name and various tests on the text to make a guess about
its language.

The toolbar then issues an API query to see whether any human
translations exist. If so, it displays the translations with a light
yellow or green background.

If no one has made a human translation (which is usually the case so
far) the toolbar resorts to well-known machine translation
services. It can make use of
Google Translate,
Apertium, and
Moses,
each of which offers an API, and will also query Babelfish when its
API is ready. Machine translations are displayed with a light blue
or grey background.

The progressive translation used by the toolbar is also interesting.
It starts with the first 10 or 20 elements, then translates heading
tags (<H1>, etc.), then the larger texts, and ultimately every element
on a page. (I displayed one page that embedded a Google ad, and the
translator recognized and translated that text too.) McConnell is
working on making the various translations run in parallel. Because
translation changes the sizes of elements, the toolbar makes various
accommodations to display the page as attractively as it can.

In short, WWL is a cool combination of mash-ups, existing services,
crowdsourcing, and Ajax. I’m sure that in a year’s time I’ll think
back to its appearance today and be shocked at how primitive it was.
But it will remain a transformative tool for me.

tags: , , , , , , , , , ,
  • http://cefaleias.com.br/enxaqueca Enxaqueca

    This is superb! CanĀ“t wait to try…

  • Bruce

    Excellent news. It’s great to see this progression of the WWL project.

  • bowerbird

    it’s very very rare indeed these days that
    something impresses me with the smooth
    — a combination of radical, clever, useful,
    thoughtful, challenging, and cool — _but_
    this project most definitely has the smooth.

    brian mcconnell, get ready to receive your
    macarthur, that’s what the smooth can do.

    -bowerbird

  • http://www.right-hand-drive-jeep.com/ Ajeet

    Despite its obvious benefits, it will still face a steep challenge in becoming popular. I do not see this becoming a standard, by any standards.

  • http://twocroissants.wordpress.com Bertil Hatt

    I might have missed something, but doesn’t Google translate offer a similar feature? It’s not a stand alone browser, you have to go through their webiste, etc. but you can navigate with it.

  • Avi

    For us that are fluent in multiple languages it would be nice to have is the ability to specify a list of languages not to translate.

  • http://unhammer.wordpress.com Kevin

    @avi: post a review on the Firefox addons page ;-)

    @Bertil Hatt: Google only offers translation through Google translate. The WWL offers through any engine which has an API (eg. Apertium, which supports many of the minority languages not supported by Google). Additionally, the WWL shows user-created translations! Thus users can easily machine-translate a web page, and then some of them might fix a typo here or a grammar error there, just as if the whole Web were a wiki! Also, these translations will be available through the WWL API for the creators of machine translation systems, providing more data for improvement (especially important for free and open source projects like Apertium and Moses, which rely on freely available translational data and can’t (easily) make deals with publishers etc…)

  • http://meedan.net Chris Blow

    WWL is quite useful if you are running an Open Source multi-lingual news site like Meedan: http://meedan.net

    I work for Meedan; for us the advantage is that you can more easily operate an translation workflow. It’s quite a beautiful system IMO given that it is totally Open and distributes the translation process across multiple servers using a sensible API. The point is that we can share translations more easily than with a centralized server — and because it parses the front end paragraph by paragraph, the the sources you are translating can rewrite parts of stories without losing your translation work.

    For application UI translation, we still manage private translations, but for us having an API for our news translation team to work is really great. If you use WWL you can get all of our translations (and translation metadata, including translation ratings) for free.