Translation of the web is all the rage these days. When you think about it, it’s kind of strange that the rage didn’t happen sooner, but that probably has lots to do with technical aspects of machine translation that I can’t begin to imagine.

The fact remains that, over the past year or so, all kinds of translation projects have emerged into public view in many shapes and sizes. A recent New York Times article, in the Business section of all places, catalogued some of them:

  • Global Voices Lingua Project, where volunteers translate blog posts into many different languages
  • Google Translate, which can now live directly in your email through a labs feature
  • Yamli.com, which allows a non-Arabic speaker like me to type in a word as it sounds in Latin letters and, if I can read the results, which I sometimes can, choose the Arabic search term. Results are displayed side-by-side in English and Arabic. (Update: Oops, the Times did not mention Yamli. I added it.)
  • TED Open Translation Project (which is based on the awesome dotsub, which the Times neglected to mention), and
  • Meedan.net, a new site that seeks to use translation as a bridge for dialogue between English and Arabic speakers about the Middle East and other issues 

In a follow-up post to the Times article, Ethan Zuckerman, a Global Voices founder and the person who coined the term “polyglot internet,” mentions some translation initiatives on the edge, including Worldwide Lexicon and Pootle, a translation management system, as well as the upcoming Open Translation Tools conference in Amsterdam next month, which SMEX will also attend.

A Recent Focus on Arabic Translation
What has been particularly interesting and exciting for us at Social Media Exchange is the focus on Arabic we’ve seen, both with the emergence of sites like Meedan and the attention of companies like Google, Twitter, and Facebook, which released an Arabic-language interface in March. But still, as a recent Meedan post notes, the Arabic Web is struggling.

This is because of a lack of translation into Arabic, to be sure. And to the fact, as medioriente writes, that

All the qualified training programmes -particularly in media field, but also in science, engineer, etc- all across the Arab world are.. in English! Go visit all the most important universities in the Arab world and you will see that the majority is offering courses and training in English. It’s not by chance that many foreigners that want to learn the Arabic language go to Syria.

But there are other factors, too, a few of which SMEX runs into quite often in translating our training materials, including the challenge of regional dialects. As the same Arabcrunch post that reported on the Arabic launch of Facebook says:

Arabic is the official language of more than 20 different countries including Palestine, Iraq, Algeria, Egypt, Saudi Arabia and Yemen, but each country has a different dialect, in which not every person from a certain Arabic dialect understand all others, and many times a certain word means something else in another Arabic dialect. So facebook used the official Arabic called Al Fusha ( الفصحى) – the Arabic dialect of Quran – which is the standard language used in media, publishing and other formal and religious circles.

In talking with the founders of DigiActive last year when proposing to translate their Introduction to Facebook Activism (pdf), which we did, he brought up the point that even in Fusha, not all terminology relating to the Internet was commonly understood. We started playing around with ideas for posting a translation and letting people from various regions comment on the terminology, and actually not just for tech terms but even for words like “activism,” where there seems to be divergence.

The Case of Lebanon
In Lebanon, where many people are functional in three languages, Arabic, English, and French, and where we do most of our training, things get even trickier. Many Lebanese spend their entire educational lives in a second language, English or French. While they, of course, speak the Lebanese dialect perfectly, many feel more comfortable writing and expressing themselves in their adopted language. And, as mediaoriente mentioned, the university and the workplace often demand this.

At a training last year in Jbeil (Byblos), about 45km north of Beirut, we had prepared Arabic-language handouts for our intro presentation on the Social Internet for Social Change. Prior experience doing training outside Beirut had told us that our audience outside t
he capital was less fluent in English. We presented the training in the Lebanese dialect, and that was fine. But when we distributed the handouts, we were met with a gasp—shu haida? (what’s this?)—and were asked to provide the training materials in English.

Arabization of internet terms can feel almost as foreign to Arabic speakers, it seems, as to those of us who don’t know the language at all. Even many translators we work with don’t know the Arabic words for technology terms and have done extensive research to come up with the right expressions for things like tagging, social bookmarking, and RSS.

Fortunately in Jbeil, we were able to provide the materials in English, and ended up giving a kind of hybrid presentation, the spoken part in Lebanese, the written part in Arabic. When the participants posted to the blog they created, it was in English, too. (I should note that this isn’t the case all over Lebanon. In other areas, French is more prominent, and in some areas, Arabic is just fine, as long as tech terms are still in English.) I do not know from experience, but I imagine in other Arab countries where Western colonial nations had power at some point, the situation is the same. I’d love to hear comments.

Where are the Arabic Web fonts?
Another issue is Arabic fonts on the web. I’m wholly unqualified to lead this discussion, but I have heard many Arabic speakers and designers and coders lament the lack of a range of Arabic fonts on the web, in complete character sets. Monofont content must be less fun to publish, at the very least, inhibiting the production of Arabic-language content. I hope someone will say more about this, so that I can also understand it better. And another thing: Arabic speakers in our office say that the Arabic font in which most web content appears is hard to read, and they almost always have to enlarge it to read anything of substance. Imagine Western adoption rates if we had to do this in English, Spanish, French, or German.

Where to from here?
For the time being, SMEX will continue to translate into the written Modern Standard Arabic (Fusha) as we did last year with our materials and a video we produced. Increasingly, we’re considering producing video tutorials in Lebanese. For our online training-of-trainers course, which launches this Saturday, we’re also going to try to crowdsource an Arabic-English tech glossary with our training participants. We’ll report back on how all that goes.

As great as all the translation initiatives are—and I mean that, it is amazing to watch things opening up because of these tools—somehow, I’m still not satisfied. I’d like to propose that we think of translation not as an end in itself but more as a catalyst for the increased production of original Arabic-language content on the Web. Translated text still holds a lot of the culture of the original language in it, to paraphrase mediaoriente. For equal exchange to take place, one culture or language can’t have primacy.

I’m looking forward to the day when I get to read vast tracts of original Arabic Web content translated into English. Although depending on how long it takes, by then I might already be fluent in Arabic—but I seriously doubt it.