Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Bash and most of shell tools are UTF-8 aware if you use right locale. Shouldn't have problems. But I doubt those are bidi aware.

JVM is unicode-ready. So is Haskell, which benefits from having less text and more punctuation. I've been there with some code in Clojure where every identifier was named in cyrillic. All library and syntax are still in english so that doesn't look right.



Yes, most shells are UTF-8 aware, but terminals are largely not Bidi-encoding aware. Bidi, short for birectional, is what allows you to use left-to-right and right-to-left (Arabic, Farsi, Hebrew, Pashto, Urdu, etc.) and read/write both variants in harmony.

There are very, very few terminals that do this at all. I read and write Arabic, so I use mlterm. It is quite good, but unlike other emulators, it requires setup and learning configuration, but is very flexible. I use Mutt to read for emails and mailing lists in Arabic, Farsi, and Urdu, and I rarely have problems.

In the Linux console, the situation is abysmal. There was a project that works there, bicon from Arabeyes project, but does not understand SIGWINCH signals. This is problematic when I share a tmux session through the Linux console and X.org with my preferred WM, for example. But to run small, mostly console programs with Arabic, bicon is old and can do some jobs for value of jobs.

A big problem is that Arabic input, and RTL input in general, is a problem where most devs in this space respond "each application must address it in its own way." This is why few console or GUI applications handle Arabic well.

FYI, Firefox was the only browser I could view Arabic language news in until like two or three years ago, on any computer for a while. The others were terrible. And this is coming from a linguistics guy; this was long before I wanted to dig deeper in the machine. Linux guys in the Arab world fight an uphill battle, since much software will not work them and is ignored.

Come join arabeyes.org and help some of them out if you interested in translating documentation, applications, or building tools to help the process.


tmux (and screen) don't even let you search the history for non-ASCII characters :-(

That said, the git and shell mentions in the article made things look worse than they are – e.g. in Xubuntu I didn't have to do anything special to at least display the characters, though in the wrong order: http://bildr.no/view/dFN0elo4

Emacs has had bidi support since 24.1, works great, and most importantly, those of us who get totally confused by it can turn it off (I still haven't gotten the hang of movement commands suddenly going the opposite direction, very confusing when RTL and LTR scripts are in the same file).


konsole works OK with bidi languages. Using the arrow keys to navigate left/right works oddly, but it does pritn things in the correct direction.


Hmm, what if standard libraries were available in multiple natural languages? You could always serialise the code as English, if you needed to interop with a multilingual team and that happened to be a good common language, but at display time, the identifiers can look however you want according to your locale and the available translations in the library.

Of course, if you hypothesise that API design could be influenced by natural language, then you might worry about things getting lost in translation. But APIs that use English words aren’t much like natural English already.


There is no requirement that the code you see/edit is the same code that is stored and compiled.

Just as an editor can show → for -> and λ for \ if adequately configured, it can also show 'если' for 'if' and 'Функция' for 'function' (or, say, the chinese equivalents) while keeping the underlying code in english. It wouldn't even need any cooperation from language implementations or library developers, and it'd be compatible with all legacy code!

I'm not saying that it's a good idea, though, just a possibility.


Excel does this for formulae (and in very old versions even for VBA).


I was thinking of translating actual APIs, not just keywords. Machine translation may not do so well with that.


That will fail when you have words in different langauges that mean different things.


You only speak English, though, right?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: