Friday, April 11, 2008

Transforming Unicode

I used to work in the localisation industry (terrible world).
Once, a former colleague asked me if it would be possible to retrieve the japanese characters from the unicode string. With Perl, the answer is "YES"
Just make sure you have the module Unicode::Escape installed, save your unicode string in a text file (i.e. unicodetext.txt) and run the oneliner:

perl -MUnicode::Escape -ne 'print Unicode::Escape::unescape($_);'/path/to/unicodetext.txt

Example with some strings in russian:
With a unicode string:

\u041f\u0440\u0435\u0434\u0438\u0441\u043b\u043e\u0432\u0438\u0435


the result would be:

Предисловие

No comments: