Aug. 16th, 2010

kake: The word "菜單" (Chinese for "menu") in various shades of purple. (菜單)

Just a quick post today, to mention one of the most useful computer tools I've found so far for helping me access and organise my vocab lists and transcribed menus — grep.

grep is a commandline tool that should be available on all Unixes (Linux, Solaris, OS X, etc), and on all those I have access to, it deals just fine with Chinese characters. This means that I can easily check through all my textfile documents to find, for example, dishes with prawns in: grep 蝦 *.txt

This is pretty powerful on its own, really, but the one thing it can't do is take account of simplified vs. traditional characters — and some of my lists/menus are copy-pasted from sources that use simplified characters, while the ones I've written/transcribed myself are in traditional characters.

So I wrote some Perl to make this easier, and you can find it on CPAN. It includes a commandline utility called dets (desensitise traditional-simplified) which builds a regexp from a string and can be used like so: grep `dets 蝦` *.txt (dets 蝦 returns [虾蝦]).

I realise I don't usually write about geek stuff on here, so eyes may be glazing over at this point — but if the owners of the remaining eyes have any comments, patches, or bug reports, I would love to hear them.

If you have any questions or corrections, please leave a comment (here's how) and let me know (or email me at kake@earth.li). See my introductory post to the Chinese menu project for what these posts are all about.

Tags

December 2012

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829
3031     

Style Credit

Expand Cut Tags

No cut tags