May262012

Extracting g11n translation strings in Lithium

If you are like me, you are happy, and you use Lithium. If you are not like me, well then, it sucks to be you. I personally love the Lithium-Doctrine2 (by means of li3_doctrine2) combo. It suits my needs, and gives me peace of mind. Sort of like a good glass of scotch at the end of the day. I don’t drink scotch, by the way.

Anyway if you are aiming for dominating the world, you need internationalization in your app. It is quite easy to do so with lithium. Go check the documentation, I’m not here to teach you about it. What I am here to do is fill a gap in li3′s proposal for g11n: extracting your translation strings and putting them in a nice gettext POT, ready for translation. I’ve built a dead-simple python script that gets the job done. Call it with the typical help argument and you’ll get:

$ scripts/g11n.py --help
usage: g11n.py [-h] [-a APP] [-o OUTPUT_FILE]

Globalization extract

optional arguments:
  -h, --help            show this help message and exit
  -a APP, --app APP     Application directory
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        Where to write template

So you basically specify it the application directory, and where to write the template file. It will go through your application directory (skipping only the resources/g11n/ and resources/tmp paths) looking for any PHP files, and extract the strings. However it does not come so easy. If you are using Lithium’s built in translation lambdas ($t and $tn in your views) it will not work. I personally set a couple of function aliases in my app/config/bootstrap/g11n.php file, so that I could use the same translation functions regardless if I’m on a view, a controller, entity, or whatever else. These two aliases look simple enough:

function _t($message, array $options = array()) {
	return \lithium\g11n\Message::translate($message, $options + array(
		'default' => $message
	));
}

function _tn($message1, $message2, $count, array $options = array()) {
	return \lithium\g11n\Message::translate($message1, $options + compact('count') + array(
		'default' => $count == 1 ? $message1 : $message2
	));
}

So if you start using the above _t() and _tn() functions you can use my python script. It uses a well established tool known as xgettext for helping out with the extraction, and msgmerge for merging. The script source code is:

#!/usr/bin/env python

import argparse, os, re, subprocess, sys, tempfile

if __name__ == '__main__':
	parser = argparse.ArgumentParser(description='Globalization extract')
	parser.add_argument('-a', '--app', default='app/', help='Application directory')
	parser.add_argument('-o', '--output-file', help='Where to write template')
	args = parser.parse_args()
	args.app = os.path.abspath(args.app)

	if not args.output_file:
		args.output_file = args.app + '/resources/g11n/default.pot'
	if not os.path.isdir(args.app) or not os.path.exists(args.app):
		print('{i} is not a valid directory'.format(i=args.app))
		sys.exit(1)

	print('Preparing to extract strings from {i} and store them in {o}'.format(
		i = args.app,
		o = args.output_file
	))

	dirs = os.listdir(args.app)
	skipPaths = [
		'resources/g11n',
		'resources/tmp'
	]
	count = 0
	o = tempfile.NamedTemporaryFile(delete=False)
	for root, dirs, files in os.walk(args.app, followlinks=True):
		d = re.sub(r'^' + re.escape(args.app + '/'), '', root)
		skip = False
		for skipPath in skipPaths:
			if re.match(r'^' + re.escape(skipPath) + '/?([^/]*)', d):
				skip = True
				break
		if not skip:
			for f in files:
				if re.search(r'' + re.escape('.php') + '$', f):
					o.write((os.path.join(root, f) + '\n').encode('utf-8'))
					count += 1
	o.close()

	print('Found {c} files to process'.format(c=count))

	if count > 0:
		output = args.output_file
		backup = None
		if os.path.exists(args.output_file):
			backup = args.output_file + '.old'
			output += '.new'
			os.rename(args.output_file, backup)

		subprocess.call([
			'xgettext',
			'--files-from=' + o.name,
			'--output=' + output,
			'--keyword',
			'--keyword=t',
			'--keyword=_t',
			'--keyword=tn:1,2',
			'--keyword=_tn:1,2',
			'--language=PHP',
			'--from-code=\'utf-8\'',
			'--package-name=\'workana\'',
			'--copyright-holder=\'Workana\'',
			'--package-version=\'1.0.0\'',
			'--msgid-bugs-address=\'g11n@workana.com\''
		], stdout=None, stderr=subprocess.STDOUT)

		if os.path.exists(output):
			newOutput = output + '.sed'
			with open(newOutput, 'w') as f:
				subprocess.call([
					'sed',
					's/\(Content-Type:\s*text\/plain;\s*charset=\)CHARSET/\\1UTF-8/i',
					output,
				], stdout=f, stderr=subprocess.STDOUT)
				if os.path.exists(newOutput):
					os.rename(newOutput, output)

		if backup:
			print('Merging with existing template')
			subprocess.call([
				'msgmerge',
				'-i',
				'-N',
				'-o',
				args.output_file,
				backup,
				output
			], stdout=None, stderr=subprocess.STDOUT)

			os.unlink(output)
			os.unlink(backup)
	os.unlink(o.name)


Leave a Comment

 
Powered by Wordpress and MySQL. Clauz's design for by Cricava