dimanche 15 août 2010

Internationalization and Localization under Google App Engine framework

I was thinking about for one or two year before. One month ago I started to code for a new web application. That project have to be available for the biggest number of users. So it have to be in the most scalable architecture. And it have to a got a good internationalization framework (i18n).

So I had to make Locales works (specially date formats), and efficient translation (gettext) works under google app engine.
So in a few word our problematic is how to make efficient localization and internationalization under state of the art MVC framework using python language.
Language ares evolving and each time I code something like the code below I ask me if I am doing the right way :
def dateToLocalizedString(pDt,strFormat):
  return {
  'Y-m-d': str(pDt.year) + '-'  + str(pDt.month) + '/' + str(pDt.day),
  'd/m/Y': str(pDt.day) + '/'  + str(pDt.month) + '/' + str(pDt.year),
  'm/d/Y': str(pDt.month) + '/'  + str(pDt.day) + '/' + str(pDt.year)
}.get(strFormat) 

Topic #1 making i18n works

Google app engine offers to work on an customized version of the Django framework wich is able to handle effective i18n,as described here . Others frameworks exists that can run under Google app engine and do the same :
<title>{% trans "This is the title." %}</title>
I didn't tried to make the upper code work. Also as Google says, "apps must load quickly order to scale in the cloud". That should always have been the case.

While learning app engine technology I rapidly heard that my views in the software have to load quickly. I didn't tried but I guess that it is "CPU costing", possibly in the exact term as far as Google will charge me for CPU usage.
Generate once, runs in every language code ! So I had to use gettext to generate my Django templates. man xgettext told me :
Choice of input file language:
-L, --language=NAME
recognise the specified language (C, C++, ObjectiveC, PO, Shell,
Python,   Lisp,  EmacsLisp,  librep,  Scheme,  Smalltalk,  Java,
JavaProperties,  C#,  awk,  YCP,  Tcl,  Perl,  PHP,  GCC-source,
NXStringTable, RST, Glade)
My goal was to use gettext to generate sort of HTML.
I search the web for project facing the same issue but I didn't found something that correspond to my needs. For reference I'll mention xml2po.py and gnunited .
I found an interresting article about using gettext to generate static files, that put me on the way...

The key idea is to use PHP to generate the Python's templates offline.

An empiric situation suggested me to generate Django templates (HTML-like) and insert PHP code in their. I used a PHP loop (undisclosed) on the development size to generate templates for each languages once and not at each load.

PHP: about 7,670,000,000 results
Python: about 32,100,000 results

Python is the language that let me wrote my cleanest coding for web application. I suggest to use PHP for offline generation of template. Django templates will look like this :
<b><?php echo _("Welcome"); ?> {{user.email}}</b>

|<a href=?action=index><?php echo _("Home"); ?></a>
|<a href={{user.logout_url}}><?php echo _("Sign out"); ?></a>

Templates are generated in different directories and called like this :
payload = dict(user=main.user,etcs=etcs)
template_file = os.path.join(os.path.dirname(__file__),
'../tpl/' + main.user.defaultLang +'/tpl.html')
main.response.out.write(template.render(template_file, payload))
Note that main is called from myApp.py as described in app.yaml and in my case main.user is a model in the datastore terms.

Topic #2 Formating date format

To make the app engine work I did like described by Yu-Jie Lin (livibetter) in it's tutorial Using Django's I18N in Google App Engine.
We use the Django templates to do the work, using the |date argument in Django template, then the Google app engine framework will do the rest as based on version 0.96 of Django that can handle this.
Using the |date argument in Django template, like this :
<input type="text" id="sampleDate"
name="sampleDate" value="{{myobject.sampleDate|date}}" size="9">
To make the app engine work I did like described by Yu-Jie Lin (livibetter) at in it's (very good) tutorial Using Django's I18N in Google App Engine. Our main.py or whatever I have referenced in app.yaml looks like this :
from google.appengine.api import users
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.ext import db
from google.appengine.ext.webapp import template
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'myApp.settings'
from django.conf import settings

# Force Django to reload settings
settings._target = None
from google.appengine.ext import webapp
from django.utils import translation
from crebits.util import Cookies
#There where other imports cutted for sample .

class MainPage(webapp.RequestHandler):
#MEMCACHABLE
def get_myAppUser(self):
  #So you can edit only logged users.
  user = users.get_current_user()
  if user:
    query = db.GqlQuery("SELECT * FROM myAppUser where email=:1",
                        user.email())
  myAppUsers=query.fetch(1)
  user_exist=False
  for myAppUser in myAppUsers:
    myAppUser.defaultJsDateFormat=myAppUser
                      .gotHookToHandleJSFormat()
    user_exist=True
    if user_exist !=True:
      myAppUser=models.myAppUser()
    myAppUser.email=user.email()
    myAppUser.defaultCurrency="USD"
    myAppUser.defaultDateFormat="m/d/y"
    myAppUser.put()
  return myAppUser

def initialize(self, request, response):
  self.user = self.get_myAppUser()
  if self.user:
     self.user.logout_url = users.create_logout_url("/")
  self.request=request
  self.response=response
  request.COOKIES = Cookies(self)
  request.META = os.environ
  self.reset_language(request, response)
  webapp.RequestHandler.initialize(self, request, response)

def reset_language(self, request, response):
 language = translation.get_language_from_request(request)
 if self.user.defaultCutOff==None:
   self.user.defaultLang=language
 translation.activate(language)
 request.LANGUAGE_CODE = translation.get_language()
 settings.DATE_FORMAT=self.user.defaultDateFormat
 # Set headers in response
 response.headers['Content-Language'] = translation.get_language()
 translation.deactivate()

#-------------------
# Main post handler
#-------------------
def post(self):
  user = users.get_current_user()
  if user:
    (user.email(), users.create_logout_url("/")))
#---------------------------------------------------------------
# Main authentified controler switch action param (post method)
#---------------------------------------------------------------
   {
   'index': views.index,
   'index2': views.index2
   }.get(self.request.get('action'),views.index)(self)
 else:
   self.redirect(users.create_login_url(self.request.uri))

application = webapp.WSGIApplication([
('/.*', MainPage)

],
debug=True)
def main():
run_wsgi_app(application)

if __name__ == "__main__":
main()

Conclusion

I dont think it's a new idea as far as you can see things like this with in many softwares open source projects, that templates templates such as torque for example. But I think it's the less CPU costing and also the easiest way to make i18n under the app engine.
It is possible that, as some loading of the Django i18n framework is made for the date localization part of the problematic is dooming the expected gain while generating internationalized offline.
Please tell me throught comments.
Best Regards.

Additional references: Entropy (provide php-cli with gettext support on MacOSX) , gettext tutorial, gettext manual .