PHP DevCenter

oreilly.comSafari Books Online.Conferences.

We've expanded our LAMP news coverage and improved our search! Search for all things LAMP across O'Reilly!

Search
Search Tips

advertisement

Listen Print Discuss Subscribe to PHP Subscribe to Newsletters

Internationalization and Localization with PHP
Pages: 1, 2

Message Objects



These phrases can also be stored as function return values instead of strings in an array. Storing the phrases as functions removes the need to use printf(). Functions that return a sentence look like this:

<?php
// English version
function i_am_X_years_old($age){
    return "I am $age years old.";
}

// Spanish version
function i am_X_years_old($age){
    return "Tengo $age años.";
}
?>

If some parts of the message catalog belong in an array, and some parts belong in functions, an object is a helpful container for a language's message catalog. A base object and two simple message catalogs look like this:

<?php
class pc_MC_Base {
    var $messages;
    var $lang;

    function msg($s) {
        if (isset($this->messages[$s])) {
            return $this->messages[$s];
        } else {
            error_log("l10n error:LANG:" . 
                "$this->lang,message:'$s'");
                }
        }
}

class pc_MC_es_US extends pc_MC_Base {
    function pc_MC_es_US() {
        $this->lang ='es_US';
        $this->messages = array(
            'chicken' => 'pollo',
            'cow' => 'vaca',
            'horse' => 'caballo'
        );
    }

    function i_am_X_years_old($age){
        return "Tengo $age años";
    }
}

class pc_MC_en_US extends pc_MC_Base {
    function pc_MC_en_US() {
        $this->lang ='en_US';
        $this->messages = array(
            'chicken' => 'chicken',
            'cow' => 'cow',
            'horse' => 'horse'
        );
    }

    function i_am_X_years_old($age) {
        return "I am $age years old.";
    }
}
?>

Each message catalog object extends the pc_MC_Base class to get the msg() method, and then defines its own messages (in its constructor) and its own functions that return phrases. Here's how to print text in Spanish:

<?php
$MC = new pc_MC_es_US;
print $MC->msg('cow');
print $MC->i_am_X_years_old(15);
?>

To print the same text in English, $MC just needs to be instantiated as a pc_MC_en_US object instead of a pc_MC_es_US object. The rest of the code remains unchanged.

Localizing Images

Images need to be localized when you want to display images containing text in locale-appropriate languages.

Make an image directory for each locale you want to support, as well as a global image directory for images that have no locale-specific information. Create copies of each locale-specific image in the appropriate directories. Make sure that these images have the same filenames. Instead of printing image URLs directly, use a wrapper method similar to the msg() method demonstrated earlier.

The img() wrapper method looks for a locale-specific version of an image first, then a global one. If neither are present, it logs an error message. Building upon the pc_MC_Base class, the new class looks like this:

<?php
class pc_MC_Base {
    var $messages;
    var $images;
    var $lang;

    var $image_base_path = '/usr/local/www/images';
    var $image_base_url = '/images';

    function msg($s) {
        if (isset($this->messages[$s])) {
            return $this->messages[$s];
        } else {
            error_log("l10n error:LANG:" . 
                "$this->lang,message:'$s'");
        }
    }

    function img($f) {
        if (is_readable("$this->image_base_path/" . 
            "$this->lang/$f")) {
            print "$this->image_base_url/$this->lang/$f";
        } elseif (is_readable("$this->image_base_path/" .
            "global/$f")) {
            print "$this->image_base_url/global/$f";
        } else {
            error_log("l10n error:LANG:" .
                      "$this->lang,image:'$f'");
        }
    }
}
?>

The img() method needs to know both the path to the image file in the filesystem ($image_base_path) and the path to the image from the base URL of your site ($image_base_url). It uses the first to test if the file can be read and the second to construct an appropriate URL for the image.

A localized image must have the same filename in each localization directory. For example, an image that says "New!" on a yellow starburst should be called new.gif in both the images/en_US directory and the images/es_US directory, even though the file images/es_US/new.gif is a picture of a yellow starburst with the word "¡Nuevo!" on it. Don't forget that the alt text you display in your image tags also needs to be localized. A complete localized <img> tag looks like this:

<?php
$MC = new pc_MC_es_US;

printf('<img src="%s" alt="%s">',
    $MC->img('cancel.png'), $MC->msg('Cancel'));
?>

If the localized versions of a particular image have varied dimensions, store image height and width in the message catalog as well:

<?php
printf('<img src="%s" alt="%s" ' .
    'height="%d" width="%d">',
    $MC->img('cancel.png'), $MC->msg('Cancel'),
        $MC->msg('img-cancel-height'), 
        $MC->msg('img-cancel-width'));
?>

The localized messages for img-cancel-height and img-cancel-width are not text strings, but integers that describe the dimensions of the cancel.png image in each locale.

If you use a consistent naming convention for your variable and file names, create an imgsrc() method to simplify matters:

<?php
function imgsrc($img) {
    $src = $this->img("$img.png");
    $alt = $this->msg(ucfirst($img));
    $height = $this->msg("img-$src-height");
    $width = $this->msg("img-$src-width");
    return sprintf('<img src="%s" alt="%s" ' .
                   'height="%d" width="%d">', 
                   $src, $alt, $height, $width);
}
?>

To get the same results as the Cancel button example before, call it like this:

<?php
$MC = new pc_MC_es_US;

print $MC->imgsrc('cancel');
?>

Conclusion

With help of the msg() and img() methods, you can quickly create message objects that allow you to localize your Web site using 100 percent pure PHP. Because it's an all-PHP solution, you can reuse all your existing code, and you don't need to install any new extensions. However, if you need to share message catalogs among many applications, PHP supports gettext. See Joao Prado Maia's article for more details on using gettext with PHP.

As you can see, internationalizing your PHP applications is not a labor of Hercules. When you organize your localizations within an object hierarchy, it's easy to extend your classes to support new countries and regions without difficulties.

Adam Trachtenberg is the manager of technical evangelism for eBay and is the author of two O'Reilly books, "Upgrading to PHP 5" and "PHP Cookbook." In February he will be speaking at Web Services Edge 2005 on "Developing E-Commerce Applications with Web Services" and at the O'Reilly booth at LinuxWorld on "Writing eBay Web Services Applications with PHP 5."


Return to the PHP DevCenter.


Questions or comments? Let us know.
You must be logged in to the O'Reilly Network to post a talkback.
Post Comment
Full Threads Oldest First

Showing messages 1 through 7 of 7.

  • Creating the arrays from language config files
    2004-02-15 20:24:50  kenr@nodots.net [Reply | View]

    I found the article very helpful--the app I am working on needs to be able to run on plain vanilla PHP, i.e., no gettext() functionality, but I wanted to be able to use gettext() if it is available.

    So, I created the standard gettext() directory structure:

    /locale/<LANGUAGE_STRING>/LC_MESSAGE

    Under LC_message I created pipe-delimited files called messages.txt with the word or phrase pairings. For example:

    // computer terms
    computer|computadora
    hard drive|disco duro
    monitor|monitor

    I then wrote a PHP script to parse this into an array of the format suggested in the article and put it in a file called "locale.inc.php" that I include in my top-level include file. (I didn't want to have to parse the file every time I needed a translation for performance reasons) The script is designed to run as a cron job so any changes to the text files are reflected in the array. Here is the script:

    <?php

    /* $Id$
    *
    * Author: Ken Riley <kenr@nodots.com>
    * Copyright (C) 2004 Nodots Development, Inc. All Rights Reserved.
    * <http://www.nodots.com/>
    *
    * Description: Reads pseudo-i18n files and generates arrays. Designed to run
    * as a cron process.
    */

    include('include/GLOBALS.inc.php');
    $dh = dir(LOCALE_DIR);
    $outFile = BASE_DIR."/include/locale.php";
    $localeFile = fopen($outFile,"w+");
    writeHeader($localeFile);
    $languageCt = 0;

    while (false !== ($file = $dh->read())) {
    if ($file != "." && $file != "..") {
    $languageFile = LOCALE_DIR."/".$file."/LC_MESSAGES/messages.txt";

    fwrite($localeFile,"\t'$file'=> array(\n");
    $fh = fopen($languageFile,"r");
    $s = "";
    while ( feof($fh) === false ) {
    $line = chop(fgets($fh));
    if ( substr($line,0,2) != "//" and $line != "" ) {

    $translation = explode("|",$line);
    $english = $translation[0];
    $translated = $translation[1];
    $s .= "\t\t'$english' => '$translated',\n";
    }

    }
    $strSize = sizeof($s);
    $strSize = $strSize - 2;
    $cleanS = substr($s,0,$strSize);
    $cleanS .= "\n\t),";
    fwrite($localeFile,$cleanS);
    $languageCt++;
    fclose($fh);
    }
    }

    writeFooter($localeFile);
    $dh->close();

    function writeHeader($fh) {
    $timestamp = date("d-m-Y G:i:s");
    $headerString = "<?php\n";
    $headerString .= "// created ".$timestamp." by buildLanguageFile.php\n\n";
    $headerString .= "\$messages = array (\n";
    fwrite($fh,$headerString);
    }

    function writeFooter($fh) {
    $footerString = ");\n";
    $footerString .= "?>";
    fwrite($fh,$footerString);

    }

    ?>

    I then added the author's msg($s) function into my top-level include file. If I have a client with access to gettext(), I just modify that function to use gettext() rather than the array.

    So, you have simple text files for the translator, a process for automatically slurping those into an array, and a clear path to full-fledged i18n support in your ap.

    Provecho,

    Ken Riley
    Nodots Development, Inc.
  • Adam Trachtenberg photo Non-technical aspects of i18n
    2002-12-20 13:45:15  Adam Trachtenberg | O'Reilly Author [Reply | View]

    You're completely right about separating text from code. I'd never work on a project that mixed the two. (And I have developed multi-lingual web sites. The code here is based on "real life;" it wasn't written specifically for PHP Cookbook.)

    Unfortunately, the internationalization process is extremely complex. If you try to cover everything in one article, it'll end up running on for ever. In this article, I decided to focus on the technical aspects. In order not to let the other details obscure my explanation of the code, I was forced to simplify other parts.

    I've written up some other thoughts how to best organize files and handle the actual process of getting the text into the system. I posted it on my web log. Check them out and please add comments and suggestions.

    PS: To the person who felt since I chose not to use Greek and Japanese in the article is an indication of a deficit in LAMP, you're wrong. Non-Western characters don't always render correctly in all browsers. Since if people can't read the article, it's not very helpful, I made the choice to stick to English and Spanish. You can read more on PHP's Multi-byte string support in the manual.
    • Non-technical aspects of i18n
      2007-07-22 08:14:03  TheCaveman [Reply | View]

      Do you have another URL to try? It is quite a while later now and your blog is still inaccessible even after signing up for the O'Reilly Network. Thanks.
    • Non-technical aspects of i18n
      2005-11-21 08:45:26  StarsInTheSky [Reply | View]

      Hello, I cannot access your blog with the given link, even after creating and logging into oreilly, it is restricted access. Would it be possible with a public link? (instead of http://www.oreillynet.com/cs/weblog/view/wlg/2470)
  • even more mistaken that that...
    2002-12-11 17:56:42  anonymous2 [Reply | View]

    The article starts off talking about the problem of mixing French, Greek, and Japanese, then proceeds to demonstrate how to mix English with English, then for the advanced class, how to mix English and Spanish, and even that is done by hard coding strings into the source code.

    The LAMP platform is about the closest you can get to worst case scenario for anyone who does real internationalization for a living.

    What the author should have said was that if you need to create an app for a small organization with no international aspirations, LAMP may acceptable, but otherwise you'd need to change to more professional tools like Java, .Net, Oracle, PostgreSQL, SQLServer, etc.

  • oh, are you mistaken...
    2002-12-11 02:48:13  kirkmc [Reply | View]

    As a translator, I shudder when I read articles like this. It is the type of short-sighted attitude that makes our work so hard.

    As a rule, you should _never_ put strings of text that are to be localized in code of any kind. Not only does it require the translator to work in the code, taking far longer than necessary to translate simple texts, but it runs the risk of the code getting damaged in this translation processe. While some translators know enough about code to work in it, most don't. The last thing you want is for your translated code to come back and find it doesn't work. Or not find it doesn't work, until you discover your site is malfunctioning.

    The best way to work with localization is to have separate files for each language, which may contain variables and definition, but which won't contain any real code. Each translator can then work on their language file, which you just roll back into your database.

    Kirk
    • Chris Shiflett photo Wrong perspective
      2003-03-28 13:18:08  Chris Shiflett | O'Reilly Author [Reply | View]

      It's quite short-sighted to try to launch a semantic argument like this against the merits of the article. Your logic seems to rely completely on whether this article is a tutorial or a real-world example. It is a tutorial, even though it may be based on a real-world example.



      For example, take this:




      $messages = array (
      'en_US'=> array(
      'My favorite foods are' =>
      'My favorite foods are',
      'french fries' => 'french fries',
      'biscuit' => 'biscuit',
      'candy' => 'candy',
      'potato chips' => 'potato chips',
      'cookie' => 'cookie',
      'corn' => 'corn',
      'eggplant' => 'eggplant'
      ),

      'en_GB'=> array(
      'My favorite foods are' =>
      'My favourite foods are',
      'french fries' => 'chips',
      'biscuit' => 'scone',
      'candy' => 'sweets',
      'potato chips' => 'crisps',
      'cookie' => 'biscuit',
      'corn' => 'maize',
      'eggplant' => 'aubergine'
      )
      );


      This is just example code to create the array that he demonstrates how to use. This array doesn't have to be hard-coded, but this article's scope isn't about creating a friendly interface for translators, it's about internationalization for programmers. It is trivial to make a nice little Web application that provides a friendly data entry interface for translators that stores the information in a database. This database can be used to create the array.



      There are plenty of tutorials that demonstrate how to interact with a database. This one is about internationalization and localization.



Tagged Articles

Post to del.icio.us

This article has been tagged:

php

Articles that share the tag php:

Understanding MVC in PHP (477 tags)

The PHP Scalability Myth (123 tags)

The Dynamic Duo of PEAR::DB and Smarty (53 tags)

PHP Form Handling (43 tags)

Very Dynamic Web Interfaces (39 tags)

View All

i18n

Articles that share the tag i18n:

Unicode Secrets (7 tags)

More Unicode Secrets (6 tags)

XML on the Web Has Failed (6 tags)

Internationalization and Localization with PHP (5 tags)

Internationalization, Part 2 (4 tags)

View All

l10n

Articles that share the tag l10n:

Internationalization and Localization with PHP (2 tags)

View All

Sponsored Resources

  • Inside Lightroom
Advertisement

Sponsored by:

O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com