PHP DevCenter

oreilly.comSafari Books Online.Conferences.

We've expanded our LAMP news coverage and improved our search! Search for all things LAMP across O'Reilly!

Search
Search Tips

advertisement

Listen Print Discuss Subscribe to PHP Subscribe to Newsletters
PHP Cookbook

Internationalization and Localization with PHP

by Adam Trachtenberg, coauthor of PHP Cookbook
11/28/2002

While everyone who programs in PHP has to learn some English eventually to get a handle on its function names and language constructs, PHP can create applications in just about any human language. Some applications need to be used by speakers of many different languages. PHP's internationalization and localization support makes it easier to make an application written for French speakers useful for German speakers.

Internationalization (often abbreviated I18N--there are 18 letters between the first "i" and the last "n") is the process of taking an application designed for just one locale and restructuring it so that it can be used in many different locales. Localization (often abbreviated L10N--there are 10 letters between the first "l" and the "n") is the process of adding support for a new locale to an internationalized application.

Localizing different kinds of content requires different techniques. This article covers an object-oriented method for localizing plain text messages and images. The PHP Cookbook contains additional recipes for dates, times, and currency. There are also recipes on using GNU gettext and other I18N and L10N topics.

Related Reading

PHP Cookbook
By David Sklar, Adam Trachtenberg

Locales

A locale is a group of settings that describe text formatting and language customs in a particular area of the world. A locale name generally has three components. The first, an abbreviation that indicates a language, is mandatory. For example, "en" stands for English and "pt" for Portuguese. An optional country specifier comes next, after an underscore, to distinguish between different versions of the same language spoken in different countries. For example, "en_US" and "en_GB" specify U.S. and British English respectively, while "pt_BR" and "pt_PT" identify Brazilian and Portugese Portuguese. Finally, after a period, comes an optional character-set specifier. Taiwanese Chinese using the Big5 character set is encoded as "zh_TW.Big5". Note that while most locale names follow these conventions, some don't.

Message Catalog

To incorporate I18N support into your program, maintain a message catalog of words and phrases and retrieve the appropriate string from the message catalog before printing it. Here's a simple message catalog with foods in American and British English and a function to retrieve words from the catalog:

<?php
$messages = array (
    'en_US'=> array(
       'My favorite foods are' =>
           'My favorite foods are',
       'french fries' => 'french fries',
       'biscuit' => 'biscuit',
       'candy' => 'candy',
       'potato chips' => 'potato chips',
       'cookie' => 'cookie',
       'corn' => 'corn',
       'eggplant' => 'eggplant'
    ),

    'en_GB'=> array(
        'My favorite foods are' =>
            'My favourite foods are',
        'french fries' => 'chips',
        'biscuit' => 'scone',
        'candy' => 'sweets',
        'potato chips' => 'crisps',
        'cookie' => 'biscuit',
        'corn' => 'maize',
        'eggplant' => 'aubergine'
    )
);

function msg($s) {
    global $LANG;
    global $messages;
    
    if (isset($messages[$LANG][$s])) {
        return $messages[$LANG][$s];
    } else {
        error_log("l10n error:LANG:" . 
            "$lang,message:'$s'");
    }
}
?>

This short program uses the message catalog to print out a list of foods:

<?php
$LANG ='en_GB';

print msg('My favorite foods are').":\n";
print msg('french fries')."\n";
print msg('potato chips')."\n";
print msg('corn')."\n";
print msg('candy')."\n";
?>

My favourite foods are:
chips
crisps
maize
sweets

To have the program output in American English instead of British English, just set $LANG to en_US.

Variable Phrases

You can combine the msg() message retrieval function with printf() to store phrases that require values to be substituted into them. Consider the English sentence "I am 12 years old." In Spanish, the corresponding phrase is "Tengo 12 años." The Spanish phrase can be built by stitching together translations of "I am," the numeral 12, and "years old." It's easier, though, to store them in the message catalogs as printf()-style format strings:

<?php
$messages = array(
    'en_US' => array(
        'I am X years old.' =>
            'I am %d years old.'),
    'es_US' => array(
        'I am X years old.' => 
            'Tengo %d años.')
);
?>

You can then pass the results of msg() to printf() as a format string:

<?php
$LANG ='es_US';

printf(msg('I am X years old.'), 12);
?>

Tengo 12 años.  

For phrases that require the substituted values to be in a different order in different languages, printf() supports changing the order of the arguments:

<?php
$messages = array(
    'en_US' => array(
        'I am X years and Y months old.' =>
        'I am %d years and %d months old.'),
    'es_US' => array(
        'I am X years and Y months old.'=>
        'Tengo %2$d meses y %1$d años.')
);
?>

With either language, call sprintf() with the same order of arguments (i.e., first years, then months):

<?php
$LANG ='es_US';

printf(msg('I am X years and Y months old.'),12,7);
?>

Tengo 7 meses y 12 años.  

In the format string, %2$ tells printf() to use the second argument, and %1$ tells it to use the first.

Pages: 1, 2

Next Pagearrow




Tagged Articles

Post to del.icio.us

This article has been tagged:

php

Articles that share the tag php:

Understanding MVC in PHP (477 tags)

The PHP Scalability Myth (123 tags)

The Dynamic Duo of PEAR::DB and Smarty (53 tags)

PHP Form Handling (43 tags)

Very Dynamic Web Interfaces (39 tags)

View All

i18n

Articles that share the tag i18n:

Unicode Secrets (7 tags)

More Unicode Secrets (6 tags)

XML on the Web Has Failed (6 tags)

Internationalization and Localization with PHP (5 tags)

Internationalization, Part 2 (4 tags)

View All

l10n

Articles that share the tag l10n:

Internationalization and Localization with PHP (2 tags)

View All

Sponsored Resources

  • Inside Lightroom
Advertisement

Sponsored by:

O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com