Apache DevCenter

oreilly.comSafari Books Online.Conferences.

We've expanded our LAMP news coverage and improved our search! Search for all things LAMP across O'Reilly!

Search
Search Tips

advertisement

Listen Print Discuss Subscribe to Apache Subscribe to Newsletters

A Day in the Life of #Apache
Examples of RewriteMap in Action

by Rich Bowen, coauthor of Apache Cookbook
04/28/2005

Editor's note: Rich Bowen is back with another installment in his ongoing series based on conversations on #apache. This week, he provides examples of RewriteMap in action. Rich is a coauthor of O'Reilly's Apache Cookbook.

#apache is an IRC channel that runs on the irc.freenode.net IRC network. To join this channel, you need to install an IRC client (XChat, MIRC, and bitchx are popular clients) and enter the following commands:

/server
irc.freenode.net
/join #apache

Day Twelve

A huge number of the questions on #apache have to do with mod_rewrite. And, fairly frequently, I find myself thinking that the problem being discussed would be so much easier to solve if we could just write a Perl script to deal with it.

Of course, you can, using the RewriteMap, but it's moderately hard to come by good examples of using this, either in the documentation, or elsewhere online.

As some of you may know, I'm working on the documentation, and, hopefully, it will soon contain some good examples of using RewriteMap. But, until then, this article will serve to provide a simple, as well as a not-so-simple, example.

I'll go ahead and give the caveat here, since you'd be really irritated with me if you got to the end and realized this little fact then. Although you can use a rewrite map anywhere (i.e., including .htaccess files), you can only define them in your main Apache configuration file. This has to do with the fact that the map is loaded on server startup, and so putting it in a .htaccess file wouldn't really work.

We'll start with the most simplistic RewriteMap example, so that you can see how the syntax works, and how you'd use it in a simple map scenario. In this simplest form, RewriteMap allows you to create a 1-1 map between patterns and URLs. You can frequently use it to replace a lengthy list of RewriteRules with a map file.

We'll start by creating the map file. We'll call it fish.map and put it in /usr/local/apache/conf, and it will look like this:

carp http://fish.net/carp.pl
trout http://fishermen.com/trout.php
guppy http://guppies.org/about.html
whale http://moby.dick.net/great.white.cfm

In the next step, we create a map name for that file, so that we can use it in RewriteRules.

RewriteMap fishmap txt:/usr/local/apache/conf/fish.map

And, finally, we'll use it in an actual RewriteRule. In this case, we want to redirect some requests for various fishes to sites about those fishes.

RewriteEngine On
RewriteRule ^/fish/(.*) ${fishmap:$1} [R]

Now, when someone visits http://myserver/fish/guppy, they will be redirected to http://guppies.org/about.html instead.

There's still one small problem, though. If they request the URL http://myserver/fish/salmon, the rule will be run, the fish will be looked up in the map, and nothing will match. If we want to provide a default place to go if nothing matches, we can add that to the RewriteRule:

RewriteRule ^/fish/(.*) ${fishmap:$1|http://no.fish.com/} [R] 

Alright, that's pretty simple, you say, but how does this help me if my needs are more complex than a simple 1-1 mapping? Well, that's where the prg: type of RewriteMap comes in. Whereas many rewrite rules can be expressed as a single line of regular expressions, some require several RewriteRule statements in a row, and others just seem to be more complex than one really wants to encode in a Apache configuration file. But you could write it in a few lines of Perl, right?

In fact, in a recent Apache class I taught, one of my students was rather irate that I left RewriteMap to the end. If I'd told them about that first, he said, the rest of it would have been unnecessary. I don't know if I'd go that far, but, let's give a couple small examples to illustrate.

In my first example, I want to replace all dashes (-) with underscores (_) in a URL. Now, you could do this with standard RewriteRule directives, using the [N] flag. But that gets icky, and people tend to get it wrong. However, it's pretty simple in Perl, so let's do it that way instead.

First of all, here's the Perl program that does the transformation. (This gets fired up when Apache starts, so you're not launching Perl with every request, or anything silly like that.)

    #!/usr/bin/perl
    $| = 1; # Turn off buffering
    while (<STDIN>) {
            s/-/_/g; # Replace - with _ globally
            print $_;
    }
    

We turn off buffering in the script because, in many cases, having buffered output can cause the rewriting process to hang indefinitely, waiting for the output to be returned.

We'll put this script in a file named dash2score.pl and put it in /usr/local/apache/conf, as we did with the other map, just for consistency. Make sure to make that script executable. Then we'll give the map a name:

RewriteMap dash2score prg:/usr/local/apache/conf/dash2score.pl 

Now we can use it in a RewriteRule:

RewriteEngine On
RewriteRule (.*-.*) ${dash2score:$1} [PT]

The pattern that I've used--(.*-.*)--will match any requested URL that contains any dash characters, and will cause the entire URL to be passed to the conversion script. The script does the conversion in one step, returns the result, and the RewriteRule passes that resulting URL back to the URL mapping engine to see what happens next.

The more complex example involves database access. I came up with this example when trying to persuade WordPress to give me a particular kind of URL. I should note that, since then, some helpful WordPress developers have pointed out easier ways to do this. However, the technique itself was interesting enough that it inspired me to think of doing this article in the first place. So here it is.

In this case, we're going to look in a database for the information that we want:

    #!/usr/bin/perl
    use DBI;
    $|=1;
    my $dbh = DBI->connect('DBI:mysql:wordpress:dbserver', 
                           'username', 'password');
    my $sth = $dbh->prepare("SELECT ID FROM wp_posts
                            WHERE post_name = ?");
    my $ID;

    # Rewrite permalink style links to article IDs
    while (my $post_name = <STDIN>) {
        chomp $post_name;
        $sth->execute($post_name);
        $sth->bind_columns(\$ID);
        $sth->fetch;

        print "/wordpress/index.php?p=$ID\n";
    } 

We create the rewrite function using the RewriteMap directive:

RewriteMap permalink prg:/usr/local/apache/conf/permalink.pl 

And then we can use it in rewrite rules:

RewriteRule ^/perm/(.*) ${wp_permalinks:$1} [PT] 

In this case, a URL like http://servername/perm/wooga will cause a database lookup using the keyword "wooga."

One final word about how this works, and why it's not monstrously inefficient. The Perl script referred to in the RewriteMap starts when the Apache server is started, and keeps running for the life of the Apache server process. This is why you need a while <STDIN> loop, and that's why it doesn't need to relaunch the program with each request. If the directive were permitted in .htaccess files, it would mean that the program would need to be launched with every request. This would be hugely inefficient.

I hope that this little tutorial will help you use RewriteMap for those cases when the RewriteRules are getting just a little too hairy.

See you on #apache.

Rich Bowen is a member of the Apache Software Foundation, working primarily on the documentation for the Apache Web Server. DrBacchus, Rich's handle on IRC, can be found on the web at www.drbacchus.com/journal.

Apache Cookbook

Related Reading

Apache Cookbook
By Ken Coar, Rich Bowen

Return to the Apache DevCenter.


Questions for Rich? Ask them here.
You must be logged in to the O'Reilly Network to post a talkback.
Post Comment
Full Threads Newest First

Showing messages 1 through 1 of 1.

  • RewriteMap
    2007-08-27 16:31:34  Jezzdk [Reply | View]

    I recently came across your article on RewriteMap, and I have something to add in the case of the prg map example.
    Since apache opens a kind of persistent connection to the script in question, any changes made to resources used by that script can break the entire thing. Especially when using databases. In the given example:

    #!/usr/bin/perl
    use DBI;
    $|=1;
    my $dbh = DBI->connect('DBI:mysql:wordpress:dbserver',
    'username', 'password');
    my $sth = $dbh->prepare("SELECT ID FROM wp_posts
    WHERE post_name = ?");
    my $ID;

    # Rewrite permalink style links to article IDs
    while (my $post_name = <STDIN>) {
    chomp $post_name;
    $sth->execute($post_name);
    $sth->bind_columns(\$ID);
    $sth->fetch;

    print "/wordpress/index.php?p=$ID\n";
    }

    Imagine what would happen if the connection to the database is lost (i.e. due to a restart of the mysql service). By putting the connection string inside the while loop, we make sure that a connection is established for each call.

    I'm no expert in either perl or apache, but for me it seems to work better this way. The question is, will it have an impact on performance (is connections closed properly etc.)?


Tagged Articles

Post to del.icio.us

This article has been tagged:

apache

Articles that share the tag apache:

Multiuser Subversion (38 tags)

Introducing LAMP Tuning Techniques (32 tags)

Apache Web-Serving with Mac OS X: Part 1 (26 tags)

Introducing mod_security (25 tags)

Location, Location, Location: Tips for Storing Web Site Files (22 tags)

View All

mod_rewrite

Articles that share the tag mod_rewrite:

A Day in the Life of #Apache (3 tags)

View All

tutorial

Articles that share the tag tutorial:

Rolling with Ruby on Rails (1417 tags)

A Simpler Ajax Path (135 tags)

Ajax on Rails (88 tags)

Rolling with Ruby on Rails, Part 2 (66 tags)

Very Dynamic Web Interfaces (66 tags)

View All

Sponsored Resources

  • Inside Lightroom
Advertisement

Sponsored by:

O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com