O'Reilly Databases

oreilly.comSafari Books Online.Conferences.

We've expanded our coverage and improved our search! Search for all things Database across O'Reilly!

Search Search Tips

advertisement
AddThis Social Bookmark Button

Listen Print Discuss Subscribe to Databases Subscribe to Newsletters

Using More Perl in PostgreSQL
Pages: 1, 2

Handing Data Back to the Database

On the other side of the coin, you can also now return records to PostgreSQL, either singly or in sets. The case of a set of records is probably the most interesting and useful. The simple way to return a set of records is to hand them back to PostgreSQL via the return statement, as an arrayref of hashrefs, with the column names as keys of the hashes, one hash per row. Notice that this is exactly the way that PostgreSQL hands rowsets to PL/Perl via spi_exec_query(). In fact, you could take the rows member of the result from spi_exec_query() and just return it, as long as it had the right keys in each hash.



Any missing field in a return value will become a NULL, as will any field whose value is undef.

You can also hand back a single record to PostgreSQL as a single hashref. PostgreSQL 8.1 allows the use of arrayrefs too.

How is this useful? Consider a task that is not easy in SQL but fairly simple using PL/Perl. Suppose that you have a table of exam scores. The table contains the name of the person with the score, the class he or she is in, and the actual score. The sample script contains some necessary type definitions as well as a PL/Perl function to populate the table with some test data. The goal is to augment that information with the ranking of the people, but only within their class. The function get_ranks() in the sample script will do that. Notice that the SQL orders the results by class and then descending score, in order to do control break processing.

In general, you need to use a PL function for any processing that needs to keep track of state, as this does. This example keeps track of the class so that it can reset the rank counter when it changes. The processing simply gets each row, looks to see whether there's a change in class and resets the counter if there is, stores the rank in the row, and stores the row on an arrayref to return at the end of processing. The ordering breaks ties in score (that's why it sorts by name), but implementing a tie-keeping mechanism would be a fairly simple alteration.

You can use a function like this one that returns a set of records as a table expression in the FROM clause of a SELECT query, just as you would a function returning a single value in the SELECT or WHERE clauses. The result looks like:

andrew=# select * from get_ranks();
     name  | class  | score | class_rank 
-------+--------+-------+------------
     Ellen | green  |    91 |          1
     Anne  | green  |    37 |          2
     Bill  | green  |    30 |          3
     May   | green  |    23 |          4
     Joe   | green  |    11 |          5
     Fred  | green  |     1 |          6
     Joe   | red    |    93 |          1
     Fred  | red    |    69 |          2
     Bill  | red    |    53 |          3
     May   | red    |    51 |          4
     Anne  | red    |    45 |          5
     Ellen | red    |    14 |          6
     Anne  | yellow |    88 |          1
     Ellen | yellow |    68 |          2
     May   | yellow |    25 |          3
     Joe   | yellow |    21 |          4
     Bill  | yellow |    11 |          5
     Fred  | yellow |     8 |          6
(18 rows)

Memory Problems with Lots of Data

Fetching data into PL/Perl using the above method requires storing the whole resultset in Perl's main memory. Similarly, if you just return an array of hashrefs, PL/Perl first creates these in Perl's memory all at once, before handing them off to PostgreSQL. Because databases are often about handling huge amounts of data, which can frequently exceed the size of available memory, this is not a good idea. In version 8.0 there's no real solution--you just have to make sure you don't SELECT or return huge data sets that will use lots of memory. Version 8.1 of PostgreSQL provides some much better ways of doing things that do not suffer from the memory bloat problem. The credit for this goes to Abhijit Menon-Sen, who has done some great work on Perl and PostgreSQL.

The first tip is that when fetching a large data set, you should use spi_query() and spi_fetchrow() instead of spi_exec_query(). spi_query() returns a handle to use as the parameter to spi_fetchrow(), which returns one row at a time, or undef when there are no more rows. So you never need to get all the rows into Perl memory at one time. Just fetch one row at a time in a loop.

To return values, use a similar mechanism--in fact, to return large amounts of data your function should never return it directly at all--the return value of the function should be undef! Instead, use a new callback function called return_next(), which you give a single return value.

The function get_ranks_revised() in the sample script uses the these new mechanisms to produce the same results as above. Transforming the function was quite trivial, and the new version not only uses less memory but runs faster too.

Next Article

The final article in this series will examine sharing data between PL/Perl functions, turning on strict mode, Unicode string handling, and the road map for further enhancements to PL/Perl.

Andrew Dunstan works for a small consulting and software company in the Triangle area of North Carolina, and contributes to PostgreSQL as an enthusiastic hobbyist as well as a sometime professional user.


Return to the Databases DevCenter


Have a question about PL/Perl? Ask Andrew here.
You must be logged in to the O'Reilly Network to post a talkback.
Post Comment


Tagged Articles

Be the first to post this article to del.icio.us

Sponsored Resources

  • Inside Lightroom

Related to this Article

Understanding Oracle Clinical Understanding Oracle Clinical
by Joan M. Johnson
May 2007
$9.99 USD

Inside SQLite Inside SQLite
by Sibsankar Haldar
April 2007
$9.99 USD

Advertisement
O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com