O'Reilly Databases

oreilly.comSafari Books Online.Conferences.

We've expanded our coverage and improved our search! Search for all things Database across O'Reilly!

Search Search Tips

advertisement
AddThis Social Bookmark Button

Listen Print Discuss Subscribe to Databases Subscribe to Newsletters

Upgrading a MySQL Application
Pages: 1, 2

Migrating Data

To eliminate the redundancy in the sales_staff table, the upgrade program first needs to run an ALTER TABLE statement to add a column for the employee id.



For tidiness, the developer chose to add the emp_id column after the sales_id column (the first and primary key column):

ALTER TABLE sales_staff 
ADD COLUMN emp_id INT 
AFTER sales_id;

Next, the employee id number must be determined for each row. This can be messy and make the developer wish he had planned his application better. Nevertheless, that's in the past and it has to be dealt with now. To update most of the records, the upgrade program will run the following SQL statements:

CREATE TABLE temp_empid
   (empid INT, salesid INT);

INSERT INTO temp_empid
   SELECT   employees.emp_id, sales_id
   FROM     employees, sales_staff
   WHERE    emp_first = rep_first
     AND    emp_last  = rep_last; 

UPDATE   sales_staff, temp_empid
   SET   emp_id   = empid
   WHERE sales_id = salesid;

DROP TABLE temp_empid;

The first SQL statement creates a temporary table, temp_empid to hold each employee's id number and the related sales id. The second SQL statement inserts into temp_empid each emp_id selected from employees and the related sales_id selected from sales_staff, where the first and last name columns of each table match. The third SQL statement updates sales_staff by setting the values of the new column emp_id equal to the matching empid column in temp_empid where the sales id columns of each of these tables match. Finally, the fourth SQL statement deletes the temporary table. All of these SQL statements could be replaced with one statement without the use of a temporary table:

UPDATE sales_staff, employees
   SET   sales_staff.emp_id = employees.emp_id
   WHERE emp_first          = rep_first
     AND emp_last           = rep_last;

This will work, but being split apart into a set of three SQL statements like above can help to solve some data migration problems.

Migration Problems

There are some potential problems with running the set of three SQL statements above unchecked. First, before running the UPDATE statement it's a good idea to extract a list of rows from temp_empid at least for where there are duplicates (if not for all rows) so that the user can adjust the information before updating sales_staff. The following SQL statement will help:

SELECT COUNT(sales_id), sales_id, CONCAT(rep_first, ' ', rep_last)
FROM     employees, sales_staff
WHERE    emp_first = rep_first
AND      emp_last  = rep_last
GROUP BY employees.emp_id;

A Perl program would use this statement to capture the count of rows and the name and sales id of each sales representative. For counts greater than one, the user would be given a list of salespeople. By capturing sales_id, the upgrade program can run another SELECT statement to retrieve the entire related record from sales_staff if requested, so that the user can have more data to associate rows in temp_empid to salespeople. The programs, of course, will have to keep circling back and querying the database again for duplicates until there are none. Then the UPDATE statement can be run.

For rows in sales_staff where there are no matches (probably due to different spellings of names), the user should be asked to adjust the data. The simplest way to deal with these records is to run the following SQL statement after the UPDATE statement is run to select only records that were not updated (where emp_id in sales_staff is still NULL):

SELECT sales_id, CONCAT(rep_first, ' ', rep_last)
FROM  sales_staff
WHERE emp_id IS NULL;

One potential issue with the all-in-one UPDATE statement is it refers to multiple tables. This feature was added to MySQL in version 4.0. Clients using an earlier version will need a workaround. This requires getting Perl involved in performing the updates instead of using a multi-table UPDATE statement. Here's one way in which it could be done in Perl after the CREATE TABLE and the INSERT statements are run, but before the DROP statement:

#!/usr/bin/perl -w

use strict;
use DBI;

# Client Variables
my $db   = $ENV{DB};
my $pw   = $ENV{PW};
my $user = $ENV{USR};

# Connect to MySQL
my $dbh = DBI->connect( "DBI:mysql:$db:localhost", $user, $pw )
	or die "Failed: " . DBI->errstr();

# Get rows from temp_empid
my $sql = "SELECT empid, salesid FROM temp_empid";
my $sth = $dbh->prepare($sql);
$sth->execute();

my $empids = $sth->fetchall_arrayref();
$sth->finish();

# Loop through array of temp_empid rows 
# and update sales_staff table
foreach my $emp (@$empids) {    
	my ($empid,$salesid) = @$emp;

	my $sql = "UPDATE sales_staff 
		SET emp_id = ?
		WHERE sales_id = ?";

	my $sth = $dbh->prepare($sql);
	$sth->execute( $empid, $salesid );
	$sth->finish();
}

$dbh->disconnect();

exit;

The opening lines of this script (sales_staff_update.plx) use the Perl DBI module and set up the client variables. The script then establishes a connection to the MySQL server or dies if it cannot. Next, the program basically gets all of the records at once from the temporary table. Each row of the results, that is, each salesperson, is stored in an array, itself stored in a comprehensive array. The scalar $empids holds a reference to this parent array. The script then loops over the salespeople (each row of temp_empid), updating the sales_staff table where the sales id numbers match. Once this is done, the script disconnects from the MySQL server and exits. The script above is run by the upgrade program like this:

perl sales_staff_update.plx 2> \
	'./logs/sales_staff_update'

Any error messages are redirected to a text file which may be sent to the developer as needed.

Updating a Reference Table

The application's database has a table (makes_models) that contains the names of hundreds of makes and models of automobiles. This table can be joined to the inventory table by a column called model_id. It saves the user from having to type the make and model name for each automobile entered into inventory. The client does, however, have the option of adding models not already in the table. As part of the upgrade, the developer wants to seize the opportunity not only to add new models to the client's makes_models table, but also to get a copy of the user's additions so that the developer can add them to his master table for future releases.

In preparation for the upgrade, the developer exported makes_models from his master copy of the database using mysqldump:

$ mysqldump --user='root' --password='abc123' \
	--add-locks --no-create-info car_dealer \
	makes_models > makes_models.sql

The --no-create-info option stops the CREATE TABLE statement from being added to the output file. There is a problem with the file that this will create, though. The developer doesn't want to run INSERT statements because they might cause duplicates (if not errors). Instead, he wants to run REPLACE statements, which will only insert a row if there is not already a row with the same primary key in the table. If a row already exists, then MySQL will replace it with the imported record. Unfortunately, there isn't an option with mysqldump to generate REPLACE statements instead of INSERT statements. But since the syntax of both statements are identical, a simple Perl one-liner like the following can fix the exported file before it's distributed to clients:

$ perl -p -i.bak -e 's/INSERT/REPLACE/;' makes_models.sql

The -p switch tells Perl to process each line of the input file. The -i.bak here causes Perl to create a backup file called makes_models.sql.bak in the current directory. The original file will be replaced. The -e switch tells Perl that what follows in quotes is to be executed by Perl. The bit in quotes substitutes the first occurrence of INSERT on each line with REPLACE.

With the makes_models.sql dump file properly adjusted, the upgrade program only has to run it on the client's server to update the client's makes_models table. After that has been done, the upgrade program will then need to query the database for user generated rows in makes_models.

As mentioned before, the primary key of makes_models is model_id. This is not an automatically incremented numeric column. Instead, it's a fixed length alphanumeric character column (CHAR(9)). The first four characters are letters representing the make and the last five are sequential numbers (e.g., FORD12345) This is a little old fashioned, but the developer is more comfortable with this method. When the user creates a new model, the application generates an identifier with the letters USER followed by a five character number. One advantage to this is that user created rows won't be overwritten by updates. Another advantage is that it makes it easy to extract these rows. The upgrade program will use the following SQL statement to create a temporary table (exp_models) based on makes_models, select the user added rows, and insert them into the temporary table:

CREATE TABLE exp_models
   SELECT * 
   FROM  makes_models 
   WHERE model_id LIKE 'USER%';

The first line here creates the temporary export table based on the types of the columns selected from makes_models—in this case, all (*) columns. As for the WHERE clause, the % is a wildcard. Rows that start with USER and end in any number of characters will be selected.

After this, the upgrade program merely needs to run mysqldump to export the temporary table and then use the system's e-mail service to mail it to the developer (preferably with the client's permission). Once this is done, it can issue a DROP statement to delete the temporary table.

Undoing or Finishing

With the table and data changes completed, the new and replacement scripts can be installed. This is just a matter of copying files with the cp command. Once it's done, the user should be asked to verify the integrity of the data and to make sure that the new scripts are working properly. If the user responds that there are problems, the upgrade program should offer to undo the upgrade and to restore the data. The scripts that are tarred and gzipped in the backup directory could be restored like this:

cd $SCRIPT_DIR
tar -xvfz $UP_DIR/backup/scripts.tar.gz

The first line changes to the directory containing the application's scripts. In the second line, the tar command extracts all of the files backed up and restores them. In this case, the z switch will gunzip the files. The x switch will then untar the file (the f switch). The v option (for verbose) will allow the user to see what's happening. This is good for reassuring the user. As for scripts that the upgrade added, the restoration program can run a series of rm command statements to remove them individually. All that will remain then is to restore the tables that were tampered with and their data. This can be done by using the mysqldump files that were created at the beginning:

mysql --user="$USR" --password="$PW" \
      $DB < ./backup/sales_staff.sql

This command causes MySQL to run the SQL statements contained in the text file created by mysqldump. After this, the restoration program should e-mail the developer a message that the upgrade failed, along with copies of the log files that the upgrade generated.

If the upgrade was successful, though, the upgrade program could then delete the backup files and the installation program. However, it would be prudent to leave everything in place in case the client notices a problem with the upgrade a few days later. The files could be deleted during the next upgrade.

Closing Comments

This article has only illustrated one table change and one reference table update. However, it has touched on many of the factors that need to be considered in a MySQL application upgrade. A real upgrade would simply be more of the same. As with most situations with databases, a database application upgrade is tedious and requires a tremendous amount of planning and testing. To save yourself plenty of grief, be prepared for failure by making backups and by developing a restoration script. This will minimize client complaints and make it easier to recover the client's data and to keep the client's patronage. You should also try to consider every deviation that the client might make, and assume that the clients have no technical abilities. The smoother the upgrade, the less follow-up telephone support that will be required. This will lead to more referrals and more money per hour for your work in developing your application.

Russell Dyer has worked full-time for several years as a free-lance writer of computer articles, primarily on MySQL.


Return to ONLamp.com.


Have you any tips on migrating and upgrading existing applications? Share them here.
You must be logged in to the O'Reilly Network to post a talkback.
Post Comment
Full Threads Oldest First

Showing messages 1 through 4 of 4.

  • DBI error checking
    2003-10-22 05:58:22  hmerrill [Reply | View]

    Good article, but the DBI code lacks error checking. The connect should at least be setting RaiseError *on* (to 1) so that any error with a DBI statement causes a die. Read about DBI error checking by doing 'perldoc DBI' at a command prompt, and searching (using the forward slash '/') for both 'RaiseError' and 'Transactions'.
  • locking table with mysql -e
    2003-10-22 02:02:34  anonymous2 [Reply | View]

    IIRC it's not correct because locks are dropped when client disconnects
    • locking table with mysql -e
      2004-02-02 07:08:25  anna4 [Reply | View]

      Agreed, better to use --lock-tables (not --add-locks as in article)
  • Version check
    2003-10-18 01:15:55  anonymous2 [Reply | View]

    I think you shoud have an "app_version" or "table_version" field somewhere in the database, so you can be sure you're not updating an already up-to-date application. Updating a table more than once could be dangerous for the data and for the backup.


Tagged Articles

Be the first to post this article to del.icio.us

Sponsored Resources

  • Inside Lightroom

Related to this Article

Understanding Oracle Clinical Understanding Oracle Clinical
by Joan M. Johnson
May 2007
$9.99 USD

Inside SQLite Inside SQLite
by Sibsankar Haldar
April 2007
$9.99 USD

Advertisement
O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com