O'Reilly Databases

oreilly.comSafari Books Online.Conferences.

We've expanded our coverage and improved our search! Search for all things Database across O'Reilly!

Search Search Tips

advertisement
AddThis Social Bookmark Button

Listen Print Discuss Subscribe to Databases Subscribe to Newsletters

Introducing Slony

by A. Elein Mustain
11/18/2004

Slony is the Russian plural for elephant. It is also the name of the new replication project being developed by Jan Weick. The mascot for Slony, Slon, is a good variation of the usual Postgres elephant mascot, created by Jan.

Slon, the Slony mascot
Figure 1. Slon, the Slony mascot.

Slony-I, the first iteration of the project, is an asynchronous replicator of a single master database to multiple replicas, which in turn may have cascaded replicas. It will include all features required to replicate large databases with a reasonable number of replicas. Jan has targeted Slony-I toward data centers and backup sites, implying that all nodes in the network are always available.

The master is the primary database with which the applications interact. Replicas are replications, or copies of the primary database. Since the master database is always changing, data replication is the system that enables the updates of secondary, or replica, databases as the master database updates. In synchronous replication systems, the master and the replica are consistent exact copies. The client does not receive a commit until all replicas have the transaction in question. Asynchronous replication loosens that binding and allows the replica to copy transactions from the master, rolling forward, at its own pace. The server issues a commit to the master client based on the state of the master database transaction.

Cascading replicas over a WAN minimizes bandwidth, enabling better scalability and also enables read-only (for example, reporting) applications to take advantage of replicas.

cascading replicas
Figure 2. Cascading replicas

Assume you have a primary site, with a database server and a replica as backup server. Then you create a remote backup center with its own main server and its backup replica. The remote primary server is a direct replica, replicating from the master over the WAN, while the remote secondary server is a cascaded replica, replicating from the primary server via the LAN. This avoids transferring all of the transactions twice over the WAN. More importantly, this configuration enables you to have a remote backup with its own local failover already in place for cases such as a data center failure.

Slony's design goals differentiate it from other replication systems. The initial plan was to enable a few very important key features as a basis for implementing these design goals. An underlying theme to the design is to update only that which changes, enabling scalable replication for a reliable failover strategy.

The design goals for Slony are:

  1. The ability to install, configure, and create a replica and let it join and catch up with a running database.

    This allows the replacement of both masters and replicas. This idea also enables cascading replicas, which in turn adds scalability, limitation of bandwidth, and proper handling of failover situations.
  2. Allowing any node to take over for any other node that fails.

    In the case of a failure of a replica that provides data to other replicas, the other replicas can continue to replicate from another replica or directly from the master.

    replication continues after a failure
    Figure 3. Replication continues after a failure

    In the case where a master node fails, a replica can receive a promotion to become a master. Any other replicas can then replicate from the new master. Because Slony-I is asynchronous, the different replicas may be ahead of or behind each other. When a replica becomes a master, it synchronizes itself with the state of the most recent other replica.

    In other replication solutions, this roll forward of the new master is not possible. In those solutions, when promoting a replica to master, any other replicas that exist must rebuild from scratch in order to synchronize with the new master correctly. A failover of a 1TB database leaves the new master with no failover of its own for quite a while.

    The Slony design handles the case where multiple replicas may be at different synchronization times with the master and are able to resynchronize when a new master arises. For example, different replicas could logically be in the future, compared to the new master. There is a way to detect and correct this. If there weren't, you would have to dump and restore the other replicas from the new master to synchronize again.

    It's possible to roll forward the new master, if necessary, from other replicas because of the packaging and saving of the replication transactions. Replication data is packaged into blocks of transactions and sent to each replica. Each replica knows what blocks it has consumed. Each replica can also pass those blocks along to other servers--this is the mechanism of cascading replicas. A new master may be on transaction block 17 relative to the old master, when another replica is on transaction block 20 relative to the old master. Switching to the new master causes the other replicas to send blocks 18, 19, and 20 to the new master.

    Jan, said, "This feature took me a while to develop, even in theory."

  3. Backup and point-in-time capability with a twist.

    It is possible, with some scripting, to maintain a delayed replica as a backup that might, for example, be two hours behind the master. This is done by storing and delaying the application of the transaction blocks. With this technique, it is possible to do a point-in-time recovery anytime within the last two hours on this replica. The time it takes to recover only depends on the time to which you choose to recover. Choosing "45 minutes ago" would take about one hour and 15 minutes, for example, independent of database size.
  4. Hot PostgreSQL installation and configuration.

    For failover, it must be possible to put a new master into place and reconfigure the system to allow the reassignment of any replica to the master or to cascade from another replica. All of this must be possible without taking down the system.

    This means that it must be possible to add and synchronize a new replica without disrupting the master. When the new replica is in place, the master switch can happen.

    This is particularly useful when the new replica is a different PostgreSQL version than the previous one. If you create an 8.0 replica from your 7.4 master, it now is possible to promote the 8.0 to master as a hot upgrade to the new version.

  5. Schema changes.

    Schema changes require special consideration. The bundling of the replication transactions must be able to join all of the pertinent schema changes together, whether or not they took place in the same transaction. Identifying these change sets is very difficult.

    In order to address this issue, Slony-I has a way to execute SQL scripts in a controlled fashion. This means that it is even more important to bundle and save your schema changes in scripts. Tracking your schema changes in scripts is a key DBA procedure for keeping your system in order and your database recreatable.

Related Reading

Practical PostgreSQL
By John C. Worsley, Joshua D. Drake

The first part of Slony-I also does not address any of the user interface features required to set up and configure the system. After the core engine of Slony-I becomes available, development of the configuration and maintenance interface can begin. There may be multiple interfaces available, depending on who develops the user interface and how.

Jan points out that "replication will never be something where you type SETUP and all of a sudden your existing enterprise system will nicely replicate in a disaster recovery scenario." Designing how to set up your replication is a complex problem.

The user interface(s) will be important to clarify and simplify the configuration and maintenance of your replication system. Some of the issues to address include the configuration of which tables to replicate, the requirement of primary keys, and the handling of sequence and trigger coordination.

The Slony-I release does not address the issues of multi-master, synchronous replication or sporadically synchronizable nodes (the "sales person on the road" scenario). However, Jan is considering these issues in the architecture of the system so that future Slony releases may implement some of them. It is critical to design future features into the system; analysis of existing replication systems has shown that it is next to impossible to add fundamental features to an existing replication system.

The primary question to ask regarding the requirements for a failover system is how much down time can you afford. Is five minutes acceptable? Is one hour? Must the failover be read/write, or is it acceptable to have a read-only temporary failover? The second question you must ask is whether you are willing to invest in the hardware required to support multiple copies of your database. A clear cost/benefit analysis is necessary, especially for large databases.

References

  • General Bits Slony Articles on Tidbits
  • The Slony-I Project documentation on GBorg
  • Slonik Commands
  • Jan Wieck's Original Slony-I Talk and Scripts, July 2004 in Portland, OR, sponsored by Affilias Global Registry Services
  • Information from IRC's #slony on freenode.net
  • The Slony1-general@gborg.postgresql.org mailing list

A. Elein Mustain has more than 15 years of experience working with databases, 10 of those working exclusively with object relational database systems.


Return to ONLamp.com.


Have a question about Slony and its aims, or a suggestion for Elein to cover next time? Let us know.
You must be logged in to the O'Reilly Network to post a talkback.
Post Comment
Full Threads Newest First

Showing messages 1 through 3 of 3.

  • nice article
    2004-11-23 11:13:10  adam@devtty.net [Reply | View]

    Thanks for this update on Slony. I've been aware that this released a 1.0 recently and meant to check it out.

    Nice job on giving the overview on the tech, I may actually be motivated to set this up...

  • a couple questions
    2004-11-23 21:13:34  dlang [Reply | View]

    this seems to be very nice but a couple questions came to mind as I read this

    1. in the case of a delayed replication (2 hour delay) that you want to restore to match the real one as of 45 min ago, why would it take 1 hour and 45 min to get there ? why couldn't you identify how far you need to go and then process the updates as fast as possible to get to that point in time?

    2. could something similar to the script delayed update be used for the traveling salesman problem?

    create an additional 'slave' that accepts updates and delayes them indefinantly until the intermittent client connects, downloads all the pending updates, and processes them.
    • a couple questions
      2005-09-14 13:48:13  DarcyB [Reply | View]

      1) There is a feature in the works in -HEAD to facilitate this very item, This will allow a replica to stop processing events as of a specific point in time, though it only applys to a currently subscribed set.

      2) With the advent of log shipping in 1.1 yes this could easily be accomplished.


Tagged Articles

Post to del.icio.us

This article has been tagged:

database

Articles that share the tag database:

MySQL FULLTEXT Searching (54 tags)

Live Backups of MySQL Using Replication (53 tags)

Advanced MySQL Replication Techniques (53 tags)

Dreaming of an Atom Store: A Database for the Web (49 tags)

How to Misuse SQL's FROM Clause (38 tags)

View All

postgresql

Articles that share the tag postgresql:

Managing Many-to-Many Relationships with PL/pgSQL (17 tags)

Writing PostgreSQL Functions with PL/pgSQL (11 tags)

Datamining Apache Logs with PostgreSQL (8 tags)

Introducing Slony (8 tags)

Batch Updates with PL/pgSQL (8 tags)

View All

postgres

Articles that share the tag postgres:

Writing PostgreSQL Functions with PL/pgSQL (8 tags)

Introducing Slony (5 tags)

Automating PostgreSQL Tasks (4 tags)

Modifying Slony Clusters (3 tags)

Managing Many-to-Many Relationships with PL/pgSQL (3 tags)

View All

db

Articles that share the tag db:

Live Backups of MySQL Using Replication (18 tags)

Configuring Database Access in Eclipse 3.0 with SQLExplorer (6 tags)

Going Native: Making the Case for XML Databases (6 tags)

Berkeley DB XML: An Embedded XML Database (5 tags)

Introducing Slony (4 tags)

View All

replication

Articles that share the tag replication:

Advanced MySQL Replication Techniques (80 tags)

Live Backups of MySQL Using Replication (42 tags)

Building and Configuring Slony (4 tags)

Introducing Slony (4 tags)

Modifying Slony Clusters (3 tags)

View All

Sponsored Resources

  • Inside Lightroom

Related to this Article

Understanding Oracle Clinical Understanding Oracle Clinical
by Joan M. Johnson
May 2007
$9.99 USD

Inside SQLite Inside SQLite
by Sibsankar Haldar
April 2007
$9.99 USD

Advertisement
O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com