|
Contents |
|
What is Cricket? Cricket is an easy-to-set-up application for recording page-load times, and it has a nice web-based grapher that will generate charts to display the data in several formats. |
Everyone wants something different from our web server. Marketeers want to know the geographical areas that hits are coming from. Editors want to know what content readers find most interesting. Advertisers demand to know how many times their banner ads are delivered, and from what pages.
And of course, as system administrator, I need to know that pages are being delivered promptly. I can't get that information from conventional web server log files. Instead, I need a system that can periodically load-test pages from each of my servers, and log how long it takes. By graphing that data, I can get a feel for how well my servers are running.
For the last year, I have been using a free software package, Cricket, that's perfect for this application.
Cricket can be easily set up to record page-load times, and it has a nice web-based grapher that will generate charts to display the data in several formats. Cricket is based on RRDtool, whose ancestor is MRTG (short for "Multi-Router Traffic Grapher"). RRDtool (Round Robin Data Tool) is a package that collects data in "round robin" databases; each data file is fixed in size so that running Cricket does not slowly fill up your disks. The database tables are sized when created and do not grow larger over time. As the data ages, it's averaged. This works fine for the graphing we are doing here.
Each RRDtool table "wraps around": The detailed "hourly" table has readings at 5-minute intervals for the last 24 hours, the "daily" table has averages of the load times for the last seven days, and so on. It's pretty intuitive when you look at the graphs, but you should be aware that you won't be able to get back exact 5-minute interval readings for something that took place a month ago if you use the RRDtool as your logging system -- only the interpolated averages are stored. This is good enough for most purposes however.
Tracking page-load times is just one small facet of what Cricket can do for you. I use it to track how much traffic travels across our Etherswitch so that we can track bandwidth utilization of our subnet and each individual host. Cricket uses SNMP (Simple Network Management Protocol) to log data from all sorts of networked gear.
|
See Cricket in Action! See for yourself how Cricket works. Click here to jump to a table listing four web site options. Each option is a live web page. Choose a page, and within seconds Cricket will analyze its load time and build the graphs displaying its results. |
In addition, you can even write your own "collector" scripts that log data on just about anything, and then use Cricket to graph the data. I tested this by logging subscriber counts for a mailing list.
In this article, we're going to set up Cricket to track web server performance. Cricket comes with good documentation, and sample setups for network gear. Once you've got it running, refer to these documents to take on more ambitious projects. There is also a Cricket mailing list that you can sign up for to get more help. (See the Cricket web site for info on how to sign up.)
When Cricket is installed to log page-load times, I can show folks the before and after pictures when I perform system tuning operations such as adding memory or a new hard drive. This helps to justify the expense by showing concrete evidence that the change has improved load times.
If you make a major software change, you can track its effect on load times. Keep in mind that you have to select the right URLs to track. If you turn on server-side "includes" but point Cricket only at a page with no includes in it, you won't get as sharp a picture of the change resulting from the new software.
On page 2, we'll walk through the Cricket installation process step by step.
|
For our basic Cricket setup, let's assume you're going to install it on a Linux server that already has an Apache server running on it. Also you'll need to have installed the basic development tools: make, GCC compiler, and the Linux kernel headers.
I found these instructions in the beginner.txt file in the cricket/doc directory. My instructions are more explicitly geared for a Linux system; if you are running something else, you'll probably want to refer to the beginner.txt file as well. So, let's get started.
You will need a recent version of Perl -- version 5.004 or newer. To check the version number, use the perl -v command.
You will also need these packages from CPAN (Comprehensive Perl Archive Network); you may have some of them already.
| Package name | Where to get it: | MD5 CPAN | by-authors/id/GAAS/Digest-MD5-*.tar.gz |
| LWP CPAN | by-authors/id/GAAS/libwww-perl-*.tar.gz |
| DB_File CPAN | by-authors/id/PMQS/DB_File-*.tar.gz Date::Parse |
| CPAN | by-authors/id/GBARR/Timedate-*.tar.gz Time::HiRes |
| CPAN | by-authors/id/DEWEG/Time-HiRes-*.tar.gz |
If you have the CPAN module installed and configured, you can issue the following commands while running as "root." If you have the CPAN module but have not configured it, the first time you run it, it will ask some questions. Go ahead and give it a shot; it's not that hard. Otherwise you can go to CPAN.org web site. Here you can find the modules, download them, unpack them, and build and install each one by following the ReadMe file. This is known as doing it the hard way.
Each of these CPAN module commands will install the latest version of each package. It is safe to run the command; if the latest version is already installed, it will just tell you that and stop.
perl -MCPAN -eshell
cpan> install MD5
cpan> install LWP
cpan> install DB_File
cpan> install Date::Parse
cpan> install Time::HiRes
cpan> quit
There are two more packages that are not in the CPAN archives, so you have to fetch and install them separately. Use your web browser to find the latest version of each and download them to a spot on your system where you can unpack and build the package. Then use your rootly powers to install it.
First the SNMP_Session package: For this HTTP tracking project we actually don't need to use any SNMP services, but Cricket requires the package so you have to install it anyway.
Here are the abbreviated instructions on building version 0.76:
% tar xzf SNMP_Session-0.76.tar.gz
% cd SNMP_Session-0.76
% perl Makefile.PL
% make
% su root
Password:
# make install
# exit
% cd ..
Here are the abbreviated instructions on building version 1.0.11:
% tar xzf rrdtool-1.0.11.tar.gz
% cd rrdtool-1.0.11
I had to do this to get configure to work on one of my systems:
% unset noclobber
% ./configure
% make
% su root
Password:
This next line will install the RRD Perl modules in your system's
standard site-perl directory tree instead of putting them in a separate
location (which is what make install does). This is necessary for
the Cricket scripts to find the RRD modules.
# make site-perl-install
Now, as root, create a user account that will run Cricket.
These commands work on a Linux system; use your own preference on your system to create a Cricket user. You don't strictly need a separate Cricket account, but I find it is a lot easier this way.
# groupadd cricket
# useradd -g cricket -c 'Cricket Traffic Grapher' cricket
# passwd cricket
# chmod 755 ~cricket
Set an alias to receive Cricket's mail.
# echo "cricket: root" >> /etc/aliases
# newaliases
# exit
Download and install Cricket.
% su - cricket
Password:
Now that you're running as Cricket, use a browser to download the Cricket source archive from here.
Here are the abbreviated instructions on installing version 0.72:
% tar xzf cricket-0.72.tar.gz
Using this symbolic link will allow you to upgrade easily later:
% ln -s cricket-0.72 cricket
% cd cricket
Running "configure" will fix up the first line of each Cricket script so they can find Perl on your system.
% ./configure
Now we get into the configuration; this is the most
complicated step. Copy the sample configuration tree to
the ~cricket/cricket-config directory, which is where
the installed Cricket will look for it.
% cd ..
% cp -r cricket/sample-config cricket-config
There are lots of setup files in there that won't be used but maybe
you'll want them later; they won't hurt anything for now. For now,
we are only interested in cricket-config/http-performance. Edit
the URLs file.
% cd cricket-config/http-performance
% ls
Defaults urls
% emacs urls
The sample file looks like this. Remove those entries and put in entries for whatever you would like to monitor.
target cricket-home
short-desc = "The Cricket Homepage"
url = "http://www.munitions.com/~jra/test-file.txt"
target www.cnn.com
url = "http://www.cnn.com"
target name (for example, cricket-home) will be used to name the
database table, and as the default name in the web interface.short-desc text will override the target name in the web
interface.You can have as many targets as you like; each one will cause a database table of approximately 60 kilobytes to be created. If it takes more than five minutes to collect a set of data, you will receive warning messages telling you that the collection subtree is locked. You can ignore these messages or change the collection interval.
The defaults file in this directory contains settings to control how
the data is displayed. For a while, one
of my servers was consistently delivering pages in times greater than
five seconds. I had to change the setting for y-max so that I could see
more data on the graphs. I won't tell you how slow the server was -- too embarrassing. Every page on that site was generated by Perl
scripts.
Each time you make changes to the cricket-config files, you have to
recompile them. There will be error messages if you entered anything
incorrectly.
% cd ~
% cricket/compile
To test the data collector, run it now manually.
% cricket/collector /http-performance
If it works, you'll see a lot of messages like this indicating Cricket is testing each target in your configuration file:
[09-Feb-2000 22:26:55 ] Retrieving data
(EXEC: /home/cricket/cricket-config/../cricket/util/test-url
http://www.xml.com) for XML
The collector will also create the data tables in the
~cricket/cricket-data/http-performance directory the first time you run it.
The collector is run from a script called collect-subtrees.
You can set up Cron to run different collection sets at different
intervals. The file cricket/subtrees-sets defines what is in
each set. For our example, you will have to edit that file to change
the lines.
set normal:
/routers
/router-interfaces
to
set normal:
/http-performance
Now, to test collection of the set, run the wrapper script:
% cricket/collect-subtrees normal
% exit
You won't see any output from this script, but the wrapper will create
another directory, cricket-logs, and log its output to file in it,
normal.0.
Now you are ready to set up a Cron job to run the collection script.
Make a Cron entry to run Cricket once every five minutes. I run Cricket
from the /etc/cron.d directory. You could run it directly under the
Cricket account (using the crontab -e command to edit the file), but I
find it easier to keep track of what administrative Cron jobs are
installed by putting them all in the /etc/cron.d directory.
% su root
[Next command all on one line]
# echo "*/5 * * * * cricket
/home/cricket/cricket/collect-subtrees normal" >
/etc/cron.d/cricket
# exit
Now wait until the next 5-minute increment rolls around and watch to
see if the data collection happens. Once Cricket has been running for
awhile, you will see a series of files from normal.0 to normal.20;
each time collect-subtrees file runs, it renumbers the files so the newest
one is always normal.0.
Cricket is now logging data. If you modify the files in cricket-config,
remember to re-run compile to update the configuration.
On page 3, we'll cover how to set things up for web browsing.
|
Basically everything is already installed, but you have to make
symbolic links from the Cricket account's public_html area into the
Cricket install. (Using symlinks instead of copying the files makes
upgrading very easy, so I highly recommend it.)
% su - cricket
Password:
% mkdir public_html
% cd public_html
% ln -s ../cricket/doc doc
% mkdir cricket
% cd cricket
% ln -s ../../cricket/VERSION
% ln -s ../../cricket/grapher.cgi
% ln -s ../../cricket/images
% ln -s ../../cricket/lib
% ln -s ../../cricket/mini-graph.cgi
You have to have your web server configured to allow viewing of user
directories. You need to allow symlinks and CGIs in the Cricket
subdirectory. This would be appropriate code to add to your Apache
httpd.conf file. (This part is pretty generic; if you have installed a recent Apache server, it's probably already there.)
UserDir public_html
<Directory /home/*/public_html>
AllowOverride FileInfo AuthConfig Limit
Options MultiViews Indexes SymLinksIfOwnerMatch
<Limit GET POST OPTIONS PROPFIND>
Order allow,deny
Allow from all
</Limit>
<Limit PUT DELETE PATCH PROPPATCH MKCOL COPY MOVE LOCK UNLOCK>
Order deny,allow
Deny from all
</Limit>
</Directory>
# This is for the Cricket Traffic Grapher
<Directory /home/cricket/public_html/cricket>
Options SymLinksIfOwnerMatch ExecCGI
</Directory>
Of course, if you have to add this to your httpd.conf file, you will have to
tell httpd that your configuration has changed. If you compiled and
installed Apache from sources, you can use the apachectl command to
signal httpd. This command should work for you.
apachectl restart
If you used the Red Hat RPM version of Apache, you can use this instead:
/etc/rc.d/init.d/httpd restart
Now you should be able to run the grapher.cgi program using this URL,
http://yoursystem/~cricket/cricket/grapher.cgi
|
One last tip If it's a possibility for you, install Cricket on several different networks to get a more accurate picture of how the Internet affects page-load times. I run a copy on my personal ISP account across town. |
if you insert the name of your web server in where it
says yoursystem.
If you get an "internal server" error message in the browser, the
first place to look is in your Apache error_log file. (The exact
location of this file depends on your system.)
When things are running normally, you will get a menu back from
grapher.cgi. You should be able to click on the link,
http-performance. This will return a list of the targets you set up
to be monitored. Click on one of them and you should get the graph
page. Alas, the first time you view a page there will be no data to
view! Patience. Come back in a few hours and more interesting graphs
will start to show up.
Discuss this article in the O'Reilly Network Apache Forum.
Return to the Apache DevCenter.
Copyright © 2009 O'Reilly Media, Inc.