Discovering System Processes Part II
Pages: 1, 2
Sometimes you may start a process and wish to stop it before it is finished. For example, in a spurt of inspiration you might decide that you want to see the name of every file on your FreeBSD system, so you type this at your terminal:
find / -print |more
However, you soon grow tired of pressing the spacebar and decide that you really didn't want to see all of your files at this time. In other words, you want to send an interrupt signal. One way to do this is:
^C
You'll know that your INT signal worked as you'll get your prompt back.
Retry the same find command, but this time send a signal 3 like so:
^\
Just before you get your prompt back, you'll see the following message:
Quit (core dumped)
If you use ALT F1 to return to the console, you'll see a message similar to this:
Nov 19 13:50:09 genisis /kernel: pid 806 (find), uid 1001: exited on
signal 3
Nov 19 13:50:09 genisis /kernel: pid 807 (more), uid 1001: exited on
signal 3 (core dumped)
And if you do a directory listing at your original terminal, you should see a file called more.core. Normally, you won't be sending a signal 3 to a process unless you're a programmer and know how to use the kernel debugger. I included the example to show the difference between a signal 2 and a signal 3; you can safely delete that *.core file.
Interprocess communication isn't much different than any other type of communication. You or another process can send a signal requesting a desired result, but it is up to the process receiving the signal to decide what it wants to do with that signal. Remember that processes are simply running programs; most programs use something called a "signal handler" to decide how and when to respond to signals. Usually if you send some type of termination signal, the signal handler will try to gracefully close all the files that process has opened to prevent data loss before the process itself closes. Sometimes, the signal handler will decide to just ignore your signal and will refuse to terminate the process.
However, some signals can't be ignored; for example, signal 9 and signal 17. Let's say that you wish to stop a process you've started, so you used grep to find the PID of the process, used ps to send a TERM signal, then repeated your grep to ensure it worked like so:
ps | grep processname
kill PID
ps | grep processname
However, the second grep still shows that PID, meaning your TERM signal was ignored by that process. Either one of these commands should fix it:
kill -9 PID
or
kill -KILL PID
If you now repeat your grep command, you should just have your prompt echoed back at you, meaning that PID was indeed terminated.
You may ask, "Why not just always send a signal 9 if it can't be ignored?" Signal 9 does indeed "kill" a process, but it doesn't give it time to gracefully save all of its work first, meaning that you may lose some data. It's better to try sending another type of terminating signal first, and save signal 9 for those processes that stubbornly refuse to terminate. Also, remember that as a regular user you will only be able to send signals to processes that are owned by you. The superuser can send a signal to any process.
There may be times when you wish to terminate all the processes you own; this has different ramifications depending on whether you are a regular user or the superuser.
Let's demonstrate as a regular user. Log in to four different terminals and do a ps command:
ps
PID TT STAT TIME COMMAND 316 v0 Ss 0:00.39 -csh (csh) 957 v0 R+ 0:00.00 ps 317 v1 Is+ 0:00.20 -csh (csh) 915 v2 Is 0:00.12 -csh (csh) 941 v2 I+ 0:00.09 lynx 942 v2 Z+ 0:00.00 (lynx) 913 v3 Is 0:00.12 -csh (csh) 946 v3 I+ 0:00.01 /bin/sh /usr/X11R6/bin/startx 951 v3 I+ 0:00.04 xinit /home/genisis/.xinitrc -- 955 v3 S 0:03.00 xfce
In this example, I've logged into terminals 0-3. I ran the ps command from the console, logged into the first terminal, started lynx on the second terminal, and started an XWindows session from terminal three, which resulted in a total of 10 processes owned by myself. If I use a PID of "-1" when I invoke the kill command, I will broadcast the signal I specify to all of my processes. So, let's send a TERM signal like so:
kill -1
Then check our results with the ps command:
ps
PID TT STAT TIME COMMAND
316 v0 Ss 0:00.41 -csh (csh)
969 v0 R+ 0:00.00 ps
317 v1 Ss+ 0:00.21 -csh (csh)
915 v2 Is+ 0:00.12 -csh (csh)
913 v3 Is+ 0:00.12 -csh (csh)
Looks like we terminated six of the original PIDs, but four processes ignored our TERM signal. Let's be a bit more aggressive:
kill -KILL -1
PID TT STAT TIME COMMAND 317 v1 Ss 0:00.22 -csh (csh) 995 v1 R+ 0:00.00 ps
If you scroll through your original four terminals, you'll see the login prompt at three of them. This last command killed all processes except the process you executed the kill command from, that is, all processes except the c shell you ran the kill command in.
You'll note that if you make a typo and type:
kill 1
instead of:
kill -1
you'll receive the following error message:
1: Operation not permitted
-1 is the special PID that represents all of your processes; 1 is the PID of the process named init. Only the superuser can kill the init process. Also, the superuser should only kill init if the superuser knows what he is doing.
Now let's see what happens if we repeat this exercise as the superuser. First, I'll run the ps command on my test computer that is running all kinds of neat stuff: Apache, MySQL, Squid, NFS, etc.
ps -acux
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND genisis 1050 0.0 0.2 428 244 v0 R+ 4:08PM 0:00.00 ps root 1 0.0 0.2 532 304 ?? ILs 5:10AM 0:00.04 init root 2 0.0 0.0 0 0 ?? DL 5:10AM 0:00.03 pagedaemon root 3 0.0 0.0 0 0 ?? DL 5:10AM 0:00.00 vmdaemon root 4 0.0 0.0 0 0 ?? DL 5:10AM 0:00.04 bufdaemon root 5 0.0 0.0 0 0 ?? DL 5:10AM 0:02.62 syncer root 27 0.0 2.0 70780 2540 ?? ILs 5:10AM 0:00.08 mount_mfs root 30 0.0 0.1 208 92 ?? Is 5:10AM 0:00.00 adjkerntz root 110 0.0 0.3 536 368 ?? Ss 10:10AM 0:00.22 dhclient root 163 0.0 0.5 904 608 ?? Ss 10:10AM 0:00.19 syslogd daemon 166 0.0 0.4 916 556 ?? Is 10:10AM 0:00.01 portmap root 171 0.0 0.3 504 320 ?? Is 10:10AM 0:00.00 mountd root 173 0.0 0.1 360 172 ?? Is 10:10AM 0:00.01 nfsd root 175 0.0 0.1 352 164 ?? I 10:10AM 0:00.00 nfsd root 176 0.0 0.1 352 164 ?? I 10:10AM 0:00.00 nfsd root 177 0.0 0.1 352 164 ?? I 10:10AM 0:00.00 nfsd root 178 0.0 0.1 352 164 ?? I 10:10AM 0:00.00 nfsd root 181 0.0 0.5 263052 576 ?? Is 10:10AM 0:00.00 rpc.statd root 197 0.0 0.6 1028 764 ?? Is 10:10AM 0:00.02 inetd root 199 0.0 0.6 956 700 ?? Ss 10:10AM 0:00.19 cron root 202 0.0 1.0 1424 1216 ?? Is 10:10AM 0:00.20 sendmail root 227 0.0 0.4 876 488 ?? Is 10:10AM 0:00.00 moused root 261 0.0 1.4 2068 1704 ?? Ss 10:10AM 0:00.98 httpd root 275 0.0 0.4 620 448 con- I+ 10:10AM 0:00.02 sh root 293 0.0 0.4 624 452 con- I+ 10:10AM 0:00.01 sh mysql 303 0.0 1.4 10896 1796 con- S+ 10:10AM 0:00.43 mysqld nobody 305 0.0 4.7 6580 5928 con- S+ 10:10AM 0:05.42 squid nobody 308 0.0 1.4 2092 1704 ?? I 10:10AM 0:00.00 httpd nobody 309 0.0 1.4 2092 1704 ?? I 10:10AM 0:00.00 httpd nobody 310 0.0 1.4 2092 1704 ?? I 10:10AM 0:00.00 httpd nobody 311 0.0 1.4 2092 1704 ?? I 10:10AM 0:00.00 httpd nobody 312 0.0 1.4 2092 1704 ?? I 10:10AM 0:00.00 httpd genisis 317 0.0 0.8 1336 960 v1 Is+ 10:10AM 0:00.24 csh root 320 0.0 0.5 920 628 v4 Is+ 10:10AM 0:00.02 getty root 321 0.0 0.5 920 628 v5 Is+ 10:10AM 0:00.01 getty root 322 0.0 0.5 920 628 v6 Is+ 10:10AM 0:00.01 getty root 323 0.0 0.5 920 628 v7 Is+ 10:10AM 0:00.01 getty nobody 324 0.0 0.3 832 348 ?? Is 10:10AM 0:00.01 unlinkd root 992 0.0 0.5 920 628 v2 Is+ 3:46PM 0:00.01 getty root 993 0.0 0.5 920 628 v3 Is+ 3:46PM 0:00.01 getty genisis 994 0.0 0.8 1336 956 v0 Ss 3:46PM 0:00.14 csh root 0 0.0 0.0 0 0 ?? DLs 5:10AM 0:00.02 swapper
Then I'll send the KILL signal to the special PID -1 as the superuser:
su
Password:
kill -9 -1
That command was a little scarier as it even kicked me out of the c shell I executed the kill command from. Once I logged back in, I assessed the damage like so:
ps -acux
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND genisis 1070 0.0 0.2 396 244 v0 R+ 4:11PM 0:00.00 ps root 1 0.0 0.2 532 304 ?? ILs 5:10AM 0:00.05 init root 2 0.0 0.0 0 0 ?? DL 5:10AM 0:00.03 pagedaemon root 3 0.0 0.0 0 0 ?? DL 5:10AM 0:00.00 vmdaemon root 4 0.0 0.0 0 0 ?? DL 5:10AM 0:00.05 bufdaemon root 5 0.0 0.0 0 0 ?? DL 5:10AM 0:02.65 syncer root 1059 0.0 0.5 920 628 v3 Is+ 4:10PM 0:00.01 getty root 1060 0.0 0.5 920 628 v2 Is+ 4:10PM 0:00.01 getty root 1061 0.0 0.5 920 628 v7 Is+ 4:10PM 0:00.01 getty root 1062 0.0 0.5 920 628 v6 Is+ 4:10PM 0:00.01 getty root 1063 0.0 0.5 920 628 v5 Is+ 4:10PM 0:00.01 getty genisis 1064 0.0 0.8 1336 956 v0 Ss 4:10PM 0:00.12 csh root 1065 0.0 0.5 920 628 v4 Is+ 4:10PM 0:00.01 getty root 1066 0.0 0.5 920 628 v1 Is+ 4:10PM 0:00.01 getty root 0 0.0 0.0 0 0 ?? DLs 5:10AM 0:00.02 swapper
When the superuser sends a signal to -1, it is sent to every process except the system processes. If that signal happened to be the KILL signal, you would be hearing complaints from users who happened to have a file open at the time and lost their data.
This is one of the reasons only the superuser is allowed to run the reboot and halt commands. When one of these commands is issued, a TERM signal is sent to PID -1 to give all processes a chance to save their data; this is followed by a KILL signal to ensure that any remaining processes are terminated.
In next week's article, I'd like to continue a bit more on this theme and take a closer look at init and getty.
Dru Lavigne is a network and systems administrator, IT instructor, author and international speaker. She has over a decade of experience administering and teaching Netware, Microsoft, Cisco, Checkpoint, SCO, Solaris, Linux, and BSD systems. A prolific author, she pens the popular FreeBSD Basics column for O'Reilly and is author of BSD Hacks and The Best of FreeBSD Basics.
Read more FreeBSD Basics columns.
Discuss this article in the Operating Systems Forum.
Return to the BSD DevCenter.
