Untwisting Python Network Programming
Pages: 1, 2, 3
Retrieving Mails with Twisted
Programming a POP3 client to retrieve mails in the Twisted framework is more
complex and takes more code. The logic to retrieve mails is all in a class that
subclasses POPClient (from twisted.mail.pop3client).
Due to the event-driven nature of Twisted, it's easier to define a method for
each step, register the success, and failure callbacks to the method to enter
the next step and handle any error, respectively.
from twisted.internet.protocol import ClientCreator
from twisted.mail.pop3client import POP3Client
class MyPOP3Client(POP3Client):
def serverGreeting(self, msg):
POP3Client.serverGreeting(self, msg)
self.login(self.myuser, self.mypass).addCallbacks(
self.do_stat, errorHandler)
def do_stat(self, result):
self.stat().addCallbacks(self.do_retrieve, errorHandler)
In the class MyPOP3Client, the first step to get mails is the
serverGreeting method, which Twisted will invoke when the client
starts. This method invokes the superclass's serverGreeting, and
then logs in to the POP3 server with a user name and password. The
login method returns a Deferred object, invoking the
addCallbacks method to register the do_stat method
(called upon successful login), and the errorHandler method
(called on login error).
Similarly, the do_stat method invokes POP3Client's
stat method to perform a POP3 STAT command, and registers the next
step as do_retrieve. Because the call to method stat
is asynchronous, it cannot return its results to the caller with return values.
Instead, it passes the results as arguments to the success callback registered
to the stat method. The second parameter to the
do_retrieve method is a list, of which the first element is the
number of messages in the POP3 account.
def do_retrieve(self, stats):
self.format = "%-3s %-15s %s"
self.num_messages = stats[0]
self.cur_message = 0
print self.myuser, "has", self.num_messages, "messages"
if self.num_messages > 0:
if deletion:
print "Deleting", self.num_messages, "messages",
self.delete(0).addCallbacks(self.do_delete_msg, errorHandler)
else:
print self.format % ("Num", "From", "Subject")
self.retrieve(0).addCallbacks(self.do_retrieve_msg, errorHandler)
else:
reactor.stop()
def do_retrieve_msg(self, lines):
msg = email.message_from_string("\r\n".join(lines))
print self.format % (self.cur_message, msg["From"], msg["Subject"])
self.cur_message += 1
if (self.cur_message < self.num_messages):
self.retrieve(self.cur_message).addCallbacks(
self.do_retrieve_msg, errorHandler)
else:
reactor.stop()
If there is no message in the mailbox, the code calls
reactor.stop to tell Twisted to shutdown. Otherwise, it invokes
retrieve(0) to get the first message. Its success callback,
do_retrieve_msg, handles the message by displaying its summary,
and then retrieves the next message. Because the method
do_retrieve_msg gets invoked for all subsequent messages, the code
uses an instance variable, cur_message, to keep track of the
current message number and to determine when it has handled all messages. When
it has processed everything, it stops the Twisted main loop.
Because the logic of mail retrieval is similar to deletion, both features
are in the same class, MyPOP3Client. The instance variable
deletion denotes the current mode of working. You can see its
initialization in __init__, along with the user name and password.
More interesting is the setting of allowInsecureLogin to true,
which allows login to a server without authentication challenge non-encrypted
transport.
def __init__(self):
self.myuser = user
self.mypass = passwd
self.deletion = deletion
self.allowInsecureLogin = True
def do_delete_msg(self, str):
print ".",
self.cur_message += 1
if (self.cur_message < self.num_messages):
self.delete(self.cur_message).addCallbacks(
self.do_delete_msg, errorHandler)
else:
print " done."
q = self.quit()
q.addCallbacks(lambda _: reactor.stop(), errorHandler)
To delete a mail, call the delete method of class
POP3Client. Similar to the core poplib module, this
method just marks the mail for deletion, and the actual deletion occurs when
you send the POP3 command QUIT to the server, as with the
quit method. Finally, Twisted's execution thread stops when the
quitting action completes.
pop3 = ClientCreator(reactor, MyPOP3Client)
d = pop3.connectTCP(host, 110)
reactor.run()
With the implementation of the desired mail handling in class
MyPOP3Client, you can launch the client. With its descriptive
name, the class ClientCreator (from
twisted.internet.protocol) provides a convenient way to start a
communication client. This code passes the reactor and the
MyPOP3Client class to create a ClientCreator, and
begins the mail retrieval by calling connectTCP with the specified
server and port number. Twisted's execution loop then kicks off by
reactor.run().
Invoking do_retrieve_msg repeatedly for all messages is
conceptually tedious and lengthy, when compared to the
DeferredList mechanism which keeps track of several actions and
gets notified when all actions complete, as in the case of sending mails.
However, collecting the Deferreds of multiple calls to
retrieve of POP3Client in a DeferredList
simply does not work in Twisted (Versions 2.2.0 and 2.4.0). The success
callback never gets invoked (see mail-twisted.py).
Doing Telnet with Twisted
Twisted can power a Telnet client in a way similar to, but simpler than, the
POP3 client. the Telnet conversation logic goes in a subclass of
Telnet (from twisted.conch.telnet).
def stop(host):
from twisted.internet.protocol import ClientCreator
from twisted.conch.telnet import Telnet
class MyTelnet(Telnet):
def dataReceived(self, data):
if "Login id:" in data:
self._write("root\n")
elif "Password:" in data:
self._write("root\n")
elif "Welcome" in data:
d = self._write("shutdown\n")
def connectionLost(self, reason):
reactor.stop()
print "done."
mytelnet = ClientCreator(reactor, MyTelnet)
d = mytelnet.connectTCP(host, 4555)
reactor.run()
The class MyTelnet overrides two methods of class
Telnet. The first method, dataReceived, is called
when data arrives at the client. It checks the data received and calls the
_write method to send the user name, password, or the shutdown
command accordingly to the server. The second method is
connectionLost, which Twisted calls when the server closes the
telnet session. In that case, the program simply terminates the Twisted
execution loop. The Telnet client starts by using the ClientClient
class, connected to port 4555 of the James mail server.
When to Be Twisted?
The two functionally equivalent programs, one using Python core modules and the other using the Twisted framework, significantly differ from each other in terms of programming style and the amount of code. Then when should you use either of the two options?
For basic programs such as the command-line client of this example, the Python core networking modules are more desirable due to the simplicity and performance advantages. However, most real-world networking programs are very complex, and Twisted's asynchronous programming model is more effective. For example, BitTorrent, the popular peer-to-peer file sharing client that performs massive parallel downloading of data chunks from different sources, uses Twisted. Twisted also works well in programs with graphical user interface (GUI), because its asynchronous nature fits more seamlessly with the event-driven programming models of modern GUI frameworks. In fact, Twisted has integration with popular GUI frameworks including PyGTK, Qt, Tkinter, WxPython, and Win32.
The other area where Twisted shines is in server programming. A typical network server uses multithreading so that it can handle multiple clients concurrently. The asynchronous mechanism of Twisted alleviates the creation and handling of threads by server programs. In addition, Twisted provides several protocols on which to build new networking services, enabling rapid development of complex servers. One such project is Quotient, which adopts Twisted to build a multiprotocol messaging server that supports a variety of protocols and services including SMTP, POP3, IMAP, webmail, and SIP.
Kendrew Lau is a consultant in Hong Kong, with focus on Java, Linux, and other OSS technologies.
Return to the Python DevCenter.
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 8 of 8.
-
Why untwisting?
2006-08-11 10:34:28 glyph@divmod.com [Reply | View]
Many of the examples in this article are wrong. I'll just point out the errors in the telnet example, because I'm more familiar with Twisted than non-Twisted code, and I also don't have too much time to spend reviewing it...
By overriding Telnet's dataReceived method, it short-circuits the machinery which actually speaks the telnet protocol. By inspecting the 'data' argument for full strings, it depends on implementation accidents of the transport to deliver whole messages - TCP does not guarantee that, and although packets are rarely fragmented at such small sizes, as example code this is very bad form.
Try looking at TelnetTransport and ITelnetProtocol for a more correct way to implement this; you should probably use something like StatefulTelnetProtocol to verify that you're processing whole lines, and to deal with the negotiation of various telnet options.
"For basic programs such as the command-line client of this example, the Python core networking modules are more desirable due to the simplicity and performance advantages."
What do you mean by "performance advantages"? Have you done any measurements indicating that the core networking modules are ever faster?
-
Why untwisting?
2006-08-13 22:19:59 Kendrew [Reply | View]
Thank you for pointing out the possibility of TCP segmenting, which may cause problems to dataReceived.
Regarding performance, I observed the core modules runs faster than Twisted when running the programs. I also briefly did some measurements and here are the results:
Using core modules:
start server (no net): 0.6040 sec
send mails (smtp): 1.5585 sec
view mails (pop3): 0.7506 sec
delete mails (pop3): 0.5159 sec
stop server (telnet): 0.5063 sec
Using Twisted:
start server (no net): 0.6668 sec
send mails (smtp): 2.4919 sec
view mails (pop3): 1.4418 sec
delete mails (pop3): 1.2992 sec
stop server (telnet): 2.1045 sec
(Windows XP, Python 2.4.3, Twisted 2.4.0)
These numbers are the average of 10 runs, and the mail server is run in localhost. While the measurements are by no means vigorous, they basically agree with the observations. Of course, the differences may not be significant in real uses when the network delay counts for majority of the execution time. -
performance
2006-08-14 08:16:53 radix [Reply | View]
How were these numbers reached? Can you show us any python code or commands that you used to find them?
I notice that your synchronous version of the telnet code immediately closes the socket whereas the twisted version is waiting for the *server* to end the connection; this could definitely be affecting your times. If you add a transport.loseConnection call after your write call, the semantics should line up better, and I imagine performance will be closer to what we expect.
Also, why are you calling the private "_write" method in the telnet example? -
performance
2006-08-17 04:57:06 Kendrew [Reply | View]
The time includes starting the interpreter and doing the whole script. It is true that the import statements (and any networking delay) contributes the majority to the execution time in these small programs. In fact, before measuring I did some tidying of the import statements so that only required items are imported before use.
For the telnet, I measure again with the addition of transport.loseConnection() and using StatefulTelnetProtocol instead of Telnet. The Twisted telnet runs faster than before, as expected:
Using core modules:
start server (no net): 0.5809 sec
send mails (smtp): 1.3601 sec
view mails (pop3): 0.7007 sec
delete mails (pop3): 0.5187 sec
stop server (telnet): 0.5124 sec
Using Twisted:
start server (no net): 0.5959 sec
send mails (smtp): 2.2488 sec
view mails (pop3): 1.4274 sec
delete mails (pop3): 1.3074 sec
stop server (telnet): 1.3213 sec
If you're interested, there is the Python program to measure the timing. It just invokes various usages of the two networking programs and takes the average.
#!/usr/bin/python
# file: mail-timeit.py
# Measures the timing of invoking mail-core.py and mail-twisted.py
from time import sleep
from timeit import Timer
cmdline = ''
def doit(cmd, arg, array, rest):
global cmdline
cmdline = cmd + ' ' + arg
print; print cmdline
array.append(Timer('os.system(cmdline)',
'import os; from __main__ import cmdline').timeit(1))
sleep(rest)
def dostat(cmd, times):
stat = [[], [], [], [], []] # for 1, s, v, d, 0
for i in range(times):
doit(cmd, '1', stat[0], 12)
doit(cmd, 's', stat[1], 5)
doit(cmd, 'v', stat[2], 1)
doit(cmd, 'd', stat[3], 1)
doit(cmd, '0', stat[4], 2)
return stat
def avgstat(stat, fr, to):
return [ sum(i[fr:to]) / (to-fr) for i in stat ]
def printavgs(avgs):
labels = [
'start server (no net)',
'send mails (smtp)',
'view mails (pop3)',
'delete mails (pop3)',
'stop server (telnet)']
for i, j in zip(labels, avgs):
print '%25s: %.4f sec' % (i, j)
if __name__ == '__main__':
times = 11
stat1 = dostat('mail-core.py', times)
avgs1 = avgstat(stat1, 1, times)
stat2 = dostat('mail-twisted.py', times)
avgs2 = avgstat(stat2, 1, times)
# print stat1
printavgs(avgs1)
# print stat2
printavgs(avgs2)
# end of mail-timeit.py
-
Why untwisting?
2006-08-14 08:15:01 glyph@divmod.com [Reply | View]
Are you measuring the time it takes to perform a task, or the amount of time it takes to start the interpreter, load every module, perform the task, and shut down the interpreter?
Twisted has more code in it than the Python standard library version, so unless you've carefully optimized the package for importing, the amount of time spent loading code will dwarf the amount of time spent actually doing anything.
-
Short examples don't show event driven-driven benefits
2006-08-23 11:11:38 andypurshottam [Reply | View]
The main advantages of event driven programming become visible when large amounts of data can be processed incrementally ("streaming") and when there are multiple event sources, especially a gui tooklkit. Small cute programming examples typically do not need these resources. The smallest example I have seen of a application that needs and benefits from event-driven programming is the tcp proxy spy with gui window, like tcpwatch (done with async stuff from medusa, but would be instructive example with any event-driven system.)
Found I really understood POE after completing such a program, and would advise those trying to learn a event-driven stsrem to code such. Only proplem is that doing so is not trivial, especially given the small examples that come with most systems, that do not explain how to do the tricky things needed so code a proxy, that also has GUI or standard IO, and possibly multple network connections.
Andy (andypurshottam@gmail.com)



I think you meant "newline" instead of "carriage return" (which is "\r").