A Potential Memory Leak in Twisted

Recently I’m developing a server module with Python, the module will initiate thousands of connections to a Jabber server, and there’re many clients connect/disconnect to/from this module, so I’ve created a pool to hold all the connections.

To maintain such a pool, the module has to frequently construct object (when user connects) and deconstruct object (when user leaves), I implement the module using Twisted framework. After some tests, there seems memory leaks, this is not allowed for a server-side application, so I tried to hunt the leak down, after some tests I finally found where the problem exists.

Let’s start from a simple script:

from twisted.protocols import basic
from twisted.internet import protocol, reactor
from twisted.internet.defer import Deferred

import gc
gc.enable()
gc.set_debug(gc.DEBUG_LEAK)

class JabberProxy(protocol.Protocol):
    def __init__(self):
        pass

    def __del__(self):
        print "*** Destruct JabberProxy ***"

    def connectionMade(self):
        self.transport.write('<?xml version="1.0" encoding="UTF-8"?>' +
                             '<stream:stream xmlns:stream="http://etherx.jabber.org/streams" ' +
                             'xmlns="jabber:client" to="im.sky-mobi.com" version="1.0">')

        self.transport.write('<iq type="set" id="451166"><query xmlns="jabber:iq:auth">' +
                             '<username>100001</username><resource>skymobi</resource>' +
                             '<password>kickass</password></query></iq>')

        self.transport.write('<iq type="get" id="451167"><query xmlns="jabber:iq:roster"/></iq>')

    def connectionLost(self, reason):
        self.factory.proto = None

    def dataReceived(self, data):
        #print "=== Data received ===", repr(data)
        pass

class JabberFactory(protocol.ClientFactory):
    protocol = JabberProxy

    def __init__(self):
        pass

    def __del__(self):
        print "*** Destruct JabberFactory ***"

    def clientConnectionFailed(self, connector, reason):
        print "JabberFactory::clientConnectionFailed"

myFactory = JabberFactory()
myRosters = []

class Roster():
    def __init__(self, skyID, factory):
        self.skyID = skyID
        self.factory = factory

    def __del__(self):
        print "*** Destruct Roster ***"

    def connect(self):
        self.conn = reactor.connectTCP('192.168.1.254', 5222, self.factory)

    def disconnect(self):
        if self.conn.transport:
            self.conn.transport.write('')
            self.conn.transport.loseConnection()
            # This line breaks the cyclic references between IConnector and Client,
            # without this line, memory will leak while deleting objects.
            self.conn.transport = None

def stopIt():
    global myRosters

    while len(myRosters):
        myRosters[0].disconnect()
        del myRosters[0]

    reactor.stop()

myRosters.append(Roster('100001', myFactory))
myRosters[0].connect()

reactor.callLater(1, stopIt)
reactor.run()

gc.collect()
print "gc.garbage:", len(gc.garbage)

for item in gc.garbage:
    print item

Notice the line reads self.conn.transport = None which is the key to solve the leak, the following is the details:

When calling reactor.connectTCP, an Connector object is created, after the tcp connection is established, the Connector object creates a Client object, and references the object as transport, in the created Client object, there’s a reference named connector which points back to the Connector object, apparently this forms cyclic references between two objects, which causes the leak in my test script while it tries to free a Roster object (del myRoster[0]). By adding a one line patch (self.conn.transport = None) before the destruction effectively breaks the cyclic references, hence overcomes the leak.

The result before and after the patch:

Before

*** Destruct Roster ***
*** Destruct JabberProxy ***
gc: collectable <dict 0x8daa79c>
gc: collectable <Connector instance at 0x8d9102c>
gc: collectable <Client 0x8d9104c>
gc: collectable <dict 0x8daa934>
gc: collectable <tuple 0x8ae776c>
gc: collectable <tuple 0x8c0e22c>
gc: collectable <list 0x8cb47ec>
gc.garbage: 7
{'reactor': <twisted.internet.selectreactor.SelectReactor object at 0x8d5b7cc>, 'state': 'disconnected', 'factoryStarted': 0, 'bindAddress': None, 'factory': <__main__.JabberFactory instance at 0x8ae7aec>, 'host': '192.168.1.254', 'timeout': 30, 'port': 5222, 'transport': <<class 'twisted.internet.tcp.Client'> to ('192.168.1.254', 5222) at 8d9104c>}
<twisted.internet.tcp.Connector instance at 0x8d9102c>
<<class 'twisted.internet.tcp.Client'> to ('192.168.1.254', 5222) at 8d9104c>
{'_tempDataBuffer': ['</stream:stream>'], 'disconnected': 1, 'dataBuffer': '', '_tempDataLen': 16, 'realAddress': ('192.168.1.254', 5222), 'connector': <twisted.internet.tcp.Connector instance at 0x8d9102c>, 'logstr': 'JabberProxy,client', 'connected': 0, 'offset': 0, 'disconnecting': 1, 'reactor': , 'addr': ('192.168.1.254', 5222)}
('192.168.1.254', 5222)
('192.168.1.254', 5222)
['</stream:stream>']</pre>
After
*** Destruct Roster ***
*** Destruct JabberProxy ***
gc.garbage: 0
*** Destruct JabberFactory ***

Notice, patching the Twisted framework directly (break the cycle in Connector or Client) can also solve the problem but will raise other issues, the Connector object is re-useable, breaking the cyclic references in framework makes it no long re-useable, so I think the better solution is to break the cycle in application instead of framework.


Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word

Site hosted by