Note: This question has been re-asked with a summary of all debugging attempts here.
ps = subprocess.Popen(['ps', 'aux'], stdout=subprocess.PIPE).communicate()
After running for a few days, the call is erroring with:
File "/home/admin/sd-agent/checks.py", line 436, in getProcesses File "/usr/lib/python2.4/subprocess.py", line 533, in __init__ File "/usr/lib/python2.4/subprocess.py", line 835, in _get_handles OSError: [Errno 12] Cannot allocate memory
However the output of free on the server is:
$ free -m total used free shared buffers cached Mem: 894 345 549 0 0 0 -/+ buffers/cache: 345 549 Swap: 0 0 0
I have searched around for the problem and found this article which says:
Solution is to add more swap space to your server. When the kernel is forking to start the modeler or discovery process, it first ensures there’s enough space available on the swap store the new process if needed.
I note that there is no available swap from the free output above. Is this likely to be the problem and/or what other solutions might there be?
Update 13th Aug 09 The code above is called every 60 seconds as part of a series of monitoring functions. The process is daemonized and the check is scheduled using sched. The specific code for the above function is:
def getProcesses(self): self.checksLogger.debug('getProcesses: start') # Memory logging (case 27152) if self.agentConfig['debugMode'] and sys.platform == 'linux2': mem = subprocess.Popen(['free', '-m'], stdout=subprocess.PIPE).communicate() self.checksLogger.debug('getProcesses: memory before Popen - ' + str(mem)) # Get output from ps try: self.checksLogger.debug('getProcesses: attempting Popen') ps = subprocess.Popen(['ps', 'aux'], stdout=subprocess.PIPE).communicate() except Exception, e: import traceback self.checksLogger.error('getProcesses: exception = ' + traceback.format_exc()) return False self.checksLogger.debug('getProcesses: Popen success, parsing') # Memory logging (case 27152) if self.agentConfig['debugMode'] and sys.platform == 'linux2': mem = subprocess.Popen(['free', '-m'], stdout=subprocess.PIPE).communicate() self.checksLogger.debug('getProcesses: memory after Popen - ' + str(mem)) # Split out each process processLines = ps.split('n') del processLines # Removes the headers processLines.pop() # Removes a trailing empty line processes =  self.checksLogger.debug('getProcesses: Popen success, parsing, looping') for line in processLines: line = line.split(None, 10) processes.append(line) self.checksLogger.debug('getProcesses: completed, returning') return processes
This is part of a bigger class called checks which is initialised once when the daemon is started.
The entire checks class can be found at http://github.com/dmytton/sd-agent/blob/82f5ff9203e54d2adeee8cfed704d09e3f00e8eb/checks.py with the getProcesses function defined from line 442. This is called by doChecks() starting at line 520.
when you use popen you need to hand in close_fds=True if you want it to close extra file descriptors.
creating a new pipe, which occurs in the _get_handles function from the back trace, creates 2 file descriptors, but your current code never closes them and your eventually hitting your systems max fd limit.
Not sure why the error you’re getting indicates an out of memory condition: it should be a file descriptor error as the return value of
pipe() has an error code for this problem.