How to close file descriptors in python? - python-2.7

I have the following code in python:
import os
class suppress_stdout_stderr(object):
'''
A context manager for doing a "deep suppression" of stdout and stderr in
Python, i.e. will suppress all print, even if the print originates in a
compiled C/Fortran sub-function.
This will not suppress raised exceptions, since exceptions are printed
to stderr just before a script exits, and after the context manager has
exited (at least, I think that is why it lets exceptions through).
'''
def __init__(self):
# Open a pair of null files
self.null_fds = [os.open(os.devnull,os.O_RDWR) for x in range(2)]
# Save the actual stdout (1) and stderr (2) file descriptors.
self.save_fds = (os.dup(1), os.dup(2))
def __enter__(self):
# Assign the null pointers to stdout and stderr.
os.dup2(self.null_fds[0],1)
os.dup2(self.null_fds[1],2)
def __exit__(self, *_):
# Re-assign the real stdout/stderr back to (1) and (2)
os.dup2(self.save_fds[0],1)
os.dup2(self.save_fds[1],2)
# Close the null files
os.close(self.null_fds[0])
os.close(self.null_fds[1])
for i in range(10**6):
with suppress_stdout_stderr():
print 'plop'
if i % 50 == 0:
print i
it fails at 5100 on OSX with OSError: [Errno 24] Too many open files. I'm wondering why and if there is a solution to close the file descriptor. I'm looking for a solution for a context manager which closes stdout and stderr.

I executed your code on a Linux machine and got the same error but at a different number of iterations.
I added the following two lines in the __exit__(self, *_) function of your class:
os.close(self.save_fds[0])
os.close(self.save_fds[1])
With this change I do not get an error and the script returns successfully. I assume that the duplicated file descriptors stored in self.save_fds are kept open if you don't close them with os.close(fds) and so you get the too many files open error.
Anyway my console printed "plop", but maybe this depends on my platform.
Let me know if it works :)

Related

Python popen doesn't capture stderror

I need to be able read stdout and stderr as it occurs from a process that I spawn in Python, I am currently using:
task = Popen('sh job.sh', stdout=PIPE, bufsize=1)
with task.stdout:
for line in iter(task.stdout.readline, b''):
stream.append(line)
fileHandle.write(line)
This is getting the stdout, but stderr is getting sent to the console:
./tmp_2edd9d49-4108-43e8-a09f-30f34488c531: line 1: #echo: command not found
I tried adding stderr=PIPE, but that made the errors vanish. Is there a way of doing this so I can read both (I really would like the error to occur at right place.
You can't omit the stderr argument if you want to capture it!
import subprocess as shell
raw_cmd = 'sh job.sh'
cmd_list = raw_cmd.split()
task = shell.Popen.(cmd_lst, stdout=shell.PIPE, stderr=shell.PIPE)
with task.stderr as stderr:
for line in stderr:
print line
with task.stdout as stdout:
for line in stdout:
print line
Basically the extern program writes into two files: stdout and stderr, we plug these "out-files" into our program. The way we are doing that in this example allows only to track the output of either stderr, or stdout in total, so right now there is no correlation.
To track both files simultaneously, you would have to fall back to select, pool, or epoll. Depending on installed libraries and OS.
e.g. on linux:
...
from select import select
...
while 1:
# `select` blocks until any file is ready !!!
reads, writes, errors = select([task.stdout, task.stderr], [], [])
for stdfile in reads:
if stdfile == task.stdout:
for line in stdfile: print "stdout:", line
if stdfile == task.stderr:
for line in stdfile: print "stdERR:", line
...
Beware, the code above is untested, but would allow a tighter out/err correleation. This is also not an optimal solution, just a pointer to possible venues.
You let select block until any of the specified files/PIPES are ready. Then you check which file is ready (e.g if stdfile == task.stderr) you print it and repeat the loop with select.
If you don't want this loop to block, you could move them into a separate therad, or make select non-blocking and do multiple polls (see select).

File opening operation misbehaving in Python 2.7

I am learning about exceptions and so performing some file operations and testing various parts of code that can possibly generate exceptions while working with files in Python. I am executing this Python 2.7 code on Canopy.
#!/usr/bin/python
import os
try:
fp = open('testfile', 'r')
except IOError:
print 'File not opened successfully'
else:
print 'File opened successfully'
try:
fp.write('Hello!')
except IOError:
print 'Write not allowed on this file'
else:
print 'Write successful'
try:
fp.close()
except IOError:
print 'File not closed properly'
else:
print 'File closed successfully'
finally:
if os.path.exists(fp.name):
os.remove(fp.name)
When I execute this code, I get the following output:
File not opened properly
NameErrorTraceback (most recent call last)
/home/sr/Python/tcs.py in ()
--> 185 if os.path.exists(fp.name)
NameError: name 'fp' is not defined
But if I change the access mode of file to 'w', Then everything seems to work properly with the correct output as:
File opened successfully
Write successful
File closed successfully
I cannot understand why the 'r' mode is not making the file open properly and thus the fp file object is not created. Please help me figure the problem out.
P.S.: Also I would like to know if there is a better way of implementing the same thing. But this is optional.
Explanation
The error combined with your printout should be pretty self-explanatory: the variable fp does not exist if you can't open the file.
The mode 'r' indicates that you want to open the file for reading. You can not read something that is not there, so you end up going to the finally block in your code after processing the IOError. But the error occurs before fp was set, so there is no variable fp, hence the error. [Solutions below]
The mode 'w' indicates that you want to open for writing, but from scratch. There is also an 'a' mode to append if the file already exists. You can write to a non-existent file just fine, so your code does not fail. In fact, if the file did exist in 'w' mode, it would be trucated and any previous contents would be lost.
Try creating an empty file and running with mode 'r'. You should get an exception that prints 'Write not allowed on this file'. That is because, as your error message correctly indicates, writing to a file opened in read mode is not allowed.
Improvements
There are two major improvements that you can make to your code. One is fixing the logical flaws, the other is a major stylistic improvement using with statements.
You have two major logic errors. The first is in the outermost finally block that you already saw. The simplest fix is moving the contents of the finally block into the else, since you don't have any action to take if the file was not opened. Another solution is to refer to the file name you are trying to open in the first place. For example, you could store the file name into a variable and use that:
filename = 'testfile'
try:
fp = open(filename, 'r')
...
finally:
if os.path.exists(filename):
os.remove(filename)
The second major logic error is that you do not close the file if your write fails. Notice that you call fp.close() only in the else clause of your try block. If should instead appear in a finally block. The print statement should of course stay in the else. Change
else:
print 'Write successful'
try:
fp.close()
...
to
else:
print 'Write successful'
finally:
try:
fp.close()
...
The entire code can be improved stylistically by using with blocks to manage your file operations. The obvious way to do this is as follows:
fname = 'testfile'
with open(fname, 'r') as fp:
fp.write('Hello!')
if os.path.exists(fname):
os.remove(fname)
You will not get as many detailed messages when things fail, but overall, this code is cleaner, shorter and more robust than what you have. It guarantees that the file will be closed whether or not an exception occurred anywhere along the way. If you need the detailed error output that you currently have, keep using the current try blocks. The reason that most people will prefer with is that any error that occurs will have a detailed desciption and a line number that it happened on, so you basically get all the necessary information with a lot less work.
Here are some additional resources to help you understand with and context managers:
Understanding Python's "with" statement (from effbot.org)
Official documentation for with
SO 1, 2, 3, 4

Script failing to open and append multiple files simultaneously

So trying to finish a very simple script that has given me a unbelievably hard time. It's supposed to iterate through specified directories and open all text files in them and append them all with the same specified string.
The issue is it's not doing anything to the files at all. Using print to test my logic I've replaced lines 10 and 11 with print f (the write and close functions), and get the following output:
<open file '/Users/russellculver/documents/testfolder/.DS_Store', mode 'a+' at
So I think it is storing the correct files in the f variable for the write function, however I am not familiar with how Mac's handle DS_STORE or the exact role it plays in temporary location tracking.
Here is the actual script:
import os
x = raw_input("Enter the directory path here: ")
def rootdir(x):
for dirpaths, dirnames, files in os.walk(x):
for filename in files:
try:
with open(os.path.join(dirpaths, filename), 'a+') as f:
f.write('new string content')
f.close()
except:
print "Directory empty or unable to open file."
return x
rootdir(x)
And the exact return in Terminal after execution:
Enter the directory path here: /Users/russellculver/documents/testfolder
Exit status: 0
logout
[Process completed]
Yet nothing written to the .txt files in the provided directory.
The way the indentation is in the question, you return from the function right after writing the first file; either of the for-loops never finish. Which is relatively easy to surmise from the fact that you only get one output file printed.
Since you're not doing anything with the result of the rootdir function, I would just remove the return statement entirely.
An aside: there is no need to use f.close() when you open a file with the with statement: it will automatically be closed (even upon an exception). That is in fact what the with statement was introduced for (see the pep on context managers if necessary).
To be complete, here's the function the way I would have (roughly) written it:
def rootdir(x):
for dirpaths, dirnames, files in os.walk(x):
for filename in files:
path = os.path.join(dirpaths, filename)
try:
with open(path, 'a+') as f:
f.write('new string content')
except (IOError, OSError) as exc:
print "Directory empty or unable to open file:", path
(Note that I'm catching only the relevant I/O errors; any other exceptions (though unlikely) will not be caught, as they are likely not to be related to non-existing/unwritable file.)
Return was indented wrong, ending the iteration after a single loop. Wasn't even necessary so was removed entirely.

python: reading executable's stdout, broken stream

I am trying to read the output of an executable (A) which is written in c++ from my python script. I am working in Linux. The only way I have known so far is through the subprocess library
Firstly I tried
p = Popen(['executable', '-arg_flag1', arg1 ...], stdout=PIPE, stdin=PIPE, stderr=STDOUT)
print "reach here"
stdout_output = p.communicate()[0]
print stdout_output
sys.stdin.read(1)
which turned out to hang up both my executable (with 99% cpu usage) and my script :S:S:S
Moreover reach here is printed.
After that I tried:
f = open ("out.txt", 'r+')
command = 'executable -arg_flag1 arg1 ... '
subprocess.call(command, shell=True, stdout=f)
f.seek(0)
content = f.read()
and this works but I get an output where some chars at the end of the content are repeated or even more values produced than expected :S
Anyway could someone enlighten me of a more proper way to do this?
Thanks in advance
The first solution is best. Using shell=True is slower, and has security issues.
The problem is Popen doesn't wait for the process to complete, so Python stops leaving the process without stdout, stdin and stderr. Causing that process to go wild. Adding p.wait() should do the trick!
Also, using communicate is a loss of time. Just do stdout_output = p.stdout.read(). You'll have to check yourself if stdout_output contains anything though, but this is still nicer than using communicate()[0].

ZMQ IOLoop instance write/read workflow

I am having a weird system behavior when using PyZMQ's IOLoop instance:
def main():
context = zmq.Context()
s = context.socket(zmq.REP)
s.bind('tcp://*:12345')
stream = zmqstream.ZMQStream(s)
stream.on_recv(on_message)
io_loop = ioloop.IOLoop.instance()
io_loop.add_handler(some_file.fileno(), on_file_data_ready_read_and_then_write, io_loop.READ)
io_loop.add_timeout(time.time() + 10, another_handler)
io_loop.start()
def on_file_data_ready_read_and_then_write(fd, events):
# Read content of the file and then write back
some_file.read()
print "Read content"
some_file.write("blah")
print "Wrote content"
def on_message(msg):
# Do something...
pass
if __name__=='__main__':
main()
Basically the event loop listens to zmq port of 12345 for JSON requests, and reads content from a file when available (and when it does, manipulate it and wrties to it back. Basically the file is a special /proc/ kernel module that was built for that).
Everything works well BUT, for some reason when looking at the strace I see the following:
...
1. read(\23424) <--- Content read from file
2. write("read content")
3. write("Wrote content")
4. POLLING
5. write(\324324) # <---- THIS is the content that was sent using some_file.write()
...
So it seems like the write to file was not done in the order of the python script, but the system call of write to that file was done AFTER the polling, even though it should have been done between lines 2 and 3.
Any ideas?
Looks like you're running into a caching problem. If some_file is a file like object, you can try explicitly calling .flush() on it, same goes for ZMQ Socket which can hold messages for efficiency reasons as well.
As it stands, the file's contents are being flushed when the some_file reference is garbage collected.
Additional:
use the context manager logic that newer versions of Python provide with open()
with open("my_file") as some_file:
some_file.write("blah")
As soon as it finishes this context, some_file will automatically be flushed and closed.