ZMQ IOLoop instance write/read workflow - python-2.7

I am having a weird system behavior when using PyZMQ's IOLoop instance:
def main():
context = zmq.Context()
s = context.socket(zmq.REP)
s.bind('tcp://*:12345')
stream = zmqstream.ZMQStream(s)
stream.on_recv(on_message)
io_loop = ioloop.IOLoop.instance()
io_loop.add_handler(some_file.fileno(), on_file_data_ready_read_and_then_write, io_loop.READ)
io_loop.add_timeout(time.time() + 10, another_handler)
io_loop.start()
def on_file_data_ready_read_and_then_write(fd, events):
# Read content of the file and then write back
some_file.read()
print "Read content"
some_file.write("blah")
print "Wrote content"
def on_message(msg):
# Do something...
pass
if __name__=='__main__':
main()
Basically the event loop listens to zmq port of 12345 for JSON requests, and reads content from a file when available (and when it does, manipulate it and wrties to it back. Basically the file is a special /proc/ kernel module that was built for that).
Everything works well BUT, for some reason when looking at the strace I see the following:
...
1. read(\23424) <--- Content read from file
2. write("read content")
3. write("Wrote content")
4. POLLING
5. write(\324324) # <---- THIS is the content that was sent using some_file.write()
...
So it seems like the write to file was not done in the order of the python script, but the system call of write to that file was done AFTER the polling, even though it should have been done between lines 2 and 3.
Any ideas?

Looks like you're running into a caching problem. If some_file is a file like object, you can try explicitly calling .flush() on it, same goes for ZMQ Socket which can hold messages for efficiency reasons as well.
As it stands, the file's contents are being flushed when the some_file reference is garbage collected.
Additional:
use the context manager logic that newer versions of Python provide with open()
with open("my_file") as some_file:
some_file.write("blah")
As soon as it finishes this context, some_file will automatically be flushed and closed.

Related

count all events in stream with babeltrace API

I have a LTTNg trace, which i am parsing using babeltrace API. So i was wondering if I could count all events in trace (or stream) without iterating over them. What functions from publilc API I can use to do that ?
The very nature of CTF makes it impossible to count the event records of a given packet in constant time. The packet's context could include an event record count field somehow, but it's not specified, so generic tools would not use it.
Thus the only way to count events is to iterate the event records, unfortunately. The easiest way is to count the number of lines that the text format of the babeltrace(1) tool prints:
babeltrace /path/to/ctf/trace/directory | wc --lines
This works as long as there's one line per printed event record, which is the case unless an event record contains a string field which has a newline (currently not escaped in the text output).
You may also wish to consider discarded event records. They are not printed to the standard output by babeltrace(1), but the tool prints a message including the count to the standard error when they are detected.
There's no way with the current babeltrace(1) tool to only print the event records which belong to the packets of a given data stream. If you need this, what I suggest is that you remove all the data stream files except the one for which you need an event record count, and run the command above again.
Also consider the Babeltrace Python bindings, for example (not tested):
import babeltrace
def count_ctf_event_records(path):
trace_collection = babeltrace.TraceCollection()
trace_collection.add_trace(path, 'ctf')
return sum(1 for event in trace_collection.events)
if __name__ == '__main__':
import sys
print(count_ctf_event_records(sys.argv[1]))
Saved as count.py, you can try this:
python3 count.py /path/to/ctf/trace/directory
Counting the event records of a specific data stream with the Python bindings is left as an exercise for the reader.
Having said this, I don't know if the Python bindings approach is faster than the babeltrace(1) one.

catch change in security permission of file without blocking

I want to catch change in security permission of file without blocking till the change is made, like an event that popup when the change is made.
I also don't want to install any third party modules or softwares.
The main requirement of this is to be in some of win32 modules or built-in modules.
I'm currently watching for security change by this function:
import win32con, win32file
def security_watcher():
specific_file = "specific.ini"
path_to_watch = "."
FILE_LIST_DIRECTORY = 0x0001
hDir = win32file.CreateFile(path_to_watch,
FILE_LIST_DIRECTORY,
win32con.FILE_SHARE_READ |
win32con.FILE_SHARE_WRITE |
win32con.FILE_SHARE_DELETE,
None,
win32con.OPEN_EXISTING,
win32con.FILE_FLAG_BACKUP_SEMANTICS,
None)
results = win32file.ReadDirectoryChangesW(hDir,
1024,
False,
win32con.FILE_NOTIFY_CHANGE_SECURITY,
None,
None)
print results
for action, file_name in results:
if file_name == specific_file:
# wake another function to do something about that
Note: I need it non blocking because I use this function in GUI Application and it freezes the GUI.
If you don't mind (or can't avoid) adding some threading overhead, you can launch a separate process or thread that waits on the blocking call to win32.ReadDirectoryChangesW() that you already have. When it receives a change, it writes the result to a pipe shared with the main thread of your GUI.
Your GUI can periodically execute a non-blocking read at the appropriate point in your code (presumably, where you now call win32file.ReadDirectoryChangesW()). Do make sure to wait a bit between reads, or your app will spend 100% of its time on non-blocking reads.
You can see in this answer how to set up a non-blocking read on a pipe in an OS-independent way. The final bit will look like this:
try:
results = q.get_nowait()
except Empty:
pass
else:
for line in results.splitlines():
# Parse and use your result format
...
if file_name == specific_file:
...

How to close file descriptors in python?

I have the following code in python:
import os
class suppress_stdout_stderr(object):
'''
A context manager for doing a "deep suppression" of stdout and stderr in
Python, i.e. will suppress all print, even if the print originates in a
compiled C/Fortran sub-function.
This will not suppress raised exceptions, since exceptions are printed
to stderr just before a script exits, and after the context manager has
exited (at least, I think that is why it lets exceptions through).
'''
def __init__(self):
# Open a pair of null files
self.null_fds = [os.open(os.devnull,os.O_RDWR) for x in range(2)]
# Save the actual stdout (1) and stderr (2) file descriptors.
self.save_fds = (os.dup(1), os.dup(2))
def __enter__(self):
# Assign the null pointers to stdout and stderr.
os.dup2(self.null_fds[0],1)
os.dup2(self.null_fds[1],2)
def __exit__(self, *_):
# Re-assign the real stdout/stderr back to (1) and (2)
os.dup2(self.save_fds[0],1)
os.dup2(self.save_fds[1],2)
# Close the null files
os.close(self.null_fds[0])
os.close(self.null_fds[1])
for i in range(10**6):
with suppress_stdout_stderr():
print 'plop'
if i % 50 == 0:
print i
it fails at 5100 on OSX with OSError: [Errno 24] Too many open files. I'm wondering why and if there is a solution to close the file descriptor. I'm looking for a solution for a context manager which closes stdout and stderr.
I executed your code on a Linux machine and got the same error but at a different number of iterations.
I added the following two lines in the __exit__(self, *_) function of your class:
os.close(self.save_fds[0])
os.close(self.save_fds[1])
With this change I do not get an error and the script returns successfully. I assume that the duplicated file descriptors stored in self.save_fds are kept open if you don't close them with os.close(fds) and so you get the too many files open error.
Anyway my console printed "plop", but maybe this depends on my platform.
Let me know if it works :)

Test a program that uses tty stdin and stdout

I have a software made of two halves: one is python running on a first pc, the other is cpp running on a second one.
They communicate through the serial port (tty).
I would like to test the python side on my pc, feeding it with the proper data and see if it behaves as expected.
I started using subprocess but then came the problem: which stdin and stdout should I supply?
cStringIO does not work because there is no fileno()
PIPE doesn't work either because select.select() says there is something to read even if nothing it's actually sent
Do you have any hints? Is there a fake tty module I can use?
Ideally you should mock that out and just test the behavior, without relying too much on terminal IO. You can use mock.patch for that. Say you want to test t_read:
#mock.patch.object(stdin, 'fileno')
#mock.patch.object(stdin, 'read')
def test_your_behavior(self, mock_read, mock_fileno):
# this should make select.select return what you expect it to return
mock_fileno.return_value = 'your expected value'
# rest of the test goes here...
If you can post at least part of the code you're trying to test, I can maybe give you a better example.

Condor output file updating

I'm running several simulations using Condor and have coded the program so that it outputs a progress status in the console. This is done at the end of a loop where it simply prints the current time (this can also be percentage or elapsed time). The code looks something like this:
printf("START");
while (programNeedsToRum) {
// Run code repetitive code...
// Print program status update
printf("[%i:%i:%i]\r\n", hours, minutes, seconds);
}
printf("FINISH");
When executing normally (i.e. in the terminal/cmd/bash) this works fine, but the condor nodes don't seem to printf() the status. Only once the simulation has finished, all the status updates have been outputted to the file but then it's no longer of use. My *.sub file that I submit to condor looks like this:
universe = vanilla
executable = program
output = out/out-$(Process)
error = out/err-$(Process)
queue 100
When submitted the program executes (this is confirmed in condor_q) and the output files contain this:
START
Only once the program has finished running its corresponding output file shows (example):
START
[0:3:4]
[0:8:13]
[0:12:57]
[0:18:44]
FINISH
Whilst the program executes, the output file only contains the START text. So I came to the conclusion that the file is not updated if the node executing program is busy. So my question is, is there a way of updating the output files manually or gather any information on the program's progress in a better way?
Thanks already
Max
What you want to do is use the streaming output options. See the stream_error and stream_output options you can pass to condor_submit as outlined here: http://research.cs.wisc.edu/htcondor/manual/current/condor_submit.html
By default, HTCondor stores stdout and stderr locally on the execute node and transfers them back to the submit node on job completion. Setting stream_output to TRUE will ask HTCondor to instead stream the output as it occurs back to the submit node. You can then inspect it as it happens.
Here's something I used a few years ago to solve this problem. It uses condor_chirp which is used to transfer files from the execute host to the submitter. I have a python script that executes the program I really want to run, and redirects its output to a file. Then, periodically, I send the output file back to the submit host.
Here's the Python wrapper, stream.py:
#!/usr/bin/python
import os,sys,time
os.environ['PATH'] += ':/bin:/usr/bin:/cygdrive/c/condor/bin'
# make sure the file exists
open(sys.argv[1], 'w').close()
pid = os.fork()
if pid == 0:
os.system('%s >%s' % (' '.join (sys.argv[2:]), sys.argv[1]))
else:
while True:
time.sleep(10)
os.system('condor_chirp put %s %s' % (sys.argv[1], sys.argv[1]))
try:
os.wait4(pid, os.WNOHANG)
except OSError:
break
And my submit script. The problem ran sh hello.sh, and redirected the output to myout.txt:
universe = vanilla
executable = C:\cygwin\bin\python.exe
requirements = Arch=="INTEL" && OpSys=="WINNT60" && HAS_CYGWIN==TRUE
should_transfer_files = YES
transfer_input_files = stream.py,hello.sh
arguments = stream.py myout.txt sh hello.sh
transfer_executable = false
It does send the output in its entirety, so take that in to account if you have a lot of jobs running at once. Currently, its sending the output every 10 seconds .. you may want to adjust that.
with condor_tail you can view the output of a running process.
to see stdout just add the job-ID (and -f if you want to follow the output and see the updates immediately. Example:
condor_tail 314.0 -f