DLL Injection/IPC question

DLL Injection/IPC question - c++

I'm work on a build tool that launches thousands of processes (compiles, links etc). It also distributes executables to remote machines so that the build can be run accross 100s of slave machines. I'm implementing DLL injection to monitor the child processes of my build process so that I can see that they opened/closed the resources I expected them to. That way I can tell if my users aren't specifying dependency information correctly.
My question is:
I've got the DLL injection working but I'm not all that familiar with windows programming. What would be the best/fastest way to callback to the parent build process with all the millions of file io reports that the children will be generating? I've thought about having them write to a non-blocking socket, but have been wondering if maybe pipes/shared memory or maybe COM would be better?

First, since you're apparently dealing with communication between machines, not just within one machine, I'd rule out shared memory immediately.
I'd think hard about trying to minimize the amount of data instead of worrying a lot about how fast you can send it. Instead of sending millions of file I/O reports, I'd batch together a few kilobytes of that data (or something on that order) and send a hash of that packet. With a careful choice of packet size, you should be able to reduce your data transmission to the point that you can simply use whatever method you find most convenient, rather than trying to pick the one that's the fastest.

If you stay in the windows world (None of your machines is linux or whatever) named pipes is a good choice, because it is fast and can be accessed across the machine boundary. I think shared memory is out of the race, because it can't cross the machine boundary. Distributed com allows to formulate the contract in IDL, but i think XML Messages via pipes are also ok. The xml messages have the benefit to work completely independent from the channel. If yo need linux later you can switch to tcp/ip transport and send your xml messages.
Some additional techniques with limitations:
Another forgotten but hot candidate is RPC (remote procedure calls). Lot of windows services rely on this. But i think it is hard to program RPC
If you are on the same machine and you only need to send some status information, you can regisier a windows message via RegisterWindowMessage() and send messages vie SendMessage()

apart from all the suggestions from thomas, you might also just use a common database to store the results. And if that is too slow use one of the more modern(and fast) key/value databases (like tokyo cabinet/memcachedb/etc).

This sounds like a lot of overkill for the task of verifying the files used in a build. How about, just scanning the build files? or capturing the output from the build tools?

Related

logging facilities for realtime and non realtime applications

We're developing both std and realtime applications that run on a RT-Linux.
question is, what would be an efficient way of logging application traces from both realtime and non-realtime processes?
By effecient, I mean that the process of logging application traces shouldn't cause RT-perf hit by increasing latency, etc.
Traces should ideally be stored into a single file with timestamp, to make it easier to track interaction between processes.

For real time Logging I'll advise use different aproaches than bare logging to files. Writing to files a lot of information will hurt your performance.
I can advice other more lighter mechanismS:
Use statistics/counters to get filling what your application is doing
Write/encode logs in some binary format to be processed offline. This binary format may be more compact and thus lighter.

Since you are on linux, you can use syslog() :
openlog() opens a connection to the system logger for a program.
this means your program forwards messages to another program, which can be of low priority.
If you want something more fancy, then boost logging.

ZeroC ICE vs 0MQ/ZeroMQ vs Crossroads IO vs Open Source DDS

How does ZeroC ICE compare to 0MQ? I know that 0MQ/Crossroads and DDS are very similar, but cant seem to figure out where ICE comes in.
I need to quickly implement a system that offloads real-time market-data from C++ to C#, as a first phase of my project. The next phase will be to implement an Event Based architecture with an underlying Pub/Sub design.
I am willing to use TCP.. but the the system is currently running on a single 24 core server.. so an IPC option would be nice. From what I understand ICE is only TCP, while DDS and 0mq have an IPC option.
Currently ,I am leaning towards using Protobuf with either ICE or Crossroads IO. Got turned off from the OpenSplice DDS website. Ive done lots research on the various options, was originally considering OpenMPI + boost:mpi, but there does not seem to be MPI for .NET.
My question is:
How does ICE compare to 0MQ? I cant wrap my head around this. Was unable to find anything online that compares the two.
thanks in advance.
........
More about my project:
Currently using CMAKE C++ on Windows, but the plan is to move to CentOS at some point. An additional desired feature is to store the tic data and all the messages in a "NoSql" database such as Hbase/Hadoop or HDF5. Do any of these middleware/messaging/pub-sub libraries have any database integration?

Some thoughts about ZeroC:
Very fast; Able to have multiple endpoints; Able to load balance on the endpoints; Able to reconnect to a different endpoint in case one of the node goes down. This is transparent to the end user; Has good tool chain (IceGrid, IceStorm, IceBox, etc); Distributed, high availability, multiple failover, etc
Apart from that, I have used it for hot swapping code modules (something similar to Erlang) by having the client create the proxy with multiple endpoints, and later on bring down each endpoint for a quick upgrade one by one. With the transparent retry to a different endpoint, I could have the system up and running the whole time i did an upgrade. Not sure if this is an advertised feature or an unadvertised side-effect :)
Overall, it is very easy to scale out your servers if need be using ZeroC Ice.
I know ZeroMQ provides a fantastic set of tools and messaging patterns and I would keep using it for my pet projects. However, The problem that i see is that it is very easy to go overboard and lose track of all your distributed components. This is a must have in a distributed environment. How will you know where your clients/server are when you need to upgrade? If one of components down the chain does not receive a message, how to identify where the issue is? the publisher? the client? or any one of the bridges (REP/REQ, XREP/XREQ, etc) in between?
Overall, ZeroC provides a much better toolset and ecosystem for enterprise solutions.
And it is open source :)

Jaybny,
ZMQ:
If you want real good performance and the only job for Phase 1 of your job is to move data from C++ to C#, then Zmq is the best option.
Having a pub/sub model for event driven architecture is also something that Zmq can help you with, with its in-built messaging pattern.
Zmq also supports your IPC requirements in this case. Eg: you can have one instance of your application that consumes 24 cores by multithreading and communicating via IPC.
ZeroC Ice:
Ice is a RPC framework very much like CORBA.
Eg.
Socket/ZMQ - You send message over the wire. Read it at the other end, parse the message, do some action, etc.
ZeroC Ice - Create a contract between client and server. Contract is nothing but a template of a class. Now the client calls a proxy method of that class, and the server implements/actions it and returns the value. Thus, int result = mathClass.Add(10,20) is what the client calls. The method, parameters, etc is marshalled and sent to the server, server implements the Add method, returns the result, and the client gets 30 as the result. Thus on the client side, the api is nothing but a proxy for a servant running on a remote host.
Conclusion:
ZeroC ICE has some nice enterprisy features which are really good. However, for your project requirements, ZMQ is the right tool.
Hope this helps.

For me.. the correct answer was Crossroads I/O . It does everything I need.. but still unable to pub/sub when using protobufs... im sure ZeroC ICE is great for distributed IPC, but 0MQ/Crossroads, gives you the added flexibility to use Inter-Thread-Communication.
Note: on windows, 0mq does not have IPC.
So, all in all, the crossroads fork of 0mq is the best. but you will have to roll your own windows/ipc (or use tcp::127..) , and publisher side topic filtering features for pub/sub.

nanomsg, from the guy who wrote crossroads and 0mq (i think).
http://nanomsg.org/

How to communicate between two processes

Hi I'm working on a c++ project that I'm trying to keep OS independent and I have two processes which need to communicate. I was thinking about setting up a 3rd process (possibly as a service?) to coordinate the other two, asynchronously.
Client 1 will tell the intermediate process when data is ready, and send the data to it. The intermediate process will then hold this data until client 2 tells it that it is ready for the data. If the intermediate process has not received new data from client 1, it will tell client 2 to wait.
Since I am trying to keep this OS independent I don't really know what to use. I have looked into using MPI but it doesn't really seem to fit this purpose. I have also looked into Boost.ASIO, Named Pipes, RPC's and RCF. Im currently programming in Windows but I'd like to avoid using the WIN_API so that the code could potentially be compiled in Linux.
Here's a little more detail on the two processes.
We have a back end process/model (client 1) that will receive initial inputs from a GUI (client 2, written in Qt) via the intermediate process. The model will then proceed to work until the end condition is met, sending data to the server as it becomes ready. The GUI will ask the intermediate process for data on regular intervals and will be told to wait if the model has not updated the data. As the data becomes available from the model we also want to be able to keep any previous data from the current session for exporting to a file if the user chooses to do so (i.e., we'll want the GUI to issue a command to the interface to export (or load) the data).
My modification privleges of the the back end/model are minimal, other than to adhere to the design outlined above. I have a decent amount of c++ experience but not much parallel/asynchronous application experience. Any help or direction is greatly appreciated.

Standard BSD TCP/IP socket are mostly platform independent. They work with some minor differences on both windows and Unices (like linux).
PS windows does not support AF_UNIX sockets.

I'd checkout the boost.interprocess library. If the two processes are on the same machine it has a number of different ways to communicate between processes, and do so in an platform independent manner.

I am not sure if you have considered the messaging system but if you are sending structured data between processes you should consider looking at google protocol buffers.
These related to the content of the messaging (what is passed) rather than how they are passed.
boost::asio is platform independent although it doesn't imply C++ at both ends. Of course, when you are using C++ you can use boost::asio as your form of transport.

Crossplatform background service + GUI

This seems to be typical application:
1. One part of the program should scan for audio files in background and write tags to the database.
2. The other part makes search queries and shows results.
The application should be crossplatform.
So, the main search loop, including adding data to database is not a problem. The questions are:
1. What is the best way to implement this background working service? Boost(asio) or Qt(services framework?)?
2. What is the best approach, to make a native service wrapper using mentioned libraries or emulate it using non gui application?
3. Should I connect gui to the service(how they will communicate using boost or qt?) or directly to the database (could locks be there?)?
4. Will decsision in point 1 consume all CPU usage? And how to avoid that? How to implement scanning for files less cpu consumable?S

I like to use Poco which has a convenient ServerApplication class, which can be used in an application that can be run as either a normal command-line application, or as a Windows service, or as a *nix daemon without having to touch the code.
If you use a "real" database (MySQL, PostgreSQL, SQL Server), then querying the database from the GUI application is probably fine and easier to do. If you use another type of database that isn't necessarily multi-user friendly, then you should communicate with the service using loopback sockets or pipes.
As far as CPU usage, you could just use a bunch of "sleep" calls within your code that searches files to make sure it doesn't hog the CPU and IO ports. Or use some kind of interval notification to allow it to search in chunks periodically.

Flex4/AIR with NativeProcess: How to pass an image to the native process?

I am trying to make an AIR application, that needs to pass an image (.jpg/.png) to a C++ app, that does number crunching.(this needs to be done very often, like every 2-3 seconds.) I've managed to pass the image by saving it to disk via AIR, then opening this file with the C++ program (and passing the filename as an argument to the C++ program), but this method is really slow, because it involves lots of disk I/O.
Is there a method to send an image directly to a native process?
Edit: There is a good Flash-C++ communication example at http://www.marijnspeelman.nl/blog/2008/03/06/face-detection-using-flash-and-c-revisited/ using sockets. The big problem with this method is, that some firewall settings can block the communication (i get a windows firewall warning, when i start the app).

There are several ways to transmit data between two processes.
One of the most efficient, and easy to setup, is to use TCP sockets.
It means that your C/C++ will for (TCP/HTTP) requests, and that your AIR program will send the request with all data inside.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js