I am trying to understand interprocess communication in CUDA. I would like some help with being able to understand this concept and trying to apply this to a project I am doing.
I have a image acquisition system that provides N number of input images. Each raw input image is first processed and then, stored in a single variable called 'Result'. There are four functions which do the processing of the image, Aprocess, Bprocess, Cprocess and Dprocess. Each time a new image is acquired by the system, the four functions mentioned above are called to do the processing. The final image 'Result' is stored in Dprocess.
What I would like to do is:
Create a new process, 'process2', where I can hand off one (final) image stored in 'Result', each time that image is obtained, and put it in a buffer called 'Images'. I would like to do this for 10 images. 'process2' should wait for a new image to be passed to it and not terminate because the first process has to keep calling the four functions and get a final processed image.
What I have come across so far: cudaIpcGetMemHandle, cudaIpcOpenMemHandle and cudaIpcCloseMemHandle
Question: How do I use the above function names to achieve IPC?
Question: How do I use the above function names to achieve IPC?
The CUDA simpleIPC sample code demonstrates that.
There is also a brief mention of how to use CUDA IPC API in the programming guide.
Finally, the API itself is documented in the runtime API reference manual
Note that this functionality requires cc 2.0 or higher, and a 64-bit Linux OS.
Related
I'm using MLT framework to create a video player for my app in which users will be able to preform some small video editing for a specific task. I'm also using QT for this app. I have started with essentially the BuildOnMe example which can be found here
The problem is the player crashes on videos after a certain time (always different).
At one point I was printing the number of frames to see if it was on the same number (it isn't) and when it crashed it printed this: [mlt_pool] out of memory
Do I need to take care of memory management for mlt?
I'm using QT5.3
My code, in case it helps, can be found here (I didn't add the .h)
I found out the problem was from the Mlt::Frame created in the function on_frame_show
This frame needs to be deleted, in the example it is used on the mac openGL class. But on windows since it's never used it'll quickly build up on the memory.
I am working on a Qt Project that requires me to to work with Matlab c++ shared library. I am basically working with Images that I acquire, and I need to do further processing on them later.
It is absolutely necessary for me that I acquire Images in C Platform, and then call Matlab for processing whenever needed. My Images are coming at a high speed : some 100 frames per sec.
The problem is that whenever I am calling Matlab in a loop, I am able to process the acquired images, but not real time. It takes one or two seconds between the subsequent calls in Matlab. I am assuming it is flushing off the other images and just plotting some images.
Can you suggest me a way so that I can just call Matlab function once, and my inputs be changed in real time. I dont intend to use Matlab Engine because that would require me to have Matlab Installed in every computer, my project runs on.
Are you creating a library from MATLAB code using MATLAB Compiler, and expecting to be able to call it 100 times per second?
That's not going to happen - the overhead of calling the library is too high. It sounds like your library might also be doing some plotting, which is likely to take too long as well.
You could perhaps look into using MATLAB Coder to convert your MATLAB image processing algorithm to C code, and then integrate the C code directly into your main code. Much of Image Processing Toolbox is supported by MATLAB Coder, as is Computer Vision System Toolbox and much of the Signal Processing-related toolboxes.
I need help for the following situation.
My project is to use simulink to simulate a robot.The output of the simulink model are robbot's position and torques at each timestep. My problem is on the data collection part. I plan to use a buffer to store the simulink output and use antoher matalb function to access the same buffer to get the data out of the buffer for online data analysis. The requirement is the simulink model and matlab data analysis function need to run simultaneously. And the matalb data analysis function decide when to get the data out of the buffer. This is like a producer-consumer problem, where the simulink is the producer and the data analysis matlab function acts as a consumer.
My question is how to protect the buffer for mutual exclusion. I do not want to use To workspace block, because it only updates data when simulink is paused or stopped. I do not find any smeaphore or mutex like structure provided by matalb or simulink. I have tried the following ways to solve the problem, but non of them works:
I have tried to use the queue and buffer block in DSP toolbox, this two blocks provides mutual exclusion, but the size of the output data is changed during the simulation. Basically when the matlab function collecting data, it takes all the data stored in the buffer at the moment. Then buffer block seems to output one by one at each simulink timestep.
I have tried to implement a queue by a persist variable in a embedded function. When the matalb function want to collect data, it flip a signal flag to tell the simulink to output the data into workspace. But in this method, the matalb function have to get the data by two calls. The 1st call to flip the flag and then return. In the next, the 2nd call is used to search the workspace to find the data outputted by simulink. This method is denied by my advisor, because it is not elegant.
I think RTW may solve this problem, but the simulink model and matalb analysis function code are often changed, so for debugging purpose, I plan to not change the simulink in to C/C++. But I wonder whether I can use C to implement a mutex and call by simulink and Matlab. If the answer is yes, then how to do this?
I really hope someone can help me out. Any suggestion is appreciated. By the way, I am using Linux system.
Have a look at Access Block Data During Simulation in the Simulink documentation and also at Simulink Signal Viewing using Event Listeners and a MATLAB UI on the File Exchange. I think this will do what you want.
Hi I'm working on a c++ project that I'm trying to keep OS independent and I have two processes which need to communicate. I was thinking about setting up a 3rd process (possibly as a service?) to coordinate the other two, asynchronously.
Client 1 will tell the intermediate process when data is ready, and send the data to it. The intermediate process will then hold this data until client 2 tells it that it is ready for the data. If the intermediate process has not received new data from client 1, it will tell client 2 to wait.
Since I am trying to keep this OS independent I don't really know what to use. I have looked into using MPI but it doesn't really seem to fit this purpose. I have also looked into Boost.ASIO, Named Pipes, RPC's and RCF. Im currently programming in Windows but I'd like to avoid using the WIN_API so that the code could potentially be compiled in Linux.
Here's a little more detail on the two processes.
We have a back end process/model (client 1) that will receive initial inputs from a GUI (client 2, written in Qt) via the intermediate process. The model will then proceed to work until the end condition is met, sending data to the server as it becomes ready. The GUI will ask the intermediate process for data on regular intervals and will be told to wait if the model has not updated the data. As the data becomes available from the model we also want to be able to keep any previous data from the current session for exporting to a file if the user chooses to do so (i.e., we'll want the GUI to issue a command to the interface to export (or load) the data).
My modification privleges of the the back end/model are minimal, other than to adhere to the design outlined above. I have a decent amount of c++ experience but not much parallel/asynchronous application experience. Any help or direction is greatly appreciated.
Standard BSD TCP/IP socket are mostly platform independent. They work with some minor differences on both windows and Unices (like linux).
PS windows does not support AF_UNIX sockets.
I'd checkout the boost.interprocess library. If the two processes are on the same machine it has a number of different ways to communicate between processes, and do so in an platform independent manner.
I am not sure if you have considered the messaging system but if you are sending structured data between processes you should consider looking at google protocol buffers.
These related to the content of the messaging (what is passed) rather than how they are passed.
boost::asio is platform independent although it doesn't imply C++ at both ends. Of course, when you are using C++ you can use boost::asio as your form of transport.
The Nikon SDK allows for a request/response system from PC to camera through USB through the C programming language. When creating two camera objects in two seperate threads, it is not possible to send two commands simultaneously to two seperate cameras. One camera will get its command, and send back the response, and then the second camera will get its command and send back a response. I think it has to do with the fact the DLL the Nikon SDK accesses uses global variables. The DLL is not open-source, so I cannot change or verify this. I did make two seperate copies of the DLL and each thread acesses a seperate copy. Is it possible to send two commands and get responses back at the same time?
Even though you made two copies of the DLL, they are both being loaded into the same address space / process, so any conflicts will still overlap.
The first thing I would try is two separate EXEs, each loading the original DLL, so that they are running in different processes. If this allows the two cameras to be controlled independently and simultaneously, you will just need to build some kind of process isolation system :-)
The only way I know to do this (and it's not easy) is to build a COM wrapper around the Nikon DLLs and use IIS to isolate the two instances into their own processes. A slightly easier approach might be to build your own "server" for each camera, running in an EXE process, and send messages to it (maybe just Windows messages) from a third master process.
A brute force solution would be to run each process in its own virtual machine using VMWare Workstation or a similar virtual PC architecture. Of course, now you've got the problem of communicating between two virtual PCs...
Those md3 files are not thread safe and contain static functions. I got this working on the Nikon SDK by dynamically creating a new copy of the md3 file each time a camera is connected. I had one main md3 for detecting the cameras and then would create a new md3 each time I connected.
Finally, make sure your class is thread safe and contains no global or static functions. I recommend encasing the base Nikon code into a class. If your writing a 3rd party dll that requires static functions use a pointer to the Nikon class, for each static call pass the void* object that is created by your constructor.
First think I would try is spawning 2 instances of your application. One for each camera.
Not sure exactly what you're trying to accomplish. Does the answer take too long, so that you want to get the answers at the same time? Why not simply just create a wrapper and make sure the question/answer are simply synchronous, so that you can access the SDK from any thread (and in case thread X is waiting for a response and thread Y makes a request, thread Y will wait until thread X is getting the response, and then make a request).