Running multiprocess applications from MATLAB - c++

I've written a multitprocess application in VC++ and tried to execute it with command line arguments with the system command from MATLAB. It runs, but only on one core --- any suggestions?
Update:In fact, it doesn't even see the second core. I used OpenMP and used omp_get_max_threads() and omp_get_thread_num() to check and omp_get_max_threads() seems to be 1 when I execute the application from MATLAB but it's 2 (as is expected) if I run it from the command window.
Question:My task manager reports that CPU usage is close to 100% --- could this mean that the aforementioned API is malfunctioning it's still running as a multiprocess application?
Confirmation:
I used Process Explorer to check if there were any differences in the number of threads.
When I call the application from the command window, 1 thread goes to cmd.exe and 2 go to my application.
When I call it from MATLAB, 26 threads are for MATLAB.exe, 1 for cmd.exe and 1 for my application.
Any ideas?

The question is how Matlab is affecting your app's behavior, since it's a separate process. I suspect Matlab is modifying environment variables in a manner that affects OMP, maybe because it uses OMP internally, and the process you are spawning from Matlab is inheriting this modified environment.
Do a "set > plain.txt" from the command window where you're launching you app plain, and "system('set > from_matlab.txt')" from within Matlab, and diff the outputs. This will show you the differences in environment variables that Matlab is introducing. When I do this, this appears in the environment inherited from Matlab, but not in the plain command window's environment.
OMP_NUM_THREADS=1
That looks like an OpenMP setting related to the function calls in your question. I'll bet your spawned app is seeing that and respecting it.
I don't know why Matlab is setting it. But as a workaround, when you launch the app from Matlab, instead of calling it directly, call a wrapper .bat file that clears the OMP_NUM_THREADS environment variable, or sets it to a higher number.

Run the command outside of Matlab and see how many cores its using. There should be no difference running it from within Matlab because its just a call down to the operating system. IE. equivalent from running on command line.
EDIT
Ok odd, what do you get when you call feature('NumCores') ? What version of Matlab are you using?
Does it help to enable this?

you have to execute in MATLAB command-line:
setenv OMP_NUM_THREADS 4
if you want to use 4 threads.

Related

How to implement google test sharding in c++?

I want to parallelise my googletest cases in c++.
I have read the documentation of google test sharding but unable to implement it in c++ coding environment.
As I'm new to the coding field , so can anyone please by a code explain to me the documentation in the link below
https://github.com/google/googletest/blob/master/googletest/docs/advanced.md
Google Sharding works on different machines or can be implemented on same using multiple threads?
Sharding isn't done in code, it's done using the environment. Your machine specifies two environment variables GTEST_TOTAL_SHARDS, which is the total number of machines you are running and GTEST_SHARD_INDEX, which is unique to each machine. When GTEST starts up, it selects a subset of these tests.
If you want to simulate this, then you need to set these environment variables (which can be done in code).
I would probably try something like this (on Windows) in a .bat file:
set GTEST_TOTAL_SHARDS=10
FOR /L %%I in (1,1,10) DO cmd.exe /c "set GTEST_SHARD_INDEX=%%I && start mytest.exe"
And hope that the new cmd instance had it's own environment.
Running the following in a command window worked for me (very similar to James Poag's answer, but note change of range from "1,1,10" to "0,1,9", "%%" -> "%" and "set" to "set /A"):
set GTEST_TOTAL_SHARDS=10
FOR /L %I in (0,1,9) DO cmd.exe /c "set /A GTEST_SHARD_INDEX=%I && start mytests.exe"
After further experimentation it is also possible to do this in C++. But it is not straightforward and I did not find a portable way of doing it. I can't post the code as it was done at work.
Essentially, from main, create new processes (where n is the number of cores available), capture the results from each shard, merge and output to the screen.
To get each process running a different shard, the total number of shard and instance number is given to the child process by the controller.
This is done by retrieving and copying the current environment, and setting in the copy the two environment variables (GTEST_TOTAL_SHARDS and GTEST_SHARD_INDEX) as required. GTEST_TOTAL_SHARDS is always the same, but GTEST_SHARD_INDEX will be the instance number of the child.
Merging the results is tedious but straightforward string manipulation. I successfully managed to get a correct total at the end, adding up the results of all the separate shards.
I was using Windows, so used CreateProcessA to create the new processes, passing in the custom environment.
It turned out that creating new processes takes a significant amount of time, but my program was taking about 3 minutes to run, so there was good benefits to be had from parallel running - the time came down to about 30 seconds on my 12 core PC.
Note that if this all seems overkill, there is a python program which does what I have described here but using a python script (I think - I haven't used it). This might be more straight forward.

DLL stop main thread when running

I call a DLL written in C++ (VS2012) from a software (LabView) and what it does is uploading a file on a server via FTP.
While the DLL is uploading the file (15MB) it does not let LabView continue with other tasks.
How could this problem be solved?
Regardless of what you have to do on the C++ side to make the call threadsafe, you will need to configure the call in LabVIEW not to run in the UI thread (which I believe is the default configuration, for safety reasons). Double click the node and select the run in any thread option.
Also, if you want to ensure running it in its own thread, you can put it in a separate VI and change the execution settings of that VI to run in a different execution system. LabVIEW doesn't give you direct control of threads, because it manages them on its own, but this should make the VI execute in a different thread.
Operations with FTP are long-term.
It is better to perform such operations in another thread.

how to run one program created exe file from another program in turbo c?

i am developed a program in dev c++ compiler name of file is CorrectPrgm.cpp and want to run CorrectPrgm.exe created by CorrectPrgm.cpp file. from Le.cpp which was developed in turbo c++ 3.0 compiler and my need is at the time of running Le.cpp i want to invoke/run CorrectPrgm.exe. The CorrectPrgm file accepts file name from user and produces output as list of tokens.
i have tried like this:
system("C:\\CorrectPrgm.EXE");
not working..
any other way to call...
Any help would be appreciated..
If you are on Windows Vista and above, probably you can't run it, as I believe this would be a 16-bit DOS applications. If it's 32-bit DOS app (proteced mode through DPMI, but unlikely) then it might run too, but that was too long ago to remmember how.
On Windows 7, you can install Windows XP mode (actually Virtual PC builtin kind of), and run it from there. XP still supports 16-bit apps.
I believe you can use one of the exec or spawn functions.
you can create a separate process for the program you want to invoke. But you will face a lot of problems. Firstly. correctPrgm.exe and le.exe will execute in two separate process. So you have to consider interprocess communication.
The best thing I'd suggest is break the CorrectPrgm.exe source file in functions and call the functions you need. Even you can use library and header file(s) to get the functionality of those functions.
You can also create threads. But then you have to design the threads (in one thread the CorrectPrgm will run) very carefully.

time taken by forked child process

This is a sequel to my previous question. I am using fork to create child process. Inside child, I am giving command to run a process as follows:
if((childpid=fork())==0)
{
system("./runBinary ");
exit(1)
}
My runBinary has the functionality of measuring how much time it takes from start to finish.
What amazes me is that when I run runBinary directly on command-line, it takes ~60 seconds. However, when I run it as a child process, it takes more, like ~75 or more. Is there something which I can do or am currently doing wrong, which is leading to this?
Thanks for the help in advance.
MORE DETAILS: I am running on linux RHEL server, with 24 cores. I am measuring CPU time. At a time, I only fork 8 child (sequentially), each of which is bound to different core, using taskset (not shown in code). The system is not loaded except for my own program.
The system() function is to invoke the shell. You can do anything inside it, including running a script. This gives you a lot of flexibility, but it comes with a price: you're loading a shell, and then runBinary inside it. Although I don't think loading the shell would be responsible to so much time difference (15 seconds is a lot, after all), since it doesn't seem you need that - just to run the app - try using something from the exec() family instead.
Without profiling the application, if the parent process which forks has a large memory space, you might find that there is time spent attempting to fork the process itself, and attempts to duplicate the memory space.
This isn't a problem in Red Hat Enterprise Linux 6, but was in earlier versions of Red Hat Enterprise Linux 5.

Run Linux commands from Daemon

I need to run a linux command such as "df" from my linux daemon to know free space,used space, total size of the parition and other info. I have options like calling system,exec,popen etc..
But as this each command spawn a new process , is this not possible to run the commands in the same process from which it is invoked?
And at the same time as I need to run this command from a linux daemon, as my daemon should not hold any terminal. Will it effect my daemon behavior?
Or is their any C or C++ standard API for getting the mounted paritions information
There is no standard API, as this is an OS-specific concept.
However,
You can parse /proc/mounts (or /etc/mtab) with (non-portable) getmntent/getmntent_r helper functions.
Using information about mounted filesystems, you can get its statistics with statfs.
You may find it useful to explore the i3status program source code: http://code.stapelberg.de/git/i3status/tree/src/print_disk_info.c
To answer your other questions:
But as this each command spawn a new process , is this not possible to run the commands in the same process from which it is invoked?
No; entire 'commands' are self-contained programs that must run in their own process.
Depending upon how often you wish to execute your programs, fork();exec() is not so bad. There's no hard limits beyond which it would be better to gather data yourself vs executing a helper program. Once a minute, you're probably fine executing the commands. Once a second, you're probably better off gathering the data yourself. I'm not sure where the dividing line is.
And at the same time as I need to run this command from a linux daemon, as my daemon should not hold any terminal. Will it effect my daemon behavior?
If the command calls setsid(2), then open(2) on a terminal without including O_NOCTTY, that terminal might become the controlling terminal for that process. But that wouldn't influence your program, because your program already disowned the terminal when becoming a daemon, and as the child process is a session leader, it cannot change your process's controlling terminal.