C++: repeated call of system() - c++

I need some help with external program call from c++ code.
I have to call javap.exe (from JDK package) from my program many times (probably more than 100), but call system("javap.exe some_parameters") is extremely slow. It's work so good for one set of parameters but repeated calls of system() not acceptable. I think it is only because of costs to access the hard disk and application run (but I'm not sure).
What can I do for better performance? Can I "save javap.exe in RAM" and call it "directly".
Or may be somebody knows how can I get java-class description and methods signature without javap.exe?

The Java VM is not cheap to start running, and it's likely that its initialization is eating up the lion's share of your time. Luckily, the functionality of javap is available directly through Java code. I suggest that you write a small Java application which, while similar to javap, does with one invocation what you would otherwise need thousands for. (Though... maybe you could already use just one? javap will take multiple class files, after all...)

Calling system() is easy, but very inefficient, primarily because you are not just launching whatever program you specify. Rather, you are launching one process (a shell), and that shell will examine your parameter and launch a second process.
If you're on a system that supports fork() and exec*(), you're going to improve performance by using them instead. As a pseudo-code example, consider:
void replace_system(const char *command)
{
pid_t child = fork();
if (child < 0) {
perror("fork:");
return;
}
if (child) {
/* this is the parent, wait for the child to finish */
while (waitpid(child, &status, options) <= 0);
return;
}
/* this is the new process */
exec*(...);
perror("failed to start the child");
exit(-1);
}
Choose one of the exec* functions based on how you want to arrange the parameters. You'll need to break your string of arguments into components, and possibly provide an environment of your liking. Once you call the exec* function, that function will never return (unless there is an error starting the command you've defined for it).
Beyond performance considerations, another reason to use this is, if desired, it allows you to modify the child's standard paths. For example, you might be interested in the output of a child; if you modify its stdout to be a pipe available to you, you can simply read what it prints. Research source code for the standard popen() call to find an example of this.

Related

Injecting dll before windows executes target TLS callbacks

There's an app that uses TLS callbacks to remap its memory using (NtCreateSection/NtUnmapViewOfSection/NtMapViewOfSection) using the SEC_NO_CHANGE flag.
Is there any way to hook NtCreateSection before the target app use it on its TLS callback?
You could use API Monitor to check if it is really that function call and if I understand you correctly you want to modify its invocation. API Monitor allows you to modify the parameters on the fly. If just "patching" the value when the application accesses the api is enough you could than use x64dbg to craft a persistent binary patch for your application. But this requires you to at least know or get familiar with basic x64/x86 assembler.
I have no idea what you're trying to achieve exactly but if you're trying to execute setup code before the main() function is called (to setup hooks), you could use the constructor on a static object. You would basically construct an object before your main program starts.
// In a .cpp file (do not put in a header as that would create multiple static objects!)
class StaticIntitializer {
StaticIntitializer(){
std::cout << "This will run before your main function...\n";
/* This is where you would setup all your hooks */
}
};
static StaticInitializer staticInitializer;
Beware though, as any object constructed this way might get constructed in any order depending on compilers, files order, etc. Also, some things might not be initialized yet and you might not be able to achieve what you want to setup.
That might be a good starting point, but as I said, I'm not sure exactly what you're trying to achieve here, so good luck and I hope it helps a little.

Is it possible to restart a program from inside a program?

I am developing a C++ program and it would be useful to use some function, script or something that makes the program restart. It's a big program so restarting all the variables manually will take me long time...
I do not know if there is any way to achieve this or if it is possible.
If you really need to restart the whole program (i.e. to "close" and "open" again), the "proper" way would be to have a separate program with the sole purpose of restarting your main one. AFAIK a lot of applications with auto-update feature work this way. So when you need to restart your main program, you simply call the "restarter" one, and exit.
You can use a loop in your main function:
int main()
{
while(!i_want_to_exit_now) {
// code
}
}
Or, if you want to actually restart the program, run it from a harness:
program "$#"
while [ $? -e 42 ]; do
program "$#"
done
where 42 is a return code meaning "restart, please".
Then inside the program your restart function would look like this:
void restart() {
std::exit(42);
}
On Unicies, or anywhere else you have execve and it works like the man page specifies, you can just...kill me for using atoi, because it's generally awful, except for this sort of case.
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char** argv) {
(void) argc;
printf("arg: %s\n", argv[1]);
int count = atoi(argv[1]);
if ( getchar() == 'y' ) {
++count;
char buf[20];
sprintf(buf, "%d", count);
char* newargv[3];
newargv[0] = argv[0];
newargv[1] = buf;
newargv[2] = NULL;
execve(argv[0], newargv, NULL);
}
return count;
}
Example:
$ ./res 1
arg: 1
y
arg: 2
y
arg: 3
y
arg: 4
y
arg: 5
y
arg: 6
y
arg: 7
n
7 | $
(7 was the return code).
It neither recurses nor explicitly loops -- instead, it just calls itself, replacing its own memory space with a new version of itself.
In this way, the stack will never overflow, though all previous variables will be redeclared, just like with any reinvocation -- the getchar call prevents 100% CPU utilisation.
In the case of a self-updating binary, since the entire binary (at least, on Unix-likes, I don't know about Windows) will be copied into memory at runtime, then if the file changes on disk before the execve(argv[0], ... call, the new binary found on disk, not the same old one, will be run instead.
As #CarstenS and #bishop point out in the comments, due to the unique way in which Unix was designed, open file descriptors are kept across fork/exec, and as a result in order to avoid leaking open file descriptors across calls to execve, you should either close them before execve or open them with e, FD_CLOEXEC / O_CLOEXEC in the first place -- more information can be found on Dan Walsh's blog.
This is a very OS-specific question. In Windows you can use the Application Restart API or MFC Restart Manager. In Linux you could do an exec()
However most of the time there is a better solution. You're likely better off using a loop, as suggested in other answers.
This sounds like the wrong approach, like all your state is global and so the only clear-cut method you have of resetting everything (other than to manually assign "default" values to each variable) is to restart the whole program.
Instead, your state should be held in objects (of class type, or whatever). You are then free to create and destroy these objects whenever you like. Each new object has a fresh state with "default" values.
Don't fight C++; use it!
You probably need a loop:
int main()
{
while (true)
{
//.... Program....
}
}
Every time you need to restart, call continue; within the loop, and to end your program, use break;.
When I develop realtime systems my approach is normally a "derived main()" where I write all code called from a real main(), something like:
The main.cpp program:
int main (int argc, char *argv[])
{
while (true)
{
if (programMain(argc, argv) == 1)
break;
}
}
The programmain.cpp, where all code is written:
int programMain(int argc, char *argv[])
{
// Do whatever - the main logic goes here
// When you need to restart the program, call
return 0;
// When you need to exit the program, call
return 1;
}
In that way, every time we decide to exit the program the program will be restarted.
Detail: all variables, globals and logic must be written inside programMain()- nothing inside "main()" except the restart control.
This approach works in both Linux and Windows systems.
It sounds to me like you're asking the wrong question because you don't know enough about coding to ask the right question.
What it sounds like you're asking for is how to write some code where, on a missed call, it loops back round to the initial state and restarts the whole call/location sequence. In which case you need to use a state machine. Look up what that is, and how to write one. This is a key software concept, and you should know it if your teachers were any good at their job.
As a side note, if your program takes 5s to initialise all your variables, it's still going to take 5s when you restart it. You can't shortcut that. So from that it should be clear that you don't actually want to kill and restart your program, because then you'll get exactly the behaviour you don't want. With a state machine you could have one initialisation state for cold startup where the system has only just been turned on, and a second initialisation state for a warm restart.
Oh, and 6 threads is not very many! :)
Depending on what you mean by "restarting" the program, I can see few simple solutions.
One is to embed your whole program in some "Program" class, that essentially provides some loop that has your proper program. When you need to restart the program, you call static public method "Restart" that starts the loop again.
You could also try to make system-specific call that would start your program again, and exit.
As suggested in other answer, you could create a wrapper program for this sole purpose(and check return code to know whether to quit or restart).
The other simple option is to use goto. I know that people will hate me for even mentioning it, but let's face it: we want to make simple program, not use beautiful boilerplate. Goto going back guarantees destruction, so you could create a program with a label in the beginning, and some function "Restart" that just goes back to the beginning.
Whatever option you choose, document it well, so others(or you in the future) will use one WTF less.
PS. As mentioned by alain, goto will not destroy global nor static objects, same would go for enclosing class. Therefore any approach that does not include starting new program in place of the current one should either refrain from using global/static variables, or take proper actions to re-set them(although that might be tedious, as with addition of each static/global, you need to modify the restart routine).
Simple and clean way to do this is to add a wire from an unused data pin to the RESET pin and set it low to reset! :-)

Create an executable that calls another executable?

I want to make a small application that runs another application multiple times for different input parameters.
Is this already done?
Is it wrong to use system("myAp param"), for each call (of course with different param value)?
I am using kdevelop on Linux-Ubuntu.
From your comments, I understand that instead of:
system("path/to/just_testing p1 p2");
I shall use:
execl("path/to/just_testing", "path/to/just_testing", "p1", "p2", (char *) 0);
Is it true? You are saying that execl is safer than system and it is better to use?
In the non-professional field, using system() is perfectly acceptable, but be warned, people will always tell you that it's "wrong." It's not wrong, it's a way of solving your problem without getting too complicated. It's a bit sloppy, yes, but certainly is still a usable (if a bit less portable) option. The data returned by the system() call will be the return value of the application you're calling. Based on the limited information in your post, I assume that's all you're really wanting to know.
DIFFERENCES BETWEEN SYSTEM AND EXEC
system() will invoke the default command shell, which will execute the command passed as argument.
Your program will stop until the command is executed, then it'll continue.
The value you get back is not about the success of the command itself, but regards the correct opening of command shell.
A plus of system() is that it's part of the standard library.
With exec(), your process (the calling process) is replaced. Moreover you cannot invoke a script or an internal command. You could follow a commonly used technique: Differences between fork and exec
So they are quite different (for further details you could see: Difference between "system" and "exec" in Linux?).
A correct comparison is between POSIX spawn() and system(). spawn() is more complex but it allows to read the external command's return code.
SECURITY
system() (or popen()) can be a security risk since certain environment variables (like $IFS / $PATH) can be modified so that your program will execute external programs you never intended it to (i.e. a command is specified without a path name and the command processor path name resolution mechanism is accessible to an attacker).
Also the system() function can result in exploitable vulnerabilities:
when passing an unsanitized or improperly sanitized command string originating from a tainted source;
if a relative path to an executable is specified and control over the current working directory is accessible to an attacker;
if the specified executable program can be spoofed by an attacker.
For further details: ENV33-C. Do not call system()
Anyway... I like Somberdon's answer.

How to run another app in C++ and communicate with it, cross platform

I want to run another program from my C++ code. system() returns int, as every program can only return int to the os. However, the other program I want to call will generate a string that I need in my base app. How can I send it to the parent process?
The two apps will be in the same folder, so I think that the child app can save the string to "temp.txt" and then the main app may read and delete it (it's not performance critical process, I will call another process just to call open file dialog in my main opengl app). However this is a bit ugly solution, are there better cross platform solutions?
Thanks
You could use popen(), this opens a process where you can write and read data. AFIK this is also cross plattform
// crt_popen.c
/* This program uses _popen and _pclose to receive a
* stream of text from a system process.
*/
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char psBuffer[128];
FILE *pPipe;
/* Run DIR so that it writes its output to a pipe. Open this
* pipe with read text attribute so that we can read it
* like a text file.
*/
if((pPipe = _popen("dir *.c /on /p", "rt")) == NULL)
exit(1);
/* Read pipe until end of file. */
while(!feof(pPipe)) {
if(fgets(psBuffer, 128, pPipe) != NULL)
printf(psBuffer);
}
/* Close pipe and print return value of pPipe. */
printf("\nProcess returned %d\n", _pclose(pPipe));
return 0;
}
Although it's not part of the C++ standard, nearly all reasonably current systems provide a popen (or _popen, in Microsoft's case) that will let you spawn a child process and read from its standard output as a C-style FILE * in the parent. At least if memory serves, popen is included in POSIX, so you can expect it to be present in essentially all Unix-like systems (and, as implied above, it's also available on Windows, at least with most compilers).
In other words, about the only place you'd likely encounter that it's not available would be something like a small embedded system where it might well be pretty meaningless (e.g., no file system to find the other executable in, and quite possibly no ability to create new processes either).
Though there is no standard way of achieving interprocess communication, there is a relatively pain free library, ported to many OS/compilers: Boost.Interprocess. It covers most necessities:
Shared memory.
Memory-mapped files.
Semaphores, mutexes, condition variables and upgradable mutex
types to place them in shared memory and memory mapped files.
Named versions of those synchronization objects, similar to
UNIX/Windows sem_open/CreateSemaphore API.
File locking.
Relative pointers.
Message queues.

Logging code execution in C++

Having used gprof and callgrind many times, I have reached the (obvious) conclusion that I cannot use them efficiently when dealing with large (as in a CAD program that loads a whole car) programs. I was thinking that maybe, I could use some C/C++ MACRO magic and somehow build a simple (but nice) logging mechanism. For example, one can call a function using the following macro:
#define CALL_FUN(fun_name, ...) \
fun_name (__VA_ARGS__);
We could add some clocking/timing stuff before and after the function call, so that every function called with CALL_FUN gets timed, e.g
#define CALL_FUN(fun_name, ...) \
time_t(&t0); \
fun_name (__VA_ARGS__); \
time_t(&t1);
The variables t0, t1 could be found in a global logging object. That logging object can also hold the calling graph for each function called through CALL_FUN. Afterwards, that object can be written in a (specifically formatted) file, and be parsed from some other program.
So here comes my (first) question: Do you find this approach tractable ? If yes, how can it be enhanced, and if not, can you propose a better way to measure time and log callgraphs ?
A collegue proposed another approach to deal with this problem, which is annotating with a specific comment each function (that we care to log). Then, during the make process, a special preprocessor must be run, parse each source file, add logging logic for each function we care to log, create a new source file with the newly added (parsing) code, and build that code instead. I guess that reading CALL_FUN... macros (my proposal) all over the place is not the best approach, and his approach would solve this problem. So what is your opinion about this approach?
PS: I am not well versed in the pitfalls of C/C++ MACROs, so if this can be developed using another approach, please say it so.
Thank you.
Well you could do some C++ magic to embed a logging object. something like
class CDebug
{
CDebug() { ... log somehow ... }
~CDebug() { ... log somehow ... }
};
in your functions then you simply write
void foo()
{
CDebug dbg;
...
you could add some debug info
dbg.heythishappened()
...
} // not dtor is called or if function is interrupted called from elsewhere.
I am bit late, but here is what I am doing for this:
On Windows there is a /Gh compiler switch which makes the compiler to insert a hidden _penter function at the start of each function. There is also a switch for getting a _pexit call at the end of each function.
You can utilizes this to get callbacks on each function call. Here is an article with more details and sample source code:
http://www.johnpanzer.com/aci_cuj/index.html
I am using this approach in my custom logging system for storing the last few thousand function calls in a ring buffer. This turned out to be useful for crash debugging (in combination with MiniDumps).
Some notes on this:
The performance impact very much depends on your callback code. You need to keep it as simple as possible.
You just need to store the function address and module base address in the log file. You can then later use the Debug Interface Access SDK to get the function name from the address (via the PDB file).
All this works suprisingly well for me.
Many nice industrial libraries have functions' declarations and definitions wrapped into void macros, just in case. If your code is already like that -- go ahead and debug your performance problems with some simple asynchronous trace logger. If no -- post-insertion of such macros can be an unacceptably time-consuming.
I can understand the pain of running an 1Mx1M matrix solver under valgrind, so I would suggest starting with so called "Monte Carlo profiling method" -- start the process and in parallel run pstack repeatedly, say each second. As a result you will have N stack dumps (N can be quite significant). Then, the mathematical approach would be to count relative frequencies of each stack and make a conclusion about the ones most frequent. In practice you either immediately see the bottleneck or, if no, you switch to bisection, gprof, and finally to valgrind's toolset.
Let me assume the reason you are doing this is you want to locate any performance problems (bottlenecks) so you can fix them to get higher performance.
As opposed to measuring speed or getting coverage info.
It seems you're thinking the way to do this is to log the history of function calls and measure how long each call takes.
There's a different approach.
It's based on the idea that mainly the program walks a big call tree.
If time is being wasted it is because the call tree is more bushy than necessary,
and during the time that's being wasted, the code that's doing the wasting is visible on the stack.
It can be terminal instructions, but more likely function calls, at almost any level of the stack.
Simply pausing the program under a debugger a few times will eventually display it.
Anything you see it doing, on more than one stack sample, if you can improve it, will speed up the program.
It works whether or not the time is being spent in CPU, I/O or anything else that consumes wall clock time.
What it doesn't show you is tons of stuff you don't need to know.
The only way it can not show you bottlenecks is if they are very small,
in which case the code is pretty near optimal.
Here's more of an explanation.
Although I think it will be hard to do anything better than gprof, you can create a special class LOG for instance and instantiate it in the beginning of each function you want to log.
class LOG {
LOG(const char* ...) {
// log time_t of the beginning of the call
}
~LOG(const char* ...) {
// calculate the total time spent,
//by difference between current time and that saved in the constructor
}
};
void somefunction() {
LOG log(__FUNCTION__, __FILE__, ...);
.. do other things
}
Now you can integrate this approach with the preprocessing one you mentioned. Just add something like this in the beginning of each function you want to log:
// ### LOG
and then you replace the string automatically in debug builds (shoudn't be hard).
May be you should use a profiler. AQTime is a relatively good one for Visual Studio. (If you have VS2010 Ultimate, you already have a profiler.)