How do I use the system() function in c++ if I want to enter a command that needs a path when my path has spaces in it?
Example code:
#include <iostream>
int main()
{
std::string command = "ls -la /home/testuser/this is a folder/test/";
std::cout << "Click enter to execute command..." << std::endl;
getchar();
std::system(command.c_str());
return 0;
}
This doesn't work supposedly because the shell needs backspaces in front of a space.
Unfortunately this doesn't work too:
std::string command = "ls -la /home/testuser/this\b is\b a\b folder/test/";
Any idea what I'm doing wrong or how I could do it better? Thanks.
Ask yourself: how would you execute this command, yourself, from a shell prompt?
$ ls -la /home/testuser/this is a folder/test/
This, of course, will not work for the same reason your program fails. Instead, as every primer on shell scripting teaches you, you need to quote the parameter:
$ ls -la "/home/testuser/this is a folder/test/"
That will work, and you use system() in exactly the same way:
std::string command = "ls -la \"/home/testuser/this is a folder/test/\"";
But what's even better is not using system() in the first place. All the system() is, for all practical purposes, is a fork(), followed by exec() in the child process, with the parent process wait()ing for the child process's termination.
The problem is that the child process exec() the system shell, which parses the command according to its rules. This includes all the normal things that occur when executing the command via the shell directly: filename expansion, globbing, and other things.
If the string that gets passed to exec() includes any special shell characters, they'll be interpreted by the shell. In this case, you're intentionally using this to correctly parse the command string, in order to pass the correct arguments to /bin/ls.
When executing a specific, fixed command, this is fine. But when the actual command varies, or contains externally-specified parameters, it is your responsibility to correctly handle any shell wildcard characters in order to get your intended result. Hillarity ensues, otherwise. In that situation, you will find that using fork() and exec() yourself will produce far more deterministic, and reliable results, where you are in complete control of all arguments that get passed to the command being executed, instead of relying on the system shell to do it for you.
Related
How do I use the system() function in c++ if I want to enter a command that needs a path when my path has spaces in it?
Example code:
#include <iostream>
int main()
{
std::string command = "ls -la /home/testuser/this is a folder/test/";
std::cout << "Click enter to execute command..." << std::endl;
getchar();
std::system(command.c_str());
return 0;
}
This doesn't work supposedly because the shell needs backspaces in front of a space.
Unfortunately this doesn't work too:
std::string command = "ls -la /home/testuser/this\b is\b a\b folder/test/";
Any idea what I'm doing wrong or how I could do it better? Thanks.
Ask yourself: how would you execute this command, yourself, from a shell prompt?
$ ls -la /home/testuser/this is a folder/test/
This, of course, will not work for the same reason your program fails. Instead, as every primer on shell scripting teaches you, you need to quote the parameter:
$ ls -la "/home/testuser/this is a folder/test/"
That will work, and you use system() in exactly the same way:
std::string command = "ls -la \"/home/testuser/this is a folder/test/\"";
But what's even better is not using system() in the first place. All the system() is, for all practical purposes, is a fork(), followed by exec() in the child process, with the parent process wait()ing for the child process's termination.
The problem is that the child process exec() the system shell, which parses the command according to its rules. This includes all the normal things that occur when executing the command via the shell directly: filename expansion, globbing, and other things.
If the string that gets passed to exec() includes any special shell characters, they'll be interpreted by the shell. In this case, you're intentionally using this to correctly parse the command string, in order to pass the correct arguments to /bin/ls.
When executing a specific, fixed command, this is fine. But when the actual command varies, or contains externally-specified parameters, it is your responsibility to correctly handle any shell wildcard characters in order to get your intended result. Hillarity ensues, otherwise. In that situation, you will find that using fork() and exec() yourself will produce far more deterministic, and reliable results, where you are in complete control of all arguments that get passed to the command being executed, instead of relying on the system shell to do it for you.
I am having trouble using system() from libc on Linux. My code is this:
system( "tar zxvOf some.tar.gz fileToExtract | sed 's/some text to remove//' > output" );
std::string line;
int count = 0;
std::ifstream inputFile( "output" );
while( std::getline( input, line != NULL ) )
++count;
I run this snippet repeatedly and occasionally I find that count == 0 at the end of the run - no lines have been read from the file. I look at the file system and the file has the contents I would expect (greater than zero lines).
My question is should system() return when the entire command passed in has completed or does the presence of the pipe '|' mean system() can return before the part of the command after the pipe is completed?
I have explicitly not used a '&' to background any part of the command to system().
To further clarify I do in practice run the code snippet multiples times in parallel but the output file is a unique filename named after the thread ID and a static integer incremented per call to system(). I'm confident that the file being output to and read is unique for each call to system().
According to the documentation
The system() function shall not return until the child process has terminated.
Perhaps capture the output of "output" when it fails and see what it is? In addition, checking the return value of system would be a good idea. One scenario is that the shell command you are running is failing and you aren't checking the return value.
system(...) calls the standard shell to execute the command, and the shell itself should return only after the shell has regained control over the terminal. So if there's one of the programs backgrounded, system will return early.
Backgrounding happens through suffixing a command with & so check if the string you pass to system(...) contains any & and if so make sure they're properly quoted from shell processing.
System will only return after completion of its command and the file output should be readable in full after that. But ...
... multiple instances of your code snippet run in parallel would interfere because all use the same file output. If you just want to examine the contents of output and do not need the file itself, I would use popen instead of system. popen allows you to read the output of the pipe via a FILE*.
In case of a full file system, you could also see an empty output while the popen version would have no trouble with this condition.
To notice errors like a full file system, always check the return code of your calls (system, popen, ...). If there is an error the manpage will tell you to check errno. The number errno can be converted to a human readable text by strerror and output by perror.
On a Linux platform, I have C++ code that goes like this:
// ...
std::string myDir;
myDir = argv[1]; // myDir is initialized using user input from the command line.
std::string command;
command = "mkdir " + myDir;
if (system(command.c_str()) != 0) {
return 1;
}
// continue....
Is passing user input to a system() call safe at all?
Should the user input be escaped / sanitized?
How?
How could the above code be exploited for malicious purposes?
Thanks.
Just don't use system. Prefer execl.
execl ("/bin/mkdir", "mkdir", myDir, (char *)0);
That way, myDir is always passed as a single argument to mkdir, and the shell isn't involved. Note that you need to fork if you use this method.
But if this is not just an example, you should use the mkdir C function:
mkdir(myDir, someMode);
Using system() call with command line parameters without sanitizing the input can be highly insecure.
The potential security threat could be a user passing the following as directory name
somedir ; rm -rf /
To prevent this , use a mixture of the following
use getopt to ensure your input is
sanitized
sanitize the input
use execl instead of system to execute
the command
The best option would be to use all three
Further to Matthew's answer, don't spawn a shell process unless you absolutely need it. If you use a fork/execl combination, individual parameters will never be parsed so don't need to be escaped. Beware of null characters however which will still prematurely terminate the parameter (this is not a security problem in some cases).
I assume mkdir is just an example, as mkdir can trivially be called from C++ much more easily than these subprocess suggestions.
Reviving this ancient question as I ran into the same problem and the top answers, based on fork() + execl(), weren't working for me. (They create a separate process, whereas I wanted to use async to launch the command in a thread and have the system call stay in-process to share state more easily.) So I'll give an alternative solution.
It's not usually safe to pass user input as-is, especially if the utility is designed to be sudo'd; in order to sanitize it, instead of composing the string to be executed yourself, use environment variables, which the shell has built-in escape mechanisms for.
For your example:
// ...
std::string myDir;
myDir = argv[1]; // myDir is initialized using user input from the command line.
setenv("MY_DIR", myDir, 1);
if (system("mkdir \"${MY_DIR}\"") != 0) {
return 1;
}
// continue....
I've got a GUI C++ program that takes a shell command from the user, calls forkpty() and execvp() to execute that command in a child process, while the parent (GUI) process reads the child process's stdout/stderr output and displays it in the GUI.
This all works nicely (under Linux and MacOS/X). For example, if the user enters "ls -l /foo", the GUI will display the contents of the /foo folder.
However, bash niceties like output redirection aren't handled. For example, if the user enters "echo bar > /foo/bar.txt", the child process will output the text "bar > /foo/bar.txt", instead of writing the text "bar" to the file "/foo/bar.txt".
Presumably this is because execvp() is running the executable command "echo" directly, instead of running /bin/bash and handing it the user's command to massage/preprocess.
My question is, what is the correct child process invocation to use, in order to make the system behave exactly as if the user had typed in his string at the bash prompt? I tried wrapping the user's command with a /bin/bash invocation, like this: /bin/bash -c the_string_the_user_entered, but that didn't seem to work. Any hints?
ps Just calling system() isn't a good option, since it would cause my GUI to block until the child process exits, and some child processes may not exit for a long time (if ever!)
If you want the shell to do the I/O redirection, you need to invoke the shell so it does the I/O redirection.
char *args[4];
args[0] = "bash";
args[1] = "-c";
args[2] = ...string containing command line with I/O redirection...;
args[4] = 0;
execv("/bin/bash", args);
Note the change from execvp() to execv(); you know where the shell is - at least, I gave it an absolute path - so the path-search is not relevant any more.
I am using execv() to run commands from /bin/ such as 'ls', 'pwd', 'echo' from my c++ program, and I am wondering what value I should provide in argv[0];
const char * path = getPath();
char ** argv = getArgs();
execv(path,argv);
argv[0] is supposed to be the program name. It's passed to the program's main function. Some programs differentiate their behavior depending on what string argv[0] is. For example the GNU bash shell will disable some of its features if called using sh instead of bash. Best give it the same value that you pass to path.
In linux, argv[0] is the process name displayed by the top utility (which it probably gets from reading entries in /proc/)
argv[0] should be the full path of the command that you want to run.
I know that this is not the answer you're looking for but is there a specific reason why you're doing this? The reason I ask is that most if not all of the actions people normally run with either system() or execv() are available in libraries on either Windows or Unix and are safer, faster and less likely to suffer from circumstantial errors. By that I mean, for example, when the PATH changes and suddenly your code stops working.
If you're passing in a string, either in whole or in part, and running it then you also leave yourself open to a user gaining access to the system by entering a command that could be damaging. E.g. imagine you've implemented a file search using find /home -name and your user types in:
"%" -exec rm {} \;
Ouch!