c++ . when can a system call to move fail? - c++

Read some where that one should use WEXITSTATUS to check for return status of system calls .
However I dont think that a call like system("mv /a/b/c /a/b/d") needs a check if it fails .
What are the conditions when this call may fail ?

Some possibilities:
/a/b/c does not exist
/a/b does not exist
You have insufficient access to /a/b/c
You have insufficient access to /a/b/d
/a/b/d already exists
/a/b/c isn't moveable
There is no shell
mv does not exist
You have insufficient access to mv
You have no file system mounted
You have no storage available at all
And many, many more...

system("mv /a/b/c /a/b/d") : very probably both /a/b/c and /a/b/d lie on the same mounted file system. I am guessing you have a Posix system, perhaps Linux.
Then it is much more simpler to use the rename(2) syscall (which is called by /bin/mv when relevant!) and directly code:
if (rename("/a/b/c", "/a/b/d")) {
perror("rename failed");
exit(EXIT_FAILURE);
}
and you would have thru errno(3) i.e. thru perror(3) the error code explaining why the rename failed. All the error conditions in rename(2) are relevant failure cases of mv, see also mv(1)!
Read Advanced Linux Programming.
If (for some strange reason, e.g. bind mounts, symlinks, ...) /a/b/c and /a/b/d don't lie in the same filesystem, you would get errno == EXDEV (and you might handle that case by a copy followed by an unlink of the source).
In general, using system("mv ...") should be avoided. See this answer and this one explaining why (and also this and this). And your user may have a different mv in his PATH (or some alias), so mv should at least be /bin/mv ... If . (or even worse /tmp !) is early in his PATH and has a symlink mv ➙ /bin/rm your user won't be happy!
BTW, you generally don't call system with a compile-time constant string starting with mv. That string is generally built. And if you don't quote correctly things (imagine one argument being ; rm -rf $HOME) havoc can happen.
Also, system(3) can fail e.g. because fork(2) failed (too many user processes, reaching the RLIMIT_NPROC limit of setrlimit(2)...).

Related

Mounting ecryptfs using C++ mount function

I am trying to mount ecryptfs from within a C++ program. I can definitely mount it without it asking questions by issuing this command at the prompt:
sudo mount -t ecryptfs -o "rw,key=passphrase:passphrase_passwd=geoff,ecryptfs_cipher=aes,ecryptfs_key_bytes=32,ecryptfs_passthrough=n,ecryptfs_enable_filename_crypto=n,no_sig_cache" ~/source/ ~/target/
Note that in reality, I am passing a full canonical path in case that matters.
But from within the program I get failure with errno=EINVAL after trying by using the mount() function with the same arguments:
mount("~/source/", "~/target/", "ecryptfs", MS_NODEV, "rw,key=passphrase:passphrase_passwd=geoff,ecryptfs_cipher=aes,ecryptfs_key_bytes=32,ecryptfs_passthrough=n,ecryptfs_enable_filename_crypto=n,no_sig_cache")
The program does launch with root privileges and I have checked that I have CAP_SYS_ADMIN.
The mount() function returns -1 and sets errno to EINVAL.
Have I got the arguments correct? Is this maybe a privileges issue?
EDIT: I got it to work by executing mount externally via system(), but would still like to use the function because of reasons.
I believe this is because mount -t ecryptfs is actually calling the helper executable mount.ecryptfs, and it's processing some of the options (in particular, key=) itself. What's actually passed to the kernel is different (you can see this by looking at /proc/mounts afterward).
If you look closely at https://manpages.ubuntu.com/manpages/kinetic/en/man7/ecryptfs.7.html, key= and ecryptfs_enable_filename_crypto= are listed under "MOUNT HELPER OPTIONS" - the actual kernel module's options are ecryptfs_sig=(fekek_sig) and ecryptfs_fnek_sig=(fnek_sig).
So, if you want to bypass the helper and do the mount directly, you'd need to load the tokens into the kernel's keyring with https://man7.org/linux/man-pages/man2/keyctl.2.html and replace key= with the resulting token signatures, like mount.ecryptfs did.
It does appear that there is a libecrytpfs with functions in ecryptfs.h like ecryptfs_add_passphrase_key_to_keyring which you can (presumably, not tested) use to do this in a way matching the mount.ecryptfs

Are ALL system() calls a security risk in c++?

A post in this (Are system() calls evil?) thread says:
Your program's privileges are inherited by its spawned programs. If your application ever runs as a privileged user, all someone has to do is put their own program with the name of the thing you shell out too, and then can execute arbitrary code (this implies you should never run a program that uses system as root or setuid root).
But system("PAUSE") and system("CLS") shell to the OS, so how could a hacker possibly intervene if it ONLY shells to a specific secure location on the hard-drive?
Does explicitly flush—by using fflush or _flushall—or closing any stream before calling system eliminate all risk?
The system function passes command to the command interpreter, which executes the string as an operating-system command. system uses the COMSPEC and PATH environment variables to locate the command-interpreter file CMD.exe. If command is NULL, the function just checks whether the command interpreter exists.
You must explicitly flush—by using fflush or _flushall—or close any stream before you call system.
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/system-wsystem
In case, there are any doubts here's the actual snippet from the MS' implementation (very simple and straightforward):
// omitted for brevity
argv[1] = _T("/c");
argv[2] = (_TSCHAR *) command;
argv[3] = NULL;
/* If there is a COMSPEC defined, try spawning the shell */
/* Do not try to spawn the null string */
if (argv[0])
{
// calls spawnve on value of COMSPEC vairable, if present
// omitted for brevity
}
/* No COMSPEC so set argv[0] to what COMSPEC should be. */
argv[0] = _T("cmd.exe");
/* Let the _spawnvpe routine do the path search and spawn. */
retval = (int)_tspawnvpe(_P_WAIT,argv[0],argv,NULL);
// clean-up part omitted
As to concerns of what _tspawnvpe may actually be doing, the answer is: nothing magical. The exact invocation sequence for spawnvpe and friends goes as following (as anybody with licensed version of MSVC can easily learn by inspecting the spanwnvpe.c source file):
Do some sanity checks on parameters
Try to invoke _tspawnve on the passed file name. spawnve will succeed if the file name represents an absolute path to an executable or a valid path relative to the current working directory. No further checks are done - so yes, if a file named cmd.exe exists in current directory it will be invoked first in the context of system() call discussed.
In a loop: obtain the next path element using `_getpath()
Append the file name to the path element
Pass the resulted path to spwanvpe, check if it was successful
That's it. No special tricks/checks involved.
The original question references POSIX not windows. Here there is no COMSPEC (there is SHELL but system() deliberately does not use it); however /bin/sh is completely, utterly vulnerable.
Suppose /opt/vuln/program does system("/bin/ls"); Looks completely harmless, right? Nope!
$ PATH=. IFS='/ ' /opt/vuln/program
This runs the program called bin in the current directory. Oops. Defending against this kind of thing is so difficult it should be left to the extreme experts, like the guys who wrote sudo. Sanitizing environment is extremely hard.
So you might be thinking what is that system() api for. I don't actually know why it was created, but if you wanted to do a feature like ftp has where !command is executed locally in the shell you could do ... else if (terminalline[0] == '!') system(terminalline+1); else ... Since it's going to be completely insecure anyway there's no point in making it secure. Of course a truly modern use case wouldn't do it that way because system() doesn't look at $SHELL but oh well.

What could cause unzip command returning -1 in my scenario?

I run unzip via a system() call in my C++ code in below format:
/usr/bin/unzip -o -q /<my_path_to_zip_file>/cfg_T-KTMAKUCB.zip -d /<my_path_to_dest>/../
This will almost 90% of times succeed. I cannot understand what could make it fail time to time with -1 return code. Any ideas?
According my local man system,
The value returned is -1 on error (e.g. fork(2) failed), and the return status of the command otherwise.
and the POSIX spec says,
If a child process cannot be created, or if the termination status for the command language interpreter cannot be obtained, system() shall return -1 and set errno to indicate the error
Finally, the manpage for unzip lists various return codes, but -1 isn't among them.
If the command itself can't return -1, the problem is probably with the initial fork/exec, due to something like a system-wide or per-user limit (memory exhausted; process table full; maximum processes, open files or VM size limit for the user etc. etc).
You should be checking errno when system fails anyway. Running the whole thing under strace -f will also show what happens.

c++ - Way to know if Linux command exists before executing it

I'd like to write a function that generate gz file. The function will only be operational on Linux so I'd like to use gzip command (just execute external command).
So far I have this:
bool generate_gz( const String& path )
{
bool res = false;
// LINUX
#ifndef __WXMSW__
if( !gzip_command_exists())
cout << "cannot compress file. 'gzip' command is not available.\n";
else
res = (0 == execute_command(String::Format("gzip %s", path.c_str())));
// WINDOWS
#else
// do nothing - result will be false
#endif
return res;
}
bool gzip_command_exists()
{
// TBD
}
Question
Is there a way to implement gzip_command_exists()? If so, does it have to involve running ( or trying to run) gzip command?
The simplest is to execute via system() : "which gzip" and see the exit code of the system call:
RETURN VALUE
The value returned is -1 on error (e.g. fork(2) failed), and the return status of the command otherwise. This latter return status
is in
the format specified in wait(2). Thus, the exit code of the command will be WEXITSTATUS(status). In case /bin/sh could not be
executed,
the exit status will be that of a command that does exit(127).
What to look for:
:~$ which gzip
/bin/gzip
:~$ echo $?
0
:~$ which gzip11
:~$ echo $?
1
If you do not want to spawn an external command, you can use the stat function to check if a file exists and if it is executable on a POSIX system.
It you do not want to hard code the path to gzip it is slightly more complicated. You will have to obtain the PATH environment variable, split it on colons, and then check each path for gzip. Again the name and format of path variables are POSIX specific. Check getenv function to read the path, and you could use strtok to split it.
It is questionable if it is worth checking, though, vs. just trying to run it and handling any errors.
You could use popen(3) to read the output of /usr/bin/which gzip (and you could also use it to compress on the fly by write-popen-ing a gzip > file.gz command). you could also have: FILE* pgzipv = popen("gzip --version", "r"); and fgets the first line then pclose....
You could consider using getenv("PATH") then making a loop on it with an access test to each constructed path obtained by appending /gzip to each element in the PATH, etc... You could also fork then execvp using gzip --version with stdout and stderr suitably redirected, etc..
Notice that both popen(3) and system(3) would fail when asked to execute a non-existing program (since they both fork(2) a /bin/sh shell with -c). So you don't need to test the existence of gzip and you always need to test the success of system or popen (which can fail for many reasons, see below for fork failure, and the documentation for other failures).
To be picky, checking that gzip exists is useless: it [the file /bin/gzip] could (unlikely) have been removed between your check -e.g. with access as below or with popen as above- and your later invocation of system or popen; so your first check for gzip don't bring anything.
On most Linux systems, gzip is generally available at /bin/gzip (and in practice gzip is always installed);
this is required by the file system hierarchy standard (which says that if gzip is installed it should be at that file path). Then you could just use access(2) e.g. with code like
#define GZIP_PATH "/bin/gzip" /* per FSH, see www.pathname.com/fhs */
if (access(GZIP_PATH, X_OK)) { perror(GZIP_PATH); exit(EXIT_FAILURE); };
At last, you don't need at all to fork a gzip process to gzip-compress a file. You could (and you should) simply use a library like zlib (which is required according to the Linux Standard Base as libz.so.1); you want its gzopen, gzwrite, gzprintf, gzputs, gzclose etc .... functions! That would be faster (no need to fork(2) any external process) and more reliable (no dependency on some external program like gzip; would work even if fork is not possible because limits have been reached - see setrlimit(2) with RLIMIT_NPROC and ulimit builtin of bash(1))
See also Advanced Linux Programming

Executing a command from C++, What is expected in argv[0]?

I am using execv() to run commands from /bin/ such as 'ls', 'pwd', 'echo' from my c++ program, and I am wondering what value I should provide in argv[0];
const char * path = getPath();
char ** argv = getArgs();
execv(path,argv);
argv[0] is supposed to be the program name. It's passed to the program's main function. Some programs differentiate their behavior depending on what string argv[0] is. For example the GNU bash shell will disable some of its features if called using sh instead of bash. Best give it the same value that you pass to path.
In linux, argv[0] is the process name displayed by the top utility (which it probably gets from reading entries in /proc/)
argv[0] should be the full path of the command that you want to run.
I know that this is not the answer you're looking for but is there a specific reason why you're doing this? The reason I ask is that most if not all of the actions people normally run with either system() or execv() are available in libraries on either Windows or Unix and are safer, faster and less likely to suffer from circumstantial errors. By that I mean, for example, when the PATH changes and suddenly your code stops working.
If you're passing in a string, either in whole or in part, and running it then you also leave yourself open to a user gaining access to the system by entering a command that could be damaging. E.g. imagine you've implemented a file search using find /home -name and your user types in:
"%" -exec rm {} \;
Ouch!