Parse command line arguments string into array for posix_spawn/execve - c++

Given single string cmd representing program command line arguments, how to get array of strings argv, that can be passed to posix_spawn or execve.
Various forms of quoting (and escaping quotes) should be processed appropriately (resulting invocation should be same as in POSIX-compatible shell). Support for other escape characters would be desirable. Examples: #1, #2, #3.

As Shawn commented, in Linux and other POSIXy systems, you can use wordexp(), which is provided as part of the standard C library on such systems. For example, run.h:
#ifdef __cplusplus
extern "C" {
#endif
/* Execute binary 'bin' with arguments from string 'args';
'args' must not be NULL or empty.
Command substitution (`...` or $(...)$) is NOT performed.
If 'bin' is NULL or empty, the first token in 'args' is used.
Only returns if fails. Return value:
-1: error in execv()/execvp(); see errno.
-2: out of memory. errno==ENOMEM.
-3: NULL or empty args.
-4: args contains a command substitution. errno==EINVAL.
-5: args has an illegal newline or | & ; < > ( ) { }. errno==EINVAL.
-6: shell syntax error. errno==EINVAL.
In all cases, you can use strerror(errno) for a descriptive string.
*/
int run(const char *bin, const char *args);
#ifdef __cplusplus
}
#endif
and compile the following C source to an object file you link into your C or C++ program or library:
#define _XOPEN_SOURCE
#include <stdlib.h>
#include <unistd.h>
#include <wordexp.h>
#include <string.h>
#include <errno.h>
int run(const char *bin, const char *args)
{
/* Empty or NULL args is an invalid parameter. */
if (!args || !*args) {
errno = EINVAL;
return -3;
}
wordexp_t w;
switch (wordexp(args, &w, WRDE_NOCMD)) {
case 0: break; /* No error */
case WRDE_NOSPACE: errno = ENOMEM; return -2;
case WRDE_CMDSUB: errno = EINVAL; return -4;
case WRDE_BADCHAR: errno = EINVAL; return -5;
default: errno = EINVAL; return -6;
}
if (w.we_wordc < 1) {
errno = EINVAL;
return -3;
}
if (!bin || !*bin)
bin = w.we_wordv[0];
if (!bin || !*bin) {
errno = ENOENT;
return -1;
}
/* Note: w.ve_wordv[w.we_wordc] == NULL, per POSIX. */
if (strchr(bin, '/'))
execv(bin, w.we_wordv);
else
execvp(bin, w.we_wordv);
return -1;
}
For example, run(NULL, "ls -laF $HOME"); will list the contents of the current user's home directory. Environment variables will be expanded.
run("bash", "sh -c 'date && echo'"); executes bash, with argv[0]=="sh", argv[1]=="-c", and argv[2]=="date && echo". This lets you control what binary will be executed.

Related

C++ Windows _pclose return value: _cwait needed?

I want to get the exit status of a pipe in C++, both on Linux and on Windows, to check whether a command ran successfully.
On Linux (or POSIX more generally), it appears that the macros in <sys/wait.h> are needed to get the correct exit status, such as in the first answer to the question
Does pclose() return pipe's termination status shifted left by eight bits on all platforms?
#include <cstdio>
#include <iostream>
#ifdef _WIN32
#define popen _popen
#define pclose _pclose
#else
#include <sys/wait.h>
#endif
int main(){
FILE* pipe {nullptr};
pipe = popen( "echo 123", "r" );
int status {0};
status = pclose(pipe);
#ifndef _WIN32
/* ask how the process ended to clean up the exit code. */
if ( WIFEXITED(status) ){
/* Add code here if needed after pipe exited normally */
status = WEXITSTATUS(status);
} else if ( WIFSIGNALED(status) ){
/* Add code here if needed after pipe process was terminated */
status = WTERMSIG(status);
} else if ( WIFSTOPPED(status) ){
/* Add code here if needed after pipe stopped */
status = WSTOPSIG(status);
}
#else
/* but what about windows? */
#endif
std::cout << "Exit status: " << status << '\n';
return 0;
}
I couldn't find anything about Windows though. The C runtime lib reference for _pclose includes a remark about _cwait and states that
"The format of the return value [of _pclose] is the same as for _cwait, except the low-order and high-order bytes are swapped".
So how do I get the correct exit status on Windows?

Convert errno to exit codes

I'm working on lib which uses a lot of file system functions.
What I want is that my function returns various of error codes (not just -1 as error) depending on errno in case file system function fails.
Although I could use errno values directly but I want to create some abstraction layer between my functions error codes and system errno (e.g. my error values begins on -1000 and are negative whereas errno values are positive).
My question what is the best way to implement it.
For now I see two possible solution:
use an enum with error codes and switch case function to translate, e.g.:
typedef enum {
MY_ERROR_EPERM = -1104, /* Operation not permitted */
MY_ERROR_ENOENT = -1105, /* No such file or directory */
// ...
} MyReturnCodes_t;
int ErrnoToErrCode(unsigned int sysErrno) {
int error = ENOSYS;
switch(sysErrno) {
case EPERM: error = MY_ERROR_EPERM; break;
case ENOENT: error = MY_ERROR_ENOENT; break;
// ...
}
return error;
}
use translation directly in enum:
#define ERR_OFFSET -1000
typedef enum {
MY_ERROR_EPERM = ERR_OFFSET - EPERM, /* Operation not permitted */
MY_ERROR_ENOENT = ERR_OFFSET - ENOENT, /* No such file or directory */
MY_ERROR_ESRCH = ERR_OFFSET - ESRCH, /* No such process */
// ...
} MyReturnCodes_t;
Which way is more constant?
One more point: This library should be used both on QNX and Linux OS, what is the proper way to align errno codes (which different in some cases)?
I´d go for a std::map in a dedicated function. You don't have to care about gaps or anything as long as you use the provided error macros:
#include <iostream>
#include <errno.h>
#include <map>
namespace MyError
{
enum MyReturnCode: int
{
MY_INVALID_VAL = 0 , /* Invalid Mapping */
MY_ERROR_EPERM = -1104, /* Operation not permitted */
MY_ERROR_ENOENT = -1105, /* No such file or directory */
};
MyReturnCode fromErrno(int e)
{
static const std::map<int, MyReturnCode> mapping {
{ EPERM, MY_ERROR_EPERM},
{ ENOENT, MY_ERROR_ENOENT}
};
if(mapping.count(e))
return mapping.at(e);
else
return MY_INVALID_VAL;
}
}
int main()
{
std::cout << MyError::fromErrno(ENOENT) << std::endl;
std::cout << MyError::fromErrno(42) << std::endl;
return 0;
}
http://coliru.stacked-crooked.com/a/1da9fd44d88fb097

How can I determine filesystem type (name) with Linux API for C?

I need to get a C-string, which contains fs name.
There are a lot of commands to print fs name in terminal, but I can not find easy way to get it in C/C++ code.
You parse /proc/mounts.
If you have used one of the stat() family of functions, and have a filesystem identifier (st_dev) in numerical form, you simply stat() the mounted directory at each mount point listed in /proc/mounts (append /./ to each mount point, so that you stat the mounted directory, and not the mount point in its parent filesystem), until you see one that matches. Using that entry (line) you obtain the type of the filesystem, as the kernel sees it.
Remember that /proc/ and /sys/ in Linux systems are not on disk, but are the proper interface for the kernel to expose certain details. Currently mounted filesystems (in /proc/mounts) is one of these.
In posixc, this is very simple to implement using fopen(), getline(), fclose(), free(), and strtok() or sscanf() or your own line-splitting function. Remember that as a kernel interface, the contents of the files in /proc/ and /sys/ are never localized; they are always in the default C/POSIX locale.
Linux C API has statfs() (inspired by BSD, for other unix like OS have look into stat from coreutils).
#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <sys/vfs.h>
// https://github.com/linux-test-project/ltp/tree/master/include/tst_fs.h
/* man 2 statfs or kernel-source/include/linux/magic.h */
#define TST_BTRFS_MAGIC 0x9123683E
#define TST_NFS_MAGIC 0x6969
#define TST_RAMFS_MAGIC 0x858458f6
#define TST_TMPFS_MAGIC 0x01021994
#define TST_V9FS_MAGIC 0x01021997
#define TST_XFS_MAGIC 0x58465342
#define TST_EXT2_OLD_MAGIC 0xEF51
/* ext2, ext3, ext4 have the same magic number */
#define TST_EXT234_MAGIC 0xEF53
#define TST_MINIX_MAGIC 0x137F
#define TST_MINIX_MAGIC2 0x138F
#define TST_MINIX2_MAGIC 0x2468
#define TST_MINIX2_MAGIC2 0x2478
#define TST_MINIX3_MAGIC 0x4D5A
#define TST_UDF_MAGIC 0x15013346
#define TST_SYSV2_MAGIC 0x012FF7B6
#define TST_SYSV4_MAGIC 0x012FF7B5
#define TST_UFS_MAGIC 0x00011954
#define TST_UFS2_MAGIC 0x19540119
#define TST_F2FS_MAGIC 0xF2F52010
#define TST_NILFS_MAGIC 0x3434
#define TST_EXOFS_MAGIC 0x5DF5
#define TST_OVERLAYFS_MAGIC 0x794c7630
#define TST_FUSE_MAGIC 0x65735546
// https://github.com/linux-test-project/ltp/tree/master/lib/tst_fs_type.c
const char *tst_fs_type_name(long f_type)
{
switch (f_type) {
case TST_TMPFS_MAGIC:
return "tmpfs";
case TST_NFS_MAGIC:
return "nfs";
case TST_V9FS_MAGIC:
return "9p";
case TST_RAMFS_MAGIC:
return "ramfs";
case TST_BTRFS_MAGIC:
return "btrfs";
case TST_XFS_MAGIC:
return "xfs";
case TST_EXT2_OLD_MAGIC:
return "ext2";
case TST_EXT234_MAGIC:
return "ext2/ext3/ext4";
case TST_MINIX_MAGIC:
case TST_MINIX_MAGIC2:
case TST_MINIX2_MAGIC:
case TST_MINIX2_MAGIC2:
case TST_MINIX3_MAGIC:
return "minix";
case TST_UDF_MAGIC:
return "udf";
case TST_SYSV2_MAGIC:
case TST_SYSV4_MAGIC:
return "sysv";
case TST_UFS_MAGIC:
case TST_UFS2_MAGIC:
return "ufs";
case TST_F2FS_MAGIC:
return "f2fs";
case TST_NILFS_MAGIC:
return "nilfs";
case TST_EXOFS_MAGIC:
return "exofs";
case TST_OVERLAYFS_MAGIC:
return "overlayfs";
case TST_FUSE_MAGIC:
return "fuse";
default:
return "unknown";
}
}
void print_filesystem(const char* path)
{
if (path == NULL)
return;
struct statfs s;
if (statfs(path, &s)) {
fprintf(stderr, "statfs(%s) failed: %s\n", path, strerror(errno));
return;
}
printf("'%s' filesystem: %s\n", path, tst_fs_type_name(s.f_type));
}
int main(int argc, char *argv[]) {
print_filesystem("/");
print_filesystem("/tmp");
return 0;
}
Example:
$ gcc -Wall filesystem.c -o filesystem && ./filesystem
'/' filesystem: ext2/ext3/ext4
'/tmp' filesystem: tmpfs

How to initialize parameters depending on the environment that we call a program?

In a header file I have a parameter that specifies the name of a control file:
#define CTLFILE "server.ini"
This works fine. But now I want something like this:
If I am on the server
#define CTLFILE "server.ini"
else if I am on the client
#define CTLFILE "client.ini"
How can I implement this?
You can pass option when launch your program:
For example try to call the following program passing server or client:
#include <stdio.h>
#include <string.h>
#define SERVER_FILE "server.ini"
#define CLIENT_FILE "client.ini"
int main (int argc, char *argv[])
{
if (argc<2)
{
fprintf(stderr, "You mast pass type of envirnment\n!");
return 1;
}
if (strcmp(argv[1], "server") == 0)
{
printf ("File selected: %s\n", SERVER_FILE);
}
else if (strcmp(argv[1], "client") == 0)
{
printf ("File selected: %s\n", CLIENT_FILE);
}
else
{
fprintf(stderr, "Not supported environment %s", argv[0]);
return 1;
}
return 0;
}
You can use conditional compilation by using #ifdef....#endif pair.
For example, in the code, put it like
#ifdef SERVERSIDE
#define CTLFILE "server.ini"
#else
#define CTLFILE "client.ini"
#endif
Then, while compiling, pass -DSERVERSIDE option to the compiler (reference: gcc).
You can't do this that way, because #define and all #something are preprocessor instruction. That mean that after compilation, all #something are "gone", so you can't execute the same program differently with preprocessor instruction.
Many choice :
*) You compile twice the program with different #define CTLFILE.
*) You develop something like a configuration file in order to configure the execution of your program.
This will need extra development since you will have to dynamicly change string. It's up to you.
*) Just test for the existence of "server.ini" or "client.ini" file.
Work if the two file don't exist at the same time.

Is it possible to set a gdb watchpoint programmatically?

I want to set a watchpoint (break on hardware write) temporarily in my C++ program to find memory corruption.
I've seen all the ways to do it manually through gdb, but I would like to actually set the watchpoint via some method in my code so I don't have to break into gdb, find out the address, set the watchpoint and then continue.
Something like:
#define SET_WATCHPOINT(addr) asm ("set break on hardware write %addr")
Set hardware watchpoint from child process.
#include <signal.h>
#include <syscall.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <linux/user.h>
enum {
DR7_BREAK_ON_EXEC = 0,
DR7_BREAK_ON_WRITE = 1,
DR7_BREAK_ON_RW = 3,
};
enum {
DR7_LEN_1 = 0,
DR7_LEN_2 = 1,
DR7_LEN_4 = 3,
};
typedef struct {
char l0:1;
char g0:1;
char l1:1;
char g1:1;
char l2:1;
char g2:1;
char l3:1;
char g3:1;
char le:1;
char ge:1;
char pad1:3;
char gd:1;
char pad2:2;
char rw0:2;
char len0:2;
char rw1:2;
char len1:2;
char rw2:2;
char len2:2;
char rw3:2;
char len3:2;
} dr7_t;
typedef void sighandler_t(int, siginfo_t*, void*);
int watchpoint(void* addr, sighandler_t handler)
{
pid_t child;
pid_t parent = getpid();
struct sigaction trap_action;
int child_stat = 0;
sigaction(SIGTRAP, NULL, &trap_action);
trap_action.sa_sigaction = handler;
trap_action.sa_flags = SA_SIGINFO | SA_RESTART | SA_NODEFER;
sigaction(SIGTRAP, &trap_action, NULL);
if ((child = fork()) == 0)
{
int retval = EXIT_SUCCESS;
dr7_t dr7 = {0};
dr7.l0 = 1;
dr7.rw0 = DR7_BREAK_ON_WRITE;
dr7.len0 = DR7_LEN_4;
if (ptrace(PTRACE_ATTACH, parent, NULL, NULL))
{
exit(EXIT_FAILURE);
}
sleep(1);
if (ptrace(PTRACE_POKEUSER, parent, offsetof(struct user, u_debugreg[0]), addr))
{
retval = EXIT_FAILURE;
}
if (ptrace(PTRACE_POKEUSER, parent, offsetof(struct user, u_debugreg[7]), dr7))
{
retval = EXIT_FAILURE;
}
if (ptrace(PTRACE_DETACH, parent, NULL, NULL))
{
retval = EXIT_FAILURE;
}
exit(retval);
}
waitpid(child, &child_stat, 0);
if (WEXITSTATUS(child_stat))
{
printf("child exit !0\n");
return 1;
}
return 0;
}
int var;
void trap(int sig, siginfo_t* info, void* context)
{
printf("new value: %d\n", var);
}
int main(int argc, char * argv[])
{
int i;
printf("init value: %d\n", var);
watchpoint(&var, trap);
for (i = 0; i < 100; i++) {
var++;
sleep(1);
}
return 0;
}
Based on user512106's great answer, I coded up a little "library" that someone might find useful:
It's on github at https://github.com/whh8b/hwbp_lib. I wish I could have commented directly on his answer, but I don't have enough rep yet.
Based on feedback from the community, I am going to copy/paste the relevant code here:
#include <stdio.h>
#include <stddef.h>
#include <signal.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <sys/user.h>
#include <sys/prctl.h>
#include <stdint.h>
#include <errno.h>
#include <stdbool.h>
extern int errno;
enum {
BREAK_EXEC = 0x0,
BREAK_WRITE = 0x1,
BREAK_READWRITE = 0x3,
};
enum {
BREAK_ONE = 0x0,
BREAK_TWO = 0x1,
BREAK_FOUR = 0x3,
BREAK_EIGHT = 0x2,
};
#define ENABLE_BREAKPOINT(x) (0x1<<(x*2))
#define ENABLE_BREAK_EXEC(x) (BREAK_EXEC<<(16+(x*4)))
#define ENABLE_BREAK_WRITE(x) (BREAK_WRITE<<(16+(x*4)))
#define ENABLE_BREAK_READWRITE(x) (BREAK_READWRITE<<(16+(x*4)))
/*
* This function fork()s a child that will use
* ptrace to set a hardware breakpoint for
* memory r/w at _addr_. When the breakpoint is
* hit, then _handler_ is invoked in a signal-
* handling context.
*/
bool install_breakpoint(void *addr, int bpno, void (*handler)(int)) {
pid_t child = 0;
uint32_t enable_breakpoint = ENABLE_BREAKPOINT(bpno);
uint32_t enable_breakwrite = ENABLE_BREAK_WRITE(bpno);
pid_t parent = getpid();
int child_status = 0;
if (!(child = fork()))
{
int parent_status = 0;
if (ptrace(PTRACE_ATTACH, parent, NULL, NULL))
_exit(1);
while (!WIFSTOPPED(parent_status))
waitpid(parent, &parent_status, 0);
/*
* set the breakpoint address.
*/
if (ptrace(PTRACE_POKEUSER,
parent,
offsetof(struct user, u_debugreg[bpno]),
addr))
_exit(1);
/*
* set parameters for when the breakpoint should be triggered.
*/
if (ptrace(PTRACE_POKEUSER,
parent,
offsetof(struct user, u_debugreg[7]),
enable_breakwrite | enable_breakpoint))
_exit(1);
if (ptrace(PTRACE_DETACH, parent, NULL, NULL))
_exit(1);
_exit(0);
}
waitpid(child, &child_status, 0);
signal(SIGTRAP, handler);
if (WIFEXITED(child_status) && !WEXITSTATUS(child_status))
return true;
return false;
}
/*
* This function will disable a breakpoint by
* invoking install_breakpoint is a 0x0 _addr_
* and no handler function. See comments above
* for implementation details.
*/
bool disable_breakpoint(int bpno)
{
return install_breakpoint(0x0, bpno, NULL);
}
/*
* Example of how to use this /library/.
*/
int handled = 0;
void handle(int s) {
handled = 1;
return;
}
int main(int argc, char **argv) {
int a = 0;
if (!install_breakpoint(&a, 0, handle))
printf("failed to set the breakpoint!\n");
a = 1;
printf("handled: %d\n", handled);
if (!disable_breakpoint(0))
printf("failed to disable the breakpoint!\n");
return 1;
}
I hope that this helps someone!
Will
In GDB, there are two types of watchpoints, hardware and software.
you can't implement easily software watchpoints: (cf. GDB Internals)
Software watchpoints are very slow, since gdb needs to single-step the program being debugged and test the value of the watched expression(s) after each instruction.
EDIT:
I'm still trying to understand what are hardware watchpoint.
for hardware breakpoints, this article gives some technics:
We want to watch reading from or writing into 1 qword at address 100005120h (address range 100005120h-100005127h)
lea rax, [100005120h]
mov dr0, rax
mov rax, dr7
and eax, not ((1111b shl 16) + 11b) ; mask off all
or eax, (1011b shl 16) + 1 ; prepare to set what we want
mov
dr7, rax ; set it finally
Done, now we can wait until code falls into the trap! After accessing any byte at memory range 100005120h-100005127h, int 1 will occur and DR6.B0 bit will be set to 1.
You can also take a look at GDB low-end files (eg, amd64-linux-nat.c) but it (certainly) involves 2 processes: 1/ the one you want to watch 2/a lightweight debugger who attaches to the first one with ptrace, and uses:
ptrace (PTRACE_POKEUSER, tid, __regnum__offset__, address);
to set and handle the watchpoint.
The program itself can supply commands to the GDB. You'll need a special shell script to run GDB though.
Copy this code into the file named untee, and execute chmod 755 untee
#!/bin/bash
if [ -z "$1" ]; then
echo "Usage: $0 PIPE | COMMAND"
echo "This script will read the input from both stdin and PIPE, and supply it to the COMMAND."
echo "If PIPE does not exist it will be created with mkfifo command."
exit 0
fi
PIPE="$1"
if [ \! -e "$PIPE" ]; then
mkfifo "$PIPE"
fi
if [ \! -p "$PIPE" ]; then
echo "File $PIPE does not exist or is not a named pipe" > /dev/stderr
exit 1
fi
# Open the pipe as a FD 3
echo "Waiting for $PIPE to be opened by another process" > /dev/stderr
exec 3<"$PIPE"
echo "$PIPE opened" > /dev/stderr
OPENED=true
while true; do
read -t 1 INPUT
RET=$?
if [ "$RET" = 0 ]; then
echo "$INPUT"
elif [ "$RET" -lt 128 ]; then
echo "stdin closed, exiting" > /dev/stderr
break
fi
if $OPENED; then
while read -t 1 -u 3 INPUT; do
RET=$?
if [ "$RET" = 0 ]; then
echo "$INPUT"
else
if [ "$RET" -lt 128 ]; then
echo "$PIPE closed, ignoring" > /dev/stderr
OPENED=false
fi
break
fi
done
fi
done
And now the C code:
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <signal.h>
#include <unistd.h>
void gdbCommand(const char *c)
{
static FILE * dbgpipe = NULL;
static const char * dbgpath = "/tmp/dbgpipe";
struct stat st;
if( !dbgpipe && stat(dbgpath, &st) == 0 && S_ISFIFO(st.st_mode) )
dbgpipe = fopen(dbgpath, "w");
if( !dbgpipe )
return;
fprintf(dbgpipe, "%s\n", c);
fflush(dbgpipe);
}
void gdbSetWatchpoint(const char *var)
{
char buf[256];
snprintf(buf, sizeof(buf), "watch %s", var);
gdbCommand("up"); /* Go up the stack from the kill() system call - this may vary by the OS, you may need to walk the stack more times */
gdbCommand("up"); /* Go up the stack from the gdbSetWatchpoint() function */
gdbCommand(buf);
gdbCommand("continue");
kill(getpid(), SIGINT); /* Make GDB pause our process and execute commands */
}
int subfunc(int *v)
{
*v += 5; /* GDB should pause after this line, and let you explore stack etc */
return v;
}
int func()
{
int i = 10;
printf("Adding GDB watch for var 'i'\n");
gdbSetWatchpoint("i");
subfunc(&i);
return i;
}
int func2()
{
int j = 20;
return j + func();
}
int main(int argc, char ** argv)
{
func();
func2();
return 0;
}
Copy that to the file named test.c, compile with command gcc test.c -O0 -g -o test then execute ./untee /tmp/dbgpipe | gdb -ex "run" ./test
This works on my 64-bit Ubuntu, with GDB 7.3 (older GDB versions might refuse to read commands from non-terminal)
If you happen to be using Xcode, you can achieve the required effect (automatic setting of watchpoints) by using an action on another breakpoint to set your watchpoint:
Set up a breakpoint somewhere where the variable you want to watch will be in scope that will be hit before you need to start watching the variable,
Right-click on the breakpoint and select Edit Breakpoint...,
Click on Add Action and add a Debugger Command with an LLDB command like: watchpoint set variable <variablename> (or if you're using GDB1, a command like: watch <variablename>),
Check the Automatically continue after evaluating actions checkbox.
1: GDB is no longer supported in more recent versions of Xcode, but I believe it is still possible to set it up manually.