Getting the string(s) output from an exec command in C++ - c++

My problem is pretty simple, but I can't seem to find anything straightforward or specific to what I am trying to do. I'm simply using execl to list the files in the current folder that follow the same pattern (ie, execl("ls nameOfFile*.txt")). What I want to do now is grab those file names so that I can loop through and get the data out of them. Is there a simple way of doing this? Am I using the correct exec?
Thanks for any help or tips.

The signature of execl is
int execl(const char *path, const char *arg, ...);
You're supposed to pass the path to the executable as the first argument, and arguments for the executable as the subsequent arguments, so your calling syntax is wrong. Even if you fix that, it still won't do what you want. The only way execl and friends ever return control to the calling program is if an error occurs. This answer contains an excellent explanation of what execl does.
You were probably thinking of std::system, which you can pass an arbitrary string to, and have the OS execute that command. While that'll print the filenames to stdout, it's still not what you want, because system returns an error code resulting from executing the command line you specified, it has no way of capturing and returning whatever may be written to stdout by the command.
Unfortunately, there is nothing in the C++ standard library (yet) that allows you to list and iterate files from the filesystem. The preferred cross platform approach is to use Boost.Filesystem. Otherwise, there are platform specific APIs available, which are listed in this answer, along with a Boost usage example.

Related

Can a function access args passed to main()?

Is there a way I could access program's args outside of main() without storing references to them?
Program arguments are stored within preserved space of the program, so I see no reason for not being able to access them. Maybe there is something like const char** get_program_arguments() and int get_program_arguments_count() but I cannot find it...
My problem comes from the fact that I am rewriting a library that is used now in many programs within the company, and I need to access these programs common arguments without changing them. For example I need program name, but I cannot use ::getenv("_") as they can be executed from various shells. I cannot use GNU extension because this needs to work on Linux, AIX, SunOS using gcc, CC and so on.
Some systems do provide access to the argument list, or at least argv[0]. But it’s common practice for main to mutate argc and argv during option processing, so there is no reliably correct answer as to what a global interface for them should return.
Add to that the general undesirability of global state, and the fact that it harms debugging to have whatever low-level functions attempt to analyze the arguments to a program they might know nothing about, and you end up with don’t do that. It’s not hard to pass arguments (or, better, meaningful flags that result from decoding them) to a library.

How to open a file ending with a particular extension in C++

I am trying to write a lexer using flex, and want to open and read from a file ending with a particular extension. E.g filename.k. I am only able to do it if I specify the file name as well as the extension.
FILE *myfile = fopen("a.k", "r");
if (!myfile) {
cout << "I can't open a.k!" << endl;
Can someone show me the way to open *.k files in C++.
I am running flex on Ubuntu. What I am trying to do is to run a flex program. The above code executes fine. I wanted a way where I can open a file with .k extension irrespective of the file name. Example. ./myprogram a.k or ./myprogram b.k. In the above example I always have to specify the file name in the code itself all the time.
Comment to Basile's anser:
[...] Such as ./myprogram a.k, I wanted a way where I can write any filename instead of a but ending with a .k extension.
While the cited answer technically is correct, I think your true problem is how to get some arbitrary, but specific file path from the command line:
Example: ./myprogram a.k or ./myprogram b.k
The thing is quite easy: you get the command line parameters passed directly to your main function, provided you use the variant accepting them:
int main(int argc, char* argv[]);
First parameter (argv[0]) is always the name of your programme (or an empty string, if not available), so argc will always be at least one. Afterwards the parameters provided follow, so invoking "./myprogram b.k" will result in argc being two and argv pointing to a char* array equivalent to the following:
char* argv[] =
{
"./myprogram",
"b.k",
nullptr // oh, yes, the array is always null terminated...
};
And then, the matter gets easy: Check, if the parameter is given at all: if(argc == 2) or, if you are willing to accept but ignore any additional parameters, if(argc >= 2) or simply if(argv[1]) (as it will be nullptr, if no parameter given, or the first parameter otherwise) and then use it for fopen or, if you prefer a more C++ like way, to open a std::ifstream. You might want to have additional checks, e. g. if the file name really ends with ".k", but that's up to you now...
Your fopen-ing code is good, but running in conditions (e.g. in some weird working directory, or without sufficient permissions) which make the fopen fail.
I recommend to use errno (perhaps implicitly thru perror) in that failure case to get an idea of the failure reason:
FILE *myfile = fopen("a.k", "r");
if (!myfile) {
perror("fopen of a.k");
exit(EXIT_FAILURE);
}
See e.g. fopen(3), perror(3), errno(3) (or their documentation for your particular implementation and system).
Notice that file extensions don't really exist in standard C++11 (but C++17 has filesystem). On Linux and POSIX systems, file extensions are just a convention.
Can someone show me the way to open *.k files in C++.
If you need to open all files with a .k extension, you may rely on globbing (on POSIX, run something like yourprog *.k in your shell, which will expand the *.k into a sequence of file names ending with .k before running your program, whose main would get an array of arguments; see glob(7)), or you have to loop explicitly using operating system primitives or functions (perhaps with glob(3), nftw(3), opendir(3), readdir(3), ... on Linux; for Windows, read about FindFirstFile etc...)
Standard C++11 don't provide a way to iterate on all files matching a given pattern. Some framework libraries (Boost, Poco, Qt) do provide such a way. Or you need to use operating system specific functions (e.g. to read the current directory. But directories are not known to C++11 and are an abstraction provided by your operating system). But C++17 has filesystem, but you need a very recent compiler and C++ standard library to get that.
BTW, on Unix or POSIX systems, you could have one single file named *.k. Of course that is very poor taste and should be avoided (but you might run touch '*.k' in your shell to make such a file).
Regarding your edit, for Linux, I recommend running
./myprogram *.k
(then your shell will expand *.k into one or several arguments to myprogram)
and code the main of your program myprog appropriately to iterate on arguments. See this.
If you want to run just myprogram without any additional arguments, you need to code the globbing or the expansion inside it. See glob(3), wordexp(3). Or scan directories (with opendir(3), readdir(3), closedir, stat(2) or nftw(3))

How to pass command-line arguments to a Windows application correctly?

Passing command-line arguments to an application in Linux just works fine with the exec* commands where you clearly pass each argument on its own. Doing so on Windows using the same functions is not an option if one wants to control the standard pipes. As those functions are based on CreateProcess() there are some clear rules on how to escape special characters like double quotes.
Sadly, this only works correctly as long as the called application retrieves its command-line arguments via main(), wmain() or CommandLineToArgvW(). However, if the called application gets those arguments via WinMain(), wWinMain(), GetCommandLineA() or GetCommandLineW() it is up to the application how to parse the command-line arguments as it gets the whole command-line rather than argument by argument.
That means a simple application named test using main() as entry point gets "abc" if called as test.exe \"abc\". Calling cmd.exe as cmd.exe /c "echo \"abc\"" will not output "abc" as expected but \"abc\".
This leads to my question:How it possible to pass command-line arguments to Windows applications in a generic way despite these quirks?
In Windows, you need to think about the command as a whole, not as a list of individual arguments. Applications are not obliged to parse the command into arguments in any particular way, or indeed at all; consider the example of the echo command, which treats the command line as a single string.
This can be a problem for runtime library developers, because it means there is no reliable way to implement a POSIX-like exec function. Some library developers take the straightforward approach, and require the programmer to provide quote marks as necessary, and some attempt to quote the arguments automatically. In the latter case it is essential to provide some method for the programmer to specify the command line as a whole, disabling any automatic quotation, even if that means a Windows-specific extension.
However, in your scenario (as described in the comments) there shouldn't be a problem. All you have to do is make sure you ask the user for a command, not for a list of arguments. Your program simply doesn't need to know how the command will be split up into arguments, if at all; understanding the command's syntax is the user's job. [NB: if you don't think this is true, you need to explain your scenario much more clearly. Provide an example of what the user might enter and how you think your application would need to process it.]
PS: since you mentioned the C library _exec functions, note that they don't work as you might be expecting. The arguments are not passed individually to the child, since that's impossible; in the Microsoft C runtime, if I remember correctly, the arguments are simply combined together into a single string, with a single space as the delimiter, so ("hello there") will be indistinguishable from ("hello", "there").
PPS: note that calling cmd.exe to parse the command introduces an additional (and much more complicated) layer of processing. Generally speaking taking that into account would still be the user's job, but you may want to be aware of it. The escape character for cmd.exe processing is the caret.
It is the C language that makes you need to use a backslash before a double quote in C code. There is no such rule for shell processing. So if you writing code to call CreateProcess and passing the literal string "abc" then you need to use backslashes because you are writing in C. But if are writing a shell script to pass invoke your app to pass "abc", e.g. the Echo example, then you don't use backslashes because there is no C code involved.

Create an executable that calls another executable?

I want to make a small application that runs another application multiple times for different input parameters.
Is this already done?
Is it wrong to use system("myAp param"), for each call (of course with different param value)?
I am using kdevelop on Linux-Ubuntu.
From your comments, I understand that instead of:
system("path/to/just_testing p1 p2");
I shall use:
execl("path/to/just_testing", "path/to/just_testing", "p1", "p2", (char *) 0);
Is it true? You are saying that execl is safer than system and it is better to use?
In the non-professional field, using system() is perfectly acceptable, but be warned, people will always tell you that it's "wrong." It's not wrong, it's a way of solving your problem without getting too complicated. It's a bit sloppy, yes, but certainly is still a usable (if a bit less portable) option. The data returned by the system() call will be the return value of the application you're calling. Based on the limited information in your post, I assume that's all you're really wanting to know.
DIFFERENCES BETWEEN SYSTEM AND EXEC
system() will invoke the default command shell, which will execute the command passed as argument.
Your program will stop until the command is executed, then it'll continue.
The value you get back is not about the success of the command itself, but regards the correct opening of command shell.
A plus of system() is that it's part of the standard library.
With exec(), your process (the calling process) is replaced. Moreover you cannot invoke a script or an internal command. You could follow a commonly used technique: Differences between fork and exec
So they are quite different (for further details you could see: Difference between "system" and "exec" in Linux?).
A correct comparison is between POSIX spawn() and system(). spawn() is more complex but it allows to read the external command's return code.
SECURITY
system() (or popen()) can be a security risk since certain environment variables (like $IFS / $PATH) can be modified so that your program will execute external programs you never intended it to (i.e. a command is specified without a path name and the command processor path name resolution mechanism is accessible to an attacker).
Also the system() function can result in exploitable vulnerabilities:
when passing an unsanitized or improperly sanitized command string originating from a tainted source;
if a relative path to an executable is specified and control over the current working directory is accessible to an attacker;
if the specified executable program can be spoofed by an attacker.
For further details: ENV33-C. Do not call system()
Anyway... I like Somberdon's answer.

How do you determine full paths from filename command line arguments in a c++ program?

I am writing a program in c++ that accepts a filename as an argument on the command line:
>> ./myprogram ../path/to/file.txt
I know I can simply open an fstream using argv[1], but the program needs more information about the exact location (ie. full pathname) of the file.
I thought about appending argv[1] to getcwd(), however obviously in the example above you'd end up with /path/../path/to/file.txt. Not sure whether fstream would resolve that path automatically, but even if it did, I still don't have the full path without a lot of string processing.
Of course, that method wouldn't work at all if the path provided was already absolute. And since this program may be run on Linux/Windows/etc, simply detecting a starting '/' character won't work to determine whether the argument was a full path or not.
I would think this is a fairly common issue to deal with path names across multiple OSs. So how does one retreive the full path name of a command line argument and how is this handled between operating systems?
Pathname handling is highly OS-specific: some OS have a hierarchy with just one root (e.g. / on Unix ), some have several roots a la MS-DOS' drive letters; some may have symbolic links, hard links or other kinds of links, which can make traversal tricky. Some may not even have the concept of a "canonical" path to a file (e.g. if a file has hard links, it has multiple names, none of which is more "canonical").
If you've ever tried to do path-name manipulation across multiple OS in Java, you know what I mean :-).
In short, pathname handling is system-specific, so you'll have to do it separately for each OS (family), or use a suitable library.
Edit:
You could look at Apache Portable Runtime, or at Boost (C++ though), both have pathname handling functions.
...you'd end up with /path/../path/to/file.txt. Not sure
whether fstream would resolve that
path automatically, but even if it
did, I still don't have the full path
without a lot of string processing.
It does, and you can use /path/../path/ for everything you want without problems.
Anyway there is no standard function in C++ to do what you want. You would have to do it manually, and it wouldn't be trivial.. I suggest you keep the path as it is, it shouldn't cause any problems.
It is OS-dependent. If you are using linux you can look at realpath(). No doubt Windows has something comparable.
AFAIK there is no standard way.
however you could try this approach (written in pseudocode):
string raw_dirname=get_directory_part(argv[1])
string basename=get_filename_part(argv[1])
string cwd=getcwd()
chdir(relative_dirname)
string absolute_dirname=getcwd()
chdir(cwd)
string absolute_filename=absulute_dirname + separator + basename
but note: I am not quite sure if there are issues when symbolic links come into play.