Turn std::string into array of char* const*'s - c++

I am writing a command shell in C++ using the POSIX api, and have hit a snag. I am executing via execvp(3), so I somehow need to turn the std::string that contains the command into a suitable array of char* consts*'s that can be passed to:
int execvp(const char *file, char *const argv[]);
I have been racking my brain for hours but I can't think of any realistic or sane way to do this. Any help or insight on how I can achieve this conversion would be greatly appreciated. Thank you and have a good day!
edit:
As per request of Chnossos, here is an example:
const char *args[] = {"echo", "Hello,", "world!"};
execvp(args[0], args);

Assuming you have a string that contains more than "one argument", you will first have to split the string (using a std::vector<std::string> would work to store the separate strings), then for each element in the vector, store the .c_str() of that string into a const char args[MAXARGS] [or a std::vector<const char*> args; and use args.data() if you don't mind using C++11]. Do not forget to store a 0 or nullptr in the last element.
It is critical if you use c_str that the string you are basing that of is not a temporary: const char* x = str.substr(11, 33).c_str(); will not give you the thing you want, because at the end of that line, the temporary string is destroyed, and its storage freed.
If you have only one actual argument,
const char* args[2] = { str.c_str(), 0 };
would work.

Examplary approach:
#include <string>
#include <vector>
#include <cstring>
using namespace std;
int execvp(const char *file, char *const argv[]) {
//doing sth
}
int main() {
string s = "echo Hello world!";
char* cs = strdup(s.c_str());
char* lastbeg = cs;
vector<char *> collection;
for (char *itcs = cs; *itcs; itcs++) {
if (*itcs == ' ') {
*itcs = 0;
collection.push_back(lastbeg);
lastbeg = itcs + 1;
}
}
collection.push_back(lastbeg);
for (auto x: collection) {
printf("%s\n", x);
}
execvp("abc.txt", &collection[0]);
}
Notice that the memory for the cs isn't freed here... in your application you would need to take care of that...
The number of elements in array can be simply extracted from collection.size()

I use this:
command_line.hpp:
#pragma once
#include <vector>
#include <string>
namespace wpsc { namespace unittest { namespace mock {
class command_line final
{
public:
explicit command_line(std::vector<std::string> args = {});
explicit command_line(int argc, char const * const * const argv);
int argc() const;
/// #remark altering memory returned by this function results in UB
char** argv() const;
std::string string() const;
private:
std::vector<std::string> args_;
mutable std::vector<char*> c_args_;
};
}}} // wpsc::unittest::mock
command_line.cpp:
#include <wpsc/unittest/mock/command_line.hpp>
#include <algorithm>
#include <sstream>
namespace wpsc { namespace unittest { namespace mock {
command_line::command_line(std::vector<std::string> args)
: args_( std::move(args) ), c_args_( )
{
}
command_line::command_line(int argc, char const * const * const argv)
: command_line{ std::vector<std::string>{ argv, argv + argc } }
{
}
int command_line::argc() const
{
return static_cast<int>(args_.size());
}
char ** command_line::argv() const
{
if(args_.empty())
return nullptr;
if(c_args_.size() != args_.size() + 1)
{
c_args_.clear();
using namespace std;
transform(begin(args_), end(args_), back_inserter(c_args_),
[](const std::string& s) { return const_cast<char*>(s.c_str()); }
);
c_args_.push_back(nullptr);
}
return c_args_.data();
}
std::string command_line::string() const
{
using namespace std;
ostringstream buffer;
copy(begin(args_), end(args_), ostream_iterator<std::string>{ buffer, " " });
return buffer.str();
}
}}} // wpsc::unittest::mock
Client code:
int main(int argc, char** argv)
{
wpsc::unittest::mock::command_line cmd1{ argc, argv };
// wpsc::unittest::mock::command_line cmd2{ {"app.exe" "-h"} };
some_app_controller c;
return c.run(cmd1.argc(), cmd1.argv());
}

If the parsing can actually be really complicated, I'd go with something like that:
std::string cmd = "some really complicated command here";
char * const args[] =
{
"sh",
"-c",
cmd.c_str(),
(char *) NULL
};
execvp(args[0], args);

So the problem is the splitting of the line into individual arguments, and filling the argument vector with the respective pointers?
Assuming you want to split at the whitespace in the line, you replace whitespace in the string with null-bytes (in-place). You can then fill the argument vector with pointers into the string.
You will have to write a single loop to go through the string.

You need to decide what the rules will be for your shell and implement them. That's a significant fraction of the work of making a shell.
You need to write this code, and it's not simple. In a typical shell, echo "Hello world!" has to become { echo, Hello world! }, while echo \"Hello world!\" has to become { echo, "Hello world!" }. And so on.
What will " do in your shell? What will ' do? You need to make these decision before you code this part.

Related

Why am I getting seg faults from using the istream iterator?

void parse_and_run_command(const std::string &command) {
std::istringstream iss(command);
std::istream_iterator<char*> begin(iss), end;
std::vector<char*> tokens(begin, end); //place the arguments in a vector
tokens.push_back(NULL);
According to GDB, the segfault occurs after executing the second line with the istream_iterator. It did not segfault earlier when I was using string vectors.
You first need to create a std::vector of std::string which will own the string data, you can then transform that std::vector into a std::vector of pointers, note that the pointers will only be valid for the lifetime of the std::string std::vector:
#include <string>
#include <iostream>
#include <sstream>
#include <iterator>
#include <vector>
#include <algorithm>
void parse_and_run_command(const std::string &command) {
std::istringstream iss(command);
std::istream_iterator<std::string> begin(iss), end;
std::vector<std::string> tokens(begin, end);
std::vector<char*> ctokens;
std::transform(tokens.begin(), tokens.end(), std::back_inserter(ctokens), [](std::string& s) { return s.data(); });
ctokens.push_back(nullptr);
for (char* s : ctokens) {
if (s) {
std::cout << s << "\n";
}
else {
std::cout << "nullptr\n";
}
}
}
int main() {
parse_and_run_command("test test2 test3");
}
First, you need to split the std::string command into list of tokens of type std::vector<std::string>. Then, you may want to use std::transform in order to fill the new list of tokens of type std::vector<char const*>.
Here is a sample code:
void parse_and_run_command(std::string const& command) {
std::istringstream iss(command);
std::vector<std::string> results(std::istream_iterator<std::string>{iss},
std::istream_iterator<std::string>());
// debugging
for (auto const& token : results) {
std::cout << token << " ";
}
std::cout << std::endl;
std::vector<const char*> pointer_results;
pointer_results.resize(results.size(), nullptr);
std::transform(
std::begin(results), std::end(results),
std::begin(pointer_results),
[&results](std::string const& str) {
return str.c_str();
}
);
// debugging
for (auto const& token : pointer_results) {
std::cout << token << " ";
}
std::cout << std::endl;
// execv expects NULL as last element
pointer_results.push_back(nullptr);
char **cmd = const_cast<char**>(pointer_results.data());
execv(cmd[0], &cmd[0]);
}
Note the last part of the function: execv expects last element to be nullptr.
Hm, very interesting. Sounds like an easy task, but there are several caveats.
First of all, we need to consider that there are at least 2 different implementations of execv.
One under Posix / Linux, see here and a windows version: see here and here.
Please note the different function signatures:
Linux / POSIX: int execv(const char *path, char *const argv[]);
Windows: intptr_t _execv(const char *cmdname, const char *const *argv);
In this case I find the WIndows version a little bit cleaner, because the argv parameter is of type const char *const *. Anyway, the major problem is, that we have to call legacy code.
Ok, let's see.
The execv function requires a NULL-terminated array of char pointers with the argument for the function call. This we need to create.
We start with a std::string containing the command. This needs to be split up into parts. There are several ways and I added different examples.
The most simple way is maybe to put the std::string into a std::istringstream and then to use the std::istream_iterator to split it into parts. This is the typical short sequence:
// Put this into istringstream
std::istringstream iss(command);
// Split
std::vector parts(std::istream_iterator<std::string>(iss), {});
We use the range constructor for the std::vector. And we can define the std::vector without template argument. The compiler can deduce the argument from the given function parameters. This feature is called CTAD ("class template argument deduction").
Additionally, you can see that I do not use the "end()"-iterator explicitely.
This iterator will be constructed from the empty brace-enclosed default initializer with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that.
We can avoid the usage of std::istringstream and directly convert the string into tokens using std::sregex_token_iterator. Very simple to use. And the result is a one liner for splitting the original comand string:
// Split
std::vector<std::string> parts(std::sregex_token_iterator(command.begin(), command.end(), re, -1), {});
All this then boils down to 6 lines of code, including the definition of the variable and the invocation of the execv function:
Please see:
#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include <iterator>
#include <memory>
#include <algorithm>
#include <regex>
const std::regex re{ " " };
// Define Dummy function for _execv (Windows style, eveything const)
// Note: Type of argv decays to " const char* const* "
int _execv(const char* path, const char* const argv[]) {
std::cout << "\n\nPath: " << path << "\n\nArguments:\n\n";
while (*argv != 0) std::cout << *argv++ << "\n";
return 0;
}
// Define Dummy function for _execv (Posix style)
// Note: Type of argv decays to " char* const* "
int execv(const char* path, char* const argv[]) {
std::cout << "\n\nPath: " << path << "\n\nArguments:\n\n";
while (*argv != 0) std::cout << *argv++ << "\n";
return 0;
}
int main() {
{
// ----------------------------------------------------------------------
// Solution 1
// Initial example
char path[] = "path";
const char* const argv[] = { "arg1", "arg2", "arg3", 0 };
_execv(path, argv);
}
{
// ----------------------------------------------------------------------
// Solution 2
// Now, string, with command convert to a handmade argv array
std::string command{ "path arg1 arg2 arg3" };
// Put this into istringstream
std::istringstream iss(command);
// Split into substrings
std::vector parts(std::istream_iterator<std::string>(iss), {});
// create "argv" List. argv is of type " const char* "
std::unique_ptr<const char*[]> argv = std::make_unique<const char*[]>(parts.size());
// Fill argv array
size_t i = 1U;
for (; i < parts.size(); ++i) {
argv[i - 1] = parts[i].c_str();
}
argv[i - 1] = static_cast<char*>(0);
// Call execv
// Windows
_execv(parts[0].c_str(), argv.get());
// Linux / Posix
execv(parts[0].c_str(), const_cast<char* const*>(argv.get()));
}
{
// ----------------------------------------------------------------------
// Solution 3
// Transform string vector to vector of char*
std::string command{ "path arg1 arg2 arg3" };
// Put this into istringstream
std::istringstream iss(command);
// Split
std::vector parts(std::istream_iterator<std::string>(iss), {});
// Fill argv
std::vector<const char*> argv{};
std::transform(parts.begin(), parts.end(), std::back_inserter(argv), [](const std::string& s) { return s.c_str(); });
argv.push_back(static_cast<const char*>(0));
// Call execv
// Windows
_execv(argv[0], &argv[1]);
// Linux / Posix
execv(argv[0], const_cast<char* const*>(&argv[1]));
}
{
// ----------------------------------------------------------------------
// Solution 4
// Transform string vector to vector of char*. Get rid of istringstream
std::string command{ "path arg1 arg2 arg3" };
// Split
std::vector<std::string> parts(std::sregex_token_iterator(command.begin(), command.end(), re, -1), {});
// Fill argv
std::vector<const char*> argv{};
std::transform(parts.begin(), parts.end(), std::back_inserter(argv), [](const std::string& s) { return s.c_str(); });
argv.push_back(static_cast<const char*>(0));
// Call execv
// Windows
_execv(argv[0], &argv[1]);
// Linux / Posix
execv(argv[0], const_cast<char* const*>(&argv[1]));
}
return 0;
}

How to Loop Through a const char**?

I have a const char** called glfwNames which holds the C version of a string array of the required GLFW library extensions. Would it be possible to loop through either the const char* (string), or the individual characters of the string separated by '\0'?
const char** glfwNames = glfwGetRequiredInstanceExtensions(&glfwCount)
for (const char** name = glfwNames; *name; ++name)
{
slog("GLFW Extensions to use: %s", *name);
}
This is what I've attempted from one of the answers, and the return value of
glfwGetRequiredInstanceExtensions
is an array of extension names, required by GLFW http://www.glfw.org/docs/latest/group__vulkan.html#ga1abcbe61033958f22f63ef82008874b1
If glfwNames is nullptr-terminated:
#include <cstdio>
int main()
{
char const *glfwNames[] = { "foo", "bar", "baz", nullptr };
for (char const **p = glfwNames; *p; ++p)
std::puts(*p);
}
If you *know* the number of strings:
std::uint32_t glfwCount;
const char** glfwNames = glfwGetRequiredInstanceExtensions(&glfwCount)
for (std::uint32_t i{}; i < glfwCount; ++i)
{
slog("GLFW Extensions to use: %s", glfwNames[i]);
}
To also loop through the individual chars:
for (std::uint32_t i{}; i < glfwCount; ++i)
{
for(char const *p{ glfwNames[i] }; *p; ++p)
std::putchar(*p);
}
A common pattern that I use to loop through the arguments to main is via std::for_each:
#include <algorithm>
int main(int argc, char* argv[]) {
std::for_each( argv + 1, argv + argc, handler );
}
where handler is any function taking a const char*, const std::string&, or std::string_view (I use the later).
Would a similar approach work for your problem? Notice that this approach requires you to know the length of your array of strings.
As a side note, it is important to know that the return argument of std::for_each is the function provided (handler in this case). That enables the suggested pattern to make a last call once the input is known to have been exhausted:
#include <algorithm>
int main(int argc, char* argv[]) {
std::for_each( argv + 1, argv + argc, handler )("Argument To Last Call");
}
This can be used to implement state machines that receive the termination trigger at the end.

How to create and return string from function?

Would like to generate a string from a function, in order to format some data, so the function should return a string.
Tried to do the "obvious", shown below, but this prints garbage:
#include <iostream>
#include <string>
char * hello_world()
{
char res[13];
memcpy(res, "Hello world\n", 13);
return res;
}
int main(void)
{
printf(hello_world());
return 0;
}
I think this is because the memory on the stack used for the res variable, defined in the function, is overwritten before the value can be written, maybe when the printf call uses the stack.
If I move char res[13]; outside the function, thus makes it global, then it works.
So is the answer to have a global char buffer (string) that can be used for the result?
Maybe doing something like:
char * hello_world(char * res)
{
memcpy(res, "Hello world\n", 13); // 11 characters + newline + 0 for string termination
return res;
}
char res[13];
int main(void)
{
printf(hello_world(res));
return 0;
}
Don't bother with that early-20th century stuff. By the end of the previous century we already had std::string, and that's straightforward:
#include <iostream>
#include <string>
std::string hello_world()
{
return "Hello world\n";
}
int main()
{
std::cout << hello_world();
}
You are programming c. That's not bad, but your question is about c++ so this is the solution for the question you asked:
std::string hello_world()
{
std::string temp;
// todo: do whatever string operations you want here
temp = "Hello World";
return temp;
}
int main()
{
std::string result = hello_world();
std::cout << result << std::endl;
return 0;
}
Best solution would be to use std::string. However, if you must use an array, then it is best to allocate it in the calling function (in this case, main()):
#include <iostream>
#include <cstring>
void hello_world(char * s)
{
memcpy(s, "Hello world\n", 13);
}
int main(void)
{
char mys[13];
hello_world(mys);
std::cout<<mys;
return 0;
}
Still, if you want to write a pure C code, will can do something like that.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *HelloWorld(char *s, int size)
{
sprintf(s, "Hello world!\n");
return s;
}
int main (int argc, char *argv[])
{
char s[100];
printf(HelloWorld(s, 100));
return 0;
}

C++ regex - replacing substring in char

I have written the following function to replace substrings in a char. This way involves converting to a std::string then converting back to a const char. Is this the most efficient way or could I do it without this conversion or even a better way?!
const char* replaceInString(const char* find, const char* str, const char* replace)
{
std::string const text(str);
std::regex const reg(find);
std::string const newStr = std::regex_replace(text, reg, replace);
//Convert back to char
char *newChar = new char[newStr.size() + 1];
std::copy(newStr.begin(), newStr.end(), newChar);
newChar[newStr.size()] = '\0'; // terminating 0
return newChar;
}
const char* find = "hello";
const char* replace = "goodbye";
const char* oldStr = "hello james";
const char* newStr = m->replaceInString(find, oldStr, replace);
Assuming that you want a function to be called from a .c file you could use strdup (see code below).
#include <string>
#include <regex>
#include <cstdio>
#include <cstdlib>
char* replaceInString(char const *find, char const *str, char const *replace)
{
return strdup(std::regex_replace(std::string(str), std::regex(find), replace).c_str());
}
int main()
{
char *newStr = replaceInString("hello", "hello james", "goodbye");
printf("newStr = %s\n", newStr);
free(newStr);
return 0;
}
Note however that you have to free the returned memory after you're done.
Otherwise, as #jerry-coffin suggested go all the way with std::string (see code below):
#include <string>
#include <regex>
#include <iostream>
std::string replaceInString(std::string const &find, std::string const &str, std::string const &replace)
{
return std::regex_replace(std::string(str), std::regex(find), replace);
}
int main()
{
std::string str = replaceInString(std::string("hello"), std::string("hello james"), std::string("goodbye"));
std::cout << str << std::endl;
return 0;
}

using strstr() function is breaking

I am using strstr() function but I am getting the crash.
This part of code is crashing with error "Access violation reading location 0x0000006c."
strstr(p_czCharactersToDelete, (const char*)p_czInputString[index]))
Here is the complete code...
#include "stdafx.h"
#include <iostream>
#include <string>
void delchar(char* p_czInputString, const char* p_czCharactersToDelete)
{
for (size_t index = 0; index < strlen(p_czInputString); ++index)
{
if(NULL != strstr(p_czCharactersToDelete, (const char*)p_czInputString[index]))
{
printf_s("%c",p_czInputString[index]);
}
}
}
int main(int argc, char* argv[])
{
char c[32];
strncpy_s(c, "life of pie", 32);
delchar(c, "def");
// will output 'li o pi'
std::cout << c << std::endl;
}
The prototype of strstr() is as follows,
char * strstr ( char * str1, const char * str2 );
The function is used to locate substring from a main string. It returns a pointer to the first occurrence of str2 in str1, or a null pointer if str2 is not part of str1.
In your case you are passing the wrong parameters to the strstr(). You are calling,
strstr(p_czCharactersToDelete, (const char*)p_czInputString[index]));, which is wrong. Because the pointer p_czCharactersToDelete points to the sub string constant and p_czInputString points to the main string. Call strstr() as strstr(p_czInputString, p_czCharactersToDelete); and make corresponding changes in the function delchar().
you are using the wrong strstr.
probably you need strchr or strpbrk.
#include <cstring>
#include <algorithm>
class Include {
public:
Include(const char *list){ m_list = list; }
bool operator()(char ch) const
{
return ( strchr(m_list, ch) != NULL );
}
private:
const char *m_list;
};
void delchar(char* p_czInputString, const char* p_czCharactersToDelete){
Include inc(p_czCharactersToDelete);
char *last = std::remove_if(p_czInputString, p_czInputString + strlen(p_czInputString), inc);
*last = '\0';
}