Combining std::string and std::vector<char> - c++

This is not the actual code, but this represents my problem.
std::string str1 = "head";
char *buffer = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::vector<char> mainStr(buffer, buffer + strlen(buffer));
I want to put str1 and str2 to mainStr in an order:
headbody\0bodyfoot
So the binary data is maintained. Is this possible to do this?
PS: Thanks for telling the strlen part is wrong. I just used it to represent buffer's length. :)

There should be some way of defining length of data in "buffer".
Usually character 0 is used for this and most of standard text functions assume this. So if you use character 0 for other purposes, you have to provide another way to find out length of data.
Just for example:
char buffer[]="body\0body";
std::vector<char> mainStr(buffer,buffer+sizeof(buffer)/sizeof(buffer[0]));
Here we use array because it provides more information that a pointer - size of stored data.

You cannot use strlen as it uses '\0' to determine the end of string. However, the following will do what you are looking for:
std::string head = "header";
std::string foot = "footer";
const char body[] = "body\0body";
std::vector<char> v;
v.assign(head.begin(), head.end());
std::copy(body, body + sizeof(body)/sizeof(body[0]) - 1, std::back_inserter<std::vector<char> >(v));
std::copy(foot.begin(), foot.end(), std::back_inserter<std::vector<char> >(v));
Because the character buffer adds an NUL character at the end of the string, you'll want to ignore it (hence the -1 from the last iterator).

btw. strlen will not work if there are nul bytes in your string!
The code to insert into the vector is:
front:
mainStr.insert(mainStr.begin(), str1.begin(), str1.end());
back:
mainStr.insert(mainStr.end(), str2.begin(), str2.end());
With your code above (using strlen will print)
headbodyfoot
EDIT: just changed the copy to insert as copy requires the space to be available I think.

You could use std::vector<char>::insert to append the data you need into mainStr.
Something like this:
std::string str1 = "head";
char buffer[] = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::vector<char> mainStr(str1.begin(), str1.end());
mainStr.insert(mainStr.end(), buffer, buffer + sizeof(buffer)/sizeof(buffer[0]));
mainStr.insert(mainStr.end(), str2.begin(), str2.end());
Disclaimer: I didn't compile it.

You can use IO streams.
std::string str1 = "head";
const char *buffer = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::stringstream ss;
ss.write(str1.c_str(), str1.length())
.write(buffer, 9) // insert real length here
.write(str2.c_str(), str2.length());
std::string result = ss.str();
std::vector<char> vec(result.c_str(), result.c_str() + result.length());

str1 and str2 are string objects that write the text.
I wish compilers would fail on statements like the declaration of buffer and I don't care how much legacy code it breaks. If you're still building it you can still fix it and put in a const.
You would need to change your declaration of vector because strlen will stop at the first null character. If you did
char buffer[] = "body\0body";
then sizeof(buffer) would actually give you close to what you want although you'll get the end null-terminator too.
Once your vector mainStr is then set up correctly you could do:
std::string strConcat;
strConcat.reserve( str1.size() + str2.size() + mainStr.size() );
strConcat.assign(str1);
strConcat.append(mainStr.begin(), mainStr.end());
strConcat.append(str2);
if vector was set up using buffer, buffer+sizeof(buffer)-1

mainStr.resize(str1.length() + str2.length() + strlen(buffer));
memcpy(&mainStr[0], &str1[0], str1.length());
memcpy(&mainStr[str1.length()], buffer, strlen(buffer));
memcpy(&mainStr[str1.length()+strlen(buffer)], &str2[0], str2.length());

Related

Copy a part of an std::string in a char* pointer

Let's suppose I've this code snippet in C++
char* str;
std::string data = "This is a string.";
I need to copy the string data (except the first and the last characters) in str.
My solution that seems to work is creating a substring and then performing the std::copy operation like this
std::string substring = data.substr(1, size - 2);
str = new char[size - 1];
std::copy(substring.begin(), substring.end(), str);
str[size - 2] = '\0';
But maybe this is a bit overkilling because I create a new string. Is there a simpler way to achieve this goal? Maybe working with offets in the std:copy calls?
Thanks
As mentioned above, you should consider keeping the sub-string as a std::string and use c_str() method when you need to access the underlying chars.
However-
If you must create the new string as a dynamic char array via new you can use the code below.
It checks whether data is long enough, and if so allocates memory for str and uses std::copy similarly to your code, but with adapted iterators.
Note: there is no need to allocate a temporary std::string for the sub-string.
The Code:
#include <string>
#include <iostream>
int main()
{
std::string data = "This is a string.";
auto len = data.length();
char* str = nullptr;
if (len > 2)
{
auto new_len = len - 2;
str = new char[new_len+1]; // add 1 for zero termination
std::copy(data.begin() + 1, data.end() - 1, str); // copy from 2nd char till one before the last
str[new_len] = '\0'; // add zero termination
std::cout << str << std::endl;
// ... use str
delete[] str; // must be released eventually
}
}
Output:
his is a string
There is:
int length = data.length() - 1;
memcpy(str, data.c_str() + 1, length);
str[length] = 0;
This will copy the string in data, starting at position [1] (instead of [0]) and keep copying until length() - 1 bytes have been copied. (-1 because you want to omit the first character).
The final character then gets overwritten with the terminating \0, finalizing the string and disposing of the final character.
Of course this approach will cause problems if the string does not have at least 1 character, so you should check for that beforehand.

splite string using strtok function for given string not working on strings [duplicate]

I tried the following code from one of the guys who answer my previous question.
My case is I am trying to get the value 1.2597 in this string, and my non functional requirement is to use strtok instead of boost which is recommend by many fellow coders here.
However I got a issue casting result into a cstr* that can use the strtok.
I want to know how I can get 1.2597, and echo it out with strtok
string result= "CCY 1.2597 Down 0.0021(0.16%) 14:32 SGT [44]";
char *first = strtok(result, ' ');
char *second = strtok(0, ' ');
You can't use strtok directly on a C++ std::string. It requires a mutable zero-terminated C-style string, and there is no standard way to access the contents of a std::string in that form.
The simplest option (without using Boost or other non-standard libraries) is to use a C++ stream instead:
std::stringstream stream(result);
std::string first, second;
stream >> first >> second;
That could also convert the second token directly into a numeric type, if that's what you want.
If you really want to use strtok for some reason, then one possibility is to copy the string into a temporary mutable buffer:
std::vector<char> temp(result.begin(), result.end());
temp.push_back(0);
char * first = strtok(&temp[0], ' ');
char * second = strtok(0, ' ');
Be aware of the other major fault with strtok: it is not thread safe.
This works:
std::string result= "CCY 1.2597 Down 0.0021(0.16%) 14:32 SGT [44]";
size_t pos0 = result.find(' ');+1
size_t pos1 = result.find(' ',pos0);
std::string final_result = result.substr(pos0,pos1-pos0);
Yeah, as Mike Seymore stated, you have to copy your std::string to a mutable buffer. Already including string.h, simply use strcpy to copy the std::string to a mutable char*. Dont forget to include space for null termination + actually inserting the '\0' ;D
std::string result = "...";
char* token;
char* buffer[res.length() + 1]; //Space for '\0'
strcpy(buffer, result.c_str());
buffer[res.length()] = '\0'; //insert '\0'
token = strtok(buffer, " ");
while (token != NULL) {
/* work with token */
token = strtok(NULL, " ");
}

StringCchCat does not append source string to destination string [duplicate]

Why does this code produce runtime issues:
char stuff[100];
strcat(stuff,"hi ");
strcat(stuff,"there");
but this doesn't?
char stuff[100];
strcpy(stuff,"hi ");
strcat(stuff,"there");
strcat will look for the null-terminator, interpret that as the end of the string, and append the new text there, overwriting the null-terminator in the process, and writing a new null-terminator at the end of the concatenation.
char stuff[100]; // 'stuff' is uninitialized
Where is the null terminator? stuff is uninitialized, so it might start with NUL, or it might not have NUL anywhere within it.
In C++, you can do this:
char stuff[100] = {}; // 'stuff' is initialized to all zeroes
Now you can do strcat, because the first character of 'stuff' is the null-terminator, so it will append to the right place.
In C, you still need to initialize 'stuff', which can be done a couple of ways:
char stuff[100]; // not initialized
stuff[0] = '\0'; // first character is now the null terminator,
// so 'stuff' is effectively ""
strcpy(stuff, "hi "); // this initializes 'stuff' if it's not already.
In the first case, stuff contains garbage. strcat requires both the destination and the source to contain proper null-terminated strings.
strcat(stuff, "hi ");
will scan stuff for a terminating '\0' character, where it will start copying "hi ". If it doesn't find it, it will run off the end of the array, and arbitrarily bad things can happen (i.e., the behavior is undefined).
One way to avoid the problem is like this:
char stuff[100];
stuff[0] = '\0'; /* ensures stuff contains a valid string */
strcat(stuff, "hi ");
strcat(stuff, "there");
Or you can initialize stuff to an empty string:
char stuff[100] = "";
which will fill all 100 bytes of stuff with zeros (the increased clarity is probably worth any minor performance issue).
Because stuff is uninitialized before the call to strcpy. After the declaration stuff isn't an empty string, it is uninitialized data.
strcat appends data to the end of a string - that is it finds the null terminator in the string and adds characters after that. An uninitialized string isn't gauranteed to have a null terminator so strcat is likely to crash.
If there were to intialize stuff as below you could perform the strcat's:
char stuff[100] = "";
strcat(stuff,"hi ");
strcat(stuff,"there");
Strcat append a string to existing string. If the string array is empty, it is not going go find end of string ('\0') and it will cause run time error.
According to Linux man page, simple strcat is implemented this way:
char*
strncat(char *dest, const char *src, size_t n)
{
size_t dest_len = strlen(dest);
size_t i;
for (i = 0 ; i < n && src[i] != '\0' ; i++)
dest[dest_len + i] = src[i];
dest[dest_len + i] = '\0';
return dest;
}
As you can see in this implementation, strlen(dest) will not return correct string length unless dest is initialized to correct c string values. You may get lucky to have an array with the first value of zero at char stuff[100]; , but you should not rely on it.
Also, I would advise against using strcpy or strcat as they can lead to some unintended problems.
Use strncpy and strncat, as they help prevent buffer overflows.

How to check the contents of a LPTSTR string?

I'm trying to understand why a segmentation fault (SIGSEGV) occurs during the execution of this piece of code. This error occurs when testing the condition specified in the while instruction, but it does not occur at the first iteration, but at the second iteration.
LPTSTR arrayStr[STR_COUNT];
LPTSTR inputStr;
LPTSTR str;
// calls a function from external library
// in order to set the inputStr string
set_input_str(param1, (char*)&inputStr, param3);
str = inputStr;
while( *str != '\0' )
{
if( debug )
printf("String[%d]: %s\n", i, (char*)str);
arrayStr[i] = str;
str = str + strlen((char*)str) + 1;
i++;
}
After reading this answer, I have done some research on the internet and found this article, so I tried to modify the above code, using this piece of code read in this article (see below). However, this change did not solve the problem.
for (LPTSTR pszz = pszzStart; *pszz; pszz += lstrlen(pszz) + 1) {
... do something with pszz ...
}
As assumed in this answer, it seems that the code expects double null terminated arrays of string. Therefore, I wonder how I could check the contents of the inputStr string, in order to check if it actually contains only one null terminator char.
NOTE: the number of characters in the string printed from printf instruction is twice the value returned by the lstrlen(str) function call at the first iteration.
OK, now that you've included the rest of the code it is clear that it is indeed meant to parse a set of consecutive strings. The problem is that you're mixing narrow and wide string types. All you need to do to fix it is change the variable definitions (and remove the casts):
char *arrayStr[STR_COUNT];
char *inputStr;
char *str;
// calls a function from external library
// in order to set the inputStr string
set_input_str(param1, &inputStr, param3);
str = inputStr;
while( *str != '\0' )
{
if( debug )
printf("String[%d]: %s\n", i, str);
arrayStr[i] = str;
str = str + strlen(str) + 1;
i++;
}
Specifically, the issue was occurring on this line:
while( *str != '\0' )
since you hadn't cast str to char * the comparison was looking for a wide nul rather than a narrow nul.
str = str + strlen(str) + 1;
You go out of bounds, change to
str = str + 1;
or simply:
str++;
Of course you are inconsistently using TSTR and strlen, the latter assuming TCHAR = char
In any case, strlen returns the length of the string, which is the number of characters it contains not including the nul character.
Your arithmetic is out by one but you know you have to add one to the length of the string when you allocate the buffer.
Here however you are starting at position 0 and adding the length which means you are at position len which is the length of the string. Now the string runs from offset 0 to offset len - 1 and offset len holds the null character. Offset len + 1 is out of bounds.
Sometimes you might get away with reading it, if there is extra padding, but it is undefined behaviour and here you got a segfault.
This looks to me like code that expects double null terminated arrays of strings. I suspect that you are passing a single null terminated string.
So you are using something like this:
const char* inputStr = "blah";
but the code expects two null terminators. Such as:
const char* inputStr = "blah\0";
or perhaps an input value with multiple strings:
const char* inputStr = "foo\0bar\0";
Note that these final two strings are indeed double null terminated. Although only one null terminator is written explicitly at the end of the string, the compiler adds another one implicitly.
Your question edit throws a new spanner in the works? The cast in
strlen((char*)str)
is massively dubious. If you need to cast then the cast must be wrong. One wonders what LPTSTR expands to for you. Presumably it expands to wchar_t* since you added that cast to make the code compile. And if so, then the cast does no good. You are lying to the compiler (str is not char*) and lying to the compiler never ends well.
The reason for the segmentation fault is already given by Alter's answer. However, I'd like to add that the usual style of parsing a C-style string is more elegant and less verbose
while (char ch = *str++)
{
// other instructions
// ...
}
The scope of ch is only within in the body of the loop.
Aside: Either tag the question as C or C++ but not both, they're different languages.

C++ usage of strtok() on string

I tried the following code from one of the guys who answer my previous question.
My case is I am trying to get the value 1.2597 in this string, and my non functional requirement is to use strtok instead of boost which is recommend by many fellow coders here.
However I got a issue casting result into a cstr* that can use the strtok.
I want to know how I can get 1.2597, and echo it out with strtok
string result= "CCY 1.2597 Down 0.0021(0.16%) 14:32 SGT [44]";
char *first = strtok(result, ' ');
char *second = strtok(0, ' ');
You can't use strtok directly on a C++ std::string. It requires a mutable zero-terminated C-style string, and there is no standard way to access the contents of a std::string in that form.
The simplest option (without using Boost or other non-standard libraries) is to use a C++ stream instead:
std::stringstream stream(result);
std::string first, second;
stream >> first >> second;
That could also convert the second token directly into a numeric type, if that's what you want.
If you really want to use strtok for some reason, then one possibility is to copy the string into a temporary mutable buffer:
std::vector<char> temp(result.begin(), result.end());
temp.push_back(0);
char * first = strtok(&temp[0], ' ');
char * second = strtok(0, ' ');
Be aware of the other major fault with strtok: it is not thread safe.
This works:
std::string result= "CCY 1.2597 Down 0.0021(0.16%) 14:32 SGT [44]";
size_t pos0 = result.find(' ');+1
size_t pos1 = result.find(' ',pos0);
std::string final_result = result.substr(pos0,pos1-pos0);
Yeah, as Mike Seymore stated, you have to copy your std::string to a mutable buffer. Already including string.h, simply use strcpy to copy the std::string to a mutable char*. Dont forget to include space for null termination + actually inserting the '\0' ;D
std::string result = "...";
char* token;
char* buffer[res.length() + 1]; //Space for '\0'
strcpy(buffer, result.c_str());
buffer[res.length()] = '\0'; //insert '\0'
token = strtok(buffer, " ");
while (token != NULL) {
/* work with token */
token = strtok(NULL, " ");
}