C++ usage of strtok() on string

C++ usage of strtok() on string - c++

I tried the following code from one of the guys who answer my previous question.
My case is I am trying to get the value 1.2597 in this string, and my non functional requirement is to use strtok instead of boost which is recommend by many fellow coders here.
However I got a issue casting result into a cstr* that can use the strtok.
I want to know how I can get 1.2597, and echo it out with strtok
string result= "CCY 1.2597 Down 0.0021(0.16%) 14:32 SGT [44]";
char *first = strtok(result, ' ');
char *second = strtok(0, ' ');

You can't use strtok directly on a C++ std::string. It requires a mutable zero-terminated C-style string, and there is no standard way to access the contents of a std::string in that form.
The simplest option (without using Boost or other non-standard libraries) is to use a C++ stream instead:
std::stringstream stream(result);
std::string first, second;
stream >> first >> second;
That could also convert the second token directly into a numeric type, if that's what you want.
If you really want to use strtok for some reason, then one possibility is to copy the string into a temporary mutable buffer:
std::vector<char> temp(result.begin(), result.end());
temp.push_back(0);
char * first = strtok(&temp[0], ' ');
char * second = strtok(0, ' ');
Be aware of the other major fault with strtok: it is not thread safe.

This works:
std::string result= "CCY 1.2597 Down 0.0021(0.16%) 14:32 SGT [44]";
size_t pos0 = result.find(' ');+1
size_t pos1 = result.find(' ',pos0);
std::string final_result = result.substr(pos0,pos1-pos0);

Yeah, as Mike Seymore stated, you have to copy your std::string to a mutable buffer. Already including string.h, simply use strcpy to copy the std::string to a mutable char*. Dont forget to include space for null termination + actually inserting the '\0' ;D
std::string result = "...";
char* token;
char* buffer[res.length() + 1]; //Space for '\0'
strcpy(buffer, result.c_str());
buffer[res.length()] = '\0'; //insert '\0'
token = strtok(buffer, " ");
while (token != NULL) {
/* work with token */
token = strtok(NULL, " ");
}

Related

Copy a part of an std::string in a char* pointer

Let's suppose I've this code snippet in C++
char* str;
std::string data = "This is a string.";
I need to copy the string data (except the first and the last characters) in str.
My solution that seems to work is creating a substring and then performing the std::copy operation like this
std::string substring = data.substr(1, size - 2);
str = new char[size - 1];
std::copy(substring.begin(), substring.end(), str);
str[size - 2] = '\0';
But maybe this is a bit overkilling because I create a new string. Is there a simpler way to achieve this goal? Maybe working with offets in the std:copy calls?
Thanks

As mentioned above, you should consider keeping the sub-string as a std::string and use c_str() method when you need to access the underlying chars.
However-
If you must create the new string as a dynamic char array via new you can use the code below.
It checks whether data is long enough, and if so allocates memory for str and uses std::copy similarly to your code, but with adapted iterators.
Note: there is no need to allocate a temporary std::string for the sub-string.
The Code:
#include <string>
#include <iostream>
int main()
{
std::string data = "This is a string.";
auto len = data.length();
char* str = nullptr;
if (len > 2)
{
auto new_len = len - 2;
str = new char[new_len+1]; // add 1 for zero termination
std::copy(data.begin() + 1, data.end() - 1, str); // copy from 2nd char till one before the last
str[new_len] = '\0'; // add zero termination
std::cout << str << std::endl;
// ... use str
delete[] str; // must be released eventually
}
}
Output:
his is a string

There is:
int length = data.length() - 1;
memcpy(str, data.c_str() + 1, length);
str[length] = 0;
This will copy the string in data, starting at position [1] (instead of [0]) and keep copying until length() - 1 bytes have been copied. (-1 because you want to omit the first character).
The final character then gets overwritten with the terminating \0, finalizing the string and disposing of the final character.
Of course this approach will cause problems if the string does not have at least 1 character, so you should check for that beforehand.

splite string using strtok function for given string not working on strings [duplicate]

I tried the following code from one of the guys who answer my previous question.
My case is I am trying to get the value 1.2597 in this string, and my non functional requirement is to use strtok instead of boost which is recommend by many fellow coders here.
However I got a issue casting result into a cstr* that can use the strtok.
I want to know how I can get 1.2597, and echo it out with strtok
string result= "CCY 1.2597 Down 0.0021(0.16%) 14:32 SGT [44]";
char *first = strtok(result, ' ');
char *second = strtok(0, ' ');

You can't use strtok directly on a C++ std::string. It requires a mutable zero-terminated C-style string, and there is no standard way to access the contents of a std::string in that form.
The simplest option (without using Boost or other non-standard libraries) is to use a C++ stream instead:
std::stringstream stream(result);
std::string first, second;
stream >> first >> second;
That could also convert the second token directly into a numeric type, if that's what you want.
If you really want to use strtok for some reason, then one possibility is to copy the string into a temporary mutable buffer:
std::vector<char> temp(result.begin(), result.end());
temp.push_back(0);
char * first = strtok(&temp[0], ' ');
char * second = strtok(0, ' ');
Be aware of the other major fault with strtok: it is not thread safe.

This works:
std::string result= "CCY 1.2597 Down 0.0021(0.16%) 14:32 SGT [44]";
size_t pos0 = result.find(' ');+1
size_t pos1 = result.find(' ',pos0);
std::string final_result = result.substr(pos0,pos1-pos0);

Yeah, as Mike Seymore stated, you have to copy your std::string to a mutable buffer. Already including string.h, simply use strcpy to copy the std::string to a mutable char*. Dont forget to include space for null termination + actually inserting the '\0' ;D
std::string result = "...";
char* token;
char* buffer[res.length() + 1]; //Space for '\0'
strcpy(buffer, result.c_str());
buffer[res.length()] = '\0'; //insert '\0'
token = strtok(buffer, " ");
while (token != NULL) {
/* work with token */
token = strtok(NULL, " ");
}

How to check the contents of a LPTSTR string?

I'm trying to understand why a segmentation fault (SIGSEGV) occurs during the execution of this piece of code. This error occurs when testing the condition specified in the while instruction, but it does not occur at the first iteration, but at the second iteration.
LPTSTR arrayStr[STR_COUNT];
LPTSTR inputStr;
LPTSTR str;
// calls a function from external library
// in order to set the inputStr string
set_input_str(param1, (char*)&inputStr, param3);
str = inputStr;
while( *str != '\0' )
{
if( debug )
printf("String[%d]: %s\n", i, (char*)str);
arrayStr[i] = str;
str = str + strlen((char*)str) + 1;
i++;
}
After reading this answer, I have done some research on the internet and found this article, so I tried to modify the above code, using this piece of code read in this article (see below). However, this change did not solve the problem.
for (LPTSTR pszz = pszzStart; *pszz; pszz += lstrlen(pszz) + 1) {
... do something with pszz ...
}
As assumed in this answer, it seems that the code expects double null terminated arrays of string. Therefore, I wonder how I could check the contents of the inputStr string, in order to check if it actually contains only one null terminator char.
NOTE: the number of characters in the string printed from printf instruction is twice the value returned by the lstrlen(str) function call at the first iteration.

OK, now that you've included the rest of the code it is clear that it is indeed meant to parse a set of consecutive strings. The problem is that you're mixing narrow and wide string types. All you need to do to fix it is change the variable definitions (and remove the casts):
char *arrayStr[STR_COUNT];
char *inputStr;
char *str;
// calls a function from external library
// in order to set the inputStr string
set_input_str(param1, &inputStr, param3);
str = inputStr;
while( *str != '\0' )
{
if( debug )
printf("String[%d]: %s\n", i, str);
arrayStr[i] = str;
str = str + strlen(str) + 1;
i++;
}
Specifically, the issue was occurring on this line:
while( *str != '\0' )
since you hadn't cast str to char * the comparison was looking for a wide nul rather than a narrow nul.

str = str + strlen(str) + 1;
You go out of bounds, change to
str = str + 1;
or simply:
str++;

Of course you are inconsistently using TSTR and strlen, the latter assuming TCHAR = char
In any case, strlen returns the length of the string, which is the number of characters it contains not including the nul character.
Your arithmetic is out by one but you know you have to add one to the length of the string when you allocate the buffer.
Here however you are starting at position 0 and adding the length which means you are at position len which is the length of the string. Now the string runs from offset 0 to offset len - 1 and offset len holds the null character. Offset len + 1 is out of bounds.
Sometimes you might get away with reading it, if there is extra padding, but it is undefined behaviour and here you got a segfault.

This looks to me like code that expects double null terminated arrays of strings. I suspect that you are passing a single null terminated string.
So you are using something like this:
const char* inputStr = "blah";
but the code expects two null terminators. Such as:
const char* inputStr = "blah\0";
or perhaps an input value with multiple strings:
const char* inputStr = "foo\0bar\0";
Note that these final two strings are indeed double null terminated. Although only one null terminator is written explicitly at the end of the string, the compiler adds another one implicitly.
Your question edit throws a new spanner in the works? The cast in
strlen((char*)str)
is massively dubious. If you need to cast then the cast must be wrong. One wonders what LPTSTR expands to for you. Presumably it expands to wchar_t* since you added that cast to make the code compile. And if so, then the cast does no good. You are lying to the compiler (str is not char*) and lying to the compiler never ends well.

The reason for the segmentation fault is already given by Alter's answer. However, I'd like to add that the usual style of parsing a C-style string is more elegant and less verbose
while (char ch = *str++)
{
// other instructions
// ...
}
The scope of ch is only within in the body of the loop.
Aside: Either tag the question as C or C++ but not both, they're different languages.

Can you change the size of what a pointer point to

For example if a pointer points to an array of chars that read "Hello how are you?" And you only want the pointer to point to Hello. I am passing in a char pointer and when I cout it, it reads the entire array. I try to cut down the size using a for loop that break when it hit a ' '. But I am not having luck figuring it out. Any ideas?
const char *infile(char * file )
{
cout<<file<<endl; //this prints out the entire array
int j;
for(j=0;j<500; j++)
{
if(file[j]==' ')
break;
}
strncpy(file, file, j);
cout<<file<<endl; //how to get this to print out only the first word
}

strncpy() does not append a null terminator if there isn't one in the first j bytes of your source string. And your case, there isn't.
I think what you want to do is manually change the first space to a \0:
for (j = 0; j < 500; j++) {
if (file[j] == ' ') {
file[j] = '\0';
break;
}
}

First, avoid strtok (like the plague that it mostly is). It's unpleasant but sometimes justifiable in C. I've yet to see what I'd call justification for using it in C++ though.
Second, probably the easiest way to handle this (given that you're using C++) is to use a stringstream:
void infile(char const *file)
{
std::strinstream buffer(file);
std::string word;
buffer >> word;
std::cout << word;
}
Another possibility would be to use some of the functions built into std::string:
void infile(char const *file) {
std::string f(file);
std::cout << std::string(f, 0, f.find(" "));
}
...which, now that I think about it, is probably a bit simpler than the stringstream version of things.

A char* pointer actually just points to a single char object. If that object happens to be the first (or any) element of a string, you can use pointer arithmetic to access the other elements of that string -- which is how strings (C-style strings, not C++-style std::string objects) are generally accessed.
A (C-style) string is simply a sequence of characters terminated by a null character ('\0'). (Anything after the '\0' terminator isn't part of the string.) So a string "foo bar" consists of this sequence of characters:
{ 'f', 'o', 'o', ' ', 'b', 'a', 'r', '\0' }
If you want to change the string from "foo bar" to just "foo", one way to do it is simply to replace the space character with a null character:
{ 'f', 'o', 'o', '\0', ... }
The ... is not part of the syntax; it represents characters that are still there ('b', 'a', 'r', '\0'), but are no longer part of the string.
Since you're using C++, you'd probably be much better off using std::string; it's much more powerful and flexible, and frees you from having to worry about terminators, memory allocation, and other details. (Unless the point of this exercise is to learn how C-style strings work, of course.)
Note that this modifies the string pointed to by file, and that change will be visible to the caller. You can avoid that by making a local copy of the string (which requires allocating space for it, and later freeing that space). Again, std::string makes this kind of thing much easier.
Finally, this:
strncpy(file, file, j);
is bad on several levels. Calling strncpy() with an overlapping source and destination like this has undefined behavior; literally anything can happen. And strncpy() doesn't necessarily provide a proper NUL terminator in the destination. In a sense, strncpy() isn't really a string function. You're probably better off pretending it doesn't exist.
See my rant on the topic.

Doing this would be much easier
if(file[j]==' ') {
file[j] = 0;
break;
..
// strncpy(file, file, j);

Using strtok might make your life much easier.
Split up the string with ' ' as a delimiter, then print the first element you get from strtok.

Use 'strtok', see e.g. http://www.cplusplus.com/reference/clibrary/cstring/strtok/

If what you're asking is "can I dynamically resize the memory block pointed to by this pointer" then... not really, no. (You have to create a new block of the desired size, then copy the bytes over, delete the first block, etc.)
If you're trying to just "print the first word" then set the character at the position of the space to 0. Then, when you output the file* pointer you'll just get the first word (everything up to the \0.) (Read null terminated strings for more information on why that works that way.)
But this all depends on how much of what you're doing is an example to demonstrate the problem you're trying to solve. If you're really 'splitting up strings' then you'll at least want to look in to using strtok.

Why not just output each character at a time and then break once you hit a space.
const char *infile(char * file )
{
cout<<file<<endl; //this prints out the entire array
int j;
for(j=0;j<500; j++)
{
if(file[j]==' ')
break;
cout<<file[j];
}
cout<<endl;
}

This has nothing to do with the size of the pointer. A pointer always has the same size for a particular type.
Strtok might be the best solution (this code using strtok will break the string into substring every time is meets a space, an ",", a dot or a "-".
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
Source : CPP STRTOK

std::copy(file, std::find(file, file+500, ' '),
std::ostream_iterator<char>(std::cout, ""));

If you allocated the space that a char * points to using malloc, you can change the size using realloc.
char * pzFile = malloc(sizeof("Hello how are you?" + 1));
strcpy(pzFile, "Hello how are you?");
realloc(pzFile, 6);
pzFile[6] = '\0';
Note that if you do not set the null pointer, using the string can cause a problem.
If you were just trying to shorten the string, all you had to do is set the null terminator at position 6. The space allocated is larger than needed, but that's OK as long as it's not shorter.
I strongly advise that mostly what you want to do is COPY the string up to the space.
char * pzInput = malloc(sizeof("Hello how are you?" + 1));
strcpy(pzInput, "Hello how are you?");
char * pzBuffer = malloc(BUFFER_SIZE);
char * pzSrc = pzInput;
char * pzDst = pzBuffer;
while (*pzSrc && ' ' != *pzSrc)
*(pzDst++) = *(pzSrc++);
*pzDst = '\0';
This also ends up with pzSrc pointing at the rest of the string for later use!

Combining std::string and std::vector<char>

This is not the actual code, but this represents my problem.
std::string str1 = "head";
char *buffer = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::vector<char> mainStr(buffer, buffer + strlen(buffer));
I want to put str1 and str2 to mainStr in an order:
headbody\0bodyfoot
So the binary data is maintained. Is this possible to do this?
PS: Thanks for telling the strlen part is wrong. I just used it to represent buffer's length. :)

There should be some way of defining length of data in "buffer".
Usually character 0 is used for this and most of standard text functions assume this. So if you use character 0 for other purposes, you have to provide another way to find out length of data.
Just for example:
char buffer[]="body\0body";
std::vector<char> mainStr(buffer,buffer+sizeof(buffer)/sizeof(buffer[0]));
Here we use array because it provides more information that a pointer - size of stored data.

You cannot use strlen as it uses '\0' to determine the end of string. However, the following will do what you are looking for:
std::string head = "header";
std::string foot = "footer";
const char body[] = "body\0body";
std::vector<char> v;
v.assign(head.begin(), head.end());
std::copy(body, body + sizeof(body)/sizeof(body[0]) - 1, std::back_inserter<std::vector<char> >(v));
std::copy(foot.begin(), foot.end(), std::back_inserter<std::vector<char> >(v));
Because the character buffer adds an NUL character at the end of the string, you'll want to ignore it (hence the -1 from the last iterator).

btw. strlen will not work if there are nul bytes in your string!
The code to insert into the vector is:
front:
mainStr.insert(mainStr.begin(), str1.begin(), str1.end());
back:
mainStr.insert(mainStr.end(), str2.begin(), str2.end());
With your code above (using strlen will print)
headbodyfoot
EDIT: just changed the copy to insert as copy requires the space to be available I think.

You could use std::vector<char>::insert to append the data you need into mainStr.
Something like this:
std::string str1 = "head";
char buffer[] = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::vector<char> mainStr(str1.begin(), str1.end());
mainStr.insert(mainStr.end(), buffer, buffer + sizeof(buffer)/sizeof(buffer[0]));
mainStr.insert(mainStr.end(), str2.begin(), str2.end());
Disclaimer: I didn't compile it.

You can use IO streams.
std::string str1 = "head";
const char *buffer = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::stringstream ss;
ss.write(str1.c_str(), str1.length())
.write(buffer, 9) // insert real length here
.write(str2.c_str(), str2.length());
std::string result = ss.str();
std::vector<char> vec(result.c_str(), result.c_str() + result.length());

str1 and str2 are string objects that write the text.
I wish compilers would fail on statements like the declaration of buffer and I don't care how much legacy code it breaks. If you're still building it you can still fix it and put in a const.
You would need to change your declaration of vector because strlen will stop at the first null character. If you did
char buffer[] = "body\0body";
then sizeof(buffer) would actually give you close to what you want although you'll get the end null-terminator too.
Once your vector mainStr is then set up correctly you could do:
std::string strConcat;
strConcat.reserve( str1.size() + str2.size() + mainStr.size() );
strConcat.assign(str1);
strConcat.append(mainStr.begin(), mainStr.end());
strConcat.append(str2);
if vector was set up using buffer, buffer+sizeof(buffer)-1

mainStr.resize(str1.length() + str2.length() + strlen(buffer));
memcpy(&mainStr[0], &str1[0], str1.length());
memcpy(&mainStr[str1.length()], buffer, strlen(buffer));
memcpy(&mainStr[str1.length()+strlen(buffer)], &str2[0], str2.length());

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ usage of strtok() on string - c++

This works: std::string result= "CCY 1.2597 Down 0.0021(0.16%) 14:32 SGT [44]"; size_t pos0 = result.find(' ');+1 size_t pos1 = result.find(' ',pos0); std::string final_result = result.substr(pos0,pos1-pos0);

Related

Copy a part of an std::string in a char* pointer

splite string using strtok function for given string not working on strings [duplicate]

How to check the contents of a LPTSTR string?

Can you change the size of what a pointer point to

Combining std::string and std::vector<char>

Categories

Resources