c++ strtok in function changes original string value as parameter - c++

when I use strtok to tokenize a c++ string, it happens a confusing problem, see the simple code below:
void a(string s){
strtok((char*)s.c_str(), " ");
}
int main(){
string s;
s = "world hello";
a(s);
cout<<s<<endl;
return 0;
}
the program outputs "world".
Shouldn't it output "world hello"? Because I pass the string as a value parameter to function a, the strtok shouldn't modify the original s...
Can anyone explain this trick.
thank you.

The problem is (char*)s.c_str(), you are casting the constness away and modified the string contents in a way that you are not supposed to. While the original s should not be modified, I pressume you may have been hit by a smart optimization that expects you to play by the rules. For instance, a COW implementation of string would happen to show that behavior.

c_str() returns a const pointer, which is a promise to the compiler that the thing being pointed at won't be modified. And then you're calling strtok which modifies it.
When you lie to the compiler, you will be punished.

That's the way strtok() works. It use the first parameter as a buffer. By casting it to a char*, you allow it to modify the string. strtok() does not known about the original std::string. It also store the string pointer in a static variable, that's why you have to call it with a null pointer the next times to continue to parse the same string.
By the way, in c++, you should use std::istringstream instead. It does not use an internal static variable, which is not thread-safe. And you can extract the parameters directly into int, double, etc like we do with cin. std::ostringstring replace sprintf().

Related

How to get original string by string_view in c++?

For example,
std::string_view strv{ "Hello" };
strv.remove_prefix(1);
The original string should be "Hello".
I tried using strv.data() and std::string str(strv.begin(), strv.end());
I can only get "ello" instead of "Hello".
I only found two methods right now.
use a variable(e.g. string str) to store original string then use use string_view. When you need the original string, use str.
use a variable(e.g. int removed_len) to record the length of removed prefixes. Then when you need the original string strv.data() - removed_len is the start place of the original string.
The "Hello" you use to construct strv is your original string. However, you never actually instantiate that string, making it a prvalue. If you want to keep a copy of your original string, try something like:
std::string str{"Hello"};
std::string_view strv{str};
strv.remove_prefix(1);
str will hold your original string.

Pass char array by reference and return modified

I have function definition lke below
void ConvertString(std::string &str)
{
size_t pos = 0;
while ((pos = str.find("&", pos)) != std::string::npos) {
str.replace(pos, 1, "and");
pos += 3;
}
}
Purpose of this function is to find & and replace it with and. function execution in fine. I written this for all generalised string at one instance I am calling this in following way
char mystr[80] = "ThisIsSample&String";
ConvertString((std::string)mystr);
printf(mystr);
In above call I am expecting console should be printed with new modified string with "and".
But some of string modification is not working , any error in function?
This code:
char mystr[80] = "ThisIsSample&String";
ConvertString((std::string)mystr);
printf(mystr);
… creates a temporary string object and passes that as argument.
Since the formal argument type is by reference to non-const, this should not compile, but Visual C++ supports it as a language extension (for class types only, IIRC).
Instead do like
string s = "Blah & blah";
ConvertString( s );
cout << s << endl;
By the way, C style casts are in general an invitation to bugs, because the basic nature of such a cast can change very silently from e.g. const_cast to reinterpret_cast when the code is maintained.
It's safe enough in the hands of an experienced programmer, like a power tool such as a chain saw can be safe in the hands of an experienced woodsman, but it's not a thing that a novice should use just to save a little work.
It's because you create a temporary std::string object (whose initial content is the content of the array mystr), and pass that temporary object by reference to the function. This temporary object is then destructed when the call id done.
Did you read some documentation of std::string and of printf?
You need
std::string mystr = "ThisIsSample&String";
ConvertString(mystr);
printf(mystr.c_str());
You obviously want to pass by reference a string variable (technically an l-value) to your ConvertString
I believe your problem is that you cast char array to string.
ConvertString((std::string)mystr);
this line creates a new variable of type std::string and passes it by reference. What you want is to convert it this way:
std::string convertedStr = (std::string)mystr;
ConvertString(convertedStr);
printf(convertedStr.c_str());
I am not very well aware of C++ pointer and reference syntax, but it's similar to this
what your are doing is not correct! you cannot should not convert a char* to a std::string with a cstyle-cast. what you should do is more like:
std::string mystr( "ThisIsSample&String" );
ConvertString(mystr);
edit:
thx for -reputation... this code isn't even compiling...
http://ideone.com/bCsmgf

Splitting a std::string into two const char*s resulting in the second const char* overwriting the first

I am taking a line of input which is separated by a space and trying to read the data into two integer variables.
for instance: "0 1" should give child1 == 0, child2 == 1.
The code I'm using is as follows:
int separator = input.find(' ');
const char* child1_str = input.substr(0, separator).c_str(); // Everything is as expected here.
const char* child2_str = input.substr(
separator+1, //Start with the next char after the separator
input.length()-(separator+1) // And work to the end of the input string.
).c_str(); // But now child1_str is showing the same location in memory as child2_str!
int child1 = atoi(child1_str);
int child2 = atoi(child2_str); // and thus are both of these getting assigned the integer '1'.
// do work
What's happening is perplexing me to no end. I'm monitoring the sequence with the Eclipse debugger (gdb). When the function starts, child1_str and child2_str are shown to have different memory locations (as they should). After splitting the string at separator and getting the first value, child1_str holds '0' as expected.
However, the next line, which assigns a value to child2_str not only assigns the correct value to child2_str, but also overwrites child1_str. I don't even mean the character value is overwritten, I mean that the debugger shows child1_str and child2_str to share the same location in memory.
What the what?
1) Yes, I'll be happy to listen to other suggestions to convert a string to an int -- this was how I learned to do it a long time ago, and I've never had a problem with it, so never needed to change, however:
2) Even if there's a better way to perform the conversion, I would still like to know what's going on here! This is my ultimate question. So even if you come up with a better algorithm, the selected answer will be the one that helps me understand why my algorithm fails.
3) Yes, I know that std::string is C++ and const char* is standard C. atoi requires a c string. I'm tagging this as C++ because the input will absolutely be coming as a std::string from the framework I am using.
First, the superior solutions.
In C++11 you can use the newfangled std::stoi function:
int child1 = std::stoi(input.substr(0, separator));
Failing that, you can use boost::lexical_cast:
int child1 = boost::lexical_cast<int>(input.substr(0, separator));
Now, an explanation.
input.substr(0, separator) creates a temporary std::string object that dies at the semicolon. Calling c_str() on that temporary object gives you a pointer that is only valid as long as the temporary lives. This means that, on the next line, the pointer is already invalid. Dereferencing that pointer has undefined behaviour. Then weird things happens, as is often the case with undefined behaviour.
The value returned by c_str() is invalid after the string is destructed. So when you run this line:
const char* child1_str = input.substr(0, separator).c_str();
The substr function returns a temporary string. After the line is run, this temporary string is destructed and the child1_str pointer becomes invalid. Accessing that pointer results in undefined behavior.
What you should do is assign the result of substr to a local std::string variable. Then you can call c_str() on that variable, and the result will be valid until the variable is destructed (at the end of the block).
Others have already pointed out the problem with your current code. Here's how I'd do the conversion:
std::istringstream buffer(input);
buffer >> child1 >> child2;
Much simpler and more straightforward, not to mention considerably more flexible (e.g., it'll continue to work even if the input has a tab or two spaces between the numbers).
input.substr returns a temporary std::string. Since you are not saving it anywhere, it gets destroyed. Anything that happens afterwards depends solely on your luck.
I recommend using an istringstream.

Character Pointers (allotted by new)

I wrote the following code:
char *pch=new char[12];
char *f=new char[42];
char *lab=new char[20];
char *mne=new char[10];
char *add=new char[10];
If initially I want these arrays to be null, can't I do this:
*lab="\0";
*mne="\0";
and so on.....
And after that if I want to add some cstring to an empty array can't I check:
if(strcmp(lab,"\0")==0)
//then add cstring by *lab="cstring";
And if I can't do any of these things, please tell me the right way to do it...
In C++11, an easy way to initialize arrays is by using brace-initializers:
char * p = new char[100] { 0 };
The reasoning here is that all the missing array elements will be zero-initialized. You can also use explicit value-initialization (I think that's even allowed in C++98/03), which is zero-initalization for the primitive types:
char * q = new char[110]();
First of all, as DeadMG says, the correct way of doing this is using std:string:
std::string lab; // empty initially, no further initialization needed
if (lab.size() == 0) // string empty, note, very fast, no character comparison
lab += "cstring"; // or even lab = "cstring", as lab is empty
Also, in your code, if you insist in using C strings, after the initialization, the correct checking for the empty string would be
if (*lab == '\0')
First of all, I agree with everybody else to use a std::string instead of character arrays the vast majority of the time. Link for help is here: C++ Strings Library
Now to directly answer your question as well:
*lab="\0";
*mne="\0";
and so on.....
This is wrong. Assuming your compiler doesn't give you an error, you're not assigning the "null terminator" to those arrays, you're trying to assign the pointer value of where the "\0" string is to the first few memory locations where the char* is pointing to! Remember, your variables are pointers, not strings. If you're trying to just put a null-character at the beginning, so that strlen or other C-string functions see an "empty" string, do this: *lab='\0'; The difference is that with single-ticks, it denotes the character \0 whereas with double, it's a string literal, which returns a pointer to the first element. I hope that made sense.
Now for your second, again, you can't just "assign" like that to C-style strings. You need to put each character into the array and terminate it correctly. Usually the easiest way is with sprintf:
sprintf(lab, "%s", "mystring");
This may not make much sense, especially as I'm not dereferencing the pointer, but I'll walk you through it. The first argument says to sprintf "output your characters to where this pointer is pointing." So it needs the raw pointer. The second is a format string, like printf uses. So I'm telling it to use the first argument as a string. And the 3rd is what I want in there, a pointer to another string. This example would also work with sprintf(lab, "mystring") as well.
If you want to get into C-style string processing, you need to read some examples. I'm afraid I don't even know where to look on the 'net for good examples of that, but I wish you good luck. I'd highly recommend that you check out the C++ strings library though, and the basic_string<> type there. That's typedef'd to just std::string, which is what you should use.

Pass contents of stringstream to function taking char* as argument

I have a function for writing ppm files (a picture format) to disk. It takes the filename as a char* array. In my main function, I put together a filename using a stringstream and the << operator. Then, I want to pass the results of this to my ppm function. I've seen this discussed elsewhere, often with very convoluted looking methods (many in-between conversion steps).
What I've done is shown in the code below, and the tricky part that others usually do in many steps with temp variables is (char*) (PPM_file_name.str().data()). What this accomplishes is to extract the string from stringstream PPM_file_name with .str(), then get the pointer to its actual contents with .data() (this is a const char*), then cast that to a regular (char*). More complete example below.
I've found the following to work fine so far, but it makes me uneasy because usually when other people have done something in a seemingly more convoluted way, it's because that's a safer way to do it. So, can anyone tell me if what I'm doing here is safe and also how portable is it?
Thanks.
#include <iostream>
#include <sstream>
#include <stdio.h>
#include <string>
using namespace std;
int main(int argc, char *argv[]){
// String stream to hold the file name so I can create it from a series of other variable
stringstream PPM_file_name;
// ... a bunch of other code where int ccd_num and string cur_id_str are created and initialized
// Assemble the file name
PPM_file_name << "ccd" << ccd_num << "_" << cur_id_str << ".ppm";
// From PPM_file_name, extract its string, then the const char* pointer to that string's data, then cast that to char*
write_ppm((char*)(PPM_file_name.str().data()),"ladybug_vidcapture.cpp",rgb_images[ccd_num],width,height);
return 0;
}
Thanks everyone. So, following a few peoples' suggestions here, I've done the following, since I do have control over write_ppm:
Modified write_ppm to take const char*:
void write_ppm(const char *file_name, char *comment, unsigned char *image,int width,int height)
And now I'm passing ppm_file_name as follows:
write_ppm((PPM_file_name.str().c_str()),"A comment",rgb_images[ccd_num],width,height);
Is there anything I should do here, or does that mostly clear up the issues with how this was being passed before? Should all the other char arguments to write_ppm be const as well? It's a very short function, and it doesn't appear to modify any of the arguments. Thanks.
This looks like a typical case of someone not writing const-correct code and it having the knock-on effect. You have several choices:
If write_ppm is under your control, or the control of anyone you know, get them to make it const corrct
If it is not, and you can guarantee it never changes the filename then const_cast
If you cannot guarantee that, copy your string into a std::vector plus the null terminator and pass &vec[0] (where vec represents the name of your vector variable)
You should use PPM_file_name.str().c_str(), since data() isn't guaranteed to return a null-terminated string.
Either write_ppm() should take its first argument by const char* (promising not to change the string's content) or you must not pass a string stream (because you must not change its content that way).
You shouldn't use C-style casts in C++, because they don't differentiate between different reasons to cast. Yours is casting away const, which, if at all, should be done using const_cast<>. But as a rule of thumb, const_cast<> is usually only required to make code compile that isn't const-correct, which I'd consider an error.
It's absolutely safe and portable as long as write_ppm doesn't actually change the argument, in which case it is undefined behavior. I would recommend using const_cast<char*> instead of C-style cast. Also consider using c_str() member instead of the data() member. The former guarantees to return a null-terminated string
Use c_str() instead of data() (c_str() return a NULL-terminated sequence of characters).
Why not simply use const_cast<char *>(PPM_file_name.str().c_str()) ?