This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Splitting a string in C++
char *strtok(char *s1, const char *s2)
How can I convert a string to a char* as required by strtok? I did
for (string line; getline(sourceFile, line);) {
tokens = strtok(line.c_str(), " {};");
}
Where sourceFile is an ifstream (sourceFile.open(filepath.c_str());)
I am getting:
argument of type "const char *" is incompatible with parameter of type "char *"
As others have said, you probably want to use something other than strtok.
However, to do what you are asking (and you probably shouldn't):
for (string line; getline(sourceFile, line);) {
char* line_cstr = strdup(line.c_str());
char* token = strtok(line_cstr, " {};");
while ((token = strtok(NULL, " {};")) != NULL) {
//code
}
free(line_cstr);
}
If you do something like this:
char fp[40]; // or some reasonable size
strcpy( fp, filepath.c_str()); // fp is not a const char* like c_str() returns
You should be able to use strtok() against fp.
But you might want to consider a more C++ way of splitting strings.
As Jonathan Leffler's comment explains, strtok actually modifies the buffer in place to terminate successive tokens. That's great for speed - saves copying each token out to some other memory area while still giving you convenient access to it as a separate NUL-terminated sequence. It's an interesting question whether it's legitimate to use strtok on a std::string instance - certainly, the c_str() member returns a const char* because you're not meant to write over parts of that buffer. But, as long as you don't have an empty string (for which it'd give undefined behaviour), you can get a char* to the first element with &line[0]. Having done that, you can mutate the string, but only at offsets [0]..[size()-1]... it's not guaranteed to be NUL terminated the way the c_str()-returned buffer is, and precisely because of that the &line[0]-returned buffer may not be suitable for use by strtok. You could append a NUL (i.e. line += '\0' - NUL is legal std::string content), but it would make the code a bit hackish and harder to maintain, and you might need to remove the NUL afterwards if you're planning to use the original value for anything.
If you really want to use strtok, seems best to copy the data to a separate writable NUL-terminated buffer first. For example:
char* p = strdup(line.c_str());
... use strtok on p ...
free(p);
You could use alloca(), a smart pointer, a std::vector<char> to minimise potential for memory leaks. (I don't see any particular reason to prefer C++ new and delete for the above - you're using strtok anyway, but if you've smart-pointer libraries that expect it go for it).
All that said, finding another mechanism - like those in boost - is a better option. I'm guessing alternatives like that are discussed in the link chris posted in a comment under your question....
Related
There is a function which sends data to the server:
int send(
_In_ SOCKET s,
_In_ const char *buf,
_In_ int len,
_In_ int flags
);
Providing length seems to me a little bit weird. I need to write a function, sending a line to the server and wrapping this one such that we don't have to provide length explicitly. I'm a Java-developer and in Java we could just invoke String::length() method, but now we're not in Java. How can I do that, unless providing length as a template parameter? For instance:
void sendLine(SOCKET s, const char *buf)
{
}
Is it possible to implement such a function?
Use std string:
void sendLine(SOCKET s, const std::string& buf) {
send (s, buf.c_str(), buf.size()+1, 0); //+1 will also transmit terminating \0.
}
On a side note: your wrapper function ignores the return value and doesn't take any flags.
you can retrieve the length of C-string by using strlen(const char*) function.
make sure all the strings are null terminated and keep in mind that null-termination (the length grows by 1)
Edit: My answer originally only mentioned std::string. I've now also added std::vector<char> to account for situations where send is not used for strictly textual data.
First of all, you absolutely need a C++ book. You are looking for either the std::string class or for std::vector<char>, both of which are fundamental elements of the language.
Your question is a bit like asking, in Java, how to avoid char[] because you never heard of java.lang.String, or how to avoid arrays in general because you never heard of java.util.ArrayList.
For the first part of this answer, let's assume you are dealing with just text output here, i.e. with output where a char is really meant to be a text character. That's the std::string use case.
Providing lenght seems to me a little bit wierd.
That's the way strings work in C. A C string is really a pointer to a memory location where characters are stored. Normally, C strings are null-terminated. This means that the last character stored for the string is '\0'. It means "the string stops here, and if you move further, you enter illegal territory".
Here is a C-style example:
#include <string.h>
#include <stdio.h>
void f(char const* s)
{
int l = strlen(s); // l = 3
printf(s); // prints "foo"
}
int main()
{
char* test = new char[4]; // avoid new[] in real programs
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
test[3] = '\0';
f(test);
delete[] test;
}
strlen just counts all characters at the specified position in memory until it finds '\0'. printf just writes all characters at the specified position in memory until it finds '\0'.
So far, so good. Now what happens if someone forgets about the null terminator?
char* test = new char[3]; // don't do this at home, please
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
f(test); // uh-oh, there is no null terminator...
The result will be undefined behaviour. strlen will keep looking for '\0'. So will printf. The functions will try to read memory they are not supposed to. The program is allowed to do anything, including crashing. The evil thing is that most likely, nothing will happen for a while because a '\0' just happens to be stored there in memory, until one day you are not so lucky anymore.
That's why C functions are sometimes made safer by requiring you to explicitly specify the number of characters. Your send is such a function. It works fine even without null-terminated strings.
So much for C strings. And now please don't use them in your C++ code. Use std::string. It is designed to be compatible with C functions by providing the c_str() member function, which returns a null-terminated char const * pointing to the contents of the string, and it of course has a size() member function to tell you the number of characters without the null-terminated character (e.g. for a std::string representing the word "foo", size() would be 3, not 4, and 3 is also what a C function like yours would probably expect, but you have to look at the documentation of the function to find out whether it needs the number of visible characters or number of elements in memory).
In fact, with std::string you can just forget about the whole null-termination business. Everything is nicely automated. std::string is exactly as easy and safe to use as java.lang.String.
Your sendLine should thus become:
void sendLine(SOCKET s, std::string const& line)
{
send(s, line.c_str(), line.size());
}
(Passing a std::string by const& is the normal way of passing big objects in C++. It's just for performance, but it's such a widely-used convention that your code would look strange if you just passed std::string.)
How can I do that, unless providing lenght as a template parameter?
This is a misunderstanding of how templates work. With a template, the length would have to be known at compile time. That's certainly not what you intended.
Now, for the second part of the answer, perhaps you aren't really dealing with text here. It's unlikely, as the name "sendLine" in your example sounds very much like text, but perhaps you are dealing with raw data, and a char in your output does not represent a text character but just a value to be interpreted as something completely different, such as the contents of an image file.
In that case, std::string is a poor choice. Your output could contain '\0' characters that do not have the meaning of "data ends here", but which are part of the normal contents. In other words, you don't really have strings anymore, you have a range of char elements in which '\0' has no special meaning.
For this situation, C++ offers the std::vector template, which you can use as std::vector<char>. It is also designed to be usable with C functions by providing a member function that returns a char pointer. Here's an example:
void sendLine(SOCKET s, std::vector<char> const& data)
{
send(s, &data[0], data.size());
}
(The unusual &data[0] syntax means "pointer to the first element of the encapsulated data. C++11 has nicer-to-read ways of doing this, but &data[0] also works in older versions of C++.)
Things to keep in mind:
std::string is like String in Java.
std::vector is like ArrayList in Java.
std::string is for a range of char with the meaning of text, std::vector<char> is for a range of char with the meaning of raw data.
std::string and std::vector are designed to work together with C APIs.
Do not use new[] in C++.
Understand the null termination of C strings.
I am trying to convert an int to a cstring. I've decided to read the int into a regular string via stringstream, and then read the string into a char array. The following seems to be working, but I'm wondering if I'm just getting lucky with my compiler. Does the code seem sound? Thanks!
int zip = 1234;
char zipString[30];
stringstream str;
str << zip;
str >> zipString;
cout << zipString;
You can get a C++ std::string from the stream's str() function, and an immutable C-style zero-terminated string from the string's c_str() function:
std::string cpp_string = str.str();
char const * c_string = cpp_string.c_str();
You might be tempted to combine these into a single expression, str.str().c_str(), but that would be wrong; the C++ string will be destroyed before you can do anything with the pointer.
What you are doing will work, as long as you're sure that the buffer is large enough; but using the C++ string removes the danger of overflowing the buffer. In general, it's best to avoid C-style strings unless you need to use an API that requires them (or, in extreme circumstances, as an optimisation to avoid memory allocation). std::string is usually safer and easier to work with.
Unless you have a specific reason that you need an array of char instead of a standard string, I'd use the latter. Although it's not strictly necessary in this case, I'd also normally use a Boost lexical_cast instead of explicitly moving things through a stringstream to do the conversion:
std::string zipString = lexical_cast<std::string>(zip);
Then, if you really need the result as a c-style string, you can use zipString.c_str() to get that (though it's still different in one way -- you can't modify what that returns).
In this specific case it doesn't gain you a lot, but consistent use for conversions on this general order adds up, and if you're going to do that, you might as well use it here too.
The std::string's c_str() member function returns a const char* (aka a C-style string).
std::string str = "world";
printf("hello, %s", str.c_str());
In a recent question, I learned that there are situations where you just gotta pass a char* instead of a std::string. I really like string, and for situations where I just need to pass an immutable string, it works fine to use .c_str(). The way I see it, it's a good idea to take advantage of the string class for its ease of manipulation. However, for functions that require an input, I end up doing something like this:
std::string str;
char* cstr = new char[500]; // I figure dynamic allocation is a good idea just
getstr(cstr); // in case I want the user to input the limit or
str = cstr; // something. Not sure if it matters.
delete[] cstr;
printw(str.c_str());
Obviously, this isn't so, uh, straightforward. Now, I'm pretty new to C++ so I can't really see the forest for the trees. In a situation like this, where every input is going to have to get converted to a C string and back to take advantage of string's helpful methods, is it just a better idea to man up and get used to C-style string manipulation? Is this kind of constant back-and-forth conversion too stupid to deal with?
In the example you give, you can generally read a line into a std::string using the std::getline function: http://www.cplusplus.com/reference/string/getline/
Of course this doesn't do everything that a curses library does. If you need a non-const char* so that some C function can read into it, you can use a vector<char>. You can create a vector<char> from a string, and vice-versa:
std::string a("hello, world");
std::vector<char> b(a.begin(), a.end());
// if we want a string consisting of every byte in the vector
std::string c(b.begin(), b.end());
// if we want a string only up to a NUL terminator in the vector
b.push_back(0);
std::string d(&b[0]);
So your example becomes:
std::vector<char> cstr(500);
getnstr(&cstr[0], 500);
printw(&cstr[0]);
Most std::string::c_str() implementations (if not all of them) simply return a pointer to an internal buffer. No overhead whatsoever.
Beware however that c_str() returns a const char*, not a char*. And that the pointer will become invalid after the function call. So you cannot use it if the function does anything like writing back into the passed string or makes a copy of the pointer.
I made a function like this:
bool IsSameString(char* p1, char* p2)
{
return 0 == strcmp(p1, p2);
}
The problem is that sometimes, by mistake, arguments are passed which are not strings (meaning that p1 or p2 is not terminated with a null character).
Then, strcmp continues comparing until it reaches non-accessible memory and crashes.
Is there a safe version of strcmp? Or can I tell whether p1 (and p2) is a string or not in a safe manner?
No, there's no (standard) way to tell whether a char * actually points to valid memory.
In your situation, it is better to use std::string rather than char *s for all your strings, along with the overloaded == operator. If you do this, the compiler would enforce type safety.
EDIT: As per the comments below if you find yourself in a situation where you sometimes pass char *s that may or may not be valid strings to functions that expect null-terminated strings then something is fundamentally wrong with your approach, so basically
#janm's answer below.
In some cases std::strncmp can solve your problem:
int strncmp ( const char * str1, const char * str2, size_t num );
It compares up to num characters of the C string str1 to those of the C string str2.
Also, take a look, what the US DHS National Cyber Security Division recommends on this matter:
Ensure that strings are null terminated before passing into strcmp. This can be enforced by always placing a \0 in the last allocated byte of the buffer.
char str1[] ="something";
char str2[] = "another thing";
/* In this case we know strings are null terminated. Pretend we don't. */
str1[sizeof(str1)-1] = '\0';
str2[sizeof(str2)-1] = '\0';
/* Now the following is safe. */
if (strcmp(str1, str2)) { /* do something */ } else { /* do something else */ }
If you are passing strings to strcmp() that are not null terminated you have already lost. The fact that you have a string that is not null terminated (but should be) indicates that you have deeper issues in your code. You cannot change strcmp() to safely deal with this problem.
You should be writing your code so that can never happen. Start by using the string class. At the boundaries where you take data into your code you need to make sure you deal with the exceptional cases; if you get too much data you need to Do The Right Thing. That does not involve running off the end of your buffer. If you must perform I/O into a C style buffer, use functions where you specify the length of the buffer and detect and deal with cases where the buffer is not large enough at that point.
There's no cure for this that is portable. The convention states that there's an extra character holding a null character that belongs to the same correctly allocated block of memory as the string itself. Either this convention is followed and everything's fine or undefined behaviour occurs.
If you know the length of the string you compare against you can use strncmp() but his will not help if the string passed to your code is actually shorter than the string you compare against.
you can use strncmp, But if possible use std::string to avoid many problems :)
You can put an upper limit on the number of characters to be compared using the strncmp function.
There is no best answer to this as you can't verify the char* is a string. The only solution is to create a type and use it for string for example str::string or create your own if you want something lighter. ie
struct MyString
{
MyString() : str(0), len(0) {}
MyString( char* x ) { len = strlen(x); str = strdup(x); }
âMyString() { if(str) free(str); }
char* str;
size_t len;
};
bool IsSameString(MyString& p1, MyString& p2)
{
return 0 == strcmp(p1.str, p2.str);
}
MyString str1("test");
MyString str2("test");
if( IsSameString( str1, str2 ) {}
You dont write, what platform you are using. Windows has the following functions:
IsBadStringPtr
IsBadReadPtr
IsBadWritePtr
IsBadStringPtr might be what you are looking for, if you are using windows.
Const-correctness in C++ is still giving me headaches. In working with some old C code, I find myself needing to assign turn a C++ string object into a C string and assign it to a variable. However, the variable is a char * and c_str() returns a const char []. Is there a good way to get around this without having to roll my own function to do it?
edit: I am also trying to avoid calling new. I will gladly trade slightly more complicated code for less memory leaks.
C++17 and newer:
foo(s.data(), s.size());
C++11, C++14:
foo(&s[0], s.size());
However this needs a note of caution: The result of &s[0]/s.data()/s.c_str() is only guaranteed to be valid until any member function is invoked that might change the string. So you should not store the result of these operations anywhere. The safest is to be done with them at the end of the full expression, as my examples do.
Pre C++-11 answer:
Since for to me inexplicable reasons nobody answered this the way I do now, and since other questions are now being closed pointing to this one, I'll add this here, even though coming a year too late will mean that it hangs at the very bottom of the pile...
With C++03, std::string isn't guaranteed to store its characters in a contiguous piece of memory, and the result of c_str() doesn't need to point to the string's internal buffer, so the only way guaranteed to work is this:
std::vector<char> buffer(s.begin(), s.end());
foo(&buffer[0], buffer.size());
s.assign(buffer.begin(), buffer.end());
This is no longer true in C++11.
There is an important distinction you need to make here: is the char* to which you wish to assign this "morally constant"? That is, is casting away const-ness just a technicality, and you really will still treat the string as a const? In that case, you can use a cast - either C-style or a C++-style const_cast. As long as you (and anyone else who ever maintains this code) have the discipline to treat that char* as a const char*, you'll be fine, but the compiler will no longer be watching your back, so if you ever treat it as a non-const you may be modifying a buffer that something else in your code relies upon.
If your char* is going to be treated as non-const, and you intend to modify what it points to, you must copy the returned string, not cast away its const-ness.
I guess there is always strcpy.
Or use char* strings in the parts of your C++ code that must interface with the old stuff.
Or refactor the existing code to compile with the C++ compiler and then to use std:string.
There's always const_cast...
std::string s("hello world");
char *p = const_cast<char *>(s.c_str());
Of course, that's basically subverting the type system, but sometimes it's necessary when integrating with older code.
If you can afford extra allocation, instead of a recommended strcpy I would consider using std::vector<char> like this:
// suppose you have your string:
std::string some_string("hello world");
// you can make a vector from it like this:
std::vector<char> some_buffer(some_string.begin(), some_string.end());
// suppose your C function is declared like this:
// some_c_function(char *buffer);
// you can just pass this vector to it like this:
some_c_function(&some_buffer[0]);
// if that function wants a buffer size as well,
// just give it some_buffer.size()
To me this is a bit more of a C++ way than strcpy. Take a look at Meyers' Effective STL Item 16 for a much nicer explanation than I could ever provide.
You can use the copy method:
len = myStr.copy(cStr, myStr.length());
cStr[len] = '\0';
Where myStr is your C++ string and cStr a char * with at least myStr.length()+1 size. Also, len is of type size_t and is needed, because copy doesn't null-terminate cStr.
Just use const_cast<char*>(str.data())
Do not feel bad or weird about it, it's perfectly good style to do this.
It's guaranteed to work in C++11. The fact that it's const qualified at all is arguably a mistake by the original standard before it; in C++03 it was possible to implement string as a discontinuous list of memory, but no one ever did it. There is not a compiler on earth that implements string as anything other than a contiguous block of memory, so feel free to treat it as such with complete confidence.
If you know that the std::string is not going to change, a C-style cast will work.
std::string s("hello");
char *p = (char *)s.c_str();
Of course, p is pointing to some buffer managed by the std::string. If the std::string goes out of scope or the buffer is changed (i.e., written to), p will probably be invalid.
The safest thing to do would be to copy the string if refactoring the code is out of the question.
std::string vString;
vString.resize(256); // allocate some space, up to you
char* vStringPtr(&vString.front());
// assign the value to the string (by using a function that copies the value).
// don't exceed vString.size() here!
// now make sure you erase the extra capacity after the first encountered \0.
vString.erase(std::find(vString.begin(), vString.end(), 0), vString.end());
// and here you have the C++ string with the proper value and bounds.
This is how you turn a C++ string to a C string. But make sure you know what you're doing, as it's really easy to step out of bounds using raw string functions. There are moments when this is necessary.
If c_str() is returning to you a copy of the string object internal buffer, you can just use const_cast<>.
However, if c_str() is giving you direct access tot he string object internal buffer, make an explicit copy, instead of removing the const.
Since c_str() gives you direct const access to the data structure, you probably shouldn't cast it. The simplest way to do it without having to preallocate a buffer is to just use strdup.
char* tmpptr;
tmpptr = strdup(myStringVar.c_str();
oldfunction(tmpptr);
free tmpptr;
It's quick, easy, and correct.
In CPP, if you want a char * from a string.c_str()
(to give it for example to a function that only takes a char *),
you can cast it to char * directly to lose the const from .c_str()
Example:
launchGame((char *) string.c_str());
C++17 adds a char* string::data() noexcept overload. So if your string object isn't const, the pointer returned by data() isn't either and you can use that.
Is it really that difficult to do yourself?
#include <string>
#include <cstring>
char *convert(std::string str)
{
size_t len = str.length();
char *buf = new char[len + 1];
memcpy(buf, str.data(), len);
buf[len] = '\0';
return buf;
}
char *convert(std::string str, char *buf, size_t len)
{
memcpy(buf, str.data(), len - 1);
buf[len - 1] = '\0';
return buf;
}
// A crazy template solution to avoid passing in the array length
// but loses the ability to pass in a dynamically allocated buffer
template <size_t len>
char *convert(std::string str, char (&buf)[len])
{
memcpy(buf, str.data(), len - 1);
buf[len - 1] = '\0';
return buf;
}
Usage:
std::string str = "Hello";
// Use buffer we've allocated
char buf[10];
convert(str, buf);
// Use buffer allocated for us
char *buf = convert(str);
delete [] buf;
// Use dynamic buffer of known length
buf = new char[10];
convert(str, buf, 10);
delete [] buf;