How to avoid providing length along with char*? - c++

There is a function which sends data to the server:
int send(
_In_ SOCKET s,
_In_ const char *buf,
_In_ int len,
_In_ int flags
);
Providing length seems to me a little bit weird. I need to write a function, sending a line to the server and wrapping this one such that we don't have to provide length explicitly. I'm a Java-developer and in Java we could just invoke String::length() method, but now we're not in Java. How can I do that, unless providing length as a template parameter? For instance:
void sendLine(SOCKET s, const char *buf)
{
}
Is it possible to implement such a function?

Use std string:
void sendLine(SOCKET s, const std::string& buf) {
send (s, buf.c_str(), buf.size()+1, 0); //+1 will also transmit terminating \0.
}
On a side note: your wrapper function ignores the return value and doesn't take any flags.

you can retrieve the length of C-string by using strlen(const char*) function.
make sure all the strings are null terminated and keep in mind that null-termination (the length grows by 1)

Edit: My answer originally only mentioned std::string. I've now also added std::vector<char> to account for situations where send is not used for strictly textual data.
First of all, you absolutely need a C++ book. You are looking for either the std::string class or for std::vector<char>, both of which are fundamental elements of the language.
Your question is a bit like asking, in Java, how to avoid char[] because you never heard of java.lang.String, or how to avoid arrays in general because you never heard of java.util.ArrayList.
For the first part of this answer, let's assume you are dealing with just text output here, i.e. with output where a char is really meant to be a text character. That's the std::string use case.
Providing lenght seems to me a little bit wierd.
That's the way strings work in C. A C string is really a pointer to a memory location where characters are stored. Normally, C strings are null-terminated. This means that the last character stored for the string is '\0'. It means "the string stops here, and if you move further, you enter illegal territory".
Here is a C-style example:
#include <string.h>
#include <stdio.h>
void f(char const* s)
{
int l = strlen(s); // l = 3
printf(s); // prints "foo"
}
int main()
{
char* test = new char[4]; // avoid new[] in real programs
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
test[3] = '\0';
f(test);
delete[] test;
}
strlen just counts all characters at the specified position in memory until it finds '\0'. printf just writes all characters at the specified position in memory until it finds '\0'.
So far, so good. Now what happens if someone forgets about the null terminator?
char* test = new char[3]; // don't do this at home, please
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
f(test); // uh-oh, there is no null terminator...
The result will be undefined behaviour. strlen will keep looking for '\0'. So will printf. The functions will try to read memory they are not supposed to. The program is allowed to do anything, including crashing. The evil thing is that most likely, nothing will happen for a while because a '\0' just happens to be stored there in memory, until one day you are not so lucky anymore.
That's why C functions are sometimes made safer by requiring you to explicitly specify the number of characters. Your send is such a function. It works fine even without null-terminated strings.
So much for C strings. And now please don't use them in your C++ code. Use std::string. It is designed to be compatible with C functions by providing the c_str() member function, which returns a null-terminated char const * pointing to the contents of the string, and it of course has a size() member function to tell you the number of characters without the null-terminated character (e.g. for a std::string representing the word "foo", size() would be 3, not 4, and 3 is also what a C function like yours would probably expect, but you have to look at the documentation of the function to find out whether it needs the number of visible characters or number of elements in memory).
In fact, with std::string you can just forget about the whole null-termination business. Everything is nicely automated. std::string is exactly as easy and safe to use as java.lang.String.
Your sendLine should thus become:
void sendLine(SOCKET s, std::string const& line)
{
send(s, line.c_str(), line.size());
}
(Passing a std::string by const& is the normal way of passing big objects in C++. It's just for performance, but it's such a widely-used convention that your code would look strange if you just passed std::string.)
How can I do that, unless providing lenght as a template parameter?
This is a misunderstanding of how templates work. With a template, the length would have to be known at compile time. That's certainly not what you intended.
Now, for the second part of the answer, perhaps you aren't really dealing with text here. It's unlikely, as the name "sendLine" in your example sounds very much like text, but perhaps you are dealing with raw data, and a char in your output does not represent a text character but just a value to be interpreted as something completely different, such as the contents of an image file.
In that case, std::string is a poor choice. Your output could contain '\0' characters that do not have the meaning of "data ends here", but which are part of the normal contents. In other words, you don't really have strings anymore, you have a range of char elements in which '\0' has no special meaning.
For this situation, C++ offers the std::vector template, which you can use as std::vector<char>. It is also designed to be usable with C functions by providing a member function that returns a char pointer. Here's an example:
void sendLine(SOCKET s, std::vector<char> const& data)
{
send(s, &data[0], data.size());
}
(The unusual &data[0] syntax means "pointer to the first element of the encapsulated data. C++11 has nicer-to-read ways of doing this, but &data[0] also works in older versions of C++.)
Things to keep in mind:
std::string is like String in Java.
std::vector is like ArrayList in Java.
std::string is for a range of char with the meaning of text, std::vector<char> is for a range of char with the meaning of raw data.
std::string and std::vector are designed to work together with C APIs.
Do not use new[] in C++.
Understand the null termination of C strings.

Related

std::string.c_str() has different value than std::string?

I have been working with C++ strings and trying to load char * strings into std::string by using C functions such as strcpy(). Since strcpy() takes char * as a parameter, I have to cast it which goes something like this:
std::string destination;
unsigned char *source;
strcpy((char*)destination.c_str(), (char*)source);
The code works fine and when I run the program in a debugger, the value of *source is stored in destination, but for some odd reason it won't print out with the statement
std::cout << destination;
I noticed that if I use
std::cout << destination.c_str();
The value prints out correctly and all is well. Why does this happen? Is there a better method of copying an unsigned char* or char* into a std::string (stringstreams?) This seems to only happen when I specify the string as foo.c_str() in a copying operation.
Edit: To answer the question "why would you do this?", I am using strcpy() as a plain example. There are other times that it's more complex than assignment. For example, having to copy only X amount of string A into string B using strncpy() or passing a std::string to a function from a C library that takes a char * as a parameter for a buffer.
Here's what you want
std::string destination = source;
What you're doing is wrong on so many levels... you're writing over the inner representation of a std::string... I mean... not cool man... it's much more complex than that, arrays being resized, read-only memory... the works.
This is not a good idea at all for two reasons:
destination.c_str() is a const pointer and casting away it's const and writing to it is undefined behavior.
You haven't set the size of the string, meaning that it won't even necessealy have a large enough buffer to hold the string which is likely to cause an access violation.
std::string has a constructor which allows it to be constructed from a char* so simply write:
std::string destination = source
Well what you are doing is undefined behavior. Your c_str() returns a const char * and is not meant to be assigned to. Why not use the defined constructor or assignment operator.
std::string defines an implicit conversion from const char* to std::string... so use that.
You decided to cast away an error as c_str() returns a const char*, i.e., it does not allow for writing to its underlying buffer. You did everything you could to get around that and it didn't work (you shouldn't be surprised at this).
c_str() returns a const char* for good reason. You have no idea if this pointer points to the string's underlying buffer. You have no idea if this pointer points to a memory block large enough to hold your new string. The library is using its interface to tell you exactly how the return value of c_str() should be used and you're ignoring that completely.
Do not do what you are doing!!!
I repeat!
DO NOT DO WHAT YOU ARE DOING!!!
That it seems to sort of work when you do some weird things is a consequence of how the string class was implemented. You are almost certainly writing in memory you shouldn't be and a bunch of other bogus stuff.
When you need to interact with a C function that writes to a buffer there's two basic methods:
std::string read_from_sock(int sock) {
char buffer[1024] = "";
int recv = read(sock, buffer, 1024);
if (recv > 0) {
return std::string(buffer, buffer + recv);
}
return std::string();
}
Or you might try the peek method:
std::string read_from_sock(int sock) {
int recv = read(sock, 0, 0, MSG_PEEK);
if (recv > 0) {
std::vector<char> buf(recv);
recv = read(sock, &buf[0], recv, 0);
return std::string(buf.begin(), buf.end());
}
return std::string();
}
Of course, these are not very robust versions...but they illustrate the point.
First you should note that the value returned by c_str is a const char* and must not be modified. Actually it even does not have to point to the internal buffer of string.
In response to your edit:
having to copy only X amount of string A into string B using strncpy()
If string A is a char array, and string B is std::string, and strlen(A) >= X, then you can do this:
B.assign(A, A + X);
passing a std::string to a function from a C library that takes a char
* as a parameter for a buffer
If the parameter is actually const char *, you can use c_str() for that. But if it is just plain char *, and you are using a C++11 compliant compiler, then you can do the following:
c_function(&B[0]);
However, you need to ensure that there is room in the string for the data(same as if you were using a plain c-string), which you can do with a call to the resize() function. If the function writes an unspecified amount of characters to the string as a null-terminated c-string, then you will probably want to truncate the string afterward, like this:
B.resize(B.find('\0'));
The reason you can safely do this in a C++11 compiler and not a C++03 compiler is that in C++03, strings were not guaranteed by the standard to be contiguous, but in C++11, they are. If you want the guarantee in C++03, then you can use std::vector<char> instead.

Is there a safe version of strlen?

std::strlen doesn't handle c strings that are not \0 terminated. Is there a safe version of it?
PS I know that in c++ std::string should be used instead of c strings, but in this case my string is stored in a shared memory.
EDIT
Ok, I need to add some explanation.
My application is getting a string from a shared memory (which is of some length), therefore it could be represented as an array of characters. If there is a bug in the library writing this string, then the string would not be zero terminated, and the strlen could fail.
You've added that the string is in shared memory. That's guaranteed readable, and of fixed size. You can therefore use size_t MaxPossibleSize = startOfSharedMemory + sizeOfSharedMemory - input; strnlen(input, MaxPossibleSize) (mind the extra n in strnlen).
This will return MaxPossibleSize if there's no \0 in the shared memory following input, or the string length if there is. (The maximal possible string length is of course MaxPossibleSize-1, in case the last byte of shared memory is the first \0)
C strings that are not null-terminated are not C strings, they are simply arrays of characters, and there is no way of finding their length.
If you define a c-string as
char* cowSays = "moo";
then you autmagically get the '\0' at the end and strlen would return 3. If you define it like:
char iDoThis[1024] = {0};
you get an empty buffer (and array of characters, all of which are null characters). You can then fill it with what you like as long as you don't over-run the buffer length. At the start strlen would return 0, and once you have written something you would also get the correct number from strlen.
You could also do this:
char uhoh[100];
int len = strlen(uhoh);
but that would be bad, because you have no idea what is in that array. It could hit a null character you might not. The point is that the null character is the defined standard manner to declare that the string is finished.
Not having a null character means by definition that the string is not finished. Changing that will break the paradigm of how the string works. What you want to do is make up your own rules. C++ will let you do that, but you will have to write a lot of code yourself.
EDIT
From your newly added info, what you want to do is loop over the array and check for the null character by hand. You should also do some validation if you are expecting ASCII characters only (especially if you are expecting alpha-numeric characters). This assumes that you know the maximum size.
If you do not need to validate the content of the string then you could use one of the strnlen family of functions:
http://msdn.microsoft.com/en-us/library/z50ty2zh%28v=vs.80%29.aspx
http://linux.about.com/library/cmd/blcmdl3_strnlen.htm
size_t safe_strlen(const char *str, size_t max_len)
{
const char * end = (const char *)memchr(str, '\0', max_len);
if (end == NULL)
return max_len;
else
return end - str;
}
Yes, since C11:
size_t strnlen_s( const char *str, size_t strsz );
Located in <string.h>
Get a better library, or verify the one you have - if you can't trust you library to do what it says it will, then how the h%^&l do you expect your program to?
Thats said, Assuming you know the length of the buiffer the string resides, what about
buffer[-1+sizeof(buffer)]=0 ;
x = strlen(buffer) ;
make buffer bigger than needed and you can then test the lib.
assert(x<-1+sizeof(buffer));
C11 includes "safe" functions such as strnlen_s. strnlen_s takes an extra maximum length argument (a size_t). This argument is returned if a null character isn't found after checking that many characters. It also returns the second argument if a null pointer is provided.
size_t strnlen_s(const char *, size_t);
While part of C11, it is recommended that you check that your compiler supports these bounds-checking "safe" functions via its definition of __STDC_LIB_EXT1__. Furthermore, a user must also set another macro, __STDC_WANT_LIB_EXT1__, to 1, before including string.h, if they intend to use such functions. See here for some Stack Overflow commentary on the origins of these functions, and here for C++ documentation.
GCC and Clang also support the POSIX function strnlen, and provide it within string.h. Microsoft too provide strnlen which can also be found within string.h.
You will need to encode your string. For example:
struct string
{
size_t len;
char *data;
} __attribute__(packed);
You can then accept any array of characters if you know the first sizeof(size_t) bytes of the shared memory location is the size of the char array. It gets tricky when you want to chain arrays this way.
It's better to trust your other end to terminate it's strings or roll your own strlen that does not go outside the bounderies of the shared memory segment (providing you know at least the size of that segment).
If you need to get the size of shared memory, try to use
// get memory size
struct shmid_ds shm_info;
size_t shm_size;
int shm_rc;
if((shm_rc = shmctl(shmid, IPC_STAT, &shm_info)) < 0)
exit(101);
shm_size = shm_info.shm_segsz;
Instead of using strlen you can use shm_size - 1 if you are sure that it is null terminated. Otherwise you can null terminate it by data[shm_size - 1] = '\0'; then use strlen(data);
a simple solution:
buff[BUFF_SIZE -1] = '\0'
ofc this will not tell you if the string originally was exactly BUFF_SIZE-1 long or it was just not terminated... so you need xtra logic for that.
How about this portable nugget:
int safeStrlen(char *buf, int max)
{
int i;
for(i=0;buf[i] && i<max; i++){};
return i;
}
As Neil Butterworth already said in his answer above: C-Strings which are not terminated by a \0 character, are no C-Strings!
The only chance you do have is to write an immutable Adaptor or something which creates a valid copy of the C-String with a \0 terminating character. Of course, if the input is wrong and there is an C-String defined like:
char cstring[3] = {'1','2','3'};
will indeed result in unexpected behavior, because there can be something like 123#4x\0 in the memory now. So the result of of strlen() for example is now 6 and not 3 as expected.
The following approach shows how to create a safe C-String in any case:
char *createSafeCString(char cStringToCheck[]) {
//Cast size_t to integer
int size = static_cast<int>(strlen(cStringToCheck)) ;
//Initialize new array out of the stack of the method
char *pszCString = new char[size + 1];
//Copy data from one char array to the new
strncpy(pszCString, cStringToCheck, size);
//set last character to the \0 termination character
pszCString[size] = '\0';
return pszCString;
}
This ensures that if you manipulate the C-String to not write on the memory of something else.
But this is not what you wanted. I know, but there is no other way to achieve the length of a char array without termination. This isn't even an approach. It just ensures that even if the User (or Dev) is inserting ***** to work fine.

Is there any safe strcmp?

I made a function like this:
bool IsSameString(char* p1, char* p2)
{
return 0 == strcmp(p1, p2);
}
The problem is that sometimes, by mistake, arguments are passed which are not strings (meaning that p1 or p2 is not terminated with a null character).
Then, strcmp continues comparing until it reaches non-accessible memory and crashes.
Is there a safe version of strcmp? Or can I tell whether p1 (and p2) is a string or not in a safe manner?
No, there's no (standard) way to tell whether a char * actually points to valid memory.
In your situation, it is better to use std::string rather than char *s for all your strings, along with the overloaded == operator. If you do this, the compiler would enforce type safety.
EDIT: As per the comments below if you find yourself in a situation where you sometimes pass char *s that may or may not be valid strings to functions that expect null-terminated strings then something is fundamentally wrong with your approach, so basically
#janm's answer below.
In some cases std::strncmp can solve your problem:
int strncmp ( const char * str1, const char * str2, size_t num );
It compares up to num characters of the C string str1 to those of the C string str2.
Also, take a look, what the US DHS National Cyber Security Division recommends on this matter:
Ensure that strings are null terminated before passing into strcmp. This can be enforced by always placing a \0 in the last allocated byte of the buffer.
char str1[] ="something";
char str2[] = "another thing";
/* In this case we know strings are null terminated. Pretend we don't. */
str1[sizeof(str1)-1] = '\0';
str2[sizeof(str2)-1] = '\0';
/* Now the following is safe. */
if (strcmp(str1, str2)) { /* do something */ } else { /* do something else */ }
If you are passing strings to strcmp() that are not null terminated you have already lost. The fact that you have a string that is not null terminated (but should be) indicates that you have deeper issues in your code. You cannot change strcmp() to safely deal with this problem.
You should be writing your code so that can never happen. Start by using the string class. At the boundaries where you take data into your code you need to make sure you deal with the exceptional cases; if you get too much data you need to Do The Right Thing. That does not involve running off the end of your buffer. If you must perform I/O into a C style buffer, use functions where you specify the length of the buffer and detect and deal with cases where the buffer is not large enough at that point.
There's no cure for this that is portable. The convention states that there's an extra character holding a null character that belongs to the same correctly allocated block of memory as the string itself. Either this convention is followed and everything's fine or undefined behaviour occurs.
If you know the length of the string you compare against you can use strncmp() but his will not help if the string passed to your code is actually shorter than the string you compare against.
you can use strncmp, But if possible use std::string to avoid many problems :)
You can put an upper limit on the number of characters to be compared using the strncmp function.
There is no best answer to this as you can't verify the char* is a string. The only solution is to create a type and use it for string for example str::string or create your own if you want something lighter. ie
struct MyString
{
MyString() : str(0), len(0) {}
MyString( char* x ) { len = strlen(x); str = strdup(x); }
⁓MyString() { if(str) free(str); }
char* str;
size_t len;
};
bool IsSameString(MyString& p1, MyString& p2)
{
return 0 == strcmp(p1.str, p2.str);
}
MyString str1("test");
MyString str2("test");
if( IsSameString( str1, str2 ) {}
You dont write, what platform you are using. Windows has the following functions:
IsBadStringPtr
IsBadReadPtr
IsBadWritePtr
IsBadStringPtr might be what you are looking for, if you are using windows.

Difference between string and char[] types in C++

For C, we use char[] to represent strings.
For C++, I see examples using both std::string and char arrays.
#include <iostream>
#include <string>
using namespace std;
int main () {
string name;
cout << "What's your name? ";
getline(cin, name);
cout << "Hello " << name << ".\n";
return 0;
}
#include <iostream>
using namespace std;
int main () {
char name[256];
cout << "What's your name? ";
cin.getline(name, 256);
cout << "Hello " << name << ".\n";
return 0;
}
(Both examples adapted from http://www.cplusplus.com.)
What is the difference between these two types in C++? (In terms of performance, API integration, pros/cons, ...)
A char array is just that - an array of characters:
If allocated on the stack (like in your example), it will always occupy eg. 256 bytes no matter how long the text it contains is
If allocated on the heap (using malloc() or new char[]) you're responsible for releasing the memory afterwards and you will always have the overhead of a heap allocation.
If you copy a text of more than 256 chars into the array, it might crash, produce ugly assertion messages or cause unexplainable (mis-)behavior somewhere else in your program.
To determine the text's length, the array has to be scanned, character by character, for a \0 character.
A string is a class that contains a char array, but automatically manages it for you. Most string implementations have a built-in array of 16 characters (so short strings don't fragment the heap) and use the heap for longer strings.
You can access a string's char array like this:
std::string myString = "Hello World";
const char *myStringChars = myString.c_str();
C++ strings can contain embedded \0 characters, know their length without counting, are faster than heap-allocated char arrays for short texts and protect you from buffer overruns. Plus they're more readable and easier to use.
However, C++ strings are not (very) suitable for usage across DLL boundaries, because this would require any user of such a DLL function to make sure he's using the exact same compiler and C++ runtime implementation, lest he risk his string class behaving differently.
Normally, a string class would also release its heap memory on the calling heap, so it will only be able to free memory again if you're using a shared (.dll or .so) version of the runtime.
In short: use C++ strings in all your internal functions and methods. If you ever write a .dll or .so, use C strings in your public (dll/so-exposed) functions.
Arkaitz is correct that string is a managed type. What this means for you is that you never have to worry about how long the string is, nor do you have to worry about freeing or reallocating the memory of the string.
On the other hand, the char[] notation in the case above has restricted the character buffer to exactly 256 characters. If you tried to write more than 256 characters into that buffer, at best you will overwrite other memory that your program "owns". At worst, you will try to overwrite memory that you do not own, and your OS will kill your program on the spot.
Bottom line? Strings are a lot more programmer friendly, char[]s are a lot more efficient for the computer.
Well, string type is a completely managed class for character strings, while char[] is still what it was in C, a byte array representing a character string for you.
In terms of API and standard library everything is implemented in terms of strings and not char[], but there are still lots of functions from the libc that receive char[] so you may need to use it for those, apart from that I would always use std::string.
In terms of efficiency of course a raw buffer of unmanaged memory will almost always be faster for lots of things, but take in account comparing strings for example, std::string has always the size to check it first, while with char[] you need to compare character by character.
I personally do not see any reason why one would like to use char* or char[] except for compatibility with old code. std::string's no slower than using a c-string, except that it will handle re-allocation for you. You can set it's size when you create it, and thus avoid re-allocation if you want. It's indexing operator ([]) provides constant time access (and is in every sense of the word the exact same thing as using a c-string indexer). Using the at method gives you bounds checked safety as well, something you don't get with c-strings, unless you write it. Your compiler will most often optimize out the indexer use in release mode. It is easy to mess around with c-strings; things such as delete vs delete[], exception safety, even how to reallocate a c-string.
And when you have to deal with advanced concepts like having COW strings, and non-COW for MT etc, you will need std::string.
If you are worried about copies, as long as you use references, and const references wherever you can, you will not have any overhead due to copies, and it's the same thing as you would be doing with the c-string.
One of the difference is Null termination (\0).
In C and C++, char* or char[] will take a pointer to a single char as a parameter and will track along the memory until a 0 memory value is reached (often called the null terminator).
C++ strings can contain embedded \0 characters, know their length without counting.
#include<stdio.h>
#include<string.h>
#include<iostream>
using namespace std;
void NullTerminatedString(string str){
int NUll_term = 3;
str[NUll_term] = '\0'; // specific character is kept as NULL in string
cout << str << endl <<endl <<endl;
}
void NullTerminatedChar(char *str){
int NUll_term = 3;
str[NUll_term] = 0; // from specific, all the character are removed
cout << str << endl;
}
int main(){
string str = "Feels Happy";
printf("string = %s\n", str.c_str());
printf("strlen = %d\n", strlen(str.c_str()));
printf("size = %d\n", str.size());
printf("sizeof = %d\n", sizeof(str)); // sizeof std::string class and compiler dependent
NullTerminatedString(str);
char str1[12] = "Feels Happy";
printf("char[] = %s\n", str1);
printf("strlen = %d\n", strlen(str1));
printf("sizeof = %d\n", sizeof(str1)); // sizeof char array
NullTerminatedChar(str1);
return 0;
}
Output:
strlen = 11
size = 11
sizeof = 32
Fee s Happy
strlen = 11
sizeof = 12
Fee
Think of (char *) as string.begin(). The essential difference is that (char *) is an iterator and std::string is a container. If you stick to basic strings a (char *) will give you what std::string::iterator does. You could use (char *) when you want the benefit of an iterator and also compatibility with C, but that's the exception and not the rule. As always, be careful of iterator invalidation. When people say (char *) isn't safe this is what they mean. It's as safe as any other C++ iterator.
Strings have helper functions and manage char arrays automatically. You can concatenate strings, for a char array you would need to copy it to a new array, strings can change their length at runtime. A char array is harder to manage than a string and certain functions may only accept a string as input, requiring you to convert the array to a string. It's better to use strings, they were made so that you don't have to use arrays. If arrays were objectively better we wouldn't have strings.

Can I get a non-const C string back from a C++ string?

Const-correctness in C++ is still giving me headaches. In working with some old C code, I find myself needing to assign turn a C++ string object into a C string and assign it to a variable. However, the variable is a char * and c_str() returns a const char []. Is there a good way to get around this without having to roll my own function to do it?
edit: I am also trying to avoid calling new. I will gladly trade slightly more complicated code for less memory leaks.
C++17 and newer:
foo(s.data(), s.size());
C++11, C++14:
foo(&s[0], s.size());
However this needs a note of caution: The result of &s[0]/s.data()/s.c_str() is only guaranteed to be valid until any member function is invoked that might change the string. So you should not store the result of these operations anywhere. The safest is to be done with them at the end of the full expression, as my examples do.
Pre C++-11 answer:
Since for to me inexplicable reasons nobody answered this the way I do now, and since other questions are now being closed pointing to this one, I'll add this here, even though coming a year too late will mean that it hangs at the very bottom of the pile...
With C++03, std::string isn't guaranteed to store its characters in a contiguous piece of memory, and the result of c_str() doesn't need to point to the string's internal buffer, so the only way guaranteed to work is this:
std::vector<char> buffer(s.begin(), s.end());
foo(&buffer[0], buffer.size());
s.assign(buffer.begin(), buffer.end());
This is no longer true in C++11.
There is an important distinction you need to make here: is the char* to which you wish to assign this "morally constant"? That is, is casting away const-ness just a technicality, and you really will still treat the string as a const? In that case, you can use a cast - either C-style or a C++-style const_cast. As long as you (and anyone else who ever maintains this code) have the discipline to treat that char* as a const char*, you'll be fine, but the compiler will no longer be watching your back, so if you ever treat it as a non-const you may be modifying a buffer that something else in your code relies upon.
If your char* is going to be treated as non-const, and you intend to modify what it points to, you must copy the returned string, not cast away its const-ness.
I guess there is always strcpy.
Or use char* strings in the parts of your C++ code that must interface with the old stuff.
Or refactor the existing code to compile with the C++ compiler and then to use std:string.
There's always const_cast...
std::string s("hello world");
char *p = const_cast<char *>(s.c_str());
Of course, that's basically subverting the type system, but sometimes it's necessary when integrating with older code.
If you can afford extra allocation, instead of a recommended strcpy I would consider using std::vector<char> like this:
// suppose you have your string:
std::string some_string("hello world");
// you can make a vector from it like this:
std::vector<char> some_buffer(some_string.begin(), some_string.end());
// suppose your C function is declared like this:
// some_c_function(char *buffer);
// you can just pass this vector to it like this:
some_c_function(&some_buffer[0]);
// if that function wants a buffer size as well,
// just give it some_buffer.size()
To me this is a bit more of a C++ way than strcpy. Take a look at Meyers' Effective STL Item 16 for a much nicer explanation than I could ever provide.
You can use the copy method:
len = myStr.copy(cStr, myStr.length());
cStr[len] = '\0';
Where myStr is your C++ string and cStr a char * with at least myStr.length()+1 size. Also, len is of type size_t and is needed, because copy doesn't null-terminate cStr.
Just use const_cast<char*>(str.data())
Do not feel bad or weird about it, it's perfectly good style to do this.
It's guaranteed to work in C++11. The fact that it's const qualified at all is arguably a mistake by the original standard before it; in C++03 it was possible to implement string as a discontinuous list of memory, but no one ever did it. There is not a compiler on earth that implements string as anything other than a contiguous block of memory, so feel free to treat it as such with complete confidence.
If you know that the std::string is not going to change, a C-style cast will work.
std::string s("hello");
char *p = (char *)s.c_str();
Of course, p is pointing to some buffer managed by the std::string. If the std::string goes out of scope or the buffer is changed (i.e., written to), p will probably be invalid.
The safest thing to do would be to copy the string if refactoring the code is out of the question.
std::string vString;
vString.resize(256); // allocate some space, up to you
char* vStringPtr(&vString.front());
// assign the value to the string (by using a function that copies the value).
// don't exceed vString.size() here!
// now make sure you erase the extra capacity after the first encountered \0.
vString.erase(std::find(vString.begin(), vString.end(), 0), vString.end());
// and here you have the C++ string with the proper value and bounds.
This is how you turn a C++ string to a C string. But make sure you know what you're doing, as it's really easy to step out of bounds using raw string functions. There are moments when this is necessary.
If c_str() is returning to you a copy of the string object internal buffer, you can just use const_cast<>.
However, if c_str() is giving you direct access tot he string object internal buffer, make an explicit copy, instead of removing the const.
Since c_str() gives you direct const access to the data structure, you probably shouldn't cast it. The simplest way to do it without having to preallocate a buffer is to just use strdup.
char* tmpptr;
tmpptr = strdup(myStringVar.c_str();
oldfunction(tmpptr);
free tmpptr;
It's quick, easy, and correct.
In CPP, if you want a char * from a string.c_str()
(to give it for example to a function that only takes a char *),
you can cast it to char * directly to lose the const from .c_str()
Example:
launchGame((char *) string.c_str());
C++17 adds a char* string::data() noexcept overload. So if your string object isn't const, the pointer returned by data() isn't either and you can use that.
Is it really that difficult to do yourself?
#include <string>
#include <cstring>
char *convert(std::string str)
{
size_t len = str.length();
char *buf = new char[len + 1];
memcpy(buf, str.data(), len);
buf[len] = '\0';
return buf;
}
char *convert(std::string str, char *buf, size_t len)
{
memcpy(buf, str.data(), len - 1);
buf[len - 1] = '\0';
return buf;
}
// A crazy template solution to avoid passing in the array length
// but loses the ability to pass in a dynamically allocated buffer
template <size_t len>
char *convert(std::string str, char (&buf)[len])
{
memcpy(buf, str.data(), len - 1);
buf[len - 1] = '\0';
return buf;
}
Usage:
std::string str = "Hello";
// Use buffer we've allocated
char buf[10];
convert(str, buf);
// Use buffer allocated for us
char *buf = convert(str);
delete [] buf;
// Use dynamic buffer of known length
buf = new char[10];
convert(str, buf, 10);
delete [] buf;