c++ char array out of scope or not? - c++

I have a method that requires a const char pointer as input (not null terminated). This is a requirement of a library (TinyXML) I'm using in my project. I get the input for this method from a string.c_str() method call.
Does this char pointer need to be deleted? The string goes out of scope immediately after the call completes; so the string should delete it with its destructor call, correct?

Do not delete the memory you get from std::string::c_str. The string is responsible for that (and it is entirely possible it gave you a pointer to its internal buffer, so if you deleted it, that would be a bad thing (tm)).

The char array returned by string.c_str() is null terminated. If tinyXML's function takes a not null terminated char* buffer, then your probably gonna get some unexpected behaviour.
const char* c_str ( ) const;
Get C string equivalent
Generates a null-terminated sequence
of characters (c-string) with the same
content as the string object and
returns it as a pointer to an array of
characters.
A terminating null character is
automatically appended.
No, it does not need to be released. String's destructor does that for you.
The returned array points to an
internal location with the required
storage space for this sequence of
characters plus its terminating
null-character, but the values in this
array should not be modified in the
program and are only granted to remain
unchanged until the next call to a
non-constant member function of the
string object.
Source

On another note,
If you don't need to char pointer to be null terminated then you're better off to used str.data() rather than str.c_str(). The difference is that .data() doesn't grantee that what you get is going to be null terminated. This is useful if the your string just happens to occupy the entire length of the internal buffer allocated by string. In this case, calling .c_str() would force string to reallocate the date to a new bigger buffer, ones that contains enough space to add the '\0' in the end.
In any rate, ofcourse you shouldn't delete the pointer returned. string will take care of that.

std::string.c_str() returns a pointer to a null terminated string. The actual array of characters is still owned by the std::string object, and it is valid as long as:
The std::string object is valid, and
No calls to non-const member functions on the std::string object are made (i.e. modifying the string invalidates any previous C-style string pointed to).
It's up to the string object itself to allocate and release the null terminated array-of-char it returns to you.
You can always use a null-terminated string as a non-null-terminated string. After all, an NTS is just a non-NTS with an extra zero at the end. As long as the string is correctly terminated as the function expects, it'll never see the "extra" null.

Related

Pointer value to array C++

I'm trying to set a pointer array to a char array in class Tran, however it only applies the first letter of the string. I've tried many other ways but can't get the whole string to go into name.
edit: name is a private variable
char name[MAX_NAME + 1];
Trying to output it using cout << name << endl;
the input is:
setTran("Birth Tran", 1);
help would be appreciated, thank youu
namee[0] == NULL
name[0] = NULL;
These are bugs. NULL is for pointers. name[0] as well as namee[0] is a char. It may work (by work, I mean it will assign the first character to be the null terminator character) on some systems because 0 is both a null pointer constant and an integer literal and thus convertible to char, and NULL may be defined as 0. But NULL may also be defined as nullptr in which case the program will be ill-formed.
Use name[0] = '\0' instead.
name[0] = *namee;
however it only applies the first letter of the string.
Well, you assign only the first character, so this is to be expected.
If you would like to copy the entire string, you need to assign all of the characters. That can be implemented with a loop. There are standard functions for copying a string though; You can use std::strncpy.
That said, constant length arrays are usually problematic because it is rarely possible to correctly predict the maximum required size. std::string is a more robust alternative.
The underlying issue you are trying to assign a const char* to an char* const. When declaring
char name[MAX_NAME + 1];
You are declaring a constant memory address containing mutable char data (char* const). When you are passing a const char* to your function, you are passing a mutable pointer containing constant data. This will not compile. You should be doing a deep copy of the char array by using:
strcpy_s(dst, buffer_size, src);
This copy function will make sure that your array does not overflow, and that it is null terminated.
In order to be able to assign a pointer to a char array, it would need to be allocated on the heap with
char* name = new char[MAX_NAME + 1];
This would allow assigning a char* or char* const to it afterwards. You however need to manage the memory dynamically at this point, and I would advise against this in your case, as passing "Birth Tran" would lead to undefined behaviours as soon as char* const namee goes out of scope.

How does the std::string constructor handle char[] of fixed size?

How does the string constructor handle char[] of a fixed size when the actual sequence of characters in that char[] could be smaller than the maximum size?
char foo[64];//can hold up to 64
char* bar = "0123456789"; //Much less than 64 chars, terminated with '\0'
strcpy(foo,bar); //Copy shorter into longer
std::string banz(foo);//Make a large string
In this example will the size of the banz objects string be based on the original char* length or the char[] that it is copied into?
First you have to remember (or know) that char strings in C++ are really called null-terminated byte strings. That null-terminated bit is a special character ('\0') that tells the end of the string.
The second thing you have to remember (or know) is that arrays naturally decays to pointers to the arrays first element. In the case of foo from your example, when you use foo the compiler really does &foo[0].
Finally, if we look at e.g. this std::string constructor reference you will see that there is an overload (number 5) that accepts a const CharT* (with CharT being a char for normal char strings).
Putting it all together, with
std::string banz(foo);
you pass a pointer to the first character of foo, and the std::string constructor will treat it as a null-terminated byte string. And from finding the null-terminator it knows the length of the string. The actual size of the array is irrelevant and not used.
If you want to set the size of the std::string object, you need to explicitly do it by passing a length argument (variant 4 in the constructor reference):
std::string banz(foo, sizeof foo);
This will ignore the null-terminator and set the length of banz to the size of the array. Note that the null-terminator will still be stored in the string, so passing a pointer (as retrieved by e.g. the c_str function) to a function which expects a null-terminated string, then the string will seem short. Also note that the data after the null-terminator will be uninitialized and have indeterminate contents. You must initialize that data before you use it, otherwise you will have undefined behavior (and in C++ even reading indeterminate data is UB).
As mentioned in a comment from MSalters, the UB from reading uninitialized and indeterminate data also goes for the construction of the banz object using an explicit size. It will typically work and not lead to any problems, but it does break the rules set out in the C++ specification.
Fixing it is easy though:
char foo[64] = { 0 };//can hold up to 64
The above will initialize all of the array to zero. The following strcpy call will not touch the data of the array beyond the terminator, and as such the remainder of the array will be initialized.
The constructor called is one that takes a const char* as an argument. That constructor attempts to copy the character data pointed to by that pointer, until the first NUL terminator is reached. If there is no such NUL terminator then the behaviour of the constructor is undefined.
Your foo type is converted to a char* by pointer decay, then an implicit conversion to const char* occurs at the calling site.
Perhaps there could have been a templatised std::string constructor taking a const char[N] as an argument, which would have allowed the insertion of more than one NUL character (the std::string class after all does support that), but it was not introduced and to do so now would be a breaking change; using
std::string foo{std::begin(foo), std::end(foo)};
will also copy the entire array foo.

Pointer stores strings?

I recently started learning C++ and came across with the concept of a pointer (which is a variable that stores the address of another variable). However I also came across with char* str = "Hello" and I became confused. So it looks like the of "Hello" is being assigned to the pointer str (which I thought could only store addresses). So can a pointer also store a string?
For future reference you should only use the language tag of the language you're using. C and C++ are two very different languages, and in this case there is a difference.
First the common part: Literal strings like "Hello" are stored by the compiler as arrays. In the case of "Hello" it's an array of six char elements, including the string null terminator.
Now for the part that's different: In C++ such string literal arrays are constant, they can not be modified. Therefore it's an error to have a non-const pointer to such an array. In C the string literal arrays are not constant, but they are still not modifiable, they are in essence read-only. But it's still allowed to have a non-const pointer to them.
And finally for your question: As with all arrays, using them make them decay into a pointer to their first element, and that is basically what happens here. You make your variable str point to the first element in the string literal array.
A little simplified it can be seen like this (in C):
char anonymous_literal_array[] = "Hello";
...
char *str = &anonymous_literal_array[0]; // Make str point to first element in array
The pointer will store the address of the start of the string, therefore the first character. In this case "Hello" is an immutable literal. (Check the difference: Immutable vs constant)
More correctly, a pointer cannot store a string as well as anything, a pointer can point to an address containing data of the pointer's type.
Since char* is a pointer to char, it points exactly to a char.
In this example, the pointer is the address of the first character in the string. This is inherited from C where a "string" is an array of characters terminated by a NULL character. In C and C++, arrays and pointers are closely related. When you do your own memory management, you often create an array with a pointer to the first element of the array. That is exactly what is going on here with the array holding the string literal "Hello".
in c/c++ strings are stored as array of characters. Literal string like "Hello" actually return start of temporary read only character array which hold this string.
A char* variable is a pointer to a single byte(char) in memory. The most common way of handling strings is called a c-style string where the char* is a pointer to the first character in the string and is followed by the rest of the characters in memory. The c-string will always end in a '\0' or null character to signify that you've reached the end of the string ( 'H', 'e', 'l', 'l', 'o', '\0' ).
The "Hello" is called a string literal. What happens in memory is at the very beginning of your program, before anything else is run, the program allocates and sets the memory for the "Hello" string where the other static constants are located. When you write char* str = "Hello"; The compiler knows you're using a string literal and sets str to the location of the first character of that string literal.
But be careful though. All string literals are stored in a portion of memory that you cannot write to. If you try to modify that string, you might get memory errors. To make sure this doesn't happen, when dealing with c-strings, you should always write const char* str = "Hello"; That way the compiler will never allow you to modify that memory.
To have a modifiable string, you will need to allocate and manage the memory yourself. I would suggest using std::string, or have some fun and make your own string class that handles the memory.

wchar_t pointer

What's wrong with this:
wchar_t * t = new wchar_t;
t = "Tony";
I thought I could use a wchar_t pointer as a string...
Your code has two issues.
First, "Tony" is a pointer to a string of char's. L"Tony" is the appropriate wide string.
Second, you allocate a single wchar_t via new, then immediately lose track of it by reassigning the pointer to Tony. This results in a memory leak.
A pointer just points to a single value. This is important.
All you've done is allocated room for a single wchar_t, and point at it. Then you try to set the pointer to point at a string (remember, just at the first character), but the string type is incorrect.
What you have is a string of char, it "should" be L"Tony". But all you're doing here is leaking your previous memory allocation because the pointer holds a new value.
Rather you want to allocate enough room to hold the entire string, then copy the string into that allocated memory. This is terrible practice, though; never do anything that makes you need to explicitly free memory.
Just use std::wstring and move on. std::wstring t = L"Tony";. It handles all the details, and you don't need to worry about cleaning anything up.
Since you are a C# developer I will point out a few things c++ does different.
This allocates a new wchar_t and assigns it to t
wchar_t* t = new wchar_t
This is an array of constant char
"Tony"
To get a constant wchar_t array prefix it with L
L"Tony"
This reasigns t to point to the constant L"Tony" instead of your old wchar_t and causes a memory leak since your wchar_t will never be released.
t = L"Tony"
This creates a string of wide chars (wchar_t) to hold a copy of L"Tony"
std::wstring t = L"Tony"
I think the last line is what you want. If you need access to the wchar_t pointer use t.c_str(). Note that c++ strings are mutable and are copied on each assignment.
The c way to do this would be
const wchar_t* t = L"Tony"
This does not create a copy and only assigns the pointer to point to the const wchar array
What this does is first assign a pointer to a newly allocated wchar_t into t, and then try to assign a non-wide string into t.
Can you use std::wstring instead? That will handle all your memory management needs for you.
you can, its just that "Tony" is a hardcoded string, and they're ANSI by default in most editors/compilers. If you want to tell the editor you're typing in a Unicode string, then prefix it with L, e.g. t = L"Tony".
You have other problems with your code, your allocation is allocating a single Unicode character (2 bytes), then you're trying to point the original variable to the constant string, thus leaking those 2 bytes.
If you want to create a buffer of Unicode data and place data into it, you want to do:
wchar_t* t = new wchar_t[500];
wcscpy(t, "Tony");
this is completely wrong.
There's no need to allocate two bytes, make t to point to them, and then overwrite the pointer t leaking the lost memory forever.
Also, "Tony" has a different type. Use:
wchar_t *t = L"Tony";
IMHO better don't use wchars at all - See https://stackoverflow.com/questions/1049947/should-utf-16-be-considered-harmful

What is the significance of string data member?

String data
What I am particularly confused about is this statement
"Its contents are guaranteed to remain unchanged only until the next call to a non-constant member function of the string object."
Can someone clarify what does this mean? When to use this and when to avoid using this?
They mean that you could store the pointer and use it later. If some non-const method is called between two accesses the contents of the buffer your stored pointer is set to may change and your will face unexpected behaviour.
const char* data() const;
This is saying that the const char * returned by calling str.data() will not change unless someone modifies the string that it came from. Once someone calls a non-constant member function, the returned pointer could be invalid, or could point to different data from what it pointed to immediately after the str.data() function returned.
It means you can pass the returned data to C functions, for example. It means you should not do something like:
const char *old = str.data();
size_t len = str.length();
...call a function that modifies str...
// cout << old << endl;
// Since old is not guaranteed to be null terminated (thanks MSalter),
// do something else with the old data instead of writing to cout.
// Inventiveness not at a high this morning; this isn't a particularly
// good example of what to do - a sort of string copy.
char buffer[256];
memcpy(buffer, old, MIN(sizeof(buffer)-1, len));
buffer[len] = '\0';
By the time the I/O memory copying is done, old may not be valid any more, and len may also be incorrect.
Sometimes, you need to have access to the string formatted as an array of characters - usually because you need to pass the string to some function which expects the string like this (for example strcmp). You can do this by using the data or c_str members, but you have to respect the rules for calling the function which are spelled out plainly in the link you provided:
The returned array points to an
internal location which should not be
modified directly in the program. Its
contents are guaranteed to remain
unchanged only until the next call to
a non-constant member function of the
string object.
You cannot modify the array of characters - the string object assumes that you do not, and if you do this will lead to undefined behaviour.