C++ string literal type and declaration - c++

Two types of the declaration:
char* str1 = "string 1";
and
char str2[] = "string 2";
my compiler doesn't allow me to use first declaration with a error
incorrect conversion from const char[8] to char*. looks okay,
the version like this:
const char* str1 = "string 1";
passed by compiler.
please clarify my understanding.
I believed that if we declare both versions e.g. in main(),
first one (const char*) - the only pointer will be allocated on the stack and initialized with some address in data segment.
second version (char[]) - whole array of symbols will be placed on the stack
as far as I see string literal does now always have a const char[] type.
is a using of const char* depricated? for C compatibility only?
where each version will store the string ?

char* str1 = "string 1";
This is deprecated as of C++98, and is ill-formed as of C++11, as a string literal can't be modified. Modifying it would result in undefined behavior.
To avoid this, the standard prohibits assigning it to a modifiable char pointer, as it might be modified later on without the programmer realizing that he/she shouldn't have modified it.
char str2[] = "string 2";
Yes, this allocated an array of characters, whereas each character is stored on the stack.
const char* str1 = "string 1";
This isn't deprecated, it is the recommended way (only way) to assign a string literal to a char pointer. Here, str1 is points to const chars, i.e. they can't be modified. Thus it is safe to use it somewhere, as the compiler will enforce that the chars will never be modified.
Here, str1 is stored on the stack, pointing to a string literal, which may or may not be stored in read-only memory (this is implementation defined).

char str2[] = "string 2";
"string 2" is string literal const char[9] stored in read-only memory.
char str2[] will allocate char array (of size deducted from initializer size) in read-write memory.
= will use the string literal to initialize the char array (doing memcpy of the "string 2" content).
I mean in principle. The actual machine code produced with optimizations may differ, setting up str2 content by less trivial means than memcpy from string literal.
char* str1 = "string 1"; - here you are trying to get the actual string literal memory address, but that one is const, so you shouldn't assign/cast it to char *.
const char* str1 should work OK, casting from const char[] to const char * is valid (they are almost the same thing, unless you have access to original array size during compilation, then the pointer variant is size-less dumb-down version).

Related

C++ different behaviour when trying to assign to const types [duplicate]

Why does the following code in C work?
const char* str = NULL;
str = "test";
str = "test2";
Since str is a pointer to a constant character, why are we allowed to assign it different string literals? Further, how can we protect str from being modified? It seems like this could be a problem if, for example, we later assigned str to a longer string which ended up writing over another portion of memory.
I should add that in my test, I printed out the memory address of str before and after each of my assignments and it never changed. So, although str is a pointer to a const char, the memory is actually being modified. I wondered if perhaps this is a legacy issue with C?
You are changing the pointer, which is not const (the thing it's pointing to is const).
If you want the pointer itself to be const, the declaration would look like:
char * const str = "something";
or
char const * const str = "something"; // a const pointer to const char
const char * const str = "something"; // same thing
Const pointers to non-const data are usually a less useful construct than pointer-to-const.
Further, how can we protect str from being modified?
char * const str1; // str1 cannot be modified, but the character pointed to can
const char * str2; // str2 can be modified, but the character pointed to cannot
const char * const str3 // neither str3 nor the character pointed to can be modified.
The easiest way to read this is to start from the variable name and read to the left:
str1 is a constant pointer to a character
str2 is a pointer to a character constant
str3 is a constant pointer to a character constant
NOTE: the right-to-left reading does not work in the general case, but for simple declarations it's a simple way to do it. I found a java applet based on code from "The C Programming Language" that can decipher declarations with a full explanation of how to do it.
On a related note, definitely take a look at "const pointer versus pointer to const". It helps with what some people call const correctness. I keep it in my bookmarks so that I can refer to it every now and then.
What you're looking for may be the syntax...
const char* const str = NULL;
str = "test";
str = "test2";
Notice the "const" after the char* which yields a compiler error when trying to compile/build.
Memory for the string literals are allocated on the stack, and all your assignments do are change the str pointer to point to those memory addresses. The constant character it pointed to initially hasn't changed at all.
Besides, declaring a variable as const means that variable is read-only; it does not mean the value is constant!

Difference Between const char[] and const char*

So this article is discussing the use of declaring a string literal like const char* foo = "foo" it ends with the claim:
const char *foo = "foo";
is almost never what you want. Instead, you want to use one of the following forms:
For a string meant to be exported:
const char foo[] = "foo";
For a string meant to be used in the same source file:
static const char foo[] = "foo";
For a string meant to be used across several source files for the same library:
__attribute__((visibility("hidden"))) const char foo[] = "foo";
My understanding here is that const char* const foo = "foo" is equivalent to const char foo[] = "foo" simply because we're talking about a C-string pointer that can never be changed to point at anything else, whereas const char* foo = "foo" could be used to point at any other C-String.
Is this an accurate synopsis? Always use either const char* const or const char[]?
Let's get pedantic here.
char const * const p_foo = "foo";
The above defines a {constant} pointer to the {constant} character literal "foo". The pointer is to the single first character of the character literal.
const char bar[] = "bar";
The above defines a character array.
The character array is *read-only".
The character array is the length of the text literal "bar" plus a
nul terminator (4 characters).
The contents of the text literal are copied into the array. (The
compiler may optimize this step away).
Fundamentally, you have the difference between a pointer to the first character of a literal and an array.
The pointer is pointing to a single character. Incrementing the pointer may not point to a valid entity (since it is not an array, but a pointer to a single datum). There is an underlying assumption that the pointer can be incremented to the next character.
With an array you know that there are more than one character sequentially in memory (provided the array is of length 2 or more). You don't know if there is a terminating nul in the sequence (collection). You can assume that, but an array of characters does not guarantee that.
Usages
With the array declaration, the length of the text is known at compile time.
With the pointer declaration, you would need to use strlen to determine the length of the text at run-time. The run-time code doesn't know the length of the target data string; only a length of 1 can be guaranteed.
Sometimes, using static and const can help the compiler optimize.
For example:
static const char moo[] = "moo";
allows the compiler to access the text directly without creating an array variable and copying the text into the variable.
In a function that receives a pointer to a character, you can't guarantee that the pointer points to a valid location (the content of the pointer can be invalid).
Each declaration has its benefits and side-effects.
The choice is yours.
As Thomas Matthews' answer states both a const char* and a const char* const are pointers, and a const char[] is an array.
But as justified here there are 3 problems with using pointers:
Memory for the pointer's storage is required
The indirection incurred by the pointer is required
A pointer requires separate storage of an end pointer for the array or an array size
Ultimately as justified in the link:
The simple answer is that when declaring a variable you should prefer a const char[].
I do agree that an array decays into a pointer when being evaluated but there are a few functionalities that come only with an array. For example when you declare an array you have additional information as to what the size of the array is.
Also, for the fixed array case, memory is allocated specifically for foo. So you are allowed to change the contents of the array like you usually can and the array is destroyed, deallocating memory when it runs out of scope (typical local variable).
When you define it as a pointer, the compiler places foo into read-only memory, and then points to it (usually). Note that this is why most cases constant strings are defined as char* and even the compiler would warn you when you set it as a non constant pointer.
#include <iostream>
int main()
{
char* a = "foo";
return 0;
}
This code would throw you a warning like:
ISO C++ forbids converting a string constant to ‘char*’ [-Wwrite-strings]
char* a = "foo";
and any change you try to make to the string would typically lead to a segmentation fault.

in c++ is this a good practice to initialize char array with string literal?

in c++ is this a good practice to initialize char array with string?
such as:
char* abc = (char *) ("abcabc");
I see a lot of these in my co-worker's code. Should I change it to the right practice?
such as
std::string abc_str = "abcabc";
const char* abc= abc_str .c_str();
This statement
char* abc = (char *) ("abcabc");
is simply bad. String literals in C++ have types of constant character arrays. So a valid declaration will look like
const char *abc = "abcabc";
Note: In C you indeed may write
char *abc = "abcabc";
Nevertheless string literals are immutable. Any attempt to modify a string literal results in undefined behaviour.
By the way there is no any character array that is initialized by a string literal.:) Maybe you mean the following
char abc[] = "abcabc";
Using standard class std::string does not exclude using character arrays and moreover pointers to string literals.
Take into account that these declarations
const char *abc = "abcabc";
and
std::string abc_str = "abcabc";
const char* abc= abc_str .c_str();
are not equivalent. Relative to the first declaration string literals have static storage duration and their addresses are not changed during the program execution.
In the second declaration pointer abc points to dynamically allocated memory that can be reallocated if object abc_str will be changed. In this case the pointer will be invalid.
Also the first declaration supposes that the array (string literal) pointed to by the pointer will not be changed. In the second declaration it is supposed that the object of type std::string will be changed. Otherwise there is no great sense to declare an object of type std::string instead of the pointer.
Thus the meanings of the declarations are simply different.
char* abc = (char *) ("abcabc");
That is bad. Don't do it.
You are treating a string literal that is not supposed to be modified like it can be modified.
After that,
abc[0] = 'd';
will be OK by the compiler but not OK at run time. What you need to use is:
char abc[] = "abcabc";
This will create an array that is modifiable.
Both of those are bad.
char* abc = (char*) ("abcabc");
A string literal is a constant and, as such, may be stored in write protected memory. Therefore writing to it can crash your program, it is undefined behaviour.
Rather than cast away the constness you should keep it const and make a copy if you want to edit its contents.
const char* abc = "abcabc";
The other one should be avoided too:
std::string abc_str = "abcabc";
const char* abc = abc_str.c_str();
Keeping it const is good but if the string is changed it could be reallocated to another place in memory leaving your pointer dangling.
Also in pre C++11 code the pointer stops being valid the second it is assigned because there is no guarantee it is not a temporary.
Better to call abc_str.c_str() each time.
The chances are that because c_str() is such a trivial operation it will be optimized away by the compiler making it just as efficient as using the raw pointer.
Instead of both of those what you should be doing is using std::string all the way. If you absolutely need a const char* (for old legacy code) you can obtain it using c_str().
std::string abc_str = "abcabc"; // this is perfect why do more?
old_horrible_function(abc_str.c_str()); // only when needed

Remove const-ness from a variable

i'm trying to remove const-ness from a variable (char*), but for some reason when i try to change the value, the original value of the const variable still remains the same.
const char* str1 = "david";
char* str2 = const_cast<char *> (str1);
str2 = "tna";
now the value of str2 changes but the original value of str1 remains the same, i've looked it up on Google but couldn't find a clear answer.
when using const_cast and changing the value, should the original of the const variable change as well ?
The type of str1 is const char*. It is the char that is const, not the pointer. That is, it's a pointer to const char. That means you can't do this:
str1[0] = 't';
That would change the value of one of the const chars.
Now, what you're doing when you do str2 = "tna"; is changing the value of the pointer. That's fine. You're just changing str2 to point at a different string literal. Now str1 and str2 are pointing to different strings.
With your non-const pointer str2, you could do str2[0] = 't'; - however, you'd have undefined behaviour. You can't modify something that was originally declared const. In particular, string literals are stored in read only memory and attempting to modify them will bring you terrible misfortune.
If you want to take a string literal and modify it safely, initialise an array with it:
char str1[] = "david";
This will copy the characters from the string literal over to the char array. Then you can modify them to your liking.
str2 is simply a pointer. And your code just changes the value of the pointer, the address, not the string to which it points.
What's more, what you are attempting to do leads to undefined behaviour, and will most likely result in runtime errors. All modern compilers will store your string "david" in read-only memory. Attempts to modify that memory will lead to memory protection errors.

Pointer to const char vs char array vs std::string

Here I've two lines of code
const char * s1 = "test";
char s2 [] = "test";
Both lines of code have the same behavior, so I cannot see any difference whether I should prefer s1 over s2 or vice-versa. In addition to s1 and s2, there is also the way of using std::string. I think the way of using std::string is the most elegant. While looking at other code, I often see that people either use const char * or char s []. Thus, my question is now, when should I use const char * s1 or char s [] or std::string? What are the differences and in which situations should I use which approach?
POINTERS
--------
char const* s1 = "test"; // pointer to string literal - do not modify!
char* s1 = "test"; // pointer to string literal - do not modify!
// (conversion to non-const deprecated in C++03 and
// disallowed in C++11)
ARRAYS
------
char s1[5] = "test"; // mutable character array copied from string literal
// - do what you like with it!
char s1[] = "test"; // as above, but with size deduced from initialisation
CLASS-TYPE OBJECTS
------------------
std::string s1 = "test"; // C++ string object with data copied from string
// literal - almost always what you *really* want
const char * s1 = "test";
char s2 [] = "test";
These two aren't identical. s1 is immutable: it points to constant memory. Modifying string literals is undefined behaviour.
And yes, in C++ you should prefer std::string.
The first one is constant, the second isn't. std::string is a class type and implements many useful functions and methods for string manipulation, making it much easier and user-friendly. The c-style 'strings' with char pointers are difficult to control, manipulate and often cause errors, but don't have the overhead the std::string has. Generally it's better to stick to the std::strings cause they're easier to maintain.
The only difference between the two that you should care about is this:
Which one is your project already using?
These two do not have the same behavior. s1 is a simple pointer which is initialized to point to some (usually read-only) area of the memory. s2, on the other hand, defines a local array of size 5, and fills it with a copy of this string.
Formally, you are not allowed to modify s1, that is, do something like s1[0] = 'a'. In particular, under weird circumstances, it could cause all other "test"s in your program to become "aest", because they all share the same memory. This is the reason modern compilers yell when you write
char* s = "test";
On the other hand, modifying s2 is allowed, since it is a local copy.
In other words, in the following example,
const char* s1 = "test";
const char* s2 = "test";
char s3[] = "test";
char s4[] = "test";
s1 and s2 may very well point to the same address in memory, while s3 and s4 are two different copies of the same string, and reside in different areas of memory.
If you're writing C++, use std::string unless you absolutely need an array of characters. If you need a modifiable array of characters, use char s[]. If you only need an immutable string, use const char*.
Use std::string unless you know why you need a char array / pointer to char.
which one to be used depends upon your requirement. Pointer offers you more flexiblity. and in some cases vulerability. Strings are a safe option and they provide Iterator support.