So this article is discussing the use of declaring a string literal like const char* foo = "foo" it ends with the claim:
const char *foo = "foo";
is almost never what you want. Instead, you want to use one of the following forms:
For a string meant to be exported:
const char foo[] = "foo";
For a string meant to be used in the same source file:
static const char foo[] = "foo";
For a string meant to be used across several source files for the same library:
__attribute__((visibility("hidden"))) const char foo[] = "foo";
My understanding here is that const char* const foo = "foo" is equivalent to const char foo[] = "foo" simply because we're talking about a C-string pointer that can never be changed to point at anything else, whereas const char* foo = "foo" could be used to point at any other C-String.
Is this an accurate synopsis? Always use either const char* const or const char[]?
Let's get pedantic here.
char const * const p_foo = "foo";
The above defines a {constant} pointer to the {constant} character literal "foo". The pointer is to the single first character of the character literal.
const char bar[] = "bar";
The above defines a character array.
The character array is *read-only".
The character array is the length of the text literal "bar" plus a
nul terminator (4 characters).
The contents of the text literal are copied into the array. (The
compiler may optimize this step away).
Fundamentally, you have the difference between a pointer to the first character of a literal and an array.
The pointer is pointing to a single character. Incrementing the pointer may not point to a valid entity (since it is not an array, but a pointer to a single datum). There is an underlying assumption that the pointer can be incremented to the next character.
With an array you know that there are more than one character sequentially in memory (provided the array is of length 2 or more). You don't know if there is a terminating nul in the sequence (collection). You can assume that, but an array of characters does not guarantee that.
Usages
With the array declaration, the length of the text is known at compile time.
With the pointer declaration, you would need to use strlen to determine the length of the text at run-time. The run-time code doesn't know the length of the target data string; only a length of 1 can be guaranteed.
Sometimes, using static and const can help the compiler optimize.
For example:
static const char moo[] = "moo";
allows the compiler to access the text directly without creating an array variable and copying the text into the variable.
In a function that receives a pointer to a character, you can't guarantee that the pointer points to a valid location (the content of the pointer can be invalid).
Each declaration has its benefits and side-effects.
The choice is yours.
As Thomas Matthews' answer states both a const char* and a const char* const are pointers, and a const char[] is an array.
But as justified here there are 3 problems with using pointers:
Memory for the pointer's storage is required
The indirection incurred by the pointer is required
A pointer requires separate storage of an end pointer for the array or an array size
Ultimately as justified in the link:
The simple answer is that when declaring a variable you should prefer a const char[].
I do agree that an array decays into a pointer when being evaluated but there are a few functionalities that come only with an array. For example when you declare an array you have additional information as to what the size of the array is.
Also, for the fixed array case, memory is allocated specifically for foo. So you are allowed to change the contents of the array like you usually can and the array is destroyed, deallocating memory when it runs out of scope (typical local variable).
When you define it as a pointer, the compiler places foo into read-only memory, and then points to it (usually). Note that this is why most cases constant strings are defined as char* and even the compiler would warn you when you set it as a non constant pointer.
#include <iostream>
int main()
{
char* a = "foo";
return 0;
}
This code would throw you a warning like:
ISO C++ forbids converting a string constant to ‘char*’ [-Wwrite-strings]
char* a = "foo";
and any change you try to make to the string would typically lead to a segmentation fault.
Related
Why does this work:
char foo[6] = "shock";`
while this does not work:
char* bar = "shock"; //error
Why does bar have to be const while foo doesn't? Arrays in C decay to pointers, so don't foo and bar technically have the same types?
Literals are held in reserved areas of memory that are not supposed to be changed by code. Changing the value held at the address storing that literal would mean that every time any other code tried to use that literal, it would find the wrong value in that memory. So it is illegal to modify that memory, and hence illegal to treat it as not constant.
Source
With this declaration:
char foo[6] = "shock";
Variable foo is type array of char and it containes 6 non-const chars. The string literal contains const chars which are copied into the array on initialization.
While with this declaration:
char* bar = "shock"; //error
Variable bar is type pointer to char. You are trying to make it point to the address of "shock" which is a string literal containing const char.
You can't point a pointer to non-const char at a const char.
So you must do this:
const char* bar = "shock";`
because "shock" is a constant, so a pointer to it must be const
for historical reasons C allows this (and causes many errors that lead to SO posts)
char* bar = "shock";
is roughly equivalent to
const char anonymousArray[6] = "shock";
char *bar = anonymousArray;
Arrays decays to pointers. That's not the same as actually being pointers.
Assume a function with one parameter of const char* type. When I use a string when calling that function, nothing goes wrong! I would expect that you could only input characters included in the ASCII, for example c or dec 68, but it seem otherwise. Take a look at the code below...
void InitFunction(const char* initString)
{
std::cout << initString;
}
int main()
{
InitFunction("Hello!");
}
When you run the code, no problem, warnings, or errors appear. I did some more testing on this as you can see from the following...
void InitFunction(char initString)
{
std::cout << initString;
}
int main()
{
InitFunction("Hello!"); // compiler error on this line
}
This compiles with an error since Hello! cannot be converted into char. I tried another test, but instead used const char as the function parameter type. The code still compiles with an error. But when I add * to the function parameter type, then everything compiles fine! It appears to me that the const is also necessary.
To my understanding of pointers, the function is asking for a pointer, and the pointer is identified as being a character. But this raises three problems for me.
First, Hello! is not in the form of a pointer. I would expect at least to have to use the reference operator (&).
Second, Hello! is not a char type.
And third, why do we need to include the const?
Am I missing something here? Do pointers and characters work in ways that I don't know?
"Hello!" is a const char array of characters. Arrays are treated like a pointer, they decay to a pointer when used in a function call as an argument. The C++ standard specifies this in order to be compatible with the way that the C language standard specifies arrays are to be treated.
By the way this array decay to a pointer happens with other cases where an array is being used where a pointer or the address operator could be used such as an assignment statement. See also Arrays are Pointers?
So "Hello!" will be put into the argument list of InitFunction() as a pointer which points to where the compiler has stored the array of characters with the terminating zero character added. When the compiler generates the code for the function call, the pointer to the array of characters used as an argument to the function is const char *, a pointer to a char which is const and should not be changed.
When you have the function prototype as InitFunction(const char *), the compiler is fine with a function call such as InitFunction("Hello!");. What you are doing is providing a const char * that points to "Hello!".
However if you remove the const then since "Hello!" is a const the compiler complains. The compiler complains because a const variable is being used in a function call whose argument list is non-const indicating that the function may change what the pointer is pointing to. Since "Hello!" is not supposed to be changed, since it is const, the compiler issues an error.
If you remove the asterisk, change const char * to const char, then since "Hello!" is a char array which the compiler then decays into a pointer to the first element of the array, the compiler complains as you are trying to use a pointer for an argument that is not a pointer. In this alternative the problem is the actual data type, char versus char * is the problem.
The following lines of code would also be acceptable:
const char *p = "Hello!"; // create a pointer to an series of characters
char x1[] = "Hello!"; // create an array of char and initialize it.
char *p2 = x1; // create a pointer to char array and initialize it.
char *p3 = x1 + 2; // create a pointer to char array and initialize it with address of x1[2].
InitFunction (p); // p is a const char *
InitFunction (x1); // x1 decays to a char *
InitFunction (p2); // p2 is a char *
InitFunction (p3); // p3 is a char *
InitFunction (x1 + 3); // called with address of x1[3].
Note also C++ has an actual character string type, string, that results in a dynamic character text string whose underlying physical memory layout normally includes a pointer to an array of characters. The C style array of characters that is labeled a string is not the same thing as the C++ string type.
the function is asking for a pointer
Correct.
and the pointer is identified as being a character
No, a pointer is a not a character. A pointer is a pointer. This pointer points to one or more characters, i.e. an array of characters. And that's exactly what your string literal is.
First, Hello! is not in the form of a pointer.
Yes, it is. The string literal "Hello!" has type const char[7] and this decays to a pointer.
I would expect at lest to have to use the reference operator (&).
That's the address-of operator. You don't always need it to get a pointer. Example: this.
Second, Hello! is not a char type.
No, but each constituent character is.
And third, why do we need to include the const?
Because the string literal "Hello!" has type const char[7]. Dropping the const would violate const-correctness and is therefore not permitted.
This throws an error since Hello! cannot be converted into char.
That's right. A char is one byte. The string "Hello!" is not one byte. It is a string.
A C-string is typically/conventionally/usually provided in const char* form.
You should read the chapter in your book about this subject as it's a fundamental of the language. It has nothing to do with ASCII (text encodings are irrelevant to storing a sequence of bytes).
This question already has answers here:
What is the type of string literals in C and C++?
(4 answers)
Closed 5 years ago.
How is this even possible?
const char *cp = "Hello world";
I am currently reading C++ primer and i found this example (I am a very beginner).
Why is it possible to initialize a char pointer with a string? I really can't understand this example, as far as I know a pointer can only be initialized with & + the address of the object pointed OR dereferenced and THEN assigned some value.
String literals are really arrays of constant characters (with the including terminator).
When you do
const char *cp = "Hello world";
you make cp point to the first character of that array.
A little more explanation: Arrays (not just C-style strings using arrays of char but all arrays) naturally decays to pointers to their first element.
Example
char array[] = "Hello world"; // An array of 12 characters (including terminator)
char *pointer1 = &array[0]; // Makes pointer1 point to the first element of array
char *pointer2 = array; // Makes pointer2 point to the first element of array
Using an array is the same as getting a pointer to its first element, so in fact there is an address-of operator & involved, but it's implied and not used explicitly.
As some of you might have noted, when declaring cp above I used const char * as the type, and in my array-example with pointer1 and pointer2 I used a non-constant plain char * type. The difference is that the array created by the compiler for string literals are constant in C++, they can not be modified. Attempting to do so will lead to undefined behavior. In contrast the array I created in my latter example is not constant, it's modifiable and therefore the pointers to it need not be const.
"Hello world" is a read-only literal with a const char[12] type. Note that the final element is the NUL-terminator \0 which the language exploits as an "end of string" marker. C and C++ allow you to type a literal using " surrounding the alphanumeric characters for convenience and that NUL-terminator is added for you.
You are allowed to assign a const char[12] type to a const char* type by a mechanism called pointer decay.
pointer can only be initialized with & + the address of the object pointed
int i = 0;
int *pointer = &i;
It's correct but it not the only way to initialized pointers. Look at below exmaple.
int *pointer;
pointer = (int*)malloc(100 * sizeof(int));
It is how you initialized pointers by allocating memory to it.
How is this even possible?
It works because "Hello World" is a string constant. You do not need to allocate this memory, it's the compiler's job.
By the way, always use smart pointer instead of raw pointer if you're using c++.
Two types of the declaration:
char* str1 = "string 1";
and
char str2[] = "string 2";
my compiler doesn't allow me to use first declaration with a error
incorrect conversion from const char[8] to char*. looks okay,
the version like this:
const char* str1 = "string 1";
passed by compiler.
please clarify my understanding.
I believed that if we declare both versions e.g. in main(),
first one (const char*) - the only pointer will be allocated on the stack and initialized with some address in data segment.
second version (char[]) - whole array of symbols will be placed on the stack
as far as I see string literal does now always have a const char[] type.
is a using of const char* depricated? for C compatibility only?
where each version will store the string ?
char* str1 = "string 1";
This is deprecated as of C++98, and is ill-formed as of C++11, as a string literal can't be modified. Modifying it would result in undefined behavior.
To avoid this, the standard prohibits assigning it to a modifiable char pointer, as it might be modified later on without the programmer realizing that he/she shouldn't have modified it.
char str2[] = "string 2";
Yes, this allocated an array of characters, whereas each character is stored on the stack.
const char* str1 = "string 1";
This isn't deprecated, it is the recommended way (only way) to assign a string literal to a char pointer. Here, str1 is points to const chars, i.e. they can't be modified. Thus it is safe to use it somewhere, as the compiler will enforce that the chars will never be modified.
Here, str1 is stored on the stack, pointing to a string literal, which may or may not be stored in read-only memory (this is implementation defined).
char str2[] = "string 2";
"string 2" is string literal const char[9] stored in read-only memory.
char str2[] will allocate char array (of size deducted from initializer size) in read-write memory.
= will use the string literal to initialize the char array (doing memcpy of the "string 2" content).
I mean in principle. The actual machine code produced with optimizations may differ, setting up str2 content by less trivial means than memcpy from string literal.
char* str1 = "string 1"; - here you are trying to get the actual string literal memory address, but that one is const, so you shouldn't assign/cast it to char *.
const char* str1 should work OK, casting from const char[] to const char * is valid (they are almost the same thing, unless you have access to original array size during compilation, then the pointer variant is size-less dumb-down version).
in c++ is this a good practice to initialize char array with string?
such as:
char* abc = (char *) ("abcabc");
I see a lot of these in my co-worker's code. Should I change it to the right practice?
such as
std::string abc_str = "abcabc";
const char* abc= abc_str .c_str();
This statement
char* abc = (char *) ("abcabc");
is simply bad. String literals in C++ have types of constant character arrays. So a valid declaration will look like
const char *abc = "abcabc";
Note: In C you indeed may write
char *abc = "abcabc";
Nevertheless string literals are immutable. Any attempt to modify a string literal results in undefined behaviour.
By the way there is no any character array that is initialized by a string literal.:) Maybe you mean the following
char abc[] = "abcabc";
Using standard class std::string does not exclude using character arrays and moreover pointers to string literals.
Take into account that these declarations
const char *abc = "abcabc";
and
std::string abc_str = "abcabc";
const char* abc= abc_str .c_str();
are not equivalent. Relative to the first declaration string literals have static storage duration and their addresses are not changed during the program execution.
In the second declaration pointer abc points to dynamically allocated memory that can be reallocated if object abc_str will be changed. In this case the pointer will be invalid.
Also the first declaration supposes that the array (string literal) pointed to by the pointer will not be changed. In the second declaration it is supposed that the object of type std::string will be changed. Otherwise there is no great sense to declare an object of type std::string instead of the pointer.
Thus the meanings of the declarations are simply different.
char* abc = (char *) ("abcabc");
That is bad. Don't do it.
You are treating a string literal that is not supposed to be modified like it can be modified.
After that,
abc[0] = 'd';
will be OK by the compiler but not OK at run time. What you need to use is:
char abc[] = "abcabc";
This will create an array that is modifiable.
Both of those are bad.
char* abc = (char*) ("abcabc");
A string literal is a constant and, as such, may be stored in write protected memory. Therefore writing to it can crash your program, it is undefined behaviour.
Rather than cast away the constness you should keep it const and make a copy if you want to edit its contents.
const char* abc = "abcabc";
The other one should be avoided too:
std::string abc_str = "abcabc";
const char* abc = abc_str.c_str();
Keeping it const is good but if the string is changed it could be reallocated to another place in memory leaving your pointer dangling.
Also in pre C++11 code the pointer stops being valid the second it is assigned because there is no guarantee it is not a temporary.
Better to call abc_str.c_str() each time.
The chances are that because c_str() is such a trivial operation it will be optimized away by the compiler making it just as efficient as using the raw pointer.
Instead of both of those what you should be doing is using std::string all the way. If you absolutely need a const char* (for old legacy code) you can obtain it using c_str().
std::string abc_str = "abcabc"; // this is perfect why do more?
old_horrible_function(abc_str.c_str()); // only when needed