Are you allowed to re-assign char* to a string literal? - c++

This is quite normal:
[const] char *str = "some text";
But initialisation is not the same as reassignment and string literals are a bit of a special case. What are the rules if you try to do this:
[const] char *str = "some text";
str = "some other text";
Note before someone says "try it" I'm asking what the language spec says, not what my particular compiler does.

Let's repose your question as being assignment and reassignment to a const char*. This is because a string literal is a read-only array of characters terminated will \0, and compilers are lapse in allowing assignment of a string literal to a char*. C++11 explicitly forbids this.
Reassignment of a const char* to a different literal is permissible: there is no danger of a memory leak here since the strings will be stored in a read-only section of your compiled binary.

About char*
As of C++11 all of that code is illegal. String literals can only be binded to char const* or in general a [char] const array.
Notice that a char[] can be initialized with a string literal as per §8.5.2/1:
An array of narrow character type (3.9.1), char16_t array, char32_t array, or wchar_t array can be initialized by a narrow string literal, char16_t string literal, char32_t string literal, or wide string literal, respectively, or by an appropriately-typed string literal enclosed in braces (2.13.5). Successive characters of the value of the string literal initialize the elements of the array.
[Example:
char msg[] = "Syntax error on line %s\n";
shows a character array whose members are initialized with a string-literal. [...]
Previously char* was supported but considered deprecated. And anyway, modifications to the string via that char* were considered undefined behaviour.
As per §4.2/2 (pre-C++11):
A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; a wide string literal can be converted to an rvalue of type “pointer to wchar_t”. In either case, the result is a pointer to the first element of the array. This conversion is considered only when there is an explicit appropriate pointer target type, and not when there is a general need to convert from an lvalue to an rvalue. [Note: this conversion is deprecated. See Annex D. ] For the purpose of ranking in overload resolution (13.3.3.1.1), this conversion is considered an array-to-pointer conversion followed by a qualification conversion (4.4). [Example: "abc" is converted to “pointer to const char” as an array-to-pointer conversion, and then to “pointer to char” as a qualification conversion. ]
About reassigning the pointer
Reassinging a char* or a char const* is perfectly fine. The const there refers to the character, not the pointer. To avoid reassigning you would need char* const and char const* const respectively.

First of all it would be correctly to write
const char *str = "some text";
^^^^^
because string literals in C++ have types of constant character arrays. For example string literal "some text" has type const char [10].
An array name used in expressions is implicitly converted to a pointer to its first element.
For example in this declaration
const char *str = "some text";
the string literal is implicitly converted to an object of type const char * and has value of the address of the first character of the string literal.
Pointers may be reassigned. The assignment operator may be used with pointers.
So you may write
const char *str = "some text";
str = "some other text";
Now pointer str is reassigned and points to to the first character of string literal "some other text".
However if you declare the pointer itself as a constant object as for example
const char * const str = "some text";
^^^^^
then in this case you may not reassigned it. The compiler will issue an error for statement
str = "some other text";

Related

Why is it possible to pass a string as a char pointer

I have a part of code, I don't understand how it works.
I have int Save(int _key, char *file);
And this method Save accepts string as a char pointer Save(i, "log.txt");
So what happens at the end is inside the Save method I use fopen(file, "a+") and it works perfectly fine.
However I don't understand how it accepts "log.txt" for char *file.
The string literal "log.txt" has type char const[N], as per §2.13.5/8:
Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration (3.7).
which decays to a pointer when passed as argument, as per §4.2/1:
An lvalue or rvalue of type “array of N T” or “array of unknown bound of T” can be converted to a prvalue of type “pointer to T”. The result is a pointer to the first element of the array.
The reason char const* can be assigned to char* is mostly there for backward compatibility reasons and it has been deprecated in C++11.
"log.txt" isn't a std:string is actually an array of chars containing {'l','o','g','.','t','x','t','\0'}, its type is const char[N] which decays to const char* hence the call to Save(i, "log.txt"); works.
The call works but the compiler prints a warning stating that converting from const char* to char* has been deprecated in C++03 and invalid in C++11.
In C++ it is perfectly fine to initialize a character pointer with string literal.
After initialization we can use that character pointer like an array as in:
{
char *s="abc";
cout<<s[0];
cout<<s[1];
}

Would initialising a character array from a string literal be a case of array copy initialisation?

I had always thought it fine to, in my mind, replace any use of a literal with a temporary variable of that literal's type and value. If this is the case, since string literals are of type array of const char would initialising a character array through a string literal not be considered array copy-initialisation? E.g. wouldn't
const char test1[] = "hello";
be somewhat the same as doing...
const char temp[6] = {'h', 'e', 'l', 'l', 'o', '\0'};
const char test2[] = temp;
which would be forbidden since this is an example of array copy initialisation? How is it that string literals can be used to initialise an array if the literal's type is an array? Maybe somewhat related, if string literals are of type array of const char then how is it the following code seems to compile fine on my system?
char* test3 = "hello";
Since test3 is missing low-level const the compiler misses this unlawful conversion, but it compiles fine anyway? Of course trying to change any element through test3 causes the program to crash.
There is no difference between copy or direct-initialization for arrays. Both cases are handled identically by the compiler. The analogy you make in the beginning is more of a rule of thumb. In reality, an array cannot be initialized by another array unless it is a string literal. BTW your analogy is not entirely correct. The target array would be direct-initialized with the temporary array:
const char test2[](test1);
But this still won't compile for the same reason. This is how initialization of a character array works.
[dcl.init]/p17:
The semantics of initializers are as follows. The destination type is the type of the object or reference being initialized and the source type is the type of the initializer expression. If the initializer is not a single (possibly parenthesized) expression, the source type is not defined.
If the initializer is a (non-parenthesized) braced-init-list, the object or reference is list-initialized (8.5.4).
If the destination type is a reference type, see 8.5.3.
If the destination type is an array of characters, an array of char16_t, an array of char32_t, or an array of wchar_t, and the initializer is a string literal, see 8.5.2.
8.5.2:
An array of narrow character type (3.9.1), char16_t array, char32_t array, or wchar_t array can be initialized by a narrow string literal, char16_t string literal, char32_t string literal, or wide string literal,
respectively, or by an appropriately-typed string literal enclosed in braces (2.13.5). Successive characters of the value of the string literal initialize the elements of the array. [ Example:
char msg[] = "Syntax error on line %s\n";
shows a character array whose members are initialized with a string-literal. [..]
In your other example the string literal decays into a pointer to its first element, with which test3 is initialized. This code is invalid in C++111, as the decayed pointer is const char*, but this was a valid conversion in C because string literals were non-const. It was allowed in until C++03 where it was deprecated.
1: Some compilers still allow the conversion in C++11 as an extension.

Type of a C++ string literal

Out of curiosity, I'm wondering what the real underlying type of a C++ string literal is.
Depending on what I observe, I get different results.
A typeid test like the following:
std::cout << typeid("test").name() << std::endl;
shows me char const[5].
Trying to assign a string literal to an incompatible type like so (to see the given error):
wchar_t* s = "hello";
I get a value of type "const char *" cannot be used to initialize an entity of type "wchar_t *" from VS12's IntelliSense.
But I don't see how it could be const char * as the following line is accepted by VS12:
char* s = "Hello";
I have read that this was allowed in pre-C++11 standards as it was for retro-compatibility with C, although modification of s would result in Undefined Behavior. I assume that this is simply VS12 having not yet implemented all of the C++11 standard and that this line would normally result in an error.
Reading the C99 standard (from here, 6.4.5.5) suggests that it should be an array:
The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence.
So, what is the type underneath a C++ string literal?
Thank you very much for your precious time.
The type of a string literal is indeed const char[SIZE] where SIZE is the length of the string plus the null terminating character.
The fact that you're sometimes seeing const char* is because of the usual array-to-pointer decay.
But I don't see how it could be const char * as the following line is accepted by VS12:
char* s = "Hello";
This was correct behaviour in C++03 (as an exception to the usual const-correctness rules) but it has been deprecated since. A C++11 compliant compiler should not accept that code.
The type of a string literal is char const[N] where N is the number of characters including the terminating null character. Although this type does not convert to char*, the C++ standard includes a clause allowing assignments of string literal to char*. This clause was added to support compatibility especially for C code which didn't have const back then.
The relevant clause for the type in the standard is 2.14.5 [lex.string] paragraph 8:
Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration (3.7).
First off, the type of a C++ string literal is an array of n const char. Secondly, if you want to initialise a wchar_t with a string literal you have to code:
wchar_t* s = L"hello"

standard conversions: Array-to-pointer conversion (strings)

This is the point from ISO :Standard Conversions:Array-to-pointer conversion: $4.2.2
A string literal (2.13.4) that is not a wide string literal can be converted
to an rvalue of type “pointer to char”; a wide string literal can be
converted to an rvalue of type “pointer to wchar_t”. In either case,
the result is a pointer to the first element of the array. This conversion
is considered only when there is an explicit appropriate pointer target
type , and not when there is a general need to convert from an lvalue to
an rvalue. [Note: this conversion is deprecated. ]
For the purpose of ranking in overload resolution (13.3.3.1.1), this
conversion is considered an array-to-pointer conversion followed by a
qualification conversion (4.4).
[Example:"abc" is converted to "pointer to const char” as an array-to-pointer
conversion, and then to “pointer to char” as a qualification conversion. ]
Can any one explain this, if possible with an example program.
I thing i know regarding string literals...may i know about above statement(wide string literal Prefix L usage).I know ..about the wide string literal meanig.But i need it according to the above satement I mean with Lvaue to Rvalue Conversions.
Before const was introduced into C, many people wrote code like this:
char* p = "hello world";
Since writing to a string literal is undefined behavior, this dangerous conversion was deprecated. But since language changes shouldn't break existing code, this conversion wasn't deprecated immediately.
Using a pointer to a constant character is legal, since const correctness does not let you write through it:
const char* p = "hello world";
And that's all there really is too it. Ask specific questions if you need more information.
When you write
cout<<*str //output :s
it means that you obtain the str[0], as str is a char array and the pointer to array is the pointer to it's first element. str[0] appears to be a char, so cout as a clever object prints you what you wanted - the first character of your char array.
Also,
cout << (str+1)
will print 't'
char* str = "stackoverflow";
cout << str; //output :stackoverflow
This is because the type of str is 'pointer-to-char', so the output is the complete zero-terminated string that starts at str.
cout << *str //output :s
Here, the type of *str is just char -- using the leading * dereferences the pointer and gived you back the 'thing-pointed-to', which is just the single char 's'.

Return char* to string literal

Can you do this?
char* func()
{
char * c = "String";
return c;
}
is "String" here a globally allocated data by compiler?
You can do that. But it would be even more correct to say:
const char* func(){
return "String";
}
The c++ spec says that string literals are given static storage duration. I can't link to it because there are precious few versions of the c++ spec online.
This page on const correctness is the best reference I can find.
Section 2.13.4 of ISO/IEC 14882 (Programming languages - C++) says:
A string literal is a sequence of characters (as defined in 2.13.2) surrounded by double quotes, optionally
beginning with the letter L, as in "..." or L"...". A string literal that does not begin with L is an ordinary
string literal, also referred to as a narrow string literal. An ordinary string literal has type “array of n
const char” and static storage duration (3.7), where n is the size of the string as defined below, and is
initialized with the given characters. ...
Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation defined.
The effect of attempting to modify a string literal is undefined.
You can do this currently (there is no reason to, though). But you cannot do this anymore with C++0x. They removed the deprecated conversion of a string literal (which has the type const char[N]) to a char *.
Note that this conversion is only for string literals. Thus the following two things are illegal, the first of which specifies an array and the second of which specifies a pointer for initialization
char *x = (0 ? "123" : "345"); // illegal: const char[N] -> char*
char *x = +"123"; // illegal: const char * -> char*
GCC incorrectly accepts both, Clang correctly rejects both.
The constant is not allocated on the heap but it is a constant. You don't need to destroy it.
Not in a modern compiler. In modern compilers, the type of "String" is const char *, which you can't assign to a char * due to the const mismatch.
If you made c a const char * (and changed the return type of the function), the code would be legal. Typically the string literal "String" would be placed in the executable's data section by the linker, and in many cases, in a special section for read-only data.