standard conversions: Array-to-pointer conversion (strings) - c++

This is the point from ISO :Standard Conversions:Array-to-pointer conversion: $4.2.2
A string literal (2.13.4) that is not a wide string literal can be converted
to an rvalue of type “pointer to char”; a wide string literal can be
converted to an rvalue of type “pointer to wchar_t”. In either case,
the result is a pointer to the first element of the array. This conversion
is considered only when there is an explicit appropriate pointer target
type , and not when there is a general need to convert from an lvalue to
an rvalue. [Note: this conversion is deprecated. ]
For the purpose of ranking in overload resolution (13.3.3.1.1), this
conversion is considered an array-to-pointer conversion followed by a
qualification conversion (4.4).
[Example:"abc" is converted to "pointer to const char” as an array-to-pointer
conversion, and then to “pointer to char” as a qualification conversion. ]
Can any one explain this, if possible with an example program.
I thing i know regarding string literals...may i know about above statement(wide string literal Prefix L usage).I know ..about the wide string literal meanig.But i need it according to the above satement I mean with Lvaue to Rvalue Conversions.

Before const was introduced into C, many people wrote code like this:
char* p = "hello world";
Since writing to a string literal is undefined behavior, this dangerous conversion was deprecated. But since language changes shouldn't break existing code, this conversion wasn't deprecated immediately.
Using a pointer to a constant character is legal, since const correctness does not let you write through it:
const char* p = "hello world";
And that's all there really is too it. Ask specific questions if you need more information.

When you write
cout<<*str //output :s
it means that you obtain the str[0], as str is a char array and the pointer to array is the pointer to it's first element. str[0] appears to be a char, so cout as a clever object prints you what you wanted - the first character of your char array.
Also,
cout << (str+1)
will print 't'

char* str = "stackoverflow";
cout << str; //output :stackoverflow
This is because the type of str is 'pointer-to-char', so the output is the complete zero-terminated string that starts at str.
cout << *str //output :s
Here, the type of *str is just char -- using the leading * dereferences the pointer and gived you back the 'thing-pointed-to', which is just the single char 's'.

Related

Why is it possible to pass a string as a char pointer

I have a part of code, I don't understand how it works.
I have int Save(int _key, char *file);
And this method Save accepts string as a char pointer Save(i, "log.txt");
So what happens at the end is inside the Save method I use fopen(file, "a+") and it works perfectly fine.
However I don't understand how it accepts "log.txt" for char *file.
The string literal "log.txt" has type char const[N], as per §2.13.5/8:
Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration (3.7).
which decays to a pointer when passed as argument, as per §4.2/1:
An lvalue or rvalue of type “array of N T” or “array of unknown bound of T” can be converted to a prvalue of type “pointer to T”. The result is a pointer to the first element of the array.
The reason char const* can be assigned to char* is mostly there for backward compatibility reasons and it has been deprecated in C++11.
"log.txt" isn't a std:string is actually an array of chars containing {'l','o','g','.','t','x','t','\0'}, its type is const char[N] which decays to const char* hence the call to Save(i, "log.txt"); works.
The call works but the compiler prints a warning stating that converting from const char* to char* has been deprecated in C++03 and invalid in C++11.
In C++ it is perfectly fine to initialize a character pointer with string literal.
After initialization we can use that character pointer like an array as in:
{
char *s="abc";
cout<<s[0];
cout<<s[1];
}

Are you allowed to re-assign char* to a string literal?

This is quite normal:
[const] char *str = "some text";
But initialisation is not the same as reassignment and string literals are a bit of a special case. What are the rules if you try to do this:
[const] char *str = "some text";
str = "some other text";
Note before someone says "try it" I'm asking what the language spec says, not what my particular compiler does.
Let's repose your question as being assignment and reassignment to a const char*. This is because a string literal is a read-only array of characters terminated will \0, and compilers are lapse in allowing assignment of a string literal to a char*. C++11 explicitly forbids this.
Reassignment of a const char* to a different literal is permissible: there is no danger of a memory leak here since the strings will be stored in a read-only section of your compiled binary.
About char*
As of C++11 all of that code is illegal. String literals can only be binded to char const* or in general a [char] const array.
Notice that a char[] can be initialized with a string literal as per §8.5.2/1:
An array of narrow character type (3.9.1), char16_t array, char32_t array, or wchar_t array can be initialized by a narrow string literal, char16_t string literal, char32_t string literal, or wide string literal, respectively, or by an appropriately-typed string literal enclosed in braces (2.13.5). Successive characters of the value of the string literal initialize the elements of the array.
[Example:
char msg[] = "Syntax error on line %s\n";
shows a character array whose members are initialized with a string-literal. [...]
Previously char* was supported but considered deprecated. And anyway, modifications to the string via that char* were considered undefined behaviour.
As per §4.2/2 (pre-C++11):
A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; a wide string literal can be converted to an rvalue of type “pointer to wchar_t”. In either case, the result is a pointer to the first element of the array. This conversion is considered only when there is an explicit appropriate pointer target type, and not when there is a general need to convert from an lvalue to an rvalue. [Note: this conversion is deprecated. See Annex D. ] For the purpose of ranking in overload resolution (13.3.3.1.1), this conversion is considered an array-to-pointer conversion followed by a qualification conversion (4.4). [Example: "abc" is converted to “pointer to const char” as an array-to-pointer conversion, and then to “pointer to char” as a qualification conversion. ]
About reassigning the pointer
Reassinging a char* or a char const* is perfectly fine. The const there refers to the character, not the pointer. To avoid reassigning you would need char* const and char const* const respectively.
First of all it would be correctly to write
const char *str = "some text";
^^^^^
because string literals in C++ have types of constant character arrays. For example string literal "some text" has type const char [10].
An array name used in expressions is implicitly converted to a pointer to its first element.
For example in this declaration
const char *str = "some text";
the string literal is implicitly converted to an object of type const char * and has value of the address of the first character of the string literal.
Pointers may be reassigned. The assignment operator may be used with pointers.
So you may write
const char *str = "some text";
str = "some other text";
Now pointer str is reassigned and points to to the first character of string literal "some other text".
However if you declare the pointer itself as a constant object as for example
const char * const str = "some text";
^^^^^
then in this case you may not reassigned it. The compiler will issue an error for statement
str = "some other text";

Type of a C++ string literal

Out of curiosity, I'm wondering what the real underlying type of a C++ string literal is.
Depending on what I observe, I get different results.
A typeid test like the following:
std::cout << typeid("test").name() << std::endl;
shows me char const[5].
Trying to assign a string literal to an incompatible type like so (to see the given error):
wchar_t* s = "hello";
I get a value of type "const char *" cannot be used to initialize an entity of type "wchar_t *" from VS12's IntelliSense.
But I don't see how it could be const char * as the following line is accepted by VS12:
char* s = "Hello";
I have read that this was allowed in pre-C++11 standards as it was for retro-compatibility with C, although modification of s would result in Undefined Behavior. I assume that this is simply VS12 having not yet implemented all of the C++11 standard and that this line would normally result in an error.
Reading the C99 standard (from here, 6.4.5.5) suggests that it should be an array:
The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence.
So, what is the type underneath a C++ string literal?
Thank you very much for your precious time.
The type of a string literal is indeed const char[SIZE] where SIZE is the length of the string plus the null terminating character.
The fact that you're sometimes seeing const char* is because of the usual array-to-pointer decay.
But I don't see how it could be const char * as the following line is accepted by VS12:
char* s = "Hello";
This was correct behaviour in C++03 (as an exception to the usual const-correctness rules) but it has been deprecated since. A C++11 compliant compiler should not accept that code.
The type of a string literal is char const[N] where N is the number of characters including the terminating null character. Although this type does not convert to char*, the C++ standard includes a clause allowing assignments of string literal to char*. This clause was added to support compatibility especially for C code which didn't have const back then.
The relevant clause for the type in the standard is 2.14.5 [lex.string] paragraph 8:
Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration (3.7).
First off, the type of a C++ string literal is an array of n const char. Secondly, if you want to initialise a wchar_t with a string literal you have to code:
wchar_t* s = L"hello"

"hello world" string literal can be assigned to char * type?

char* foo = "fpp"; //compile in vs 2010 with no problem
I though string literal is const char* type.
And const type cannot be assigned to non-const type.
So I expect the code above to fail or am I missing something?
Edit: Sorry guys, I totally forgotten that compiler throws warning too.
I was looking at error list all this time.
I'm forget to check that.
Edit2: I set my project Warning Level to EnableAllWarnings (/Wall) and there's no warning about this.
So my question is still valid.
C++03 deprecates[Ref 1] use of string literal without the const keyword.
[Ref 1]C++03 Standard: §4.2/2
A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; a wide string literal can be converted to an rvalue of type “pointer to wchar_t”. In either case, the result is a pointer to the first element of the array. This conversion is considered only when there is an explicit appropriate pointer target type, and not when there is a general need to convert from an lvalue to an rvalue. [Note: this conversion is deprecated. See Annex D. ] For the purpose of ranking in overload resolution (13.3.3.1.1), this conversion is considered an array-to-pointer conversion followed by a qualification conversion (4.4). [Example: "abc" is converted to “pointer to const char” as an array-to-pointer conversion, and then to “pointer to char” as a qualification conversion. ]
C++11 simply removes the above quotation which implies that it is illegal code in C++11.
Prior to C++03, C++ derived its declaration of string literal without the const keyword, Note that the same is perfectly valid in C.
As I understand it, in C, before const was added, this was the way to assign a string to a pointer.
In C++ this is deprecated behavior, but still allowed to keep backwards compatibility. So don't use it.
In fact, I believe in C++11 it's completely invalid.
Not quite. A string literal is assignable to a char* type. A string literal should never be modified.
This strange situation is for backwards compatibility with programs before const existed.
gcc -std=c++0x warns about this:
a.cpp:5:14: warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings]
So, this is still allowed, but deprecated, because literal strings are const.
There is no such thing as a const type. Const keyword is a so called type qualifier. It can be applied to any pointer type and just means that the value pointed at by the pointer should not be modified.
You could also apply the const qualifier to the pointer reference itself this way:
char* const p ="aaa";
This will protect the pointer variable from pointing to another string.
There's a special implicit conversion to support this, since it was a common idiom in legacy code (often written before const existed). The type of your string literal is char const[], and you should only use it as such. A good compiler will warn at the above, since the conversion was deprecated from the moment it was introduced.
Note that this is different from C, where the type of a string literal is char[] (but trying to modify it is still undefined behavior).
You are talking about C strings, which are actually vector of char. In C++, the class std::string is used, as well as a constant string is created as const std::string.
Anyway, compilers reserve a piece of memory in the future program in order to store the literal strings that show up in the source code. This part of the memory is considered read-only, so you shoud point to it with a const char *. It size is exactly the size of the string plus one extra position for the trailing zero, marking the end of the string.
Compilers need to keep backwards compatibility, so they still accept literals to be pointed by char *. However, this is misleading, since you are not supposed to be able to modify that memory which could be stored in ROM in an embedded system.
In my system, I use clang:
$ clang --version
Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
Target: i386-pc-linux-gnu
Thread model: posix
In the clang C compiler, this code compiles without errors:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char * str = "Hello, World!";
printf( "%s", str );
return EXIT_SUCCESS;
}
However, the very same code (with minor modifications, such as the header's names) throws the following warning when compiled as a C++ program:
kk.cpp:6:15: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
char * str = "Hello, World!";
^
1 warning generated.
Hope this helps.

Why is it possible to assign a const char* to a char*?

I know that for example "hello" is of type const char*. So my questions are:
How can we assign a literal string like "hello" to a non-const char* like this:
char* s = "hello"; // "hello" is type of const char* and s is char*
// and we know that conversion from const char* to
// char* is invalid
Is a literal string like "hello", which will take memory in all my program, or it's just like temporary variable that will get destroyed when the statement ends?
In fact, "hello" is of type char const[6].
But the gist of the question is still right – why does C++ allow us to assign a read-only memory location to a non-const type?
The only reason for this is backwards compatibility to old C code, which didn’t know const. If C++ had been strict here it would have broken a lot of existing code.
That said, most compilers can be configured to warn about such code as deprecated, or even do so by default. Furthermore, C++11 disallows this altogether but compilers may not enforce it yet.
For Standerdese Fans:
[Ref 1]C++03 Standard: §4.2/2
A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; a wide string literal can be converted to an rvalue of type “pointer to wchar_t”. In either case, the result is a pointer to the first element of the array. This conversion is considered only when there is an explicit appropriate pointer target type, and not when there is a general need to convert from an lvalue to an rvalue. [Note: this conversion is deprecated. See Annex D. ] For the purpose of ranking in overload resolution (13.3.3.1.1), this conversion is considered an array-to-pointer conversion followed by a qualification conversion (4.4). [Example: "abc" is converted to “pointer to const char” as an array-to-pointer conversion, and then to “pointer to char” as a qualification conversion. ]
C++11 simply removes the above quotation which implies that it is illegal code in C++11.
[Ref 2]C99 standard 6.4.5/5 "String Literals - Semantics":
In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters...
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
is literal string like "hello" will take memory in all my program all it's just like a temporary variable that will get destroyed when the statement ends.
It is kept in programm data, so it is awaiable within lifetime of the programm. You can return pointers and references to this data from the current scope.
The only reason why const char* is being cast to char* is comatiblity with c, like winapi system calls. And this cast is made unexplicit unlike any other const casting.
Just use a string:
std::string s("hello");
That would be the C++ way. If you really must use char, you'll need to create an array and copy the contents over it.
The answer to your second question is that the variable s is stored in RAM as type pointer-to-char. If it's global or static, it's allocated on the heap and remains there for the life of the running program. If it's a local ("auto") variable, it's allocated on the stack and remains there until the current function returns. In either case, it occupies the amount of memory required to hold a pointer.
The string "Hello" is a constant, and it's stored as part of the program itself, along with all the other constants and initializers. If you built your program to run on an appliance, the string would be stored in ROM.
Note that, because the string is constant and s is a pointer, no copying is necessary. The pointer s simply points to wherever the string is stored.
In your example, you are not assigning, but constructing. std::string, for example, does have a std::string(const char *) constructor (actually it's more complicated, but it doesn't matter). And, similarly, char * (if it was a type rather than a pointer to a type) could have a const char * constructor, which is copying the memory.
I don't actually know how the compiler really works here, but I think it could be similar to what I've described above: a copy of "Hello" is constructed in stack and s is initialized with this copy's address.