Constant Pointers vs Pointers to Constant Strings - c++

I know that if I use a global pointer to a constant string:
const char *h = "hello";
the variable h is stored in the writable data segment. But what if I use
constant pointer to a string
char * const h = "hello";
or a constant pointer to a constant string
const char * const h = "hello";
where would h be stored then?

The c++ language doesn't specify distinction between different areas of storage, other than these both variables have static storage duration. On one system, they might be stored in same area, on another in separate areas.
Given that the variable is const in the latter case, a language implementation might choose to use an area of memory that is protected from being overwritten.

First of all, you keep saying pointer to string, which isn't accurate. They're all pointers to a char, and this char is the beginning of the string.
For your question, where h will be stored when it's a constant pointer ?, It will be stored in a readonly part of your RAM. just like any constant variable like const int.

Related

Pointers in CPP, confusion regarding an example program

Working my way through accelerated c++. There's an example where there are multiple things that I do not understand.
double grade = 88;
static const double numbers[] = { 97,94,90,87,84,80,77,74,70,60,0 };
static const char* const letters[] = { "A+","A","A-","B+","B","B-","C+","C","C-","D","F" };
static const size_t ngrades = sizeof(numbers) / sizeof(*numbers);
for (size_t i = 0;i < ngrades;++i) {
if (grade >= numbers[i]) {
cout << letters[i];
break;
}
}
I don't understand what's going on with static const char* const letters[] = (...). First of all, I always thought a char was a single character delimited by '. Single or more characters delimited by " are for me a string.
The way I've understood pointers is that they're a value that represents the address of an object, although this would be initialized as int* p=&x;. They have the advantage of being able to be used like an iterator (kind of). But I really do not get what is going on here, we declare a pointer letters that gets assigned to it an array of values (not addresses), what does that mean? What would be a reason for doing this?
I know what static is in java, is the meaning similar in CPP? The author writes it means that the compiler will initialize the static values only once. But isn't that done with every variable within a certain scope? I've noticed in debug that I seem to skip over the values after having executed it the first time. But that would imply that even after my program is finished running these static values are still saved? That doesn't seem logical to me.
Regarding your first question, a string literal (like e.g. "A+") is an (read-only) array of characters, and as all arrays they can decay to pointers to their first element, i.e. a pointer to char. The variable letters is an array of constant pointers (the pointer in the array can't be changed) to characters that are constant.
For the third questions, what static means is different depending on which scope you declare the variable in. It's a linkage specifier when used in the global scope, and means that the variable (or function) will not be exported from the translation unit. If use for a variable in local scope (i.e. inside a function) then variable will be shared between invocations of the function, i.e. all calls to the function will have the same variable with the the value of it being kept between calls. Declaring a class-member as static means that it's shared between all object instances of the class.
1) Here
static const char* const letters[] = (...)
letters is actually array of const pointers to const characters. Hence the "".
2) Like said, above variable is array of pointers. So each
element in the array holds address to the string literal
defined in the array. So letters[0] holds address of memory
where "A+" is stored.
3) static has various uses in C++. In your case if its declared inside
function, its value is preserved between successive calls to that function.. More details.
Those are not characters but strings, so your understanding is correct. Letters here are not stored as char but in the form of const char*.
we declare a pointer letters that gets assigned to it an array of values (not addresses)
those are not pointer letters but literal strings, "a" is a literal, its type is const char[], which decays to const char* - which means its a poitner.
3.
I know what static is in java, is the meaning similar in CPP
in general yes, but there are differences - like you cant use static inside functions in java while you can in c++, also you can have global static variables in c++.
The author writes it means that the compiler will initialize the static values only once. But isn't that done with every variable within a certain scope?
non static variables will be created on stack and default initialized (if no explicit initialization is done) on each function run. static variables on the other hand will be initialized on the first run only.
But that would imply that even after my program is finished running these static values are still saved?
thats not true, after program is done ,they are freed - your process is dead then
static const char* const letters[]
This is an array of pointers to characters. In this case the initializer list sets each pointer to the first character of each of the strings specified in the initializer list.
const ... const
The pointers and the characters the pointers point to are constants.
static
If declared inside a function, similar to a global, but with local scope.
Links:
http://msdn.microsoft.com/en-us/library/s1sb61xd.aspx
http://en.wikipedia.org/wiki/Static_(keyword)

Static constant character pointer and why it's used in this fashion

static const char* const test_script = "test_script";
When and why would you use a statement such as the one above?
Does it have any benefits?
Why use a char* instead of a constant character?
A "constant Character pointer" (const char*) is a constant already and cannot be changed; Then why use the word static in the front of it? What good does it do?
A const char *p is not a constant pointer. It's a modifiable pointer to a const char, that is, to a constant character. You can make the pointer point to something else, but you cannot change the character(s) to which it points. In other words, p = x is allowed, but *p = y is not.
A char * const is the opposite: a constant pointer to a mutable character. *p = y is allowed, p = x is not.
A const char * const is both: a constant pointer to a constant character.
Regarding static: this gives the declared variable internal linkage (cannot be accessed by name from outside the source file). Since you're asking about both C and C++, note that this is where they differ.
In C++, a variable declared const and not explicitly declared extern has internal linkage by default. Since the pointer in question is const (I'm talking about the second const), the static is superfluous in C++ and doesn't do anything.
That's not the case in C, where const variables cannot be used as constant expressions and do not have internal linkage by default. So in C, the static is necessary to give test_script internal linkage.
The above discussion of static assumes the declaration is at file scope (C) or namespace scope (C++). If it's inside a function, the meaning of static changes. Without static, it would a normal local variable in the function—each invocation would have its own copy. With static, it receives static storage duration and thus persists between function calls—all invocations share that one copy. Since you're asking about both C and C++, I am not going to discuss class scope.
Finally, you asked "why pointer instead of characters." This way, the pointer points to the actual string literal (probably somewhere in the read-only part of the process's memory). One reason to do that would be if you even need to pass test_script somewhere where a const char * const * (a pointer to a constant pointer to constant character(s)) is expected. Also, if the same string literal appears multiple times in source code, it can be shared.
An alternative would be to declare an array:
const char test_script[] = "test_script";
This would copy the string literal into test_script, guaranteeing that it has its own copy of the data. Then you can learn the length at compile-time from sizeof test_script (which includes the terminating NUL). It will also consume slightly less memory (no need for the pointer) if it's the only occurence of that string literal. However, as it will have its own copy of the data, it cannot share the storage of the string literal (if that is used elsewhere in the code as well).

c++ const pointer with operator[]

char* const p = "world";
p[2] = 'l';
The first statement creates a string pointed by a const pointer p, and the second statement tries to modify the string, and it is accepted by the compiler, while in the running time, an access violation exception is poped, and could anyone explain why?
So your question is two-fold:
Why does it give an access violation: character literal strings are stored as literals in the CODE pages of your executable program; most modern operating system do not allow changes to the these pages (including MS-windows) thus the protection fault.
Why does the compiler allow it: the const keyword in this context refers to the pointer and not the thing it points at. Code such as p="Hello"; will cause a compiler error as you have declared p as constant (not *p). If you wanted to declare the thing it points to as constant then your declaration should be const char *p.
In the
char* const p = "World";
P points to const character array, which resides in the .rodata memory area. So, one can not modify the data pointed by the p variable and one cannot change the p to point to some other string as well.
char* const p = "world";
This is illegal in the current C++ standard (C++11). Most compilers still accept it because they use the previous C++ standard (C++03) by default, but even there the code is deprecated, and a good compiler with the right warning level should warn about this.
The reason is that the type of the literal "world" is char const[6]. In other words, a literal is always constant and cannot be changed. When you say …
char* const p = "world";
… then the compiler converts the literal into a pointer. This is done implicitly by an operation called “array decay”: a C array can be implicitly converted into a pointer that points to its beginning.
So "world" is converted into a value of type char const*. Notice the const – we still are not allowed to change the literal, even when accessed through the pointer.
Alas, C++03 also allows that literals be assigned to a non-const pointer to provide backwards compatibility with C.
Since this is an interview question, the correct answer is thus: the code is illegal and the compiler shouldn’t allow it. Here’s the corrected code:
char const* const p = "world";
//p[2] = 'l'; // Not allowed!
We’ve used two consts here: the first is required for the literal. The second makes the pointer itself (rather than the pointed-to value) const.
The thing is if you define a string literal like this:
char * p = "some text"
the compiler will prepare memory for pointer only, and the text location
will be by default set to .text section
in the other hand if you define it as:
char p[] = "some text"
the compiler will know that you want the memory for entire character array.
In the first case you can't read the value as the MMU is set to readonly access for the .text section of memory, in the 2nd case you can freely access and modify memory.
Another think (to prevent the runtime error) would be correct describing of the memory the pointer points to.
For the constant pointer it would be:
const char * const p = "blaaaah"
and for the normal one:
const char * p = "blaah"
char* const p = "world";
Here p is a pointer with constant memory address. So this will compile fine as there is no syntax error as per compiler and it throws exception in runtime bcoz as its a constant pointer you can change its value.
C++ reference constant pointers
Const correctness

Understanding C-strings & string literals in C++

I have a few questions I would like to ask about string literals and C-strings.
So if I have something like this:
char cstr[] = "c-string";
As I understand it, the string literal is created in memory with a terminating null byte, say for example starting at address 0xA0 and ending at 0xA9, and from there the address is returned and/or casted to type char [ ] which then points to the address.
It is then legal to perform this:
for (int i = 0; i < (sizeof(array)/sizeof(char)); ++i)
cstr[i] = 97+i;
So in this sense, are string literals able to be modified as long as they are casted to the type char [ ] ?
But with regular pointers, I've come to understand that when they are pointed to a string literal in memory, they cannot modify the contents because most compilers mark that allocated memory as "Read-Only" in some lower bound address space for constants.
char * p = "const cstring";
*p = 'A'; // illegal memory write
I guess what I'm trying to understand is why aren't char * types allowed to point to string literals like arrays do and modify their constants? Why do the string literals not get casted into char *'s like they do to char [ ]'s? If I have the wrong idea here or am completely off, feel free to correct me.
The bit that you're missing is a little compiler magic where this:
char cstr[] = "c-string";
Actually executes like this:
char *cstr = alloca(strlen("c-string")+1);
memcpy(cstr,"c-string",strlen("c-string")+1);
You don't see that bit, but it's more or less what the code compiles to.
char cstr[] = "something"; is declaring an automatic array initialized to the bytes 's', 'o', 'm', ...
char * cstr = "something";, on the other hand, is declaring a character pointer initialized to the address of the literal "something".
In the first case you are creating an actual array of characters, whose size is determined by the size of the literal you are initializing it with (8+1 bytes). The cstr variable is allocated memory on the stack, and the contents of the string literal (which in the code is located somewhere else, possibly in a read-only part of the memory) is copied into this variable.
In the second case, the local variable p is allocated memory on the stack as well, but its contents will be the address of the string literal you are initializing it with.
Thus, since the string literal may be located in a read-only memory, it is in general not safe to try to change it via the p pointer (you may get along with, or you may not). On the other hand, you can do whatever with the cstr array, because that is your local copy that just happens to have been initialized from the literal.
(Just one note: the cstr variable is of a type array of char and in most of contexts this translates to pointer to the first element of that array. Exception to this may be e.g. the sizeof operator: this one computes the size of the whole array, not just a pointer to the first element.)
char cstr[] = "c-string";
This copies "c-string" into a char array on the stack. It is legal to write to this memory.
char * p = "const cstring";
*p = 'A'; // illegal memory write
Literal strings like "c-string" and "const cstring" live in the data segment of your binary. This area is read-only. Above p points to memory in this area and it is illegal to write to that location. Since C++11 this is enforced more strongly than before, in that you must make it const char* p instead.
Related question here.

When to use const char * and when to use const char []

I know they are different, I know how they are different and I read all questions I could find regarding char* vs char[]
But all those answers never tell when they should be used.
So my question is:
When do you use
const char *text = "text";
and when do you use
const char text[] = "text";
Is there any guideline or rule?
As an example, which one is better:
void withPointer()
{
const char *sz = "hello";
std::cout << sz << std::endl;
}
void withArray()
{
const char sz[] = "hello";
std::cout << sz << std::endl;
}
(I know std::string is also an option but I specifically want to know about char pointer/array)
Both are distinctly different, For a start:
The First creates a pointer.
The second creates an array.
Read on for more detailed explanation:
The Array version:
char text[] = "text";
Creates an array that is large enough to hold the string literal "text", including its NULL terminator. The array text is initialized with the string literal "text".The array can be modified at a later time. Also, the array's size is known even at compile time, so sizeof operator can be used to determine its size.
The pointer version:
char *text = "text";
Creates a pointer to point to a string literal "text". This is faster than the array version, but string pointed by the pointer should not be changed, because it is located in an read only implementation defined memory. Modifying such an string literal results in Undefined Behavior.
In fact C++03 deprecates use of string literal without the const keyword. So the declaration should be:
const char*text = "text";
Also,you need to use the strlen() function, and not sizeof to find size of the string since the sizeof operator will just give you the size of the pointer variable.
Which version is better?
Depends on the Usage.
If you do not need to make any changes to the string, use the pointer version.
If you intend to change the data, use the array version.
EDIT: It was just brought to my notice(in comments) that the OP seeks difference between:
const char text[] and const char* text
Well the above differing points still apply except the one regarding modifying the string literal. With the const qualifier the array test is now an array containing elements of the type const char which implies they cannot be modified.
Given that, I would choose the array version over the pointer version because the pointer can be(by mistake)easily reseated to another pointer and the string could be modified through that another pointer resulting in an UB.
Probably the biggest difference is that you cannot use the sizeof operator with the pointer to get the size of the buffer begin pointed to, where-as with the const char[] version you can use sizeof on the array variable to get the memory footprint size of the array in bytes. So it really depends on what you're wanting to-do with the pointer or buffer, and how you want to use it.
For instance, doing:
void withPointer()
{
const char *sz = "hello";
std::cout << sizeof(sz) << std::endl;
}
void withArray()
{
const char sz[] = "hello";
std::cout << sizeof(sz) << std::endl;
}
will give you very different answers.
In general to answer these types of questions, use the one that's most explicit.
In this case, const char[] wins because it contains more detailed information about the data within -- namely, the size of the buffer.
Just a note:
I'd make it static const char sz[] = "hello";. Declaring as such has the nice advantage of making changes to that constant string crash the program by writing to read-only memory. Without static, casting away constness and then changing the content may go unnoticed.
Also, the static lets the array simply lie in the constant data section instead of being created on the stack and copied from the constant data section each time the function is called.
If you use an array, then the data is initialized at runtime. If you use the pointer, the run-time overhead is (probably) less because only the pointer needs to be initialized. (If the data is smaller than the size of a pointer, then the run-time initialization of the data is less than the initialization of the pointer.) So, if you have enough data that it matters and you care about the run-time cost of the initialization, you should use a pointer. You should almost never care about those details.
I was helped a lot by Ulrich Drepper's blog-entries a couple of years ago:
so close but no cigar and more array fun
The gist of the blog is that const char[] should be preferred but only as global or static variable.
Using a pointer const char* has as disadvantages:
An additional variable
pointer is writable
an extra indirection
accessing the string through the pointer require 2 memory-loads
Just to mention one minor point that the expression:
const char chararr[4] = {'t', 'e', 'x', 't'};
Is another way to initialize the array with exactly 4 char.