#include<stdio.h>
int main(){
char *ptr="Helio";
ptr++;
printf("%s\n",ptr);
//*ptr++;
printf("%c\n",++*ptr);/*Segmentation fault on GCC*/
return 0;
}
Q1) This works fine in Turbo C++ but on GCC it gives segmentation fault. I am not getting the exact reason.
May be operator precedence is one of the reason.
Q2) Do each compiler has different operator precedence?
As I can see here ++ has higher precedence than dereference operator. May be GCC and Turbo C++ treats them differently.
No, the operator precedence is defined by the C standard, all the compiler follows the same one.
The reason of difference result of Turbo C++ and GCC in this case is because you modified the string literal, which is undefined behavior.
Change it to:
char arr[] = "Helio";
char *ptr = arr;
and you can modify the content of the string now. Note that arr itself is the array name and cannot be modified, so I added a new pointer variable ptr and initialize it to point to the first element of the array.
In your last printf() line, the expression ++*ptr is equivalent to ++ptr[0], which is, in turn, equivalent to ptr[0] = ptr[0]+1. Since ptr[0]=='H', you are trying to change the value of ptr[0] to 'I'.
That's the key problem there. Since &ptr[0] points to the first element of the constant "Helio", the attempt to change the first character, H, is giving trouble, because it is Undefined Behaviour.
char* p = "some literal";
This is only legal because of a smelly argument that C-people fought over during standard comitee negociations. You should consider it as an oddity that exists for backward compatibility.
This is the message you get with GCC:
warning: deprecated conversion from string constant to 'char*'
Please next time, write the following:
char const* p = "some literal";
And make it a reflex in your coding habits. Then you would not have been able to compile your faulty line.
which is:
++*ptr
Here you are taking the first character of the constant literal and try to increment it, to what comes after H, therefore I. But this memory zone happens to be in a write protected page, because this is a constant. This is very much undefined by standard and you should consider it illegal. Your segfault comes from here.
I suggest you run your program in valgrind next time to get more elaborate error messages.
In the answer that Yu Hao wrote for you, what is happenning is that all the characters gets copied one by one, from the constant string pool where the literal are stored, to a stack-allocated char array, by a code that the compiler writes at the initialization/declaration site, therefore you can dereference its content.
Related
I was trying strcpy like this:
int main()
{
char *c="hello";
const char *d="mello";
strcpy(c,d);
cout<<c<<endl;
return 0;
}
Compiling this gives a warning and running the code produces a Segmentation fault.
The warning is:
warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
char *c="hello";
The declaration of strcpy is: char * strcpy ( char * destination, const char * source );
So where am I wrong (regarding the warning)? IMO I have used a const char and a char, same as the function's declaration.
Does c* or d* not allocate memory to hold "hello" and "mello", due to
which it is throwing segmentation fault? How does the
initialization/definition of a variable like c* work?
Both variables are pointing to constant literal text.
Modifying constant literal text is undefined behavior.
Place the text into an array if you want to modify it:
char e[] = "fred";
e[0] = 'd';
Compiler warnings (and errors) always refer to a particular line of code (and a portion of that line as well). It's a good idea to pay close attention to where the problems are.
In your case, the warning is not about calling strcpy, but about initialising c. You're making a char *, a pointer to modifiable char, point to "hello", a read-only string literal. That's what the compiler is warning you about.
This used to be possible in C++, but with the caveat that you must never actually modify the memory pointed to (since string literals are immutable). Since C++11, converting a string literal to a char * is expicitly forbidden and will generate an error instead of a warning.
The reason for the segfault should be clear now: you're trying to overwrite a piece of read-only memory with the strcpy call.
The correct solution is to use writable memory for storing the destination string. You could do it like this:
char c[] = "hello";
This creates a new char array, which is writable normally (just like any other local variable), and initialises the array with the contents of the string literal (effectively a copy thereof).
Of course, since this is C++, you should be using std::string to store strings and not bother with char*, strcpy, and other C-isms at all.
The code you are trying is for TurboC++ it seems. As drescherjm pointed out in the comments, the char *c="hello"; is no longer legal in c++.
If you try out your code in Turbo C++, it'll work just as you expected, although this isn't the same case with the modern c++.
So where am I wrong (regarding the warning)? IMO I have used a const char and a char, same as the function's declaration.
The warning is due to the reason mentioned above, plus the warning isn't on the use of strcpy, it's on the declaration.
Does c* or d* not allocate memory to hold "hello" and "mello", due to which it is throwing segmentation fault? How does the initialization/definition of a variable like c* work?
Well, the compiler would just put the strings in to a random place in the memory and let the c/d pointer point to it. It's risky as you can lose the data if you make the pointer point to something else by accident.
struct st{
int a;
char *ptr;
}obj;
main()
{
obj.a=10;
obj.ptr="Hello World"; // (1) memory allocation?
printf("%d,%s",obj.a,obj.ptr);
}
ptr is declared in struct. When the assignment of Hello world occurs, memory is not allotted and yet this program works fine and gives output properly. Shouldn't it fail/crash when assignment done at marker (1)?
"Hello World" is a string literal residing in a read-only memory section (.rodata) of your program. You point to this section then print the contents. The program behavior is 100% well-defined and should not crash.
It is however good practice to always declare pointers to string literals as const char*, because you are not allowed to modify string literals.
It is perfectly valid.
At compile time (minus compiler optimizations), they are placed in the text/rodata segment of the code. Not sure if you are familiar with the layout of an executable in memory (also known as the runtime environment), but you have the Text, Data, BSS, Heap and Stack.
like
obj.ptr="Hello World";
will place Hello World in the read-only parts of the memory and making obj.ptr a pointer to that, making any writing operation on this memory illegal.
It has no name and has static storage duration (meaning that it lives for the entire life of the program); and a variable of type pointer-to-char, called obj.ptr, which is initialised with the location of the first character in that unnamed, read-only array.
At runtime, the char pointer is allocated on the stack and is first set to point to the area in memory where the string Hello World is.
First, the answer differs somewhat depending on the language (C
or C++), and in the case of C++, the version of the standard.
In both cases, however, "Hello World" is a string literal: in
C, it has typechar [12], and in C++, typechar const [12]`,
and the array has static lifetime, so exists for the lifetime of
the program.
In C, when you assign it to a char*, you have the standard
array to pointer conversion; in C++ pre-C++11, you have
a deprecated char const[] to char* conversion, which is only
valid if the char const[] is a string literal—the
compiler should warn; in C++11, the conversion is illegal, and
your program shouldn't compile (but I'm willing to bet that it
will for many years hence).
my compiler (g++ 4.7.2) throws a warning:
warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
If you don't get this warning, you can try to pass -Wall to your compiler (imo always a good idea)
When you're assigning a string literal to a char pointer you're not assigned memory !
All it does is searching a part of the memory for the string, if it is found - it points to the start of it - if it is not found it creates it. Either way - no memory is assigned and the pointer turns into a constant (read-only) - just try and see you can't change it as it is a constant in memory - Sometimes it's good if you don't want to change that string and sometimes it's not - depends on what you want to do :)
As we know the value of constant variable is immutable. But we can use the pointer of constant variable to modify it.
#include <iostream>
int main()
{
const int integer = 2;
void* tmp = (void*)&integer;
int* pointer = (int*)tmp;
(*pointer)++;
std::cout << *pointer << std::endl;
std::cout << integer << std::endl;
return 0;
}
the output of that code is:
3
2
So, I am confusing what i modified on earth? what does integer stand for?
Modifying consts is undefined. The compiler is free to store const values in read only portions of memory and throw error when you try to change them (free to, not obliged to).
Undefined behavior is poor, undesirable and to be avoided. In summary, don't do that.
PS integer and pointer are variable names in your code, tho not especially good names.
You have used unsafe, C-style casts to throw away the constness. C++ is not an inherently safe language, so you can do crazy stuff like that. It does not mean you should. In fact, you should not use C-style casts in C++ at all--instead use reinterpret_cast, const_cast, static_cast, and dynamic_cast. If you do that, you will find that the way to modify const values is to use const_cast, which is exactly how the language is designed.
This is an undefined behavior. The output you get is compiler dependent.
One possible explanation for this behavior is as follows.
When you declares integer as a constant, and use it in an expression, a compiler optimization and substitute it with the constant literal you have assigned to it.
But, the actual content of the memory location pointed by &integer is changed. Compiler merely ignore this fact because you have defined it as a constant.
See Const Correctness in C++. Give some attention to the assembler output just above the 'The Const_cast Operator' section of this page.
You're wading into Undefined Behavior territory.
If you write
void* tmp = &integer;
the compiler would give you an error. If you wrote good C++ code and wrote
void* tmp = static_cast<void*>(&integer);
the compiler would still give you an error. But you went ahead and used a C-style unprotected cast which left the compiler no option but to do what you told it.
There are several ways the compiler could deal with this, not least of which:
It might take the address of a location in the code segment where the value was, e.g., being loaded into a register.
It might take the address of a location of a similar value.
It might create a temporary by pushing the value onto the stack, taking the address of the location, and then popping the stack.
You would have to look at the assembly produced to see which variant your compiler prefers, but at the end of the day: don't do it it is undefined and that means next time you upgrade your compiler or build on a different system or change optimizer settings, the behavior may well change.
Consider
const char h = 'h';
const char* hello = "hello";
const unsigned char num = 2 * 50 + 2 * 2; // 104 == 'h'
arg -= num; // sub 104, eax
char* ptr = (char*)(&h);
The compiler could choose to store an 'h' specially for the purpose of 'ptr', or it could choose to make 'ptr' point to the 'h' in hello. Or it could choose to take the location of the value 104 in the instruction 'sub 104, eax'.
The const key word is just a hint for compiler. Compiler checks whether a variable is const or not and if you modify a const variable directly, compiler yield a wrong to you. But there is no mechanism on variable storage to protect const variables. So operating system can not know which variable is const or not.
The nearest question on this website to the one I have had a few answers that didn't satisfy me.
Basically, I have 2 related questions :
Q. If I do something like :
char a[]= "abcde"; //created a string with 'a' as the array name/pointer to access it.
a[0]='z'; //works and changes the first character to z.
But
char *a="abcde";
a[0]='z'; //run-time error.
What is the difference? There is no "const" declaration, so I should be free to change contents, right?
Q. If I do something like :
int i[3];
i[0]=10; i[1]=20; i[2]=30;
cout<<*++i; //'i' is a pointer to i[0], so I'm incrementing it and want to print 20.
This gives me a compile-time error, and I don't understand why.
On the other hand, this works :
int *i=new int[3];
i[0]=10; i[1]=20; i[2]=30;
cout<<*++i; //Prints 20.
Thanks for the help.
Q
char *a="abcde";
a[0]='z'; //run-time error.
Ans - Here a is pointing to string literal stored in read only location. You cannot modify string literal
Q
int i[3];
i[0]=10; i[1]=20; i[2]=30;
cout<<*++i;
Ans- Array and Pointers are not same thing. Here i is not a pointer.
You need lvalue for increment operand
You can do :
int *p = &i[0];
std::cout<<*++p;
In last case operator new returns a pointer to a allocated memory space, so its possible.
Question 1
The trouble is that it is still const.
A string literal actually has the type char const*.
But when they designed C++03 they decided (unfortunately) that conversion from char const* to char* is not an error. Thus the code:
char *a="abcde";
Actually compiles without error (even though the string is const). BUT the string literal is still a const even though it is being pointed at via a non const pointer. Thus making it very dangerous.
The good news is that most compilers will generate a warning:
warning: deprecated conversion from string constant to ‘char*’
Question 2
cout<<*++i; //'i' is a pointer to i[0], so I'm incrementing it and want to print 20.
At this point i is not a pointer. It's type is still an array.
It is not a valid operation to increment array.
The thing you are thinking about; is that arrays decay into pointers at the drop of a hat (like when you pass them to functions). Unfortunately that is not happening in this situation and thus you are trying to increment an array (which does not make sense).
And where are literals in memory exactly? (see examples below)
I cannot modify a literal, so it would supposedly be a const char*, although the compiler let me use a char* for it, I have no warnings even with most of the compiler flags.
Whereas an implicit cast of a const char* type to a char* type gives me a warning, see below (tested on GCC, but it behaves similarly on VC++2010).
Also, if I modify the value of a const char (with a trick below where GCC would better give me a warning for), it gives no error and I can even modify and display it on GCC (even though I guess it is still an undefined behavior, I wonder why it did not do the same with the literal). That is why I am asking where those literal are stored, and where are more common const supposedly stored?
const char* a = "test";
char* b = a; /* warning: initialization discards qualifiers
from pointer target type (on gcc), error on VC++2k10 */
char *c = "test"; // no compile errors
c[0] = 'p'; /* bus error when execution (we are not supposed to
modify const anyway, so why can I and with no errors? And where is the
literal stored for I have a "bus error"?
I have 'access violation writing' on VC++2010 */
const char d = 'a';
*(char*)&d = 'b'; // no warnings (why not?)
printf("%c", d); /* displays 'b' (why doesn't it do the same
behavior as modifying a literal? It displays 'a' on VC++2010 */
The C standard does not forbid the modification of string literals. It just says that the behaviour is undefined if the attempt is made. According to the C99 rationale, there were people in the committee who wanted string literals to be modifiable, so the standard does not explicitly forbid it.
Note that the situation is different in C++. In C++, string literals are arrays of const char. However, C++ allows conversions from const char * to char *. That feature has been deprecated, though.
I'm not certain about what C/C++ standards stand for about strings. But I can tell exactly what actually happens with string literals in MSVC. And, I believe, other compilers behave similarly.
String literals reside in a const data section. Their memory is mapped into the process address space. However the memory pages they're stored in are ead-only (unless explicitly modified during the run).
But there's something more you should know. Not all the C/C++ expressions containing quotes have the same meaning. Let's clarify everything.
const char* a = "test";
The above statement makes the compiler create a string literal "test". The linker makes sure it'll be in the executable file.
In the function body the compiler generates a code that declares a variable a on the stack, which gets initialized by the address of the string literal "test.
char* b = a;
Here you declare another variable b on the stack which gets the value of a. Since a pointed to a read-only address - so would b. The even fact b has no const semantics doesn't mean you may modify what it points on.
char *c = "test"; // no compile errors
c[0] = 'p';
The above generates an access violation. Again, the lack of const doesn't mean anything at the machine level
const char d = 'a';
*(char*)&d = 'b';
First of all - the above is not related to string literals. 'a' is not a string. It's a character. It's just a number. It's like writing the following:
const int d = 55;
*(int*)&d = 56;
The above code makes a fool out of compiler. You say the variable is const, however you manage to modify it. But this is not related to the processor exception, since d resides in the read/write memory nevertheless.
I'd like to add one more case:
char b[] = "test";
b[2] = 'o';
The above declares an array on the stack, and initializes it with the string "test". It resides in the read/write memory, and can be modified. There's no problem here.
Mostly historical reasons. But keep in mind that they are somewhat justified: String literals don't have type char *, but char [N] where N denotes the size of the buffer (otherwise, sizeof wouldn't work as expected on string literals) and can be used to initialize non-const arrays. You can only assign them to const pointers because of the implicit conversions of arrays to pointers and non-const to const.
It would be more consistent if string literals exhibited the same behaviour as compound literals, but as these are a C99 construct and backwards-compatibility had to be maintained, this wasn't an option, so string literals stay an exceptional case.
And where are literals in memory exactly? (see examples below)
Initialized data segment. On Linux it is either .data or .rodata.
I cannot modify a literal, so it would supposedly be a const char*, although the compiler let me use a char* for it, I have no warnings even with most of the compiler flags.
Historical as it was already explained by others. Most compilers allow you tell whether the string literals should be read-only or modifiable with a command line option.
The reason it is generally desired to have string literals read-only is that the segment with read-only data in memory can be (and normally is) shared between all the processes started from the executable. That obviously frees some RAM from being wasted to keep redundant copies of the same information.
I have no warnings even with most of the compiler flags
Really? When I compile the following code snippet:
int main()
{
char* p = "some literal";
}
on g++ 4.5.0 even without any flags, I get the following warning:
warning: deprecated conversion from string constant to 'char*'
You can write to c because you didn't make it const. Defining c as const would be correct practice since the right hand side has type const char*.
It generates an error at runtime because the "test" value is probably allocated to the code segment which is read-only. See here and here.