struct st{
int a;
char *ptr;
}obj;
main()
{
obj.a=10;
obj.ptr="Hello World"; // (1) memory allocation?
printf("%d,%s",obj.a,obj.ptr);
}
ptr is declared in struct. When the assignment of Hello world occurs, memory is not allotted and yet this program works fine and gives output properly. Shouldn't it fail/crash when assignment done at marker (1)?
"Hello World" is a string literal residing in a read-only memory section (.rodata) of your program. You point to this section then print the contents. The program behavior is 100% well-defined and should not crash.
It is however good practice to always declare pointers to string literals as const char*, because you are not allowed to modify string literals.
It is perfectly valid.
At compile time (minus compiler optimizations), they are placed in the text/rodata segment of the code. Not sure if you are familiar with the layout of an executable in memory (also known as the runtime environment), but you have the Text, Data, BSS, Heap and Stack.
like
obj.ptr="Hello World";
will place Hello World in the read-only parts of the memory and making obj.ptr a pointer to that, making any writing operation on this memory illegal.
It has no name and has static storage duration (meaning that it lives for the entire life of the program); and a variable of type pointer-to-char, called obj.ptr, which is initialised with the location of the first character in that unnamed, read-only array.
At runtime, the char pointer is allocated on the stack and is first set to point to the area in memory where the string Hello World is.
First, the answer differs somewhat depending on the language (C
or C++), and in the case of C++, the version of the standard.
In both cases, however, "Hello World" is a string literal: in
C, it has typechar [12], and in C++, typechar const [12]`,
and the array has static lifetime, so exists for the lifetime of
the program.
In C, when you assign it to a char*, you have the standard
array to pointer conversion; in C++ pre-C++11, you have
a deprecated char const[] to char* conversion, which is only
valid if the char const[] is a string literal—the
compiler should warn; in C++11, the conversion is illegal, and
your program shouldn't compile (but I'm willing to bet that it
will for many years hence).
my compiler (g++ 4.7.2) throws a warning:
warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
If you don't get this warning, you can try to pass -Wall to your compiler (imo always a good idea)
When you're assigning a string literal to a char pointer you're not assigned memory !
All it does is searching a part of the memory for the string, if it is found - it points to the start of it - if it is not found it creates it. Either way - no memory is assigned and the pointer turns into a constant (read-only) - just try and see you can't change it as it is a constant in memory - Sometimes it's good if you don't want to change that string and sometimes it's not - depends on what you want to do :)
Related
I was trying strcpy like this:
int main()
{
char *c="hello";
const char *d="mello";
strcpy(c,d);
cout<<c<<endl;
return 0;
}
Compiling this gives a warning and running the code produces a Segmentation fault.
The warning is:
warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
char *c="hello";
The declaration of strcpy is: char * strcpy ( char * destination, const char * source );
So where am I wrong (regarding the warning)? IMO I have used a const char and a char, same as the function's declaration.
Does c* or d* not allocate memory to hold "hello" and "mello", due to
which it is throwing segmentation fault? How does the
initialization/definition of a variable like c* work?
Both variables are pointing to constant literal text.
Modifying constant literal text is undefined behavior.
Place the text into an array if you want to modify it:
char e[] = "fred";
e[0] = 'd';
Compiler warnings (and errors) always refer to a particular line of code (and a portion of that line as well). It's a good idea to pay close attention to where the problems are.
In your case, the warning is not about calling strcpy, but about initialising c. You're making a char *, a pointer to modifiable char, point to "hello", a read-only string literal. That's what the compiler is warning you about.
This used to be possible in C++, but with the caveat that you must never actually modify the memory pointed to (since string literals are immutable). Since C++11, converting a string literal to a char * is expicitly forbidden and will generate an error instead of a warning.
The reason for the segfault should be clear now: you're trying to overwrite a piece of read-only memory with the strcpy call.
The correct solution is to use writable memory for storing the destination string. You could do it like this:
char c[] = "hello";
This creates a new char array, which is writable normally (just like any other local variable), and initialises the array with the contents of the string literal (effectively a copy thereof).
Of course, since this is C++, you should be using std::string to store strings and not bother with char*, strcpy, and other C-isms at all.
The code you are trying is for TurboC++ it seems. As drescherjm pointed out in the comments, the char *c="hello"; is no longer legal in c++.
If you try out your code in Turbo C++, it'll work just as you expected, although this isn't the same case with the modern c++.
So where am I wrong (regarding the warning)? IMO I have used a const char and a char, same as the function's declaration.
The warning is due to the reason mentioned above, plus the warning isn't on the use of strcpy, it's on the declaration.
Does c* or d* not allocate memory to hold "hello" and "mello", due to which it is throwing segmentation fault? How does the initialization/definition of a variable like c* work?
Well, the compiler would just put the strings in to a random place in the memory and let the c/d pointer point to it. It's risky as you can lose the data if you make the pointer point to something else by accident.
What happen when constant string assigned to constant character pointer(or character pointer)? ex:
const char* p="String";
how and where the compiler take this array .. heap memory ?
and what different from it and :
char* p="String";
thanks.
What happen when constant string assigned to constant character pointer(or character pointer)?
Nothing happens to the const string itself: a pointer to it is assigned to p, that's all.
how and where the compiler take this array .. heap memory?
It does not take it anywhere. String's data remains where it was, which is a compiler-specific thing.
and what different from it and : char* p="String";
The compiler is going to reject a program with the assignment of a literal to non-const, or warn you of a deprecated conversion, depending on the C++ version and/or compiler settings.
If you try to modify p[...]'s content using the const declaration, the compiler is going to stop you. If you try doing the same without const, the program may compile, bit it would cause undefined behavior at runtime.
The string literal "String" is a static array of const char somewhere in your program, probably placed into a read-only part of the address space when the executable is set up by your OS.
When you assign const char *p = "String", then p is initialized with a pointer to that array of const char. So *p is 'S' and p[1] is 't', etc.
When you assign char *p = "String", then your compiler should reject that (perhaps you have insufficient diagnostic level set?). If you tell the compiler to accept it regardless, then you have a pointer to (modifiable) char pointing at the string literal. If you subsequently attempt to write through this pointer, you'll get no compiler error, and instead you are likely to see one of two problems runtime:
(If the compiler/linker has placed the string literal into read-only memory) a signal is raised indicating memory access violation (SIGSEGV on Unix-like systems).
(If the string literal is in writeable memory) other uses of the same string literal get modified, because the compiler is permitted to point them all at the same storage.
I have few doubts about string literals in c++.
char *strPtr ="Hello" ;
char strArray[] ="Hello";
Now strPtr and strArray are considered to be string literals.
As per my understanding string literals are stored in read only memory so we cannot modify their values.
We cannot do
strPtr[2] ='a';
and strArray[2]='a';
Both the above statements should be illegal.
compiler should throw errors in both cases.
Compiler keeps string literals in read only memory , so if we try to modify them compiler throws errors.
Also const data is also considered as readonly.
Is it that both string literals and const data are treated same way ?
Can I remove constantness using const_cast from string literal can change its value?
Where exactly do string literals are stored ? (in data section of program)
Now strPtr and strArray are considered to be string literals.
No, they aren't. String literals are the things you see in your code. For example, the "Hello". strPtr is a pointer to the literal (which is now compiled in the executable). Note that it should be const char *; you cannot legally remove the const per the C standard and expect defined behavior when using it. strArray is an array containing a copy of the literal (compiled in the execuable).
Both the above statements should be illegal. compiler should throw errors in both cases.
No, it shouldn't. The two statements are completely legal. Due to circumstance, the first one is undefined. It would be an error if they were pointers to const chars, though.
As far as I know, string literals may be defined the same way as other literals and constants. However, there are differences:
// These copy from ROM to RAM at run-time:
char myString[] = "hello";
const int myInt = 42;
float myFloats[] = { 3.1, 4.1, 5.9 };
// These copy a pointer to some data in ROM at run-time:
const char *myString2 = "hello";
const float *myFloats2 = { 3.1, 4.1, 5.9 };
char *myString3 = "hello"; // Legal, but...
myString3[0] = 'j'; // Undefined behavior! (Most likely segfaults.)
My use of ROM and RAM here are general. If the platform is only RAM (e.g. most Nintendo DS programs) then const data may be in RAM. Writes are still undefined, though. The location of const data shouldn't matter for a normal C++ programmer.
char *strPtr ="Hello" ;
Defines strPtr a pointer to char pointing to a string literal "Hello" -- the effective type of this pointer is const char *. No modification allowed through strPtr to the pointee (invokes UB if you try to do so). This is a backward compatibility feature for older C code. This convention is deprecated in C++0x. See Annex C:
Change: String literals made const
The type of a string literal is changed from “array of char” to “array of const char.” [...]
Rationale: This avoids calling an inappropriate overloaded function, which might expect to be able to modify its argument.
Effect on original feature: Change to semantics of well-defined feature. Difficulty of converting: Simple syntactic transformation, because string literals can be converted to char*; (4.2). The most common cases are handled by a new but deprecated standard conversion:
char* p = "abc"; // valid in C, deprecated in C++
char* q = expr ? "abc" : "de"; // valid in C, invalid in C++
How widely used: Programs that have a legitimate reason to treat string literals as pointers to potentially modifiable memory are probably rare.
char strArray[] ="Hello";
The declared type of strPtr is -- it is an array of characters of unspecified size containing the string Hello including the null terminator i.e. 6 characters. However, the initialization makes it a complete type and it's type is array of 6 characters. Modification via strPtr is okay.
Where exactly do string literals are stored ?
Implementation defined.
The older C and C++ compilers were purely based on low level coding where higher standards of data protection were not available, and they can not even be enforced, typically in C and C++ you can write anything you want..
You can even write a code to access and modify your const pointers as well, if you know how to play with the addresses.
Although C++ does enforce some compile level protection, but there is no protection on runtime. You can certainly access your own stack, and use its values to manipulate any data that came in const pointer as well.
That is the reason C# was invented where little higher level standards are enforced because whatever you access is reference, it is a fixed structure governing all rules of data protection and it has hidden pointer which can not be accessed and nor modified.
The major difference is, C++ can only give you compile time protection, but C# will give you protection even at runtime.
My understanding is that in C and C++, creating a character array by calling:
char *s = "hello";
actually creates two objects: a read-only character array that is created in static space, meaning that it lives for the entire duration of the program, and a pointer to that memory. The pointer is a local variable to its scope then dies.
My question is what happens to the array when the pointer dies? If I execute the code above inside a function, does this mean I have a memory leak after I exit the function?
it lives for the entire duration of the program
Exactly, formally it has static storage duration.
what happens to the array when the pointer dies?
Nothing.
If I execute the code above inside a function, does this mean I have a memory leak after I exit the function?
No, because of (1). (The array is only "freed" when the program exits.)
No, there is no leak.
The literal string is stored in the program's data section, which is typically loaded into a read-only memory page. All equivalent string literals will typically point to the same memory location -- it's a singleton, of sorts.
char const *a = "hello";
char const *b = "hello";
printf("%p %p\n", a, b);
This should display identical values for the two pointers, and successive calls to the same function should print the same values too.
(Note that you should declare such variables as char const * -- pointer to constant character -- since the data is shared. Modifying a string literal via a pointer is undefined behavior. At best you will crash your program if the memory page is read-only, and at worst you will change the value of every occurrence of that string literal in the entire program.)
const char* s = "Hello"; is part of the code (program) - hence a constant never altered (unless you have some nasty mechanism altering code at runtime)
My question is what happens to the array when the pointer dies? If I
execute the code above inside a function, does this mean I have a
memory leak after I exit the function?
No there will be no memory leak and nothing happens to the array when the pointer dies.
A memory leak could be possible only with dynamic allocation, via malloc(). When you're malloc() something, you have to free() it later. If you don't, there will be a memory leak.
In your case, it's a "static allocation": the allocation and free of this memory space will be freed automatically and you don't have to handle that.
does this mean I have a memory leak after I exit the function?
No, there is no memory leak, string literals have static duration and will be freed when the program is done. Quote from the C++ draft standard section 2.14.5 String literals subsection 8:
Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration
Section 3.7.1 Static storage duration says:
[...] The storage for these entities shall last for the duration of the program
Note in C++, this line:
char *s = "hello";
uses a deprecated conversion see C++ warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] for more details.
The correct way would be as follows:
const char *s = "hello";
you only have to free if you use malloc or new
EDIT:
char* string = "a string"; memory allocation is static, and not good practice (if it will be constant the declaration should be a const char*)
because this is in the stack when the function ends it should be destroyed along with the rest of the local variables and arguments.
you need to use specific malloc/free and new/delete when you allocate the memory for your variable like:
char *string = new char[64]; --> delete string;
char *string = malloc(sizeof(char) * 64); --> free(string); //this is not best practice unless you have to use C
And where are literals in memory exactly? (see examples below)
I cannot modify a literal, so it would supposedly be a const char*, although the compiler let me use a char* for it, I have no warnings even with most of the compiler flags.
Whereas an implicit cast of a const char* type to a char* type gives me a warning, see below (tested on GCC, but it behaves similarly on VC++2010).
Also, if I modify the value of a const char (with a trick below where GCC would better give me a warning for), it gives no error and I can even modify and display it on GCC (even though I guess it is still an undefined behavior, I wonder why it did not do the same with the literal). That is why I am asking where those literal are stored, and where are more common const supposedly stored?
const char* a = "test";
char* b = a; /* warning: initialization discards qualifiers
from pointer target type (on gcc), error on VC++2k10 */
char *c = "test"; // no compile errors
c[0] = 'p'; /* bus error when execution (we are not supposed to
modify const anyway, so why can I and with no errors? And where is the
literal stored for I have a "bus error"?
I have 'access violation writing' on VC++2010 */
const char d = 'a';
*(char*)&d = 'b'; // no warnings (why not?)
printf("%c", d); /* displays 'b' (why doesn't it do the same
behavior as modifying a literal? It displays 'a' on VC++2010 */
The C standard does not forbid the modification of string literals. It just says that the behaviour is undefined if the attempt is made. According to the C99 rationale, there were people in the committee who wanted string literals to be modifiable, so the standard does not explicitly forbid it.
Note that the situation is different in C++. In C++, string literals are arrays of const char. However, C++ allows conversions from const char * to char *. That feature has been deprecated, though.
I'm not certain about what C/C++ standards stand for about strings. But I can tell exactly what actually happens with string literals in MSVC. And, I believe, other compilers behave similarly.
String literals reside in a const data section. Their memory is mapped into the process address space. However the memory pages they're stored in are ead-only (unless explicitly modified during the run).
But there's something more you should know. Not all the C/C++ expressions containing quotes have the same meaning. Let's clarify everything.
const char* a = "test";
The above statement makes the compiler create a string literal "test". The linker makes sure it'll be in the executable file.
In the function body the compiler generates a code that declares a variable a on the stack, which gets initialized by the address of the string literal "test.
char* b = a;
Here you declare another variable b on the stack which gets the value of a. Since a pointed to a read-only address - so would b. The even fact b has no const semantics doesn't mean you may modify what it points on.
char *c = "test"; // no compile errors
c[0] = 'p';
The above generates an access violation. Again, the lack of const doesn't mean anything at the machine level
const char d = 'a';
*(char*)&d = 'b';
First of all - the above is not related to string literals. 'a' is not a string. It's a character. It's just a number. It's like writing the following:
const int d = 55;
*(int*)&d = 56;
The above code makes a fool out of compiler. You say the variable is const, however you manage to modify it. But this is not related to the processor exception, since d resides in the read/write memory nevertheless.
I'd like to add one more case:
char b[] = "test";
b[2] = 'o';
The above declares an array on the stack, and initializes it with the string "test". It resides in the read/write memory, and can be modified. There's no problem here.
Mostly historical reasons. But keep in mind that they are somewhat justified: String literals don't have type char *, but char [N] where N denotes the size of the buffer (otherwise, sizeof wouldn't work as expected on string literals) and can be used to initialize non-const arrays. You can only assign them to const pointers because of the implicit conversions of arrays to pointers and non-const to const.
It would be more consistent if string literals exhibited the same behaviour as compound literals, but as these are a C99 construct and backwards-compatibility had to be maintained, this wasn't an option, so string literals stay an exceptional case.
And where are literals in memory exactly? (see examples below)
Initialized data segment. On Linux it is either .data or .rodata.
I cannot modify a literal, so it would supposedly be a const char*, although the compiler let me use a char* for it, I have no warnings even with most of the compiler flags.
Historical as it was already explained by others. Most compilers allow you tell whether the string literals should be read-only or modifiable with a command line option.
The reason it is generally desired to have string literals read-only is that the segment with read-only data in memory can be (and normally is) shared between all the processes started from the executable. That obviously frees some RAM from being wasted to keep redundant copies of the same information.
I have no warnings even with most of the compiler flags
Really? When I compile the following code snippet:
int main()
{
char* p = "some literal";
}
on g++ 4.5.0 even without any flags, I get the following warning:
warning: deprecated conversion from string constant to 'char*'
You can write to c because you didn't make it const. Defining c as const would be correct practice since the right hand side has type const char*.
It generates an error at runtime because the "test" value is probably allocated to the code segment which is read-only. See here and here.