String literals inside functions: automatic variables or allocated in heap? - c++

Are the string literals we use inside functions automatic variables? Or are they allocated in heap which we have to free manually?
I've a situation like the code shown below wherein I'm assigning a string literal to a private field of the class (marked as ONE in the code) and retrieving it much later in my program and using it (marked as TWO). Am I assigning a variable in the stack to a field in ONE? Can the code be referencing to a dangling pointer which in this case worked because the program was small enough?
I've compiled and ran it, it worked fine but I'm having a strange crash in my actual program where I'm assigning string literals to fields of the class like this and I suspect the case I mentioned above.
#include <iostream>
using namespace std;
class MemoryLeak
{
private:
char *s;
public:
MemoryLeak() {}
void store()
{
s = "Storing a string"; // ONE
}
char *retrieve()
{
return s;
}
};
int main()
{
MemoryLeak *obj = new MemoryLeak();
obj->store();
cout << obj->retrieve() << endl; // TWO
delete obj;
return 0;
}
Should I be declaring the variable "s" as a char array instead of a pointer? I'm planning to use std::string, but I'm just curious about this.
Any pointers or help is, as always, much appreciated :) Thanks.

String literals will be placed in the initialized data or text (code) segment of your binary by the compiler, rather than residing in (runtime allocated) memory or the stack. So you should be using a pointer, since you're going to be referencing the string literal that the compiler has already produced for you. Note that modifying this (which would require changing memory protection typically) will change all uses of this literal.

It is undefined behaviour to modify a string literal, and is most likely the cause of the crash in your program (ISO C++: 2.13.4/2). The standard allows for a conversion from a string literal to char* for backwards compatibility to C and you should only have that conversion in your code if you absolutely need it.
If you wish to treat the string literal as a constant, then you can change the type of your member to a const char *.
If your design requires that s can be modified, then I would recommend changing its type to std::string.

Thank you Cody and Richard.
I found the cause of the bug. It was because I was doing a delete on an object which was already delete'd. I was doing:
if (obj != NULL) delete obj;
I changed it to:
if (obj != NULL)
{
delete obj;
obj = NULL;
}
Learning C++ is definitely fun :)

Maybe the cause of the crash is that you did not 0-terminate the string?

Lets have a look at your options.
There are also a couple of things you should do:
/*
* Should initialize s to NULL or a valid string in constructor */
MemoryLeak()
{
store();
}
void store()
{
// This does not need to be freed because it is a string literal
// generated by the compiler.
s = "Storing a string"; // ONE
// Note this is allowed for backward compatibility but the string is
// really stored as a const char* and thus unmodifiable. If somebody
// retrieves this C-String and tries to change any of the contents the
// code could potentially crash as this is UNDEFINED Behavior.
// The following does need to be free'd.
// But given the type of s is char* this is more correct.
s = strdup("Storing a string");
// This makes a copy of the string on the heap.
// Because you allocated the memory it is modifiable by anybody
// retrieving it but you also need to explicitly de-allocate it
// with free()
}
What you are doing is using C-Strings. These should not be confused with C++ std::string. The C++ std::string is auto-initialized to the empty string. Any memory allocated is de-allocated correctly. It can easily be returned as both a modifiable and non modifiable version. It is also easy to manipulate (i.e. grow shrink change). If you grow a C-String you need to re-allocate memory and copy the string to the new memory etc (it is very time consuming an error prone).
To cope with dynamically allocating your object I would learn about smart pointers.
See this article for more details on smart pointers.
Smart Pointers or who owns you Baby
std::auto_ptr<MemoryLeak> obj(new MemoryLeak());
obj->store();
std::cout << obj->retrieve() << std::endl; // TWO
// No need to delete When object goes out of scope it auto deletes the memory.

Related

Deleting Strings on the Heap Created from Char[] one the Heap in C++

I have a fairly simple question that I cannot seem to find an answer for relating to C++ std::string and how it is instantiated with new. Now, I am well aware that any pointer returned from new should be subsequently deleted to prevent memory leak. My question comes from what happens when an existing pointer is subsequently used to instantiate a new string object. Please consider the following simplified example:
char* foo() {
char* ptr;
ptr = new char[ARBITRARY_VALUE];
...
ptr = strncpy("some null terminated string", ARBITRARY_VALUE)
...
return ptr;
}
int main() {
char* buf;
std::string myStr;
buf = foo();
myStr = new std::string(buf);
...do stuff
delete myStr;
delete buf; //Is this necessary?
return 0;
}
My question is simple: does deleting myStr also free the underlying memory used by buf or does buf need to be freed manually as well? If buf has to be freed manually, what happens in the case of anonymous parameters? As in:
myStr = new std::string(foo());
My suspicion is that the underlying implementation of std::string only maintains a pointer to the character buffer and, upon destruction, frees that pointer but I am not certain and my C++ is rusty at best.
Bonus question: How would this change if the class in question were something other than std::string? I assume that for any user created class, an explicit destructor must be provided by the implementer but what about the various other standard classes? Is it safe to assume that deletion of the parent object will always be sufficient to fully destruct an object (I try to pick my words carefully here; I know there are cases where it is desirable to not free the memory pointed to by an object, but that is beyond the scope of this question)?
std::string may be initialized from a C style null-terminated string (const char *). There is no way for std::string to know if you need that const char * free()d, delete[]()d or neither, and as already stated it won't.
Use smart-pointers to automatically delete dynamically allocated objects. There are a few different of these, each specialized for particular purposes. Have a look at scoped_ptr, auto_ptr and shared_ptr. Your project will probably have constraints on which smart pointers you get to use.
In the context of C++ there is never a reason to hold strings in manually declared char arrays, std::string is much safer to use.

C++ std::string* s; Memory reclaimed?

Given a function foo with a statement in it:
void foo() {
std::string * s;
}
Is memory reclaimed after this function returns?
I am assuming yes because this pointer isn't pointing to anything but some people are saying no - that it is a dangling pointer.
std::string* s is just an uninitialized pointer to a string. The pointer will be destroyed when function foo returns (because the pointer itself is a local variable allocated on the stack). No std::string was ever created, hence you won't have any memory leak.
If you say
void foo() {
std::string * s = new std::string;
}
Then you will have memory leak
This code is typical when people learn about strings a-la C, and then start using C++ through C idioms.
C++ classes (in particular standard library classes) treat objects as values, and manage themselves the memory they need.
std::string, in this sense is not different from an int. If you need a "text container", just declare an std::string (not std::string*) and initialize it accordingly (uninitialized std::strings are empty by definition - and default constructor) than use it to form expression using method, operators and related functions like you will do with other simple types.
std::string* itself is a symptom of a bad designed environment.
Explicit dynamic memory in C++ is typically used in two situation:
You don't know at compile time the size of an object (typical with unknown size arrays, like C strings are)
You don't know at compile time the runtime-type of an object (since its class will be decided on execution, based on some other input)
Now, std:string manage itself the first point, and does not support the second (it has no virtual methods), so allocating it dynamically adds no value: it just adds all the complication to manage yourself the memory to contain the string object that is itself a manager of other memory to contain its actual text.
This code just creates a pointer to somewhere in memory, which contains string value and it points to somewhere which has been allocated before and it does not allocate new string.
it just allocate a memory for pointer value and after function return it's no more valid...

c++ pointer scope

What happens when you have the following code:
void makeItHappen()
{
char* text = "Hello, world";
}
Does text go out of scope and get deleted automatically or does it stay in the memory?
And what about the following example:
class SomeClass
{
public:
SomeClass();
~SomeClass();
};
SomeClass::SomeClass() { }
SomeClass::~SomeClass()
{
std::cout << "Destroyed?" << std::endl;
}
int main()
{
SomeClass* someClass = new SomeClass();
return 0;
} // What happend to someClass?
Does the same thing occur here?
Thanks!
char* text = "Hello, world";
Here an automatic variable (a pointer) is created on the stack and set to point to a value in constant memory, which means:
the string literal in "" exists through the whole program execution.
you are not responsible for "allocating" or "freeing" it
you may not change it. If you want to change it, then you have to allocate some "non-constant memory" and copy it there.
When the pointer goes out of scope, the memory pointer itself (4 bytes) is freed, and the string is still in the same place - constant memory.
For the latter:
SomeClass* someClass = new SomeClass();
Then someClass pointer will also be freed when it goes out of scope (since the pointer itself is on the stack too, just in the first example)... but not the object!
The keyword new basically means that you allocate some memory for the object on free store - and you're responsible for calling delete sometime in order to release that memory.
Does text go out of scope
Yes! It is local to the function makeItHappen() and when the function returns it goes out of scope. However the pointed to string literal "Hello, world"; has static storage duration and is stored in read only section of the memory.
And what about the following example:
......
Does the same thing occur here?
Your second code sample leaks memory.
SomeClass* someClass = new SomeClass();
someClass is local to main() so when main returns it being an automatic variable gets destroyed. However the pointed to object remains in memory and there's no way to free it after the function returns. You need to explicitly write delete someClass to properly deallocate the memory.
The variable text does go out of scope (however the string literal is not deleted).
For objects that you allocate with new (like your SomeClass), you need to explicitly delete them. If you want objects allocated like this to be automatically deleted, take a look at boost smart pointers (std::unique_ptr if your compiler is c++0x aware).
This will automatically delete the allocated object when the shared pointer goes out of scope.
Your code would then look like this:
int main(int argv, char **argv)
{
boost::scoped_ptr<SomeClass> ptr(new SomeClass);
// the object is automatically deleted
return 0;
}
Note: In this particular example, you could also use std::auto_ptr (but this will be deprecated in c++0x).
Note 2: As was pointed out in the comments by Kos, it is in this case more appropriate to use boost::scoped_ptr or std::unique_ptr (c++0x). My answer first used boost::shared_ptr, which is more appropriate if you need to share ownership of a pointer between several classes for instance.
In the first example the string literal is stored in data segment of your executable.
In the second case you do not have to call delete (in your example program just terminates) since on program termination the heap is freed anyway for the process.
Note though that there are OS (as I have read) that you have to explicitly release heap even if the program terminates since it will not be cleaned up at termination for you.
Of course programmer is responsible for memory management in C++ and objects you create on heap should be deleteed once unneeded.

How to return a string literal from a function

I am always confused about return a string literal or a string from a function. I was told that there might be memory leak because you don't know when the memory will be deleted?
For example, in the code below, how to implement foo() so as to make the output of the code is "Hello World"?
void foo ( ) // you can add parameters here.
{
}
int main ()
{
char *c;
foo ( );
printf ("%s",c);
return 0;
}
Also, if the return type of foo() is not void, but you can return char*, what should it be?
I'm assuming we cannot modify main. To get your program working without a leak, you need something to have static storage:
void foo(char*& pC) // reference
{
static char theString[] = "thingadongdong";
pC = theString;
}
But really, this isn't very conventional C++ code. You'd be using std::string and std::cout, so you don't have to worry about memory:
std::string foo(void)
{
return "better thingadongdong";
}
int main(void)
{
// memory management is done
std::cout << foo() << std::endl;
}
If you're wondering if something needs to be manually deallocated, it's being done wrong.
Since the old use of char* is being deprecated, can you not simply use a string?
const char* func1 () {return "string literal";}
string func2 () {return "another string literal";}
Both of these work fine, with no compiler warnings.
However
char* func3 () {return "yet another string literal";}
will not compile at all. Nor will
char* func4 () {return &"a ref to a string literal?";}
Stroustrup says in "The C++ Programming Language" (Third Edition):
"A string literal is statically allocated so that it is safe to return one from a function.
const char* error_message (int i)`
{
//...
return "range error";
}
The memory holding range error will not go away after a call of error_messages()."
So every string literal in a program is allocated in its own little piece of memory that lasts for the duration of the program (i.e. is statically allocated).
Putting the const in front of the char* lets the compiler know that you do not intend (and cannot) alter that string literal's little piece of memory which could be dangerous, so they let this assignment slide despite that conversion from string literal to char* is deprecated.
Returning instead to a string must copy the string literal into an object of type string, memory that the caller is responsible for.
Either way there are no memory leaks: every string literal gets its own piece of memory that is cleaned up on program termination; return to const char* returns a pointer to a literal's piece of memory (knowing you cannot alter it); and return to a string makes a copy into a string object existing in the caller's code which is cleaned up by the caller.
Though it seems a little ugly notation-wise, I'm guessing they left the const char* to keep the cheap alternative (involving no copies).
I am always confused about return a string literal or a string from a function.
Immutable, literal string
As I understand it, you are safe to return a string literal directly if the return type is declared const, to declare that the string is not intended to be altered. This means you needn't worry about the lifespan of the string / memory leaks.
Mutable, non-literal string
However, if you need a string that you can change in-place, you need to consider the lifespan of the string and the size of the memory allocation in which it is stored.
This becomes an issue, since you can no longer blithely return the same memory containing string for each invocation of the function, since a previous use could have altered the contents of that memory, and/or may still be in use. Hence a new piece of memory must be allocated to hold the string returned.
This is where the potential for a leak occurs, and where the choice needs to be made about where the allocation and de-allocation should occur. You could have the function itself allocate the memory and state in the documentation that this happens and stipulate therein that the caller has to free the memory when it is no longer required (preventing a leak). This means the function can simply return a char *.
The other option is to pass in some memory to the function that was allocated by the caller, and have the function place the string inside that memory. In this case, the caller both allocates and is responsible for freeing that memory.
Finally, I mentioned that the size of the memory and string need to be managed when using a mutable string. The allocation needs to be both large enough for the string initially set by the function and also for any changed that are made after the function, before the memory is freed. Failing to do this correctly can cause a buffer overflow by writing a string to long to fit in the memory initially allocated; this is extremely dangerous to the health and security of your program. It can cause bugs and security holes that are extremely hard to spot (since the source of the error - the overflow - can be far removed from the symptoms seen when the program fails).
Something like this:
void foo(char ** pChar)
{
// Make sure the string is shorter
// than the buffer
*pChar = new char[256];
strcpy(*pChar,"Hello World!");
}
Then call it like this:
foo(&c);
As mentioned in the comment, be careful the string you are storing is smaller than the buffer or you will get a... stack overflow! (Pun intended)

Difference in behavior when returning a local reference or pointer

#include <iostream.h>
using namespace std;
class A {
public:
virtual char* func()=0;
};
class B :public A {
public:
void show() {
char * a;
a = func();
cout << "The First Character of string is " << *a;
}
char * func();
};
char* B::func() {
cout << "In B" << endl;
char x[] = "String";
return x;
}
int main() {
B b;
b.show();
}
The problem in this is that I am returing a local varibale pointer/reference.
Currently it is char x[]="String", but when I use a pointer char *x="String", the result is "S" but when Array reference the output comes as (i)
When you do something like:
char *f(){ return "static string"; }
You're returning the address of a string literal, but that string literal is not local to the function. Rather, it is statically allocated, so returning it gives well-defined results (i.e. the string continues to exist after the function exits, so it works).
When you (attempt to) return the address of an array of char like this:
char *f() {
char x[] = "automatically allocated space";
return x;
}
The compiler allocates space for x on the stack, then initializes it from a string literal to which you don't have direct access. What you're returning is the address of the memory in the stack, not the string literal itself -- so as soon as the function exits, that array ceases to exist, and you have no idea what else might be put at that address. Trying to use that address causes undefined behavior, which means anything can happen.
That is because, when B::func() the memory allocated for the x[] is released hence if you try to access that memory location afterwards you will get garbage values. But when you do char *x="String", the memory for the string "String" is most probably allocated only once from the read-only section of your process memory. This address is guaranteed to remain valid until the execution of your program. In that case, if you try to access the pointer variable it will work correctly. BTW, as a side note, you need to declare a virtual base class destructor.
First off, figure out how to post code blocks. It's not hard, just indent them by four spaces.
Second, you should never return a pointer to a function-local variable. Because they're allocated on the stack, and all bets are off as regards to whether they're around after the function returns.
EDIT: This was written when the provided code was butchered and incomplete. My point doesn't apply to this particular case, but it's still important to know.
To try to be more precise: B::func() is returning a pointer to a hunk of memory that it used for a temporary array. There are absolutely no guarantees what is in that memory once the function returns. Either (1, simplest good practice ) the calling function can allocate memory and pass in a string buffer for the called function to use or (2) the called function needs to allocate a buffer and the calling function needs to later release it (but this requires a very disciplined approach by the developer of the calling function, and I'd never want to rely on that) or (3, probably really the best practice) the calling function can pass in some sort of smart pointer that will release its memory when it goes out of scope, and have the called function use that smart pointer to point to the memory it allocates.