How does one differentiate between pointers and references at runtime? For example, if I wanted to free a pointer of a data type without knowing whether it were a pointer or not how would I do so? Is there any method to tell if a variable has been allocated on the stack or through malloc()?
void destInt(int* var)
{
free(var);
}
int num = 3;
int &numRef = num;
int* numPtr = (int*)malloc(sizeof(int));
*numPtr = num;
destInt(&numRef); //Syntactically correct but generates invalid pointer()
destInt(numPtr); //Fine syntactically and logically
No, not in the general case and not in a portable manner. If you know where in memory the heap is, you can make an educated guess, but not in any reliable way.
EDIT: also note that C does not have references. The &-operator in C is used to take the address of a variable.
If it's ANSI C, then there's no such thing as a reference, so you might want to rephrase your question to talk about pointers to heap allocated or pointers to stack allocated objects.
Often the address of the heap is 'small' and grows up, and the stack is 'big' and grows down, but that's only a heuristic and non-portable.
In C++, the information differentiating whether it is a reference or a pointer is part of the type information at compile-time. In C, this is an irrelevant distinction in semantics.
If you need to use & to get the address of something, then you cannot delete or free it. Otherwise, if you're passing a pointer around, you need to document which functions have the authority to delete or free it. The easiest way to do this in C++ is to use a smart pointer class like a shared_ptr or scoped_ptr.
Whatever you're trying to accomplish.... don't do it this way.
You can usually obtain the bounds of the stack, but this would normally be a pretty compiler/platform specific process. Same with the heap. If you've hooked new and delete with your own versions you probably know where the heap starts and ends. Otherwise you don't.
However, the tree you're barking up is not a good one. If you're convinced you really need to do it this way, pass the information around with the pointer. Wrap it in a struct that also has a bool called needsFree or something. But otherwise, the fact that you're running into this problem often indicates that the problem you're trying to solve could be solved in a cleaner way.
When using malloc, Memory is NOT allocated in the STACK, but in the heap.
Related
Is there any way to distinguish two following situations at run time:
double ptr * = new double(3.14159);
double variable = 3.14159
double * testPtr_1 = ptr;
double * testPtr_2 = &variable;
delete testPtr_1 // fine...
delete testPtr_2 // BIG RUN TIME ERROR !!!
I have find myself in situation in with I need to call delete operator for some unknown pointer. The pointer can point to anywhere (to a "local" variable or to dynamically allocated variable).
How can I find out where my "unknown" pointer points, and therefore choose when to and when not to call operator delete on it
EDIT:
Ok I see that everyone is pointing me to the smart pointers, but what if I am trying to write my own set of smart pointers (that is The reason behind my question)?
There is no way to test if a pointer is pointing to a memory area that would be valid to delete. Moreover,
There is no way to tell between pointers that must be freed with delete vs. delete[],
There is no way to tell between the pointers that have been freed and pointers that have not been freed,
There is no way to tell among pointers to an automatic variable, pointers to static variable, and pointers to dynamically allocated blocks.
The approach that you should take is tracking allocations/deallocations by some other means, such as storing flags along with your pointers. However, this is rather tedious: a much better practice is to switch to smart pointers, which would track resources for you.
You need to set some better coding practices for yourself (or for your project).
Especially since most platforms have, at the very least, a C++11-compliant compiler, there's no reason not to be using the following paradigm:
Raw Pointers (T*) should ONLY be used as non-owning pointers. If you receive a T* as the input for a function or constructor, you should assume you have no responsibility for deleting it. If you have an instance or local variable that is a T*, you should assume you have no responsibility for deleting it.
Unique Pointers (std::unique_ptr<T>) should be used as single-ownership pointers, and in general, these should be your default go-to choice for any situation where you need to dynamically allocate memory. std::make_unique<T>() should be preferred for creating any kind of Unique Pointer, as this prevents you from ever seeing the raw pointer in use, and it prevents issues like you described in your original post.
Shared Pointers (std::shared_ptr<T> and std::weak_ptr<T>) should ONLY be used in situations where it is logically correct to have multiple owners of an object. These situations occur less often than you think, by the way! std::make_shared<T>() is the preferred method of creating Shared Pointers, for the same reasons as std::make_unique, and also because std::make_shared can perform some optimizations on the allocations, improving performance.
Vectors (std::vector<T>) should be used in situations where you need to allocate multiple objects into heap space, the same as if you called new T[size]. There's no reason to use pointers at all except in very exotic situations.
It should go without saying that you need to take my rules of "ONLY do 'x'" with a grain of salt: Occasionally, you will have to break those rules, and you might be in a situation where you need a different set of rules. But for 99% of use-cases, those rules are correct and will best convey the semantics you need to prevent memory leaks and properly reason about the behavior of your code.
You cannot.
Avoid raw pointers and use smart pointers, particularly std::unique_ptr. It conveys clearly who is responsible for deleting the object, and the object will be deleted when the std::unique_ptr goes out of scope.
When creating objects, avoid using new. Wrap them in a smart pointer directly and do not take addresses of anything to wrap it in a smart pointer. This way, all raw pointers will never need freeing and all smart pointers will get cleaned up properly when their time has come.
Okay, some things you can distinguish in a very platform-specific, implementation-defined manner. I won’t go into details here, because it’s essentially insane to do (and, again, depends on the platform and implementation), but you are asking for it.
Distinguish local, global and heap variables. This is actually possible on many modern architectures, simply because those three are different ranges of the address space. Global variables live in the data section (as defined by the linker and run-time loader), local variables on the stack (usually at the end of the address space) and heap variables live in memory obtained during run-time (usually not at the end of the address space and of course not overlapping the data and code sections, a.k.a. "mostly everything else"). The memory allocator knows which range that is and can tell you details about the blocks in there, see below.
Detect already-freed variables: you can ask the memory allocator that, possibly by inspecting its state. You can even find out when a pointer points into a allocated region and then find out the block to which it belongs. This is however probably computationally expensive to do.
Distinguishing heap and stack is a bit tricky. If your stack grows large and your program is running long and some piece of heap has been returned to the OS, it is possible that an address which formerly belonged to the heap now belongs to the stack (and the opposite may be possible too). So as I mentioned, it is insane to do this.
You can't reliably. This is why owning raw pointers are dangerous, they do not couple the lifetime to the pointer but instead leave it up to you the programmers to know all the things that could happen and prepare for them all.
This is why we have smart pointers now. These pointers couple the life time to the pointer which means the pointer is only deleted once it is no longer in use anywhere. This makes dealing with pointer much more manageable.
The cpp core guildlines suggests that a raw pointer should never be deleted as it is just a view. You are just using it like a reference and it's lifetime is managed by something else.
Ok I see that everyone is pointing me to the smart pointers, but what if I am trying to write my own set of smart pointers (that is The reason behind my question)?
In that case do like the standard smart pointers do and take a deleter which you default to just using delete. That way if the user of the class wants to pass in a pointer to a stack object they can specify a do nothing deleter and you smart pointer will use that and, well, do nothing. This puts the onus on the person using the smart pointer to tell the pointer how to delete what it points to. Normally they will never need to use something other than the default but if they happen to use a custom allocator and need to use a custom deallocator they can do so using this method.
Actually you can. But memory overhead occurs.
You overload new and delete operator and then keep track of allocations and store it somewhere(void *)
#include<iostream>
#include<algorithm>
using namespace std;
void** memoryTrack=(void **)malloc(sizeof(void *)*100); //This will store address of newly allocated memory by new operator(really malloc)
int cnt=0;//just to count
//New operator overloaded
void *operator new( size_t stAllocateBlock ) {
cout<<"in new";
void *ptr = malloc(stAllocateBlock); //Allocate memory using malloc
memoryTrack[cnt] = ptr;//Store it in our memoryTrack
cnt++; //Increment counter
return ptr; //return address generated by malloc
}
void display()
{
for(int i=0;i<cnt;i++)
cout<<memoryTrack[i]<<endl;
}
int main()
{
double *ptr = new double(3.14159);
double variable = 3.14159;
double * testPtr_1 = ptr;
double * testPtr_2 = &variable;
delete testPtr_1; // fine...
delete testPtr_2;
return 0;
}
Now the most important function(You will have to work on this because it is not complete)
void operator delete( void *pvMem )
{
//Just printing the address to be searched in our memoryTrack
cout<<pvMem<<endl;
//If found free the memory
if(find(memoryTrack,memoryTrack+cnt,pvMem)!=memoryTrack+cnt)
{
//cout<<*(find(memoryTrack,memoryTrack+cnt,pvMem));
cout<<"Can be deleted\n";
free (pvMem);
//After this make that location of memoryTrack as NULL
//Also keep track of indices that are NULL
//So that you can insert next address there
//Or better yet implement linked list(Sorry was too lazy to do)
}
else
cout<<"Don't delete memory that was not allocated by you\n";
}
Output
in new
0xde1360
0xde1360
Can be deleted
0xde1360
0x7ffe4fa33f08
Dont delete memory that was not allocated by you
0xde1360
Important Node
This is just basics and just code to get you started
Open for others to edit and make necessary changes/optimization
Cannot use STL, they use new operator(if some can implement them please do,it would help to reduce and optimize the code)
I would like to return some material in the struct referenced by a pointer and then delete the struct.
In Java, just return the value and the garbage collection system will delete the struct automatically.
But in C++, the way I can imagine is not very clean, that using a temporary variable to store the things for return, deleting the pointer and then returning the value stored.
I try another tricky way using comma expression as "return ptr->value, delete ptr", but there is a compile error says "void value not ignored as it ought to be".
Is there any possible way to achieve that more elegantly?
Thanks a lot.
Updated
Thanks a lot for suggestions from everybody. In fact the original motivation of my problem is about the comma expression which I would like to use for some shorter code. And I found the discussion is more about the usage of pointers in C++. It's also another very interesting topic.
I have used C for years so I am more familiar with raw pointer and have little experience with smart pointer. At my first thought, there are two basic conditions where we need pointers. One is reference to a large piece of memory allocated in heap and the other is for dynamically allocation such as link list node or tree node(eg. my original problem came out while writing a BST-like struct).
So in C++ programming, is smart pointer the best choice for both cases? If we consider the efficiency, such as working on some low level library, is it possible to encapsulate the raw pointer inside the class completely for less memory leak risk?
Sure. Don't use pointers, and if you must, use smart pointers (std::shared_ptr, std::unique_ptr).
In your case, it could be as simple as
//...
return obj.value; //no pointer needed
//automatic memory management
or
//...
return smartPtr->value; //smart pointer automatically cleans up after itself
You probably can't imagine the clean way of doing it in C++ because C++ is taught as C, with pointers and memory management issues. Proper C++ uses RAII and doesn't suffer from that.
No, in your case, because the function is supposed to return some value, but comma operator evaluates to the right most operand, which in your case is delete expression, which is just void.
The usual solution is, first, to not use pointers, so that there
is nothing to delete, and second, when there are other resources
to be cleaned up, to do the cleanup in a destructor, which will
automatically be called after the return statement has copied
the return value to where ever it has to copy it.
If you can return it, you can copy it, so it shouldn't be
allocated dynamically, The most notable exception is where
polymorphic objects are involved; polymorphism requires pointers
or references to work, and most of the time (although there are
exceptions), the polymorphic object will be dynamicallly
allocated. In this case, if the actual lifetime of the
polymorphic object corresponds to local scope, you can use
std::auto_ptr (or std::unique_ptr in the unlikely case that
you can count on C++11).
If you dynamically allocate a pointer inside a function. what you can do is:
consider you want to return an int variable
int foo()
{
MyStruct *obj;
obj=new MyStruct;
.....
int x=obj->value;
delete obj;
return x;
}
Or you can also do:
int foo()
{
MyStruct obj;
obj=new MyStruct;
.....
return obj.value;
}
this way you won't have to worry about memory leaks..
This is the easiest method. Sure you can use smart pointers. But at your level I would say stick with this method.
I have seen at least 5 C++ tutorial sites that return pointers this way:
int* somefunction(){
int x = 5;
int *result = &x;
return result;
}
Is that not a very, VERY bad idea? My logic tells me that the returned pointer's value can be overwritten at any time. I would rather think that this would be the right solution:
int* somefuntion(){
int x = 5;
int* result = new int;
*(result) = x;
return result;
}
And then leave the calling function to delete the pointer. Or am i wrong?
Your instinct about the problem is correct- UB is the result. However, your proposed solution is le bad. "Leave the caller to delete it" is hideously error prone and unadvisable. Instead, return it in some owning class that properly encapsulates it's intended usage- preferably std::unique_ptr<int>.
Yes, the first option will return a dangling pointer, and leads to undefined behavior. Your second option is correct, although you could just write:
int* somefuntion(){
return new int(5);
}
or having a static variable inside the method and returning its address.
Yes, the first example is not good as you'll be returning a pointer to memory that the system may re-purpose for something else. The second example is better, but still risks leaking memory as it's not clear to the caller of somefunction that it's their responsibility to delete the memory that's pointed at.
Something like this might be better:
std::unique_ptr<int> somefunction(){
int x = 5;
std::unique_ptr<int> result( new int );
*result = x;
return result;
}
This way, the unique_ptr will take care of delete'ing the memory that you new'ed, and will helpfully help to eliminate potential memory leaks.
Your question mixes several different issues into one. In reality, the main question here is whether you really need that mixture.
There's no such thing as "returning a pointer" by itself. You don't "return a pointer" just because you want to "return a pointer". Returning pointers is done for some specific reason and that reason will dictate how it is done and what needs to be done in order to ensure that it works properly.
Your original example does not really illustrate that since in your original example there's simply no meaningful reason to return a pointer. It looks like you can simply return an int.
For example, in many cases you'll want to return a pointer because it is a pointer to a dynamically allocated object, i.e. an object whose lifetime is not subject to scoping rules of the language. Note that the casual relationship in this case works in the opposite direction: you need a dynamic object -> you have to return a pointer. That way, not the other way around. In you example you seem to use it backwards: I want to return a pointer -> I have to allocate the object dynamically. That latter reasoning is fundamentally flawed, although one might see it used more often than one would expect.
If you really need a dynamically allocated object (for which, as I said above, the main reason is to override the scope-based lifetime rules of the language), then a matter of memory ownership becomes an issue. In order to know when this memory can/has to be deallocated and who has to deallocate it, you have to implement either exclusive (one designated owner at any moment) or shared (like reference counting) ownership scheme. It can be done with raw pointers, but a better idea would be to use various smart pointer classes provided by the libraries.
But in many situations you can also return pointers to non-dynamic objects (static or automatic), which is perfectly fine assuming that the lifetime of these pointers is the same or shorter than the lifetime of the objects they point to.
In other words, the reasoning behind the decision to return a pointer is not really different between C and C++. It is more design/intent-related than language-related. It is just that C++ provides you with more tools to make your life easier once you already decided to return a pointer. (Which sometimes works as incentive for C++ programmers to overuse concealed pointers).
In any case, again, this is an issue of what functionality you are trying to implement. Once you know that, you can make a good decision about whether you should "return a pointer" or not. And if you finally decided to return a pointer, it will help you to choose the proper return method. That's how it works. Trying to think about it backwards ("I just want to return a pointer, but I don't have a real reason for it yet") will only produce academically useless answers, each of which can be shown to be "wrong" in some specific circumstances.
As you suspected and others clarified, the first method is clearly wrong. Although C++ is a systems language and there are some circumstances where you might want do do this (it would return a particular, relative location on the stack, on most systems), it's ALMOST NEVER right. The second method is much more sane.
However, neither method should be encouraged in C++. One of the main points of C++ is that you now have references and exceptions rather than just the pointers of C. So what you do is return a reference, and allow new to throw an exception up the stack, if the memory allocation fails.
Don't forget to
delete
pointer after invocation of your method, which is correct.
T *p = new T();
For the pointer on heap, there can be disastrous operations such as,
p++; // (1) scope missed
p = new T(); // (2) re-assignment
Which would result in memory leaks or crashes due to wrong delete. Apart from using smart pointers, is it advisable always to make heap pointer a const;
T* const p = new T(); // now "p" is not modifiable
This question is in regards of maintaining good programming practice and coding style.
Well, about the only time I use raw heap pointers is if writing my own data structures. and if you used a const pointer for them, your data structure immediately becomes unassignable. which may or may not be what you want.
I hesitate to say always, but what you propose seems reasonable for many/most cases. Const correctness is something most C++ folks pay a fair bit of attention to in function parameters, but not so much in local (or even member) variables. We might be better off to do so.
There is one major potential problem I see with this:
after delete you should set the pointer to NULL to (help) prevent it being used in other parts of the code.
Const will not allow you to do this.
I believe it is unusual to use a pointer if it is not going to change. Perhaps because I hardly ever use new in normal code, but rely on container classes.
Otherwise it is generally a good idea to preserve const-correctness by making things const whenever possible.
Is this valid? An acceptable practice?
typedef vector<int> intArray;
intArray& createArray()
{
intArray *arr = new intArray(10000, 0);
return(*arr);
}
int main(int argc, char *argv[])
{
intArray& array = createArray();
//..........
delete &array;
return 0;
}
The behavior of the code will be your intended behavior. Now, the problem is that while you might consider that programming is about writing something for the compiler to process, it is just as much about writing something that other programmers (or you in the future) will understand and be able to maintain. The code you provided in many cases will be equivalent to using pointers for the compiler, but for other programmers, it will just be a potential source of errors.
References are meant to be aliases to objects that are managed somewhere else, somehow else. In general people will be surprised when they encounter delete &ref, and in most cases programmers won't expect having to perform a delete on the address of a reference, so chances are that in the future someone is going to call the function an forget about deleting and you will have a memory leak.
In most cases, memory can be better managed by the use of smart pointers (if you cannot use other high level constructs like std::vectors). By hiding the pointer away behind the reference you are making it harder to use smart pointers on the returned reference, and thus you are not helping but making it harder for users to work with your interface.
Finally, the good thing about references is that when you read them in code, you know that the lifetime of the object is managed somewhere else and you need not to worry about it. By using a reference instead of a pointer you are basically going back to the single solution (previously in C only pointers) and suddenly extra care must be taken with all references to figure out whether memory must be managed there or not. That means more effort, more time to think about memory management, and less time to worry about the actual problem being solved -- with the extra strain of unusual code, people grow used to look for memory leaks with pointers and expect none out of references.
In a few words: having memory held by reference hides from the user the requirement to handle the memory and makes it harder to do so correctly.
Yes, I think it will work. But if I saw something like this in any code I worked on, I would rip it out and refactor right away.
If you intend to return an allocated object, use a pointer. Please!
It's valid... but I don't see why you'd ever want to do it. It's not exception safe, and std::vector is going to manage the memory for you anyway. Why new it?
EDIT: If you are returning new'd memory from a function, you should return the pointer, lest users of your function's heads explode.
Is this valid?
Yes.
An acceptable practice?
No.
This code has several problems:
The guideline of designing for least surprising behavior is broken: you return something that "looks like" an object but must be deleted by the client code (that should mean a pointer - a reference should be something that always points to a valid object).
your allocation can fail. Even if you check the result in the allocating function, what will you return? An invalid reference? Do you rely on the allocation throwing an exception for such a case?
As a design principle, consider either creating a RAII object that is responsible for managing the lifetime of your object (in this case a smart pointer) or deleting the pointer at the same abstraction level that you created it:
typedef vector<int> intArray;
intArray& createArray()
{
intArray *arr = new intArray(10000, 0);
return(*arr);
}
void deleteArray(intArray& object)
{
delete &object;
}
int main(int argc, char *argv[])
{
intArray& array = createArray();
//..........
deleteArray(array);
return 0;
}
This design improves coding style consistency (allocation and deallocation are hidden and implemented at the same abstraction level) but it would still make more sense to work through a pointer than a reference (unless the fact that your object is dynamically allocated must remain an implementation detail for some design reason).
It will work but I'm afraid it's flat-out unacceptable practise. There's a strong convention in the C++ world that memory management is done with pointers. Your code violates this convention, and is liable to trip up just about anyone who uses it.
It seems like you're going out of your way to avoid returning a raw pointer from this function. If your concern is having to check repeatedly for a valid pointer in main, you can use a reference for the processing of your array. But have createArray return a pointer, and make sure that the code which deletes the array takes it as a pointer too. Or, if it's really as simple as this, simply declare the array on the stack in main and forego the function altogether. (Initialization code in that case could take a reference to the array object to be initialized, and the caller could pass its stack object to the init code.)
It is valid because compiler can compile and run successfully. However, this kind of coding practices makes codes more harder for readers and maintainers because of
Manual memory management
Vague ownership transfer to client side
But there is a subtle point in this question, it is efficiency requirement. Sometimes we can not return pass-by value because object size might be too big, bulky as in this example (1000 * sizeof(int)); For that reason; we should use pointers if we need to transfer objects to different parts of our code. But this doesn't means above implementation is acceptable because for this kind of requirements, there is very useful tool, it is smart-pointers. So, design decision is up to programmer but for this kind of specific implementation details, programmer should use acceptable patterns like smart-pointers in this example.