I am currently working on a code, incremental garbage collection which is just a simulation. What i mean is that in the program, the user wil enter the amount of physical memory to be assigned and also will be entering keywords like x = alloc(10MB) which expects me to allocate the object "x" 10MB of the physical memory. So i will be needing a start pointer as an end pointer for the code.
My doubt: What would be the best way to assign this "physical memory" ? I came across malloc and calloc where many recommended not to use it unless necessary. Also there is the new operator. So i wanted to know if there was any other better way. And this physical memory will remain fixed through the process.
Any help is appreciated.
In C++, instead of allocating raw memory or even using raw pointers, it's generally encouraged to use the collection classes of the C++ standard library, such as std::vector. If you're writing a simulation for garbage collection, I can imagine that you design a class for unified access to GC-managed memory, and then you allocate a vector of them.
If, however, this is not the case, you can (although in C++, you reallly shouldn't) use malloc(), or calloc() if you need zero-initialized memory.
Also is there a way i can make another pointer point to the end address of the block.?
Sure, use pointer arithmetic.
T *ptr = malloc(sizeof(*ptr) * N_ELEMENTS);
T *endPastOne = ptr + N_ELEMENTS;
Related
I understand pointer allocation of memory fully, but deallocation of memory only on a higher level. What I'm most curious about is how C++ keeps track of what memory has already been deallocated?
int* ptr = new int;
cout << ptr;
delete ptr;
cout << ptr;
// still pointing to the same place however it knows you can't access it or delete it again
*ptr // BAD
delete ptr // BAD
How does C++ know I deallocated that memory. If it just turns it to arbitrary garbage binary numbers, wouldn't I just be reading in that garbage number when I dereference the pointer?
Instead, of course, c++ knows that these are segfaults somehow.
C++ does not track memory for you. It doesn't know, it doesn't care. It is up to you: the programmer. (De)allocation is a request to the underlying OS. Or more precisely it is a call to libc++ (or possibly some other lib) which may or may not access the OS, that is an implementation detail. Either way the OS (or some other library) tracks what parts of memory are available to you.
When you try to access a memory that the OS did not assigned to you, then the OS will issue segfault (technically it is raised by the CPU, assuming it supports memory protection, it's a bit complicated). And this is a good situation. That way the OS tells you: hey, you have a bug in your code. Note that the OS doesn't care whether you use C++, C, Rust or anything else. From the OS' perspective everything is a machine code.
However what is worse is that even after delete the memory may still be owned by your process (remember those libs that track memory?). So accessing such pointer is an undefined behaviour, anything can happen, including correct execution of the code (that's why it is often hard to find such bugs).
If it just turns it to arbitrary garbage binary numbers, wouldn't I just be reading in that garbage number when I dereference the pointer?
Who says it turns into garbage? What really happens to the underlying memory (whether the OS reclaims it, or it is filled with zeros or some garbage, or maybe nothing) is none of your concern. Everything you need to know is that after delete it is no longer safe to use the pointer. Even (or especially) when it looks ok.
How does C++ know I deallocated that memory.
When you use a delete expression, "C++ knows" that you deallocated that memory.
If it just turns it to arbitrary garbage binary numbers
C++ doesn't "turn [deallocated memory] to arbitrary garbage binary numbers". C++ merely makes the memory available for other allocations. Changing the state of that memory may be a side effect of some other part of the program using that memory - which it is now free to do.
wouldn't I just be reading in that garbage number when I dereference the pointer?
When you indirect through the pointer, the behaviour of the program is undefined.
Instead, of course, c++ knows that these are segfaults somehow.
This is where your operating system helpfully stepped in. You did something that did not make sense, and the operating system killed the misbehaving process. This is one of the many things that may but might not happen when the behaviour of the program is undefined.
I take it that you wonder what delete actually does. Here it is:
First of all, it destructs the object. If the object has a destructor, it is called, and does whatever it is programmed to do.
delete then proceeds to deallocate the memory itself. This means that the deallocator function (::operator delete() in most cases in C++) typically takes the memory object, and adds it to its own, internal data structures. I.e. it makes sure that the next call to ::operator new() can find the deallocated memory slab. The next new might then reuse that memory slab for other purposes.
The entire management of memory happens by using data structures that you do not see, or need to know that they exist. How an implementation of ::operator new() and ::operator delete() organizes its internal data is strictly and fully up to the implementation. It doesn't concern you.
What concerns you is, that the language standard defines that any access to a memory object is undefined behavior after you have passed it to the delete operator. Undefined behavior does not mean that the memory needs to vanish magically, or that it becomes inaccessible, or that it is filled with garbage. Usually none of these happens immediately, because making the memory inaccessible or filling it with garbage would require explicit action from the CPU, so implementations don't generally touch what's written in the memory. You are just forbidden to make further accesses, because it's now up to system to use the memory for any other purpose it likes.
C++ still has a strong C inheritance when it comes to memory addressing. And C was invented to build an OS (first version of Unix) where it makes sense to use well known register addresses or to whatever low level operation. That means that when you address memory through a pointer, you as the programmer are supposed to know what lies there and the language just trusts you.
On common implementations, the language requests chunks of memory from the OS for new dynamic objects, and keeps track of used and unused memory block. The goal is to re-use free blocks for new dynamic objects instead of asking the OS for each and every allocation and de-allocation.
Still for common implementation, nothing changes in a freshly allocated or deallocated block, but the pointers maintaining a list of free blocks. AFAIK few return memory to the OS until the end of the process. But a free block could be later re-used, that is the reason why when a careless programmer tries to access a block of memory containing pointers that has been re-used, SEGFAULT is not far, because the program could try to use arbitrary memory addresses that could not be mapped for the process.
BTW, the only point required by the standard is that accessing an object past its end of life, specifically here using the pointer after the delete statement invokes Undefined Behaviour. Said differently anything can happen from an immediate crash to normal results, passing through later crash or abnormal result in unrelated places of the program...
Getting straight to the point: What is the reason for needing to allocate memory in c++?
I understand some programming languages do it automatically, but in C/C++: what is the reason for having to allocate memory. For example:
When declaring PROCESSENTRY32, why do we need to ZeroMemory() it? When making a buffer for a sockets program, why do we need to ZeroMemory() it? Why don't you need to allocate memory when you declare an int data type?
Your question doesn't really make sense. ZeroMemory doesn't allocate memory; it just, well, sets bytes to 0. You can easily ZeroMemory an int, if you want. It's just that i = 0; is shorter to write.
In all cases ZeroMemory only works on memory that already exists; i.e. something else must have allocated it before.
As for actual allocation, C distinguishes three kinds of storage for objects:
Static storage. These objects are allocated when the program starts and live for as long as the program runs. Example: Global variables.
Automatic storage. These objects are allocated when execution reaches their scope and deallocated when execution leaves their containing scope. Example: Local variables.
Dynamic storage. This is what you manage manually by calling malloc / calloc / realloc / free.
The only case where you really have to allocate memory yourself is case #3. If your program only uses automatic storage, you don't have to do anything special.
In languages like Java, you still have to allocate memory by calling new. Python doesn't have new, but e.g. whenever you execute something like [...] or {...}, it creates a new list/dictionary, which allocates memory.
The crucial part is really that you don't have to deallocate memory.
Languages like Java or Python include a garbage collector: You create objects, but the language takes care of cleaning up behind you. When an object is no longer needed1, it is deallocated automatically.
C doesn't do that. The reasons lie in its history: C was invented as a replacement for assembler code, in order to make porting Unix to a new computer easier. Automatic garbage collection requires a runtime system, which adds complexity and can have performance issues (even modern garbage collectors sometimes pause the whole program in order to reclaim memory, which is undesirable, and C was created back in 1972).
Not having a garbage collector makes C
easier to implement
easier to predict
potentially more efficient
able to run on very limited hardware
C++ was meant to be a "better C", targeting the same kind of audience. That's why C++ kept nearly all of C's features, even those that are very unfriendly to automatic garbage collection.
1 Not strictly true. Memory is reclaimed when it is no longer reachable. If the program can still reach an object somehow, it will be kept alive even if it's not really needed anymore (see also: Space leak).
C chooses to be relatively low-level language where language constructs more or less directly map to at most a few machine instructions.
Block level allocations such as in
int main()
{
int a,b,c; //a very cheap allocation on the stack
//... do something with a, b, and c
}
fall within this category as all block-level allocations together in a function will normally translate to just a single subtraction to the stack pointer.
The downside of these allocations is that they're very limited -- you shouldn't allocate big objects or multiple objects like this (or you risk stack overflow) and they're not very persistent either--they're effectively undone at the end of the scope.
As for generic allocations from main memory, the machine doesn't really offer you much apart from a big array of char (i.e., your RAM) and possibly some virtual memory mapping facilities (i.e., mapping real memory into smaller arrays of char). There are multiple ways for slicing up these arrays and for using and reusing the pieces, so C leaves this to the libraries. C++ takes after C.
Is there any way to distinguish two following situations at run time:
double ptr * = new double(3.14159);
double variable = 3.14159
double * testPtr_1 = ptr;
double * testPtr_2 = &variable;
delete testPtr_1 // fine...
delete testPtr_2 // BIG RUN TIME ERROR !!!
I have find myself in situation in with I need to call delete operator for some unknown pointer. The pointer can point to anywhere (to a "local" variable or to dynamically allocated variable).
How can I find out where my "unknown" pointer points, and therefore choose when to and when not to call operator delete on it
EDIT:
Ok I see that everyone is pointing me to the smart pointers, but what if I am trying to write my own set of smart pointers (that is The reason behind my question)?
There is no way to test if a pointer is pointing to a memory area that would be valid to delete. Moreover,
There is no way to tell between pointers that must be freed with delete vs. delete[],
There is no way to tell between the pointers that have been freed and pointers that have not been freed,
There is no way to tell among pointers to an automatic variable, pointers to static variable, and pointers to dynamically allocated blocks.
The approach that you should take is tracking allocations/deallocations by some other means, such as storing flags along with your pointers. However, this is rather tedious: a much better practice is to switch to smart pointers, which would track resources for you.
You need to set some better coding practices for yourself (or for your project).
Especially since most platforms have, at the very least, a C++11-compliant compiler, there's no reason not to be using the following paradigm:
Raw Pointers (T*) should ONLY be used as non-owning pointers. If you receive a T* as the input for a function or constructor, you should assume you have no responsibility for deleting it. If you have an instance or local variable that is a T*, you should assume you have no responsibility for deleting it.
Unique Pointers (std::unique_ptr<T>) should be used as single-ownership pointers, and in general, these should be your default go-to choice for any situation where you need to dynamically allocate memory. std::make_unique<T>() should be preferred for creating any kind of Unique Pointer, as this prevents you from ever seeing the raw pointer in use, and it prevents issues like you described in your original post.
Shared Pointers (std::shared_ptr<T> and std::weak_ptr<T>) should ONLY be used in situations where it is logically correct to have multiple owners of an object. These situations occur less often than you think, by the way! std::make_shared<T>() is the preferred method of creating Shared Pointers, for the same reasons as std::make_unique, and also because std::make_shared can perform some optimizations on the allocations, improving performance.
Vectors (std::vector<T>) should be used in situations where you need to allocate multiple objects into heap space, the same as if you called new T[size]. There's no reason to use pointers at all except in very exotic situations.
It should go without saying that you need to take my rules of "ONLY do 'x'" with a grain of salt: Occasionally, you will have to break those rules, and you might be in a situation where you need a different set of rules. But for 99% of use-cases, those rules are correct and will best convey the semantics you need to prevent memory leaks and properly reason about the behavior of your code.
You cannot.
Avoid raw pointers and use smart pointers, particularly std::unique_ptr. It conveys clearly who is responsible for deleting the object, and the object will be deleted when the std::unique_ptr goes out of scope.
When creating objects, avoid using new. Wrap them in a smart pointer directly and do not take addresses of anything to wrap it in a smart pointer. This way, all raw pointers will never need freeing and all smart pointers will get cleaned up properly when their time has come.
Okay, some things you can distinguish in a very platform-specific, implementation-defined manner. I won’t go into details here, because it’s essentially insane to do (and, again, depends on the platform and implementation), but you are asking for it.
Distinguish local, global and heap variables. This is actually possible on many modern architectures, simply because those three are different ranges of the address space. Global variables live in the data section (as defined by the linker and run-time loader), local variables on the stack (usually at the end of the address space) and heap variables live in memory obtained during run-time (usually not at the end of the address space and of course not overlapping the data and code sections, a.k.a. "mostly everything else"). The memory allocator knows which range that is and can tell you details about the blocks in there, see below.
Detect already-freed variables: you can ask the memory allocator that, possibly by inspecting its state. You can even find out when a pointer points into a allocated region and then find out the block to which it belongs. This is however probably computationally expensive to do.
Distinguishing heap and stack is a bit tricky. If your stack grows large and your program is running long and some piece of heap has been returned to the OS, it is possible that an address which formerly belonged to the heap now belongs to the stack (and the opposite may be possible too). So as I mentioned, it is insane to do this.
You can't reliably. This is why owning raw pointers are dangerous, they do not couple the lifetime to the pointer but instead leave it up to you the programmers to know all the things that could happen and prepare for them all.
This is why we have smart pointers now. These pointers couple the life time to the pointer which means the pointer is only deleted once it is no longer in use anywhere. This makes dealing with pointer much more manageable.
The cpp core guildlines suggests that a raw pointer should never be deleted as it is just a view. You are just using it like a reference and it's lifetime is managed by something else.
Ok I see that everyone is pointing me to the smart pointers, but what if I am trying to write my own set of smart pointers (that is The reason behind my question)?
In that case do like the standard smart pointers do and take a deleter which you default to just using delete. That way if the user of the class wants to pass in a pointer to a stack object they can specify a do nothing deleter and you smart pointer will use that and, well, do nothing. This puts the onus on the person using the smart pointer to tell the pointer how to delete what it points to. Normally they will never need to use something other than the default but if they happen to use a custom allocator and need to use a custom deallocator they can do so using this method.
Actually you can. But memory overhead occurs.
You overload new and delete operator and then keep track of allocations and store it somewhere(void *)
#include<iostream>
#include<algorithm>
using namespace std;
void** memoryTrack=(void **)malloc(sizeof(void *)*100); //This will store address of newly allocated memory by new operator(really malloc)
int cnt=0;//just to count
//New operator overloaded
void *operator new( size_t stAllocateBlock ) {
cout<<"in new";
void *ptr = malloc(stAllocateBlock); //Allocate memory using malloc
memoryTrack[cnt] = ptr;//Store it in our memoryTrack
cnt++; //Increment counter
return ptr; //return address generated by malloc
}
void display()
{
for(int i=0;i<cnt;i++)
cout<<memoryTrack[i]<<endl;
}
int main()
{
double *ptr = new double(3.14159);
double variable = 3.14159;
double * testPtr_1 = ptr;
double * testPtr_2 = &variable;
delete testPtr_1; // fine...
delete testPtr_2;
return 0;
}
Now the most important function(You will have to work on this because it is not complete)
void operator delete( void *pvMem )
{
//Just printing the address to be searched in our memoryTrack
cout<<pvMem<<endl;
//If found free the memory
if(find(memoryTrack,memoryTrack+cnt,pvMem)!=memoryTrack+cnt)
{
//cout<<*(find(memoryTrack,memoryTrack+cnt,pvMem));
cout<<"Can be deleted\n";
free (pvMem);
//After this make that location of memoryTrack as NULL
//Also keep track of indices that are NULL
//So that you can insert next address there
//Or better yet implement linked list(Sorry was too lazy to do)
}
else
cout<<"Don't delete memory that was not allocated by you\n";
}
Output
in new
0xde1360
0xde1360
Can be deleted
0xde1360
0x7ffe4fa33f08
Dont delete memory that was not allocated by you
0xde1360
Important Node
This is just basics and just code to get you started
Open for others to edit and make necessary changes/optimization
Cannot use STL, they use new operator(if some can implement them please do,it would help to reduce and optimize the code)
Hell'o
I want to create my own dynamic array (vector) class, but don't know how to allocate memory on addres whom I point to. In function add I added line like:
int * object = new (this->beginning + this->lenght) int (paramValue); But visual studio shows me an error message "operator new cannot be called with the given arguments". How to make it works, which arguments should I send to the new operator?
(I am not sure to understand your question, but....)
You might want to use the placement new operator (but to implement a <vector> like thing you don't need that). Then you'll need to #include <new>
But you probably don't need that. Just call plain new from your constructor, and plain delete from your destructor. Something like int*arr = new int[length]; (in constructor) and later delete[] arr; (in destructor).
(it looks that you are misunderstanding something; I recommend spending several days reading a good C++ programming book)
how to allocate memory on address whom I point to
Insufficient information -- what kind of system? custom hardware? OS?
On a desktop, you could use 2 steps. You allocate a block of bytes using something like:
uint8_t* myMemoryBlock = new uint8_t[1000]; // 1000 byte block
Then you might contemplate using placement new at the address "you point to" using 'myMemoryBlock', with a cast.
On a desktop, the dynamic memory system can be used this way...
But if you are planning to create a user defined type any way, just new that type, and let the dynamic memory fall where it may, as opposed to positioning it on myMemoryBlock.
On a desktop, there is (generally) no memory your user-privilege level executable can access with 'new'. All other memory is protected.
mmap on Linux maps devices or files into your executables memory range. I am unfamiliar with such devices, but I have used mmap with files.
update 2017/03/19
Note 1 - user-privilege level tasks are typically blocked from accessing other / special memory.
Note 2 - memory addresses, such as 'myMemoryBlock' above, are virtual, not physical. This includes code addresses, automatic memory addresses, dynamic memory addresses. If your processor has memory management hardware support, your coding has special efforts to access physical addresses, in memory or otherwise.
On a single-board-computer (SBC), (with or without an OS) I would expect that the address you wish to 'allocate' will not be within the 'dynamic' memory set up by the board support package (BSP).
On this kind of embedded system (on a SBC), someone (an architect) has 'mapped' this 'special' memory to an address range not in use for other purposes (i.e. not part of dynamic memory). Here, you simply find out what the address is, and use it by casting the uintXX_t value to a pointer of appropriate type. Something like:
myDataType* p = reinterpret_cast<myDataType*>(premappedAddress);
For more info, you should seek out other sites discussing embedded systems.
I heard that Python has automated "garbage collection" , but C++ does not. What does that mean?
Try reading up on it.
That means that python user doesn't need to clean his dynamic created objects, like you're obligated to do it in C/C++.
Example in C++:
char *ch = new char[100];
ch[0]='a';
ch[1]='b';
//....
// somewhere else in your program you need to release the alocated memory.
delete [] ch;
// use *delete ch;* if you've initialized *ch with new char;
in python:
def fun():
a=[1, 2] #dynamic allocation
a.append(3)
return a[0]
python takes care about "a" object by itself.
From Wikipedia http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29:
...
Garbage collection frees the programmer from manually dealing with memory allocation and deallocation. As a result, certain categories of bugs are eliminated or substantially reduced:
Dangling pointer bugs, which occur
when a piece of memory is freed while
there are still pointers to it, and
one of those pointers is used.
Double free bugs, which occur when
the program attempts to free a
region of memory that is already
free.
Certain kinds of memory leaks, in
which a program fails to free
memory that is no longer referenced
by any variable, leading, over time,
to memory exhaustion.
...
The basic principles of garbage collection are:
Find data objects in a program that cannot be accessed in the future
Reclaim the resources used by those objects
Others already answered the main question, but I'd like to add that garbage collection is possible in C++. It's not that automatic like Python's, but it's doable.
Smart pointers are probably the simplest form of C++ garbage collecting - std::auto_ptr, boost::scoped_ptr, boost::scoped_array that release memory after being destroyed. There's an example in one of the earlier answers, that could be rewritten as:
boost::scoped_array<char> ch(new char[100]);
ch[0] = 'a';
ch[1] = 'b';
// ...
// boost::scoped_array will be destroyed when out of scope, or during unwind
// (i.e. when exception is thrown), releasing the array's memory
There are also boost::shared_ptr, boost::shared_array that implement reference counting (like Python). And there are full-blown garbage collectors that are meant to replace standard memory allocators, e.g. Boehm gc.
It basically means the way they handle memory resources. When you need memory you usually ask for it to the OS and then return it back.
With python you don't need to worry about returning it, with C++ you need to track what you asked and return it back, one is easier, the other performant, you choose your tool.
As you have got your answer, now it's better to know the cons of automated garbage collection:
it requires large amounts of extra memory and not suitable for hard real-time deadline applications.