How operator new knows that memory is allocated [duplicate] - c++

This question already has answers here:
How do malloc() and free() work?
(13 answers)
Closed 7 years ago.
In C++, how may operator new save information that a piece of memory is allocated? AFAIK, it does not work for constant time and have to search for free memory in heap. Or, maybe, it is not about C++, but about OS?
P.S. I do not know whether it is specified by standard or not, whether it is managed by OS or by C++, but how may it in fact be implemented?

There's no simple, standard answer. Most implementations of operator
new/operator delete ultimately forward to malloc/free, but there are a
lot of different algorithms which can be used for those. The only thing that's
more or less universal is that allocation will typically allocate a little bit
more than requested, and use the extra memory (normally in front of the address
actually returned) to maintain some additional information: either the actual
size allocated or a pointer to the end of the block (or the start of the next
block). Except that some algorithms will keep allocations of the same size
together, and be able to determine the size from the address. There is no
single answer to your question.

new is oftentimes implemented on basis of malloc/free.
How does malloc/free implement it? The answer is: It depends on the implementation. Surprisingly: Malloc oftentimes does not keep track of the allocated blocks at all! The only thing, malloc is doing most of the time, is adding a little bit of information containing the size of the block "before" the allocated block. Meaning, that when you allocate 40 bytes, it will allocate 44 bytes (on 32bit machines) and writes the size in the first 4 bytes. It will return the address of this chunk+4 to you.
Malloc/free keeps track of a freelist. A freelist is a list of freed memory chunks that is not (yet) be given back to the operating system. Malloc searches the freelist, when a new block is needed and when a fitting block is available uses that.
But a more exhausting answer about malloc/free, I have given here:
How do malloc() and free() work?
One additional information:
One implication of the fact, that many allocators don't track allocated blocks: When you return memory by free or delete and pass a pointer in, that was not allocated before, you will corrupt your heap, since the system is not able to check if the pointer is valid. The really ugly thing about it is, that in such a case, your program will not dump immediately, but any time after the error-cause occured ... and thus this error would be really ugly to find. That is one reason, memory handling in C/C++ is so hard!

new maintains a data structure to keep track of individually allocated blocks. There are plenty of ways for doing that. Usually, some kind of linked list is used.
Here a small article to illustrate this.

Related

Realloc only if the memory address doesn't change [duplicate]

This question already has answers here:
realloc without freeing old memory
(7 answers)
Closed 2 years ago.
Is it possible to reallocate more space only if the address stays the same? Like a type of realloc that fails if it cannot do that and would have to return a new address.
While putting final optimizing touches on my specialized pod container, using realloc does yield a reasonable performance boost in my testing but I cannot invalidate pointers to the data during the lifetime of the container and thus cannot leave this up to chance and good luck.
This is generally not feasible in a dynamic memory allocation scheme. An allocator could guarantee memory at the same address would be available only by reserving as much memory as might ever be required at that address to start with. If it does not reserve that much memory for allocation A, then some subsequent allocation B may be placed at some place after A earlier than that maximum potential reservation, and then A could never be enlarged beyond B, since B would be in the way.
One possible way this might be implemented is in a huge address space where each allocation could be given all the virtual address space it might ever need but have physical memory mapped only for the space that has been currently requested. An implementation-dependent custom allocator could be implemented for that.

When delete [] pointer works, why can't you get the size of the array pointed to?

A common way to use a heap-allocated array is:
SomeType * arr = new SomeType[15454];
//... somewhere else
delete [] arr;
In order to do delete [] arr the C runtime has to know the length of the memory buffer associated with the pointer. Am I right?
So in principle it should be possible to access the information somehow? Could it be accessed using some library? I'm just wondering. I understand that it is not a core part of the language so it would be platform dependent.
You get it right. The information is there. But there is no standard way of obtaining it.
If you are using windows, there is an _msize() method, which might give you the size of the memory block, though it may not necessarily be accurate. (The reported memory block size may be rounded up to the closest larger alignment point.) See MSDN -
_msize
If this is something that you really must have, you can try your luck with overriding new, allocating a slightly larger memory block, storing its size in the beginning, and returning a pointer to the byte after the size. Then you can write your own msize() which returns that size. Of course you will need to also override delete. But it is too much hassle, and it is best to avoid it if you can. If that way you go, only pain will you find.
The information exists. Unfortunately, the standart does not specify how dynamic memory should be allocated, nor how the size of the allocated block could be extracted.
That mean that each implementation can do what it wants. Classical ways are:
an allocation table storing all allocated/free blocks with their begin and size - simple to implement except for searches in the table
reserved zones before and after dynamically allocated memory zones - the implementation actually allocates zones consisting in: preamble - dynamic_memory - postamble. The preamble/postamble contains linking informations to other zones, size and status. At deallocation time, the preamble/postamble integrity can be controlled to optionnaly emit a warning for probable memory overwrite. The preamble is the memory preceding the dynamic memory presented to the program.
But as nothing is specified, you will have to dig in the internals of your implementation. Normally reading the source of malloc/free is the best source of information.
The truth is delete[] does not know the exact size of an array you have allocated, but it knows how much memory was allocated with the corresponding call to new[]. Often no excess memory is allocated, so the two numbers match. However, you cannot rely on it. This is not part of the standard because there is no reliable way of knowing the size of a dynamically allocated array.

Find which heap an address belongs to?

I'm creating a memory management system and i need a way to find in which heap an allocation I make is.
for example i use HeapAlloc and use the heap returned by GetProcessHeap() as the heap to allocate to I would expect it to allocate to that heap, but appears as though it doesn't.
When I use GetProcessHeaps to run through the heaps i find that the process heap is at something like 0x00670000 and my allocated address is at like 0x0243a385 or something. (in other words nowhere near it)
And sometimes it can actually be before it (so like 0x004335ab or something)
So, i'd like to know if there is a way I can reliably get the starting address of the heap (and the end address if at all possible!?) that i made the allocation in.
Your understanding of heaps is wrong. In general, modern heaps do not rely on allocating a large chunk of data and then parcelling it up with each allocation as you assume (although they may use this as one of their strategies). This means there is no well defined 'start' or 'end' of a heap. As an example, by default, with Windows heaps large allocations always go direct to the operating system via VirtualAlloc(...) which means that allocations from one heap may interleave with allocations from another.
If you really need to work out which heap an allocation comes from, there is a way, although its really slow so you shouldn't rely on it except for debugging or logging or similar. For actual, normal, code you should really know where allocations came from either via deduced context or by actually storing it.
Warnings aside, you can use HeapWalk to enumerate all allocations from each heap looking for the one you want.

delete & new in c++

This may be very simple question,But please help me.
i wanted to know what exactly happens when i call new & delete , For example in below code
char * ptr=new char [10];
delete [] ptr;
call to new returns me memory address. Does it allocate exact 10 bytes on heap, Where information about size is stored.When i call delete on same pointer,i see in debugger that there are a lot of byte get changed before and after the 10 Bytes.
Is there any header for each new which contain information about number of byte allocated by new.
Thanks a lot
Do it allocate exact 10 bytes
That's implementation dependant. The guarantee is "at least 10 chars".
Where information about size is stored?
That's implementation dependant.
Is there any header for each new which contain information about number of byte allocated by new?
That's implementation dependant.
By "that's implementation dependant" I mean it's not defined in the standard.
That's all up to the compiler and your runtime library. It's only exactly defined what effects new and delete have on your program, but how exactly these are acieved is not specified.
In your case it seems like a little more memory than requested is allocated and it will probably store management information like the size of the current chunk of memory, information about adjacent areas of free space or information to help the debugger try to detect buffer overflows and similar problems.
It is completely implementation-dependent. In general case you have to store the number of elements elsewhere. The implementation must allocate enough space for at least the number of elements specified, but it can allocate more.
Is there any header for each new which contain information about number of byte allocated by new.
That's platform dependent but yes, on many platforms there are.
Precisely, according to the standard, new char[10] will alloc at least 10 bytes in the heap.
The internals of new and delete are implementation dependent. So it will vary from compiler to compiler, and platform to platform. Additionally, you can find a variety of allocator algorithms (e.g: TCMalloc).
I'll give you an overview of how it could work internally, but don't take it as absolute truth. It's written for the solely purpose of this explanation.
In short, the new operator internally invokes malloc. The malloc uses a really long linked list of available memory blocks, aka free chain. When malloc is invoked, it lookups this list for the first block that's big enough to hold the requested size. After that, it splits the block in two parts, one with the size you requested, and the other with the rest, which is then added back to the free chain. Finally, it returns the block with the request size.
The inverse occurs in a free call, which is invoked by delete/delete[]. In short, it puts the provided block back to the free chain.
There could be fancy tricks during the processes I described above, like sorting the free chain, rounding the requested size to the next power of two to reduce memory fragmentation, and so on.
char * ptr=new char [10];
You are creating an array of 10 character's in heap and storing the address of 0th element in a pointer.this is similar to doing an malloc in C
delete [] ptr;
You are deleting(freeing the memory) the heap memory which was allocated by the earlier statement.this is similar to doing a free in c.
It is implementation dependent, but mostly the metadata for a block of memory is usually stored in the area before the memory address returned. The change that you observed before the 10 bytes was likely metadata being updated for this block (likely the size of the block being written into the meta data), and after the 10 bytes were metadata being updated for the next block (still unallocated, likely the pointer to the next chunk on the free list).
It is not a good idea to mess with the heap as it is not portable. However, if you want to do such heap magic, I suggest you implement your own memory pools (just get a large chunk of memory from the heap and manage it yourself). A possible place to start would be to look at libmm.
While the specifics are implementation dependent, one piece of information the implementation will need to store is the number of elements in the array. Or if it does not store it directly, it will need to accurately derive it from the block size allocated.
The reason for this because if an array of objects is allocated with new[], when they are deleted with delete[], the destructor of each object in the array will need to be called. delete[] will need to know how many objects to destruct. This is why it is necessary to match new with delete and new[] with delete[].

C and C++: Freeing PART of an allocated pointer

Let's say I have a pointer allocated to hold 4096 bytes. How would one deallocate the last 1024 bytes in C? What about in C++? What if, instead, I wanted to deallocate the first 1024 bytes, and keep the rest (in both languages)? What about deallocating from the middle (it seems to me that this would require splitting it into two pointers, before and after the deallocated region).
Don't try and second-guess memory management. It's usually cleverer than you ;-)
The only thing you can achieve is the first scenario to 'deallocate' the last 1K
char * foo = malloc(4096);
foo = realloc(foo, 4096-1024);
However, even in this case, there is NO GUARANTEE that "foo" will be unchanged. Your entire 4K may be freed, and realloc() may move your memory elsewhere, thus invalidating any pointers to it that you may hold.
This is valid for both C and C++ - however, use of malloc() in C++ is a bad code smell, and most folk would expect you to use new() to allocate storage. And memory allocated with new() cannot be realloc()ed - or at least, not in any kind of portable way. STL vectors would be a much better approach in C++
If you have n bytes of mallocated memory, you can realloc m bytes (where m < n) and thus throw away the last n-m bytes.
To throw away from the beginning, you can malloc a new, smaller buffer and memcpy the bytes you want and then free the original.
The latter option is also available using C++ new and delete. It can also emulate the first realloc case.
You don't have "a pointer allocated to hold 4096 bytes", you have a pointer to an allocated block of 4096 bytes.
If your block was allocated with malloc(), realloc() will allow you to reduce or increase the size of the block. The start address of the block won't necessarily stay the same, though.
You can't change the start address of a malloc'd memory block, which is really what your second scenario is asking. There's also no way to split a malloc'd block.
This is a limitation of the malloc/calloc/realloc/free API -- and implementations may rely on these limitations (for example, keeping bookkeeping information about the allocation immediately before the start address, which would make moving the start address difficult.)
Now, malloc isn't the only allocator out there -- your platform or libraries might provide other ones, or you could write your own (which gets memory from the system via malloc, mmap, VirtualAlloc or some other mechanism) and then hands it out to your program in whatever fashion you desire.
For C++, if you allocate memory with std::malloc, the information above applies. If you're using new and delete, you're allocating storage for and constructing objects, and so changing the size of an allocated block doesn't make sense -- objects in C++ are a fixed size.
You can make it shorter with realloc(). I don't think the rest is possible.
You can use realloc() to apparently make the memory shorter. Note that for some implementations such a call will actually do nothing. You can't free the first bit of the block and retain the last bit.
If you find yourself needing this kind of functionality, you should consider using a more complex data structure. An array is not the correct answer to every programming problem.
http://en.wikipedia.org/wiki/New_(C%2B%2B)
SUMMARY:In contrast to C's realloc, it
is not possible to directly reallocate
memory allocated with new[]. To extend
or reduce the size of a block, one
must allocate a new block of adequate
size, copy over the old memory, and
delete the old block. The C++ standard
library provides a dynamic array that
can be extended or reduced in its
std::vector template.