Does malloc and new know about each other? - c++

Assume I have 10 kB heap and mix C and C++ code like that.
char* block1 = malloc(5*1024); //allocate 5
char* block2 = new[](4*1024); // allocate 4
Is there C heap and C++ heap or just a single heap common for both? So that "new" knows that the first 5 kb of heap are already allocated?

There may or may not be separate C and C++ heaps. You can't write a conforming C++ program that can tell the difference, so it's entirely up to the implementation.
The standard describes the first step in the default behavior of operator new like this:
Executes a loop: Within the loop, the function first attempts to
allocate the requested storage. Whether the attempt involves a call to
the C standard library functions malloc or aligned_alloc is
unspecified. [new.delete.single]/4.1.
And for malloc itself, the standard says: "[aligned_alloc, calloc, malloc, and realloc] do not attempt to allocate storage by calling ::operator new()" [c.malloc]/3.
So the intention is that it's okay to call malloc from operator new, but it's not required.
In practice, operator new calls malloc.

The way that memory allocation works, is that the userspace program first requests one or several memory pages from the operating systems by means of a syscall (sbrk or mmap on *nix).
This is usually done by the malloc-implementation (there are several) that is included in your "C-library". This malloc-implementation then manages all pages that it (successfully) requested. This is done in userspace.
Returning to your question: Most implementations of ::operator new will just relay to malloc. But you can always use a different allocator-implementation and even mix several in your program (see: memory pools).*
This is the reason why the standard requires you to not mix malloc/free and new/delete.
*) Many malloc-implementations have problems with lots of small objects (which is pretty common in C++), this is a good reason for changing the allocator.

An efficient implementation should rather be using system calls brk()/mmap() directly rather than going via malloc().
Though looking at gnu implementation, that isn't the case. It's using malloc() internally.

Related

Does C++ new operator use malloc() underneath?

In other words, does it or not do a malloc() syscall everytime it is called ? (maybe by allocation a large chunk in advance)
Before C++14 the standard prohibited the implementation from combining allocations. Therefor each new expression did correspond one-to-one with a call to some system allocation function (possibly malloc).
C++14 relaxed this restriction in some cases. It's now possible for the implementation to combine allocations if the lifetime of one is strictly within the lifetime of the other. This is a fairly narrow restriction though, so I expect allocations don't actually get combined all that often.
In other words, does it or not do a malloc() syscall everytime it is called?
It's actually implementation dependend. But usually implementations of new will make use of malloc() syscalls/c-library bindings.
(maybe by allocation a large chunk in advance)
Yes, you have to consider that as a drawback. Frequently calling something like
char* newChar = new char();
may clutter your dynamic storage space unnecessarily with larger chunks allocated, than a single char would need.
If you want to override that behavior for some more efficient memory management, you can always use placement new.
As others have said, this is implementation defined. However, I would think that a high-performance C++ implementation would probably not use malloc(), but would use OS-specific memory allocation APIs or system calls (which malloc() must itself use). After all, why add an extra function call to every memory allocation? But I have no hard evidence for this.

Is it safe to free() memory allocated by new? [duplicate]

This question already has answers here:
new, delete ,malloc & free
(2 answers)
Closed 8 years ago.
I'm working on a C++ library, one of whose functions returns a (freshly allocated) pointer to an array of doubles. The API states that it is the responsibility of the caller to deallocate the memory.
However, that C++ library used to be implemented in C and the function in question allocates the memory with malloc(). It also assumes that the caller will deallocate that memory with free().
Can I safely replace the call to malloc() with a call to new? Will the existing client code (that uses free() break if I do so? All I could find so far was the official documentation of free(), which states that
If ptr does not point to a block of memory allocated with [malloc, calloc or realloc], it causes undefined behavior.
But I believe this was written before C++ came along with its own allocation operators.
You MUST match calls to malloc with free and new with delete. Mixing/matching them is not an option.
You are not allowed to mix and match malloc and free with new and delete the draft C++ standard refers back to the C99 standard for this and if we go to the draft C++ standard section 20.6.13 C library it says (emphasis mine going forward):
The contents are the same as the Standard C library header stdlib.h, with the following changes:
and:
The functions calloc(), malloc(), and realloc() do not attempt to allocate storage by calling ::operator
new() (18.6).
and:
The function free() does not attempt to deallocate storage by calling ::operator delete().
See also: ISO C Clause 7.11.2.
and includes other changes, none of which state that we can use free on contents allocated with new. So section 7.20.3.2 The free function from the draft C99 standard is still the proper reference and it says:
Otherwise, if the argument does not match a pointer earlier returned by the calloc, malloc, or realloc function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.
As you've heard now, you can't mix them.
Keep in mind that in C++ it's common to have lots of relatively small temporary objects dynamically allocated (for instance, it's easy to write code like my_string + ' ' + your_string + '\n'), while in C memory allocation's typically more deliberate, often with a larger average allocation size and longer lifetime (much more likely someone would directly malloc(strlen(my_string) + strlen(your_string) + 3) for the result without any temporary buffers). For that reason, some C++ libraries will optimise for large numbers of small transient objects. They might, for instance, use malloc() to get three 16k blocks, then use each for fixed-size requests of up to 16, 32 and 64 bytes respectively. If you call delete in such a situation, it doesn't free anything - it just returns the particular entry in the 16k buffer to a C++-library free list. If you called free() and the pointer happened to be to the first element in the 16k buffer, you'd accidentally deallocate all the elements; if it wasn't to the first you have undefined behaviour (but some implementations like Visual C++ apparently still free blocks given a pointer anywhere inside them).
So - really, really don't do it.
Even if it ostensibly works on your current system, it's a bomb waiting to go off. Different runtime behaviour (based on different inputs, thread race conditions etc.) could cause a later failure. Compilation with different optimisation flags, compiler version, OS etc. could all break it at any time.
The library should really provide a deallocation function that forwards to the correct function.
In addition to what the others already said (no compatibility guarantee), there is also the possibility that the library is linked to a different C library than your program, and so invoking free() on a pointer received from them would pass it to the wrong deallocation function even if the function names are correct.
You must use delete operator to deallocate memory when it is allocated by new operator.
malloc() allocates the memory and sends the address of the first block to the assigned pointer variable,in the case of new it allocates the memory and returns the address .it is a convention that when you use a malloc() function we should use delete function and when you are allocating memory with the help of new function the usage of free() function is comfortable.when malloc()it is a convention that we should use the corresponding realloc(),calloc(),delete() functions and similarly,when you use new() function the corresponding free()function is used.

Why can't we free() memory that was allocated by new?

I know free() won't call the destructor, but what else will this cause besides that the member variable won't be destructed properly?
Also, what if we delete a pointer that is allocated by malloc?
It is implementation defined whether new uses malloc under the hood. Mixing new with free and malloc with delete could cause a catastrophic failure at runtime if the code was ported to a new machine, a new compiler, or even a new version of the same compiler.
I know free() won't call the destructor
And that is reason enough not to do it.
In addition, there's no requirement for a C++ implementation to even use the same memory areas for malloc and new so it may be that you're trying to free memory from a totally different arena, something which will almost certainly be fatal.
Many points:
It's undefined behaviour, and hence inherently risky and subject to change or breakage at any time and for no reason at all.
(As you know) delete calls the destructor and free doesn't... you may have some POD type and not care, but it's easy for someone else to add say a string to that type without realising there are weird limitations on its content.
If you malloc and forget to use placement new to construct an object in it, then invoke a member function as if the object existed (including delete which calls the destructor), the member function may attempt operations using pointers with garbage values
new and malloc may get memory from different heaps.
Even if new calls malloc to get its memory, there may not be a 1:1 correspondence between the new/delete and underlying malloc/free behaviour.
e.g. new may have extra logic such as small-object optimisations that have proven beneficial to typical C++ programs but harmful to typical C programs.
Someone may overload new, or link in a debug version of malloc/realloc/free, either of which could break if you're not using the functions properly.
Tools like ValGrind, Purify and Insure won't be able to differentiate between the deliberately dubious and the accidentally.
In the case of arrays, delete[] invokes all the destructors and free() won't, but also the heap memory typically has a counter of the array size (for 32-bit VC++2005 Release builds for example, the array size is in the 4 bytes immediately before the pointer value visibly returned by new[]. This extra value may or may not be be there for POD types (not for VC++2005), but if it is free() certainly won't expect it. Not all heap implementations allow you to free a pointer that's been shifted from the value returned by malloc().
An important difference is that new and delete also call the constructor and destructor of the object. Thus, you may get unexpected behavior. That is the most important thing i think.
Because it might not be the same allocator, which could lead to weird, unpredictable behaviour. Plus, you shouldn't be using malloc/free at all, and avoid using new/delete where it's not necessary.
It totally depends on the implementation -- it's possible to write an implementation where this actually works fine. But there's no guarantee that the pool of memory new allocates from is the same pool that free() wants to return the memory to. Imagine that both malloc() and new use a few bytes of extra memory at the beginning of each allocated block to specify how large the block is. Further, imagine that malloc() and new use different formats for this info -- for example, malloc() uses the number of bytes, but new uses the number of 4-byte long words (just an example). Now, if you allocate with malloc() and free with delete, the info delete expects won't be valid, and you'll end up with a corrupted heap.

How does sbrk() work in C++?

Where can I read about sbrk() in some detail?
How does it exactly work?
In what situations would I want to use sbrk() instead of the cumbersome malloc() and new()?
btw, what is the expansion for sbrk()?
Have a look at the specification for brk/sbrk.
The call basically asks the OS to allocate some more memory for the application by incrementing the previous "break value" by a certain amount. This amount (the first parameter) is the amount of extra memory your application then gets.
Most rudimentary malloc implementations build upon the sbrk system call to get blocks of memory that they split up and track. The mmap function is generally accepted as a better choice (which is why mallocs like dlmalloc support both with an #ifdef).
As for "how it works", an sbrk at its most simplest level could look something like this:
uintptr_t current_break; // Some global variable for your application.
// This would probably be properly tracked by the OS for the process
void *sbrk(intptr_t incr)
{
uintptr_t old_break = current_break;
current_break += incr;
return (void*) old_break;
}
Modern operating systems would do far more, such as map pages into the address space and add tracking information for each block of memory allocated.
sbrk is pretty much obsolete, these days you'd use mmap to map some pages out of /dev/zero. It certainly isn't something you use instead of malloc and friends, it's more a way of implementing those. Also, of course, it exists only on posix-based operating systems that care about backwards compatibility to ancient code.
If you find Malloc and New too cumbersome, you should look into garbage collection instead... but beware, there is a potential performance cost to that, so you need to understand what you are doing.
You never want to use sbrk instead of malloc or free. It is non-portable and is typically used only by implementers of the standard C library or in cases where it's not available. It's described pretty well in its man page:
Description
brk() sets the end of the
data segment to the value specified by
end_data_segment, when that value is
reasonable, the system does have
enough memory and the process does not
exceed its max data size (see
setrlimit(2)).
sbrk() increments the program's data
space by increment bytes. sbrk() isn't
a system call, it is just a C library
wrapper. Calling sbrk() with an
increment of 0 can be used to find the
current location of the program break.
Return Value
On success, brk() returns
zero, and sbrk() returns a pointer to
the start of the new area. On error,
-1 is returned, and errno is set to ENOMEM.
Finally,malloc and free are not cumbersome - they are the standard way to allocate and release memory in C. Even if you want to implement your own memory allocator, it's best to just use malloc and free as the basis - a common approach is to allocate a large chunk at a time with malloc and provide memory allocation from it (this is what suballocators, or pools, usually implement)
Re the origin of the name sbrk (or its cousin brk), it may have something to do with the fact that the end of the heap is marked by a pointer known as the "break". The heap starts right after the BSS segments and typically grows up towards the stack.
You've tagged this C++ so why would you use 'cumbersome' malloc() rather than new? I am not sure what is cumbersome about malloc in any case; internally maybe so, but why would you care? And if you did care (for reasons of determinism for example), you could allocate a large pool and implement your own allocator for that pool. In C++ of course you can overload the new operator to do that.
sbrk is used to glue the C library to the underlying system's OS memory management. So make OS calls rather than using sbrk(). As to how it works, that is system dependent. If for example you are using the Newlib C library (commonly used on 'bare-metal' embedded systems with the GNU compiler), you have to implement sbrk yourself, so how it works in those circumstances is up to you so long as it achieves its required behaviour of extending the heap or failing.
As you can see from the link, it does not do much and would be extremely cumbersome to use directly - you'd probably end-up wrapping it in all the functionality that malloc and new provide in any case.
This depends on what you mean by malloc being "Cumbersome". sbrk is typically not used directly anymore, unless you're implementing your own memory allocator: IE, operator overriding "new". Even then I'd possibly use malloc to give me my initial memory.
If you'd like to see how to to implement malloc() on top of sbrk(), check out http://web.ics.purdue.edu/~cs354/labs/lab6/ which is an exercise going through that.
On a modern system you shouldn't touch this interface, though. Since you're calling malloc and new cumbersome, I suspect you don't have all the requisite experience to safely and properly use sbrk for your code.

When to use Malloc instead of New [duplicate]

This question already has answers here:
Closed 13 years ago.
Duplicate of: In what cases do I use malloc vs new?
Just re-reading this question:
What is the difference between "new" and "malloc" and "calloc" in C++?
I checked the answers but nobody answered the question:
When would I use malloc instead of new?
There are a couple of reasons (I can think of two).
Let the best float to the top.
A couple that spring to mind:
When you need code to be portable between C++ and C.
When you are allocating memory in a library that may be called from C, and the C code has to free the allocation.
From the Stroustrup FAQ on new/malloc I posted on that thread:
Whenever you use malloc() you must consider initialization and convertion of the return pointer to a proper type. You will also have to consider if you got the number of bytes right for your use. There is no performance difference between malloc() and new when you take initialization into account.
This should answer your question.
The best reason I can think of to use malloc in C++ is when interacting with a pure C API. Some C APIs I've worked with take ownership of the memory of certain parameters. As such they are responsible for freeing the memory and hence the memory must be free-able via free. Malloc will work for this puprose but not necessarily new.
In C++, just about never. new is usually a wrapper around malloc that calls constructors (if applicable.)
However, at least with Visual C++ 2005 or better, using malloc can actually result in security vulnerabilities over new.
Consider this code:
MyStruct* p = new MyStruct[count];
MyStruct* p = (MyStruct*)malloc(count* sizeof(MyStruct));
They look equivelent. However, the codegen for the first actually checks for an integer overflow in count * sizeof(MyStruct). If count comes from an unstrusted source, it can cause an integer overflow resulting in a small amount of memory being allocated, but then when you use count you overrun the buffer.
Everybody has mentioned (using slightly different words) when using a C library that is going to use free() and there are a lot of those around.
The other situation I see is:
When witting your own memory management (because for some reason that you have discovered through modeling the default is not good enough). You could allocate memory block with malloc and the initialization the objects within the pools using placement new.
One of the reason is that in C++, you can overload the new operator.
If you wanted to be sure to use the system library memory allocation in your code, you could use malloc.
A C++ programmer should rarely if ever need to call malloc. The only reason to do so that I can think of would be a poorly constructed API which expected you to pass in malloc'd memory because it would be doing the free. In your own code, new should always be the equal of malloc.
If the memory is to be released by free() (in your or someone elses code), it's pretty darn required to use malloc.
Otherwise I'm not sure. One contrived case is when you don't want destructor(s) to be run on exit, but in that case you should probably have objects that have a no-op dtor anyway.
You can use malloc when you don't want to have to worry about catching exceptions (or use a non-throwing version of new).