What's the most suitable c++ replacement of calloc?

What's the most suitable c++ replacement of calloc? - c++

The title says it.
I have tried:
new char[nSize];
but it can return uninitialized memory.
where as calloc ensures a zero-initialization.
I could call memset, etc. - but isn't there a more direct way ?

What's the most suitable c++ replacement of calloc?
For most purposes, std::vector. Or std::string if you intend to represent a character string. It will automatically delete whatever memory it allocates.
For data structures that contain many arrays that are not mutually contiguous, you might want to avoid the slightly-larger-than-pointer size of std::vector, and instead might opt for unique pointer:
auto ptr = std::make_unique<char[]>(nSize);
You can use value initialisation with a new expression as well. This is what std::make_unique does internally:
new char[nSize]();
But I would not recommend allocations without a RAII container.
As mentioned by geza, calloc may be optimised (on some systems) such that it may elide setting the memory to zero when allocating a large block. If such optimisation applies to your case, and is measurably significant, then there may be an argument for using std::calloc in C++.

Related

Program with realloc() works just fine but at the end it returns error -1073741819. Can someone please find my mistake in this code? [duplicate]

From what is written here, new allocates in free store while malloc uses heap and the two terms often mean the same thing.
From what is written here, realloc may move the memory block to a new location. If free store and heap are two different memory spaces, does it mean any problem then?
Specifically I'd like to know if it is safe to use
int* data = new int[3];
// ...
int* mydata = (int*)realloc(data,6*sizeof(int));
If not, is there any other way to realloc memory allocated with new safely? I could allocate new area and memcpy the contents, but from what I understand realloc may use the same area if possible.

You can only realloc that which has been allocated via malloc (or family, like calloc).
That's because the underlying data structures that keep track of free and used areas of memory, can be quite different.
It's likely but by no means guaranteed that C++ new and C malloc use the same underlying allocator, in which case realloc could work for both. But formally that's in UB-land. And in practice it's just needlessly risky.
C++ does not offer functionality corresponding to realloc.
The closest is the automatic reallocation of (the internal buffers of) containers like std::vector.
The C++ containers suffer from being designed in a way that excludes use of realloc.
Instead of the presented code
int* data = new int[3];
//...
int* mydata = (int*)realloc(data,6*sizeof(int));
… do this:
vector<int> data( 3 );
//...
data.resize( 6 );
However, if you absolutely need the general efficiency of realloc, and if you have to accept new for the original allocation, then your only recourse for efficiency is to use compiler-specific means, knowledge that realloc is safe with this compiler.
Otherwise, if you absolutely need the general efficiency of realloc but is not forced to accept new, then you can use malloc and realloc. Using smart pointers then lets you get much of the same safety as with C++ containers.

The only possibly relevant restriction C++ adds to realloc is that C++'s malloc/calloc/realloc must not be implemented in terms of ::operator new, and its free must not be implemented in terms of ::operator delete (per C++14 [c.malloc]p3-4).
This means the guarantee you are looking for does not exist in C++. It also means, however, that you can implement ::operator new in terms of malloc. And if you do that, then in theory, ::operator new's result can be passed to realloc.
In practice, you should be concerned about the possibility that new's result does not match ::operator new's result. C++ compilers may e.g. combine multiple new expressions to use one single ::operator new call. This is something compilers already did when the standard didn't allow it, IIRC, and the standard now does allow it (per C++14 [expr.new]p10). That means that even if you go this route, you still don't have a guarantee that passing your new pointers to realloc does anything meaningful, even if it's no longer undefined behaviour.

In general, don't do that. If you are using user defined types with non-trivial initialization, in case of reallocation-copy-freeing, the destructor of your objects won't get called by realloc. The copy constructor won't be called too, when copying. This may lead to undefined behavior due to an incorrect use of object lifetime (see C++ Standard §3.8 Object lifetime, [basic.life]).
1 The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. —end note ]
The lifetime of an object of type T begins when:
— storage with the proper alignment and size for type T is obtained, and
— if the object has non-trivial initialization, its initialization is complete.
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
— the storage which the object occupies is reused or released.
And later (emphasis mine):
3 The properties ascribed to objects throughout this International Standard apply for a given object only during its lifetime.
So, you really don't want to use an object out of its lifetime.

It is not safe, and it's not elegant.
It might be possible to override new/delete to support the reallocation, but then you may as well consider to use the containers.

In general, no.
There are a slew of things which must hold to make it safe:
Bitwise copying the type and abandoning the source must be safe.
The destructor must be trivial, or you must in-place-destruct the elements you want to deallocate.
Either the constructor is trivial, or you must in-place-construct the new elements.
Trivial types satisfy the above requirements.
In addition:
The new[]-function must pass the request on to malloc without any change, nor do any bookkeeping on the side. You can force this by replacing global new[] and delete[], or the ones in the respective classes.
The compiler must not ask for more memory in order to save the number of elements allocated, or anything else.
There is no way to force that, though a compiler shouldn't save such information if the type has a trivial destructor as a matter of Quality of Implementation.

Yes - if new actually called malloc in the first place (for example, this is how VC++ new works).
No otherwise. do note that once you decide to reallocate the memory (because new called malloc), your code is compiler specific and not portable between compilers anymore.
(I know this answer may upset many developers, but I answer depends on real facts, not just idiomaticy).

That is not safe. Firstly the pointer you pass to realloc must have been obtained from malloc or realloc: http://en.cppreference.com/w/cpp/memory/c/realloc.
Secondly the result of new int [3] need not be the same as the result of the allocation function - extra space may be allocated to store the count of elements.
(And for more complex types than int, realloc wouldn't be safe since it doesn't call copy or move constructors.)

You may be able to (not in all cases), but you shouldn't. If you need to resize your data table, you should use std::vector instead.
Details on how to use it are listed in an other SO question.

These function is mostly used in C.
memset sets the bytes in a block of memory to a specific value.
malloc allocates a block of memory.
calloc, same as malloc. Only difference is that it initializes the bytes to zero.
In C++ the preferred method to allocate memory is to use new.
C: int intArray = (int*) malloc(10 *sizeof(int));
C++: int intArray = new int[10];
C: int intArray = (int*) calloc(10 *sizeof(int));
C++: int intArray = new int10;

Malloc vs New for Primitives

I understand the benefits of using new against malloc in C++. But for specific cases such as primitive data types (non array) - int, float etc., is it faster to use malloc than new?
Although, it is always advisable to use new even for primitives, if we are allocating an array so that we can use delete[].
But for non-array allocation, I think there wouldn't be any constructor call for int? Since, new operator allocates memory, checks if it's allocated and then calls the constructor. But just for primitives non-array heap allocation, is it better to use malloc than new?
Please advise.

Never use malloc in C++. Never use new unless you are implementing a low-level memory management primitive.
The recommendation is:
Ask yourself: "do I need dynamic memory allocation?". A lot of times you might not need it - prefer values to pointers and try to use the stack.
If you do need dynamic memory allocation, ask yourself "who will own the allocated memory/object?".
If you only need a single owner (which is very likely), you should
use std::unique_ptr. It is a zero cost abstraction over
new/delete. (A different deallocator can be specified.)
If you need shared ownership, you should use std::shared_ptr. This is not a zero cost abstraction, as it uses atomic operations and an extra "control block" to keep track of all the owners.
If you are dealing with arrays in particular, the Standard Library provides two powerful and safe abstractions that do not require any manual memory management:
std::array<T, N>: a fixed array of N elements of type T.
std::vector<T>: a resizable array of elements of type T.
std::array and std::vector should cover 99% of your "array needs".
One more important thing: the Standard Library provides the std::make_unique and std::make_shared which should always be used to create smart pointer instances. There are a few good reasons:
Shorter - no need to repeat the T (e.g. std::unique_ptr<T>{new T}), no need to use new.
More exception safe. They prevent a potential memory leak caused by the lack of a well-defined order of evaluation in function calls. E.g.
f(std::shared_ptr<int>(new int(42)), g())
Could be evaluated in this order:
new int(42)
g()
...
If g() throws, the int is leaked.
More efficient (in terms of run-time speed). This only applies to std::make_shared - using it instead of std::shared_ptr directly allows the implementation to perform a single allocation both for the object and for the control block.
You can find more information in this question.

It can still be necessary to use malloc and free in C++ when you are interacting with APIs specified using plain C, because it is not guaranteed to be safe to use free to deallocate memory allocated with operator new (which is ultimately what all of the managed memory classes use), nor to use operator delete to deallocate memory allocated with malloc.
A typical example is POSIX getline (not to be confused with std::getline): it takes a pointer to a char * variable; that variable must point to a block of memory allocated with malloc (or it can be NULL, in which case getline will call malloc for you); when you are done calling getline you are expected to call free on that variable.
Similarly, if you are writing a library, it can make sense to use C++ internally but define an extern "C" API for your external callers, because that gives you better binary interface stability and cross-language interoperability. And if you return heap-allocated POD objects to your callers, you might want to let them deallocate those objects with free; they can't necessarily use delete, and making them call YourLibraryFree when there are no destructor-type operations needed is unergonomic.
It can also still be necessary to use malloc when implementing resizable container objects, because there is no equivalent of realloc for operator new.
But as the other answers say, when you don't have this kind of interface constraint tying your hands, use one of the managed memory classes instead.

It's always better to use new. If you use malloc you still have to check manually if space is allocated.
In modern c++ you can use smart pointers. With make_unique and make_shared you never call new explicitly. std::unique_ptr is not bigger than the underlying pointer and the overhead of using it is minimal.

The answer to "should I use new or malloc" is single responsibillity rule.
Resource management should be done by a type that has that as its sole purpose.
Those classes already exists, such as unique_ptr, vector etc.
Directly using either malloc or new is a cardinal sin.

zwol's answer already gives the correct correctness answer: Use malloc()/free() when interacting with C interfaces only.
I'm not going to repeat those details, I'm going to answer the performance question.
The truth is, that the performance of malloc() and new can, and does differ. When you perform an allocation with new, the memory will generally be allocated via call to the global operator new() function, which is distinct from malloc(). It is trivial to implement operator new() by calling through to malloc(), but this is not necessarily done.
As a matter of fact, I've seen a system where an operator new() that calls through to malloc() would outperform the standard implementation of operator new() by roughly 100 CPU cycles per call. That's definitely a measurable difference, and a clear indication that the standard implementation does something very different from malloc().
So, if you are worried about performance, there is three things to do:
Measure your performance.
Write replacement implementations for the global operator new() function and its friends.
Measure your performance and compare.
The gains/losses may or may not be significant.

Does C++ new operator use malloc() underneath?

In other words, does it or not do a malloc() syscall everytime it is called ? (maybe by allocation a large chunk in advance)

Before C++14 the standard prohibited the implementation from combining allocations. Therefor each new expression did correspond one-to-one with a call to some system allocation function (possibly malloc).
C++14 relaxed this restriction in some cases. It's now possible for the implementation to combine allocations if the lifetime of one is strictly within the lifetime of the other. This is a fairly narrow restriction though, so I expect allocations don't actually get combined all that often.

In other words, does it or not do a malloc() syscall everytime it is called?
It's actually implementation dependend. But usually implementations of new will make use of malloc() syscalls/c-library bindings.
(maybe by allocation a large chunk in advance)
Yes, you have to consider that as a drawback. Frequently calling something like
char* newChar = new char();
may clutter your dynamic storage space unnecessarily with larger chunks allocated, than a single char would need.
If you want to override that behavior for some more efficient memory management, you can always use placement new.

As others have said, this is implementation defined. However, I would think that a high-performance C++ implementation would probably not use malloc(), but would use OS-specific memory allocation APIs or system calls (which malloc() must itself use). After all, why add an extra function call to every memory allocation? But I have no hard evidence for this.

Is it a bad idea to replace POD C-style array with std::valarray?

I'm working with a code base that is poorly written and has a lot of memory leaks.
It uses a lot of structs that contains raw pointers, which are mostly used as dynamic arrays.
Although the structs are often passed between functions, the allocation and deallocation of those pointers are placed at random places and cannot be easily tracked/reasoned/understood.
I changed some of them to classes and those pointers to be RAIIed by the classes themselves. They works well and don't look very ugly except that I banned copy-construct and copy-assignment of those classes simply because I don't want to spend time implementing them.
Now I'm thinking, am I re-inventing the wheel? Why don't I replace C-style array with std:array or std::valarray?
I would prefer std::valarray because it uses heap memory and RAIIed. And std::array is not (yet) available in my development environment.
Edit1: Another plus of std::array is that the majority of those dynamic arrays are POD (mostly int16_t, int32_t, and float) arrays, and the numeric API can possibility make life easier.
Is there anything that I need to be aware of before I start?
One I can think of is that there might not be an easy way to convert std::valarray or std::array back to C-style arrays, and part of our code does uses pointer arithmetic and need data to be presented as plain C-style arrays.
Anything else?
EDIT 2
I came across this question recently. A VERY BAD thing about std::valarray is that it's not safely copy-assignable until C++11.
As is quoted in that answer, in C++03 and earlier, it's UB if source and destination are of different sizes.

The standard replacement of C-style array would be std::vector. std::valarray is some "weird" math-vector for doing number-calculation-like stuff. It is not really designed to store an array of arbitrary objects.
That being said, using std::vector is most likely a very good idea. It would fix your leaks, use the heap, is resizable, has great exception-safety and so on.
It also guarantees that the data is stored in one contiguous block of memory. You can get a pointer to said block with the data() member function or, if you are pre-C++11, with &v[0] for a non-empty vector v. You can then do your pointer business with it as usual.

std::unique_ptr<int[]> is close to a drop-in replacement for an owning int*. It has the nice property that it will not implicitly copy itself, but it will implicitly move.
An operation that copies will generate compile time errors, instead of run time inefficiency.
It also has next to no run time overhead over that owning int* other than a null-check at destruction. It uses no more space than an int*.
std::vector<int> stores 3 pointers and implicitly copies (which can be expensive, and does not match your existing code behavior).
I would start with std::unique_ptr<int[]> as a first pass and get it working. I might transition some code over to std::vector<int> after I decide that intelligent buffer management is worth it.
Actually, as a first pass, I'd look for memcpy and memset and similar functions and make sure they aren't operating on the structures in question before I start adding RAII members.
A std::unique_ptr<int[]> means that the default created destructor for a struct will do the RAII cleanup for you without having to write any new code.

I would prefer std::vector as the replacement of c-style arrays. You can have a direct access to the underlying data (something like bare pointers) via .data():
Returns pointer to the underlying array serving as element storage.

Using auto_ptr<> with array

I'm using auto_ptr<> which uses an array of class pointer type so how do I assign a value to it.
e.g.
auto_ptr<class*> arr[10];
How can I assign a value to the arr array?

You cannot use auto_ptr with array, because it calls delete p, not delete [] p.
You want boost::scoped_array or some other boost::smart_array :)

If you have C++0x (e.g. MSVC10, GCC >= 4.3), I'd strongly advise to use either a std::vector<T> or a std::array<T, n> as your base object type (depending on whether the size is fixed or variable), and if you allocate this guy on the heap and need to pass it around, put it in a std::shared_ptr:
typedef std::array<T, n> mybox_t;
typedef std::shared_ptr<mybox_t> mybox_p;
mybox_p makeBox() { auto bp = std::make_shared<mybox_t>(...); ...; return bp; }

Arrays and auto_ptr<> don't mix.
From the GotW site:
Every delete must match the form of
its new. If you use single-object
new, you must use single-object
delete; if you use the array form of
new, you must use the array form of
delete. Doing otherwise yields
undefined behaviour.
I'm not going to copy the GotW site verbatim; however, I will summarize your options to solve your problem:
Roll your own auto array
1a. Derive from auto_ptr. Few advantages, too difficult.
1b. Clone auto_ptr code. Easy to implement, no significant space/overhead. Hard to maintain.
Use the Adapter Pattern. Easy to implement, hard to use, maintain, understand. Takes more time / overhead.
Replace auto_ptr With Hand-Coded EH Logic. Easy to use, no significant space/time/overhead. Hard to implement, read, brittle.
Use a vector<> Instead Of an Array. Easy to implement, easy to read, less brittle, no significant space, time, overhead. Syntactic changes needed and sometimes usability changes.
So the bottom line is to use a vector<> instead of C-style arrays.

As everyone said here, don't mix arrays with auto_ptr. This must be used only when you've multiple returns where you feel really difficult to release memory, or when you get an allocated pointer from somewhere else and you've the responsibility to clean it up before existing the function.
the other thing is that, in the destructor of auto_ptr it calls delete operator with the stored pointer. Now what you're passing is a single element of an array. Memory manager will try to find and free up the memory blocks allocated starting from the address you're passing. Probably this might not be existing heap where all allocations are maintained. You may experience an undefined behavior like crash, memory corruption etc. upon this operation.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js