Why use std::vector instead of realloc? [closed]

Why use std::vector instead of realloc? [closed] - c++

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Here, in this question, it's stated that there is no realloc-like operator or function in c++. If you wish to resize an array, just just std::vector instead. I've even seen Stroustrup saying the same thing here.
I believe it's not hard to implement one. There should be a reason for not implementing one. The answers only say to use std::vector but not why it's not implemented.
What is the reason for not implementing realloc-like operator or function and preferring to use std::vector instead?

What is the reason for not implementing realloc-like operator or function and preferring to use std::vector instead?
Save time. Don't chase bugs in your own code for a problem that has long been solved. Idiomatic C++ and readability. Get answers to your questions easily and quickly. Customize the realloc part by an allocator.
I believe it's not hard to implement one
That heavily depends on what you need from the template you intend to write. For a general-purpose std::vector-like one, have a look at the source code (libcxx's 3400 line vector header is here). I bet you will revise you initial assumption on the low complexity of such construct.

There's several advantages.
Vector keeps track of its size and capacity, which means you don't have to do this yourself.
Because the current size is part of the vector object itself, you can pass a vector (by reference or by value) without needing an additional size parameter. This is especially useful when returning a vector as the caller doesn't need to receive the size through some side-channel.
When reallocating, vector will add more capacity than is needed to add just the element(s) requested to be added. This sounds wasteful but saves time as fewer reallocations are needed.
Vector manages its own memory; using vector lets you focus on the more interesting parts of your program instead of the details of managing memory, which are relatively uninteresting and tricky to get exactly right.
Vector supports many operations that arrays don't natively support, such as removing elements from the middle and making copies of an entire vector.

realloc's expectation that there might be sufficient free space after the current allocation just does not fit well with modern allocators and modern programs.
(There's many more allocation going on, many allocation sizes go to a dedicated pool for that size, and the heap is shared between all the threads in a program.)
In most cases, realloc will have to move content to a completely new allocation, just like vector does. But unlike vector<T>, realloc does not know how to move elements of type T, it only knows how to copy plain data.

Well, as the other answers have explained nicely about the reason for using vectors, I will simply elaborate on why realloc was not implemented. For this, you need to take a look at what realloc actually does. It increases the size of the memory by intelligently using malloc() and free(). You need to understand, that though it seems to simply increase the size, it does not actually increase the size, but instead allocates another block of memory with the required size (That explains the name realloc).
Take a look at the following lines:
int* iarr = (int*)malloc(sizeof(iarr)*5);
iarr = (int*)realloc(6,sizeof(iarr)); //this is completely discouraged
//what you're supposed to do here is:
int* iarr2 = (int*)realloc(iarr,1 + sizeof(iarr)); //copies the old data to new block implicitly
//this not only saves the previous state, but also allows you to check if realloc succeeded
In C++, this can be (if it is must) achieved, by writing:
int* iarr = new int[5];
int* iarr2 = new int[6];
for(int i = 0; i < 5; i++) {
iarr2[i] = iarr[i];
}
delete[] iarr;
The only use of realloc was to increase the memory capacity; as C arrays did not do that automatically they had to provide a mechanism to do so; which has been implicitly implemented in most of the containers, making the reason for having a realloc in the first place, moot.

Related

Using realloc to increase size vs creating bigger dynamic array

I am asking this question for the sake of learning; normally I would use vector or linked list for this problem.
If the size of a dynamic array is changing throughout the main code, which is more efficient or logical to use: creating a new dynamic array which is half size bigger than the previous one and copying previous elements to it, or using realloc to make the dynamic array bigger? And if one of them is more efficient or logical, why?

realloc could extend the existing memory block in place if there's room, avoiding the whole allocate + copy + free process entirely. Using new[] doesn't allow for that possibility.
If you're writing idiomatic C++ you should use std::vector, which does the same thing under the hood. But for the sake of learning, if you don't have std::vector then use realloc.
Note that realloc is not object-aware. It won't call constructors and destructors. If you're going to use it in C++ you'd better know exactly what you're doing!

Proof that shrink_to_fit or swap guarantees to release vector's memory [duplicate]

This question already has answers here:
C++ delete vector, objects, free memory
(7 answers)
Closed 5 years ago.
Can anyone provide a proof that one of the following approaches provide a guarantee to free the vector's memory in a platform independent manner?
vector<double> vec;
//populating vec here
Cleaning up:
1- shrink to fit approach
vec.clear();
vec.shrink_to_fit();
2- swap approach
vector<double>().swap(vec);

Creating a vector with using new is unlikely to do what I think you'd want to do either.
std::vector implementations usually only "guarantee" that they will allocate enough memory to hold the requested number of elements at minimum. The latter is important because asking the OS or runtime for more memory when you need to grow the vector is an expensive operation that may potentially trigger an element copy as well. For that reason, a lot of the implementations use some heuristics to determine how big the allocation is going to be when the vector has to grow to a certain size. For example, one implementation I'm familiar with doubles the size of the allocation every time new memory is required, giving you a 2^x allocation schema.
With an allocation schema like that, trying to shrink a vector from, say, 90 to 70 elements is pretty much guaranteed to keep the allocated memory size the same to reserve for additional room for growth.
If you need exact memory allocation sizes for whatever reason, you'll pretty much either have to use std::array if you know the sizes at compile time, or manage an array yourself.

Should I use std::vector instead of array [duplicate]

This question already has answers here:
When to use vectors and when to use arrays in C++?
(2 answers)
Closed 5 years ago.
The way I see this, they both have the same function except std::vector seems more flexible, so when would I need to use array, and could I use std::vector only?
This is not a new question, the original questions didn't have the answers I was looking for

One interesting thing to note is that while iterators will be invalidated in many functions with vectors, that is not the case with arrays. Note: std::swap with std::array the iterator will still point to the same spot.
See more:
http://en.cppreference.com/w/cpp/container/array
Good summary of advantages of arrays:
https://stackoverflow.com/a/4004027/7537900
This point seemed most interesting:
fixed-size arrays can be embedded directly into a struct or object,
which can improve memory locality and reducing the number of heap
allocations needed
Not having tested that, I'm not sure it's actually true though.
Here is a discussion in regards to 2D Vectors vs Arrays in regards to the competitive programming in Code Chef:
https://discuss.codechef.com/questions/49278/whether-to-use-arrays-or-vectors-in-c
Apparently memory is not contiguous in 2 dimensions in 2D vectors, only one dimension, however in 2D arrays it is.

As a rule of thumb, you should use:
a std::array if the size in fixed at compile time
a std::vector is the size is not fixed at compile time
a pointer on the address of their first element is you need low level access
a raw array if you are implementing a (non standard) container
Standard containers have the ability to know their size even when you pass them to other function, what raw arrays don't, and have enough goodies to never use raw arrays in C++ code without specific reasons. One could be a bottleneck that would require low level optimization, but only after profiling to identify the bottleneck. And you should benchmark in real condition whether the standard containers actually add any overload.
The only good reason I can think of is if you implement a special container. As standard containers are not meant to be derived, you have only two choices, either have you class contain a standard container and end in a container containing a container with delegations everywhere, or mimic a standard container (by copying code from a well knows implementation), and specialize it. In that case, you will find yourself managing directly raw arrays.

When using std:vector, the only performance hit would be when the capacity is reached, as the memory must be relocated to accomodate a larger number of objects in contiguous memory space on the heap
Thus here is a summary of both in regards to flexibility and performance:
std::array; Reallocation is not possible and thus no perfomance hit will occur due to relocation of memory on the heap.
std::vector; Only affects performance if capacity is exceeded and reallocation occurs. You can use reserve(size) to provide a rough estimate to the maximum amount of objects you'll need. This allows greater flexibility compared to std::array but will of course, have to reallocate memory if the reserved space is exceeded.

c++ why isn't there something like length(array)? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Well I don't think that it's really important but since the program has to store the length because of delete[] anyway, Why can't we get this "stored information" ?

The implementation only needs to store the length, and typically only does, if the type is not trivially destructible (i.e., it needs to generate calls to a destructor) and the array was created with the new[] operator.
Since that property of the arrayed type bears no relation to the size of the array, it is more elegant simply to call the length "cookie" a private implementation detail.
To get the length of a complete array object (not a mere pointer), you can use std::extent< decltype( arr ) >::value or std::end( arr ) - std::begin( arr ).
Using new[] with a class with a destructor is a code smell. Consider std::vector instead. The overhead vs raw new[] (considering all bytes that need to be allocated, wherever they are) is one pointer's worth of bytes, and the benefits are innumerable.

Consider the case:
char* a = new char[100];
Now a needs to point to a buffer that's at least 100 chars big, but the system might have allocated a bigger buffer to fulfill this.
With this in mind, we can see that the system is free to immediately forget the size the program asked for, as long as it can still deallocate the buffer properly later. (Either by remembering the size of the allocated buffer or doing the memory allocation with some smart data structure where only the pointer to the start is required)
So, in the general case, the information you are looking for is not, in fact, stored anywhere.

Not all arrays are allocated by new.
void f(int* arr, size_t n)
{
length(arr); // ???
}
int main()
{
int a[5];
f(a);
}
It's trivial to write though, just call (std::end(arr) - std::begin(arr)), although it only works for arrays, not pointers that point to the start of arrays.

My understanding is that the philosophy of c++ is to not force on people any feature that has a potential cost unless unavoidable.
There may be additional costs in storing this information, and the user may not want to pay that cost if they don't need the information. And as it's trivial to store the length yourself if you want it there is no reason to provide a language feature that has a cost to everyone using an array.

For proper arrays, that is, int a[length], you already have that facility: just do
#define length(A) (sizeof(A) / sizeof(*A))
and you are done with it.
If you are talking about getting the length of the array pointed to by a pointer, well, pointers and arrays are two different concepts and coupling them makes no sense, even if proper arrays "decays" to pointers when needed and you access arrays through pointer arithmetic.
But even if we don't take that into account and talk about technological aspects, the C++ runtime may not know what the length of your array is, since new could rely on malloc, which stores the length of the array in its specific ways, which is understood only by free: the C++ runtime only stores extra informations only when you have non-empty destructors. A pretty messy picture, huh?

because its up to the implementation where it stores this information. so there is no general way to do a length(array)

The length is most certainly not stored on all implementations. C++, for example, allows for garbage collection (e.g. boehmgc), and many collectors do not need to know the length. In traditional allocators, the length stored will often be larger than the actual length, i.e. the length allocated, not the length used.

But, what exactly is the length of the array? Is it the number of bytes in the array or the number of elements in the array?
For example : for some class A of 100 byte size,
A* myArray = new A[100];
should length(myArray) return 100 or 100 * 100? Somebody might want 100, somebody might want 10000. So, there is no real argument for either of that.

The c++ type that works most like the "arrays" in languages that support length(array) is std::vector<>, and it does have std::vector<>::size().
The size of plain [] arrays is known to the compiler in scopes where the size is explicit, of course, but it is possible to pass them to scopes where the size is not known to the compiler. This gives c++ more ways to handle array-like data than languages that must support a length interogative (because they have to insure that the size is always passed).

Efficient Array Reallocation in C++

How would I efficiently resize an array allocated using some standards-conforming C++ allocator? I know that no facilities for reallocation are provided in the C++ alloctor interface, but did the C++11 revision enable us to work with them more easily? Suppose that I have a class vec with a copy-assignment operator foo& operator=(const foo& x) defined. If x.size() > this->size(), I'm forced to
Call allocator.destroy() on all elements in the internal storage of foo.
Call allocator.deallocate() on the internal storage of foo.
Reallocate a new buffer with enough room for x.size() elements.
Use std::uninitialized_copy to populate the storage.
Is there some way that I more easily reallocate the internal storage of foo without having to go through all of this? I could provide an actual code sample if you think that it would be useful, but I feel that it would be unnecessary here.

Based on a previous question, the approach that I took for handling large arrays that could grow and shrink with reasonable efficiency was to write a container similar to a deque that broke the array down into multiple pages of smaller arrays. So for example, say we have an array of n elements, we select a page size p, and create 1 + n/p arrays (pages) of p elements. When we want to re-allocate and grow, we simply leave the existing pages where they are, and allocate the new pages. When we want to shrink, we free the totally empty pages.
The downside is the array access is slightly slower, in that given and index i, you need the page = i / p, and the offset into the page i % p, to get the element. I find this is still very fast however and provides a good solution. Theoretically, std::deque should do something very similar, but for the cases I tried with large arrays it was very slow. See comments and notes on the linked question for more details.
There is also a memory inefficiency in that given n elements, we are always holding p - n % p elements in reserve. i.e. we only ever allocate or deallocate complete pages. This was the best solution I could come up with in the context of large arrays with the requirement for re-sizing and fast access, while I don't doubt there are better solutions I'd love to see them.

A similar problem also arises if x.size() > this->size() in foo& operator=(foo&& x).
No, it doesn't. You just swap.

There is no function that will resize in place or return 0 on failure (to resize). I don't know of any operating system that supports that kind of functionality beyond telling you how big a particular allocation actually is.
All operating systems do however have support for implementing realloc, however, that does a copy if it cannot resize in place.
So, you can't have it because the C++ language would not be implementable on most current operating systems if you had to add a standard function to do it.

There are the C++11 rvalue reference and move constructors.
There's a great video talk on them.

Even if re-allocate exists, actually, you can only avoid #2 you mentioned in your question in a copy constructor. However in the case of internal buffer growing, re-allocate can save these four operations.
Is internal buffer of your array continuous? if so see the answer of your link
if not, Hashed array tree or array list may be your choice to avoid re-allocate.

Interestingly, the default allocator for g++ is smart enough to use the same address for consecutive deallocations and allocations of larger sizes, as long as there is enough unused space after the end of the initially-allocated buffer. While I haven't tested what I'm about to claim, I doubt that there is much of a time difference between malloc/realloc and allocate/deallocate/allocate.
This leads to a potentially very dangerous, nonstandard shortcut that may work if you know that there is enough room after the current buffer so that a reallocation would not result in a new address. (1) Deallocate the current buffer without calling alloc.destroy() (2) Allocate a new, larger buffer and check the returned address (3) If the new address equals the old address, proceed happily; otherwise, you lost your data (4) Call allocator.construct() for elements in the newly-allocated space.
I wouldn't advocate using this for anything other than satisfying your own curiosity, but it does work on g++ 4.6.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why use std::vector instead of realloc? [closed] - c++

Related

Using realloc to increase size vs creating bigger dynamic array

Proof that shrink_to_fit or swap guarantees to release vector's memory [duplicate]

Should I use std::vector instead of array [duplicate]

c++ why isn't there something like length(array)? [closed]

Efficient Array Reallocation in C++

Categories

Resources