assign space for vector of pointer to struct - c++

For some reason, I have a vector of pointer of struct, I would like to assign new space to every block of the vector. But I don't want to do it in a loop for every block as it may slow the whole process. Is there a faster way to do it?
Could anyone provide me a solution with code?
This is what I current doing("pool" is the struct name):
vector<pool*> poolPointer(vectorSize);
for(int i = 0; i<poolPointer.size() ;i++){
poolPointer.at(i) = new pool;}
I think it is very slow thus I would like to search for a faster way to allocate space and return point of struct to each individual block in the vector.

I has nothing to do with vector. What you are looking for is custom memory allocation, and in place operator new in particular. So you can allocate a single chunk of memory for all your pool instances and then create instances in this memory chunk.
EDIT:
As #JSF commented, you can allocate many instances all together as an array of "values", not pointers. You can then use vector of pointers if you wish, or you can use vector of values and don't bother with pointers at all. I'd start with vector of values and only if profiling showed that frequent removal from a vector is a bottleneck I'd think about vector of pointers as an optimisation.

Related

Is it possible to extract data from std::vector without copying it? (and make the vector forget it)

I have a std::vector<byte> object and I want to extract data from it without copying.
It may contain megabytes of data. So, if I copy data I would lose performance.
Is it possible to extract the data from the vector and make it forget about data, that is, that it doesn't free memory for the data after destruction?
Hope for your help!
Thanks in advance!
P.S: extract in this case means just get a raw pointer to the data and make vector forget about it (i.e don't free the memory after destruction)
No, It is not possible to extract part of data from vector as far as I know.
It is not compatible with structure of vector that provides its data in a continuous part of memory. std::vector memory is continues, so if it was possible to move part of its memory to another place, you need to shift reminder of memory to keep it continuous. It will be a huge burden itself.
I personally suggest to pass main vector by pointer/reference and use required parts directly as needed.
If you need to move whole data of std::vector to another place, you can just use std::move() to do so. You can even use std::swap() to swap contents of 2 vector together.
I have a std::vector object and I want to extract data from it without copying
You can move the entire contents of a vector ... into a different vector.
Or you can swap (the contents of) two vectors.
std::vector<byte> v = get_a_big_vector();
std::vector<byte> w = std::move(v); // now w owns the large allocation
std::vector<byte> x;
std::swap(x,y); // now x owns the large allocation, and w is empty
That's it. You can't ask a vector to release its storage, and you can't somehow "take" just a portion of a contiguous allocation without affecting the rest.
You can move-assign some sub-range of elements, but that's only different to copying if the elements are some kind of object with state stored outside the instance (eg, a long std::string).
If you really need to take just a sub-range and let the rest be deallocated, then a vector isn't really the right data type. Something like a rope is designed for this, or you can just split your single contiguous vector into a vector of 1Mb (or whatever) chunk indirections. This is actually something like a deque (although you can't steal chunks from std::deque either).
I think the best way is to use an object orineted approach. You can abstract byte data inside a class with other information like a flag to make them to be skipped or forget:
class Data
{
public:
Data(byte d)
{
data = d;
forget = false;
}
byte data;
bool forget;
}
Then just add to the vector pointers to data like
vector<Data*> data;
data.push_back(new Data(1));
data.push_back(new Data(2));
// and so on
You can extract data without copying just getting the pointers to specific element of the array:
Data *d = data[index];
d->forget = true;
You can use the forget flag to make it forgettable. Of course you have to manage the forget flag yourself when searching the vector. You can use the std::find_if with a lamba expression for this porpouse.
Keep in mind you have to free memory when data is not used any more.

C++: Vector of pointers vs Fixed-size array performance

Performance-wise, which is faster?
A vector of object pointers allocated by the new operator?
std::vector<Object *> array;
Or an array allocated with new in the constructor?
Object[] objects;
objects = new objects[64];
The idea is that in every frame, the program loops through each element reading/writing values for each element.
Edit:
The second snippet was pulled from an XNA book. I am not using XNA to write my framework, and I'm trying to figure out the best method for using containers in an application that requires speed.
Definitely the second one.
With a vector of pointers, each individual element of that vector can be allocated anywhere on the heap.
With an array of objects, all elements are stored sequentially. This means the processor can cache chunks of memory more effectively as you iterate through the array.
The concept is called cache locality, referring to how well organised your data is with respect to memory access patterns and caching.
As pointed out in the comments, neither of your examples are correct. I assume you meant something like this:
std::vector<Object*> vector_of_pointers(size);
Object *array_of_objects = new Object[size];
However, I fear you may not have phrased your question the way you intended. You're not comparing two similar things. A vector is basically just an array that can grow if necessary. It makes all the same guarantees as an array, and so if it's storing the same data type, you shouldn't notice any difference between the two.
// Bad cache locality:
Object **A = new Object*[size];
std::vector<Object*> B(size);
// Good cache locality:
Object *C = new Object[size];
std::vector<Object> D(size);

How to reserve memory for vector of vector

Assume that
vector<vector<shared_ptr<Base>>> vec
vec.reserve(100)
vec[0].reserve(20) // Error : vector subscript out of range
I am trying to reserve memory for both outer vector and inner vector.
I know that the vec is empty so I cannot reserve memory for the inner vector. I could only resize() or shrink_to_fit() afterward. However, using resize() or shrink_to_fit() is useless due to that is not what I wanted to do.
The intention of reserving memory for the inner vector is trying to allocate the memory well for faster searching of inner elements afterward. I am just wondering if I do not reserve the memory, the memory that is pre-allocated is expensive and chaos.
I would like to ask :
Are there any way to reserve memory for the inner vector
Does my concept of "concerning about bad allocation of memory will be caused without reserving memory for the vector" correct?
Sorry for my poor english and I am using VC++ 2010.
You can't reserve memory for both inner and outer vectors... the inner vectors don't get constructed if you've only reserved space in the outer vector. You can resize the outer vector then do a reserve for each element thereof, or you can do the reserving on the inner vectors as they're added.
If you're sure you need to do this at all, I would probably resize the outer vector, then reserve space in each inner vector.
If 100 elements is even close to accurate, the space for your outer vector is almost irrelevant anyway (typically going to be something like 1200 bytes on a 32-bit system or 2400 bytes on a 64-bit system).
That may be a little less convenient (may force you to track how many items are created vs. really in use) but if you want to reserve space in your inner vectors, you don't really have a lot of choices.
I'd start with how you're going to interface with the final container and what you know about its content in advance. Once you have settled on a convenient interface, you can implement the code behind it. For example, you could make sure that every new inner vector get created with a capacity of 100 elements. Or, you could use a map from an x/y pair to a shared pointer, which can make sense in a sparsely populated container. Or how about allocating the 100x100 elements statically and just not reallocating at all? The important point is that all these alternatives can be implemented without changing the interface to the final container, so this gives you the freedom to experiment with different approaches.
BTW: Check out make_shared, which avoid the allocation overhead of shared_ptr, I believe. Alternatively, Boost also has an intrusive_ptr which uses an internal reference counter. These shared_ptr instances are also only half the size of a shared_ptr. However, you need benchmarks to actually prove which way is fastest. Anything else is just more or less vague speculation and guesswork.

Data structure in C/C++ for multiple variable size arrays

This is the problem at hand:
I have several 10000s of arrays. Each array could be anywhere between 2-15 units in length.
The total length of all the elements in all the arrays and the number of arrays can be computed using some very low cost calculations. But the exact number in each array is not known until some fairly expensive computation is completed.
Since I know the total length of all the elements in all the arrays, I would like to just allocate data for it using just one new/malloc and just set pointers within this allocation. In my current implementation I use memmove to move the data after a certain item is inserted and updates all pointers accordingly.
Is there a better way of doing this?
Thanks,
Sid
It's not clear what you mean by better way. If you are looking for something that works faster and can afford some extra memory then you can keep two arrays, one with data, and the other one with the index of the array it belongs. After you added all the data, you can sort by the index and you have all your data split by arrays, finally you sweep the arrays and get the pointer to where each array belongs.
Regarding memory consumption, depending on how many arrays you have, and how big is your data, you can squeeze the index data to the last bits of your data, if you have it bounded by some number. This way, you only need to sort the numbers, and when you are sweeping retrieving the pointer where each array begins, you can clean the top bits.
Since I know the total length of all the elements in all the arrays, I would like to just allocate data for it using just one new/malloc and just set pointers within this allocation.
You can use one large vector. You'll need to manually calculate the offset of each sub-array yourself.
vectors guarantee that their data is stored in contiguous memory, but be careful of maintaining references or pointers to individual elements if the vector is used in such a way that may make it reallocate. Shouldn't be a problem since you're not adding anything beyond the initial size.
int main() {
std::vector<T> vec;
vec.reserve(calc_total_size());
// now you'll need to manually translate the offset of
// a given "array" and then add the offset of the element to that
T someElem = vec[array_offset + element_offset];
}
Yes, there is a better way:
std::vector<std::vector<Item>> array;
array.resize(cheap_calc());
for(int i = 0; i < array.size(); ++i) {
array[i].resize(expensive_calc(i));
for(int j = 0; j < array[i].size(); j++) {
array[i][j] = Item(some_other_calc());
}
}
No pointers, no muss, no fuss.
Are you looking for memory efficiency, speed efficiency, or simplicity?
You can always write or download a dead-simple pool allocator, then pass that as the allocator to the appropriate data structures. Because you know the total size in advance, and never need to resize vectors or add new ones, this can be even simpler than a typical pool allocator. Just malloc all of the storage in one big block, and keep a single pointer to the next block. To allocate n bytes, T *ret = nextBlock; nextBlock += n; return ret;. If your objects are trivial and don't need destruction, you can even just do one big free at the end.
This means you can use any data structure you want, or compare and contrast different ones. A vector of vectors? A giant vector of cells plus a vector of offsets? Something else you came up with that sounds crazy but just might work? You can compare their readability, usability, and performance without worrying about the memory allocation side of things.
(Of course if your goal is speed, packing things this way may not be the best answer. You can often gain a lot of speed by wasting a little space to improve your cache and/or page alignment. You could write a fancy allocator that, e.g., allocates vector space in a transposed way to improve the performance of your algorithm that does column-major where it should do row-major and vice-versa, but at that point, it's probably easier to tweak your algorithms than your allocator.)

Dynamic memory allocation, C++

I need to write a function that can read a file, and add all of the unique words to a dynamically allocated array. I know how to create a dynamically allocated array if, for instance, you are asking for the number of entries in the array:
int value;
cin >> value;
int *number;
number = new int[value];
My problem is that I don't know ahead of time how many unique words are going to be in the file, so I can't initially just read the value or ask for it. Also, I need to make this work with arrays, and not vectors. Is there a way to do something similar to a push_back using a dynamically allocated array?
Right now, the only thing I can come up with is first to create an array that stores ALL of the words in the file (1000), then have it pass through it and find the number of unique words. Then use that value to create a dynamically allocated array which I would then pass through again to store all the unique words. Obviously, that solution sounds pretty overboard for something that should have a more effective solution.
Can someone point me in the right direction, as to whether or not there is a better way? I feel like this would be rather easy to do with vectors, so I think it's kind of silly to require it to be an array (unless there's some important thing that I need to learn about dynamically allocated arrays in this homework assignment).
EDIT: Here's another question. I know there are going to be 1000 words in the file, but I don't know how many unique words there will be. Here's an idea. I could create a 1000 element array, write all of the unique words into that array while keeping track of how many I've done. Once I've finished, I could provision a dynamically allocate a new array with that count, and then just copy the words from the initial array to the second. Not sure if that's the most efficient, but with us not being able to use vectors, I don't think efficiency is a huge concern in this assignment.
A vector really is a better fit for this than an array. Really.
But if you must use an array, you can at least make it behave like a vector :-).
Here's how: allocate the array with some capacity. Store the allocated capacity in a "capacity" variable. Each time you add to the array, increment a separate "length" variable. When you go to add something to the array and discover it's not big enough (length == capacity), allocate a second, longer array, then copy the original's contents to the new one, then finally deallocate the original.
This gives you the effect of being able to grow the array. If performance becomes a concern, grow it by more than one element at a time.
Congrats, after following these easy steps you have implemented a small subset of std::vector functionality atop an array!
As you have rightly pointed out this is trivial with a Vector.
However, given that you are limited to using an array, you will likely need to do one of the following:
Initialize the array with a suitably large size and live with poor memory utilization
Write your own code to dynamically increase the size of the array at run time (basically the internals of a Vector)
If you were permitted to do so, some sort of hash map or linked list would also be a good solution.
If I had to use an array, I'd just allocate one with some initial size, then keep doubling that size when I fill it to accommodate any new values that won't fit in an array with the previous sizes.
Since this question regards C++, memory allocation would be done with the new keyword. But what would be nice is if one could use the realloc() function, which resizes the memory and retains the values in the previously allocated memory. That way one wouldn't need to copy the new values from the old array to the new array. Although I'm not so sure realloc() would play well with memory allocated with new.
You can "resize" array like this (N is size of currentArray, T is type of its elements):
// create new array
T *newArray = new T[N * 2];
// Copy the data
for ( int i = 0; i < N; i++ )
newArray[i] = currentArray[i];
// Change the size to match
N *= 2;
// Destroy the old array
delete [] currentArray;
// set currentArray to newArray
currentArray = newArray;
Using this solution you have to copy the data. There might be a solution that does not require it.
But I think it would be more convenient for you to use std::vectors. You can just push_back into them and they will resize automatically for you.
You can cheat a bit:
use std::set to get all the unique words then copy the set into a dynamically allocated array (or preferably vector).
#include <iterator>
#include <set>
#include <iostream>
#include <string>
// Copy into a set
// this will make sure they are all unique
std::set<std::string> data;
std::copy(std::istream_iterator<std::string>(std::cin),
std::istream_iterator<std::string>(),
std::inserter(data, data.end()));
// Copy the data into your array (or vector).
std::string* words = new std::string[data.size()];
std::copy(data.begin(), data.end(), &words[0]);
This could be going a bit overboard, but you could implement a linked list in C++... it would actually allow you to use a vector-like implementation without actually using vectors (which are actually the best solution).
The implementation is fairly easy: just a pointer to the next and previous nodes and storing the "head" node in a place you can easily access to. Then just looping through the list would let you check which words are already in, and which are not. You could even implement a counter, and count the number of times a word is repeated throughout the text.