If I have a vector in C++, I know I can safely pass it as an array (pointer to the contained type):
void some_function(size_t size, int array[])
{
// impl here...
}
// ...
std::vector<int> test;
some_function(test.size(), &test[0]);
Is it safe to do this with a nested vector?
void some_function(size_t x, size_t y, size_t z, int* multi_dimensional_array)
{
// impl here...
}
// ...
std::vector<std::vector<std::vector<int> > > test;
// initialize with non-jagged dimensions, ensure they're not empty, then...
some_function(test.size(), test[0].size(), test[0][0].size(), &test[0][0][0]);
Edit:
If it is not safe, what are some alternatives, both if I can change the signature of some_function, and if I can't?
Short answer is "no".
Elements here std::vector<std::vector<std::vector<int> > > test; are not replaced in contiguous memory area.
You can only expect multi_dimensional_array to point to a contiguos memory block of size test[0][0].size() * sizeof(int). But that is probably not what you want.
It is erroneous to take the address of any location in a vector and pass it. It might seem to work, but don't count on it.
The reason why is closely tied to why a vector is a vector, and not an array. We want a vector to grow dynamically, unlike an array. We want insertions into a vector be a constant cost and not depend on the size of the vector, like an array until you hit the allocated size of the array.
So how does the magic work? When there is no more internal space to add a next element to the vector, a new space is allocated twice the size of the old. The old space is copied to the new and the old space is no longer needed, or valid, which makes dangling any pointer to the old space. Twice the space is allocated so the average cost of insertion to the vector that is constant.
Is it safe to do this with a nested vector?
Yes, IF you want to access the inner-most vector only, and as long you know the number of elements it contains, and you don't try accessing more than that.
But seeing your function signature, it seems that you want to acess all three dimensions, in that case, no, that isn't valid.
The alternative is that you can call the function some_function(size_t size, int array[]) for each inner-most vector (if that solves your problem); and for that you can do this trick (or something similar):
void some_function(std::vector<int> & v1int)
{
//the final call to some_function(size_t size, int array[])
//which actually process the inner-most vectors
some_function(v1int.size(), &v1int[0]);
}
void some_function(std::vector<std::vector<int> > & v2int)
{
//call some_function(std::vector<int> & v1int) for each element!
std::for_each(v2int.begin(), v2int.end(), some_function);
}
//call some_function(std::vector<std::vector<int> > & v2int) for each element!
std::for_each(test.begin(), test.end(), some_function);
A very simple solution would be to simply copy the contents of the nested vector into one vector and pass it to that function. But this depends on how much overhead you are willing to take.
That being sad: Nested vectorS aren't good practice. A matrix class storing everything in contiguous memory and managing access is really more efficient and less ugly and would possibly allow something like T* matrix::get_raw() but the ordering of the contents would still be an implementation detail.
Simple answer - no, it is not. Did you try compiling this? And why not just pass the whole 3D vector as a reference? If you are trying to access old C code in this manner, then you cannot.
It would be much safer to pass the vector, or a reference to it:
void some_function(std::vector<std::vector<std::vector<int>>> & vector);
You can then get the size and items within the function, leaving less risk for mistakes. You can copy the vector or pass a pointer/reference, depending on expected size and use.
If you need to pass across modules, then it becomes slightly more complicated.
Trying to use &top_level_vector[0] and pass that to a C-style function that expects an int* isn't safe.
To support correct C-style access to a multi-dimensional array, all the bytes of all the hierarchy of arrays would have to be contiguous. In a c++ std::vector, this is true for the items contained by a vector, but not for the vector itself. If you try to take the address of the top-level vector, ala &top_level_vector[0], you're going to get an array of vectors, not an array of int.
The vector structure isn't simply an array of the contained type. It is implemented as a structure containing a pointer, as well as size and capacity book-keeping data. Therefore the question's std::vector<std::vector<std::vector<int> > > is more or less a hierarchical tree of structures, stitched together with pointers. Only the final leaf nodes in that tree are blocks of contiguous int values. And each of those blocks of memory are not necessarily contiguous to any other block.
In order to interface with C, you can only pass the contents of a single vector. So you'll have to create a single std::vector<int> of size x * y * z. Or you could decide to re-structure your C code to handle a single 1-dimensional stripe of data at a time. Then you could keep the hierarchy, and only pass in the contents of leaf vectors.
Related
I have an array of a struct, lets say
struct cell{
int pos;
int id;
};
std::vector<cell> myArray;
I want an array of the id element. I can't just iterate over my array as it would take too long.
I have to provide std::vector<int> to a function.
My thought process was: Since arrays are usually just a pointer to the first element and then an offset I thought of creating an array where i can provide the offset, such as it would point to the id element of the next cell in std::vector<cell> myArray.
One solution I can think of is having an array of pointers to that element, for example:
The final solution might be something like:
struct cell{
int pos;
int id;
};
std::vector<cell> myArray;
std::vector<int*> pointersToIds;
// Creating an array of int from an array of int*
std::vector<int> idsArray = std::something(pointersToIds);
myFunc(idsArray);
Since the std library has tons of stuff I supposed there would be a way to do this.
Is there a way to convert the array of pointers to an actual array of elements in a very optimized way? The pointers approach was the only i could think but it's doesn't necessarily have to be it.
Thank you all in advance :)
I tried iterating over the the array of pointers and creating an array of elements, but it would take too much time.
TLDR Get array of an element from array of struct
I suppose this might be an instance of The XY Problem, since it's not clear what you are actually trying to solve, do you:
Want to find a fast way to pass the list of struct to a function
Want a way to extract all the members from a list of struct into a list of members
First off, shoo away from your mind the idea of manually creating an array of addresses and then fiddling around with the offsets yourself, this is certainly doable, but probably hard to do yourself in a safe and portable way due to Struct Alignment, something that differs from machine to machine.
besides accessing cell.id is already doing that in a portable way by itself!
Problem 1.
If you want to pass a vector (or any object really) to a function in a fast way, you can use a reference, it would look something like this:
void foo(std::vector<cell>& in_vec);
notice the & operator, declaring that in_vec must be passed as a reference, what this does internally is pass in_vec by address, avoiding copying values one by one, C++ does all this by himself and you can treat in_vec normally in the function without a care in the world, and it's blazing fast.
Problem 2.
if your point is that you want to extract all the IDs before passing them to a function, first off, I still suggest you pass the cell, that way it is clear that foo is supposed to operate on cell IDs and not random integers, once again, paying the cost of unpacking the structs outside (which warrants an iteration) or inside (where you might not even need to access all cells depending on foo's nature) is equal if not worse.
If you must carry through, it's as easy as a for loop:
std::vector<int> ids;
for(auto const& cell : myArray)
{
ids.push_back(cell.id);
}
Or, if you want a elegant and modern solution, using lambdas and algorithm:
#include <algorithm>
std::vector<int> ids;
std::transform(myArray.begin(), myArray.end(),
std::back_inserter(ids), [](cell const& c) {
return c.id;
});
Or something to this effect.
Say I have a simple contiguous array or vector containing some elements of type T
std::vector<T> someVector;
I have several raw pointers to the insides of the vector distributed around the application.
T* pointerOne = &someVector[5];
T* another = &someVector[42];
T* evenMore = &someVector[55];
However, the elements in the vector sometimes move around in the application, which can invalidate pointers (as in: doesn't point to what it's supposed to point at anymore):
std::swap(someVector[4],someVector[5]); //Oops! pointerOne now essentially points to whatever was in someVector[4], and the correct object that was in someVector[5] has moved places
What's an efficient (in terms of performance and memory footprint [although that probably goes hand in hand]) system for keeping these pointers updated when the contents of the array move around?
Some notes:
elements switch their positions very infrequently. num(location changes) << num(accesses to elements). This means that I'd like to keep pointers which are updated instead of introducing some other system that abstracts this problem away, because dereferencing a pointer is as fast as I can get in the application, and performance is very important here.
all of the Ts will always be inside a contiguous array. It won't at some point in development change to become some other container type, like a map.
I do know (and can modify) the code parts where the Ts are moved around inside the array. In fact that happens inside a single function. I.e. the system doesn't need to monitor the memory and somehow automatically detect at runtime if the contents of the array changes.
How about holding a reverse map to the pointers. This could be an array (or vector) in the length of your original array that holds pointers to the pointers you created. For instance at index 5 of this reverse map, you will have pointers to all the pointers that point at element 5 in the original array. Now if element 5 is swapped with say element 6, just go over all the pointers in the reverse map at index 5, set them to point at element 6 in the original array and also move all these pointers to index 6 of the reverse map. You can do this work from the single point in your code that moves stuff around.
An idea similar to user3678664's suggestion optimised for the case when there are many elements in the vector and few pointers:
Create a function that uses a
multimap<int, T**> ptrmap;
to save the addresses of the pointers.
void RegisterPointer(int pos, T** ptr)
{
ptrmap.insert(pair<int, T**>(pos, ptr));
}
Then write a function void UpdateSwappedPtrs(int pos, int newpos) that updates the swapped pointers by iterating through all the pointers in the ptrmap at pos and newpos.
In C++ I understand that in order to create a dynamic array you need to use vectors. However I have a problem when I need to find information I put in the vector.
For example:
Lets say I have a simple vector that stores the name of a person and a small message the wrote. In the vector how do I find where Bill is located.
I was also trying to understand how to do this in PHP when I posted this question.
Indeed you seam confused. Let me try to help you.
One thing that is maybe confusing you: std::vector is not a geometric vector. It's only a sequence of data of the same type that is contiguous in memory. So it's like an array.
a) Determine the size of a vector based on a variable. For example if
I was using an array it would look something like array [x][y] ( I
know it's not possible to do this). How would I do this with a vector
std::vector is basically a automatically managed dynamic array.
It means that it IS an array inside, but it's managed by code that will make sure that array grows (gets bigger) when you try to add more data than it current capacity can hold.
Actually, std::vector is a class template. It means that it's not a real class, it's code that the compiler will use to generate itself a real class. If I say
std::vector<int> my_ints; // this is a vector of ints
This vector can only hold ints. And then:
std::vector<std::string> name_list;
this one hold std::string objects.
As I was saying, inside, it's only code to manage an array dynamically. You can think the previous examples as if it was like that:
class
{
unsigned long size; // count of elements contained in this container
unsigned long capacity; // count of elements that the memory allocated by the array can hold
int* array; // array containing the values, created using new, destroyed using delete
}
my_ints;
This is an oversimplified view of how it is inside, so don't assume it's exactly like that, but it might be useful.
Now, when you add values, the value is copied in the memory of the array, in an element that is not used yet (through push_back() for example) or writing over an element already existing (using insert() for example).
If you add a value and the capacity of the vector is not enough to hold all values, then the the vector will automatically grow: it will create a much bigger array, copy it's current values inside, copy the additional value too, then delete the array it had before.
It's important to understand this: if a vector grows, then you can't assume that it's data is always at the same adress in memory, so pointers to it's data can't be trusted.
b) second how would I using the push back command to store the value
of a variable inside a specific spot. Again if I was using an array
it'd be like array[x][y] += q. Where x and y are the spot in the array
and q is the value.
You don't use push_back() to add a value between two values, you use insert().
The syntaxe array[x][y] += q Will certainly not do what you describe. It will add q to the value at the position array[x][y].
Arrays are different to std::vector because they are of a fixed size. All elements of the array exist while the array exists. When you create a std::vector with its default constructor, it is empty. It contains no elements, so you cannot index any elements.
However, std::vector does have a constructor that takes the initial size. If you pass a single int argument to the std::vector constructor, it will default initialise that many elements. For example:
std::vector<int> v(10); // Will have 10 ints
If you want the equivalent of a 2D array, then you'll need a std::vector<std::vector<T>>. If you want to construct it with a specific size, you will need specify the size of the outer std::vector as above, and pass it the std::vector that each element should be initialised to. For example, if you want a 10x20 vector:
// This will have 10x20 ints
std::vector<std::vector<int>> v(10, std::vector<int>(20));
Once these elements exist, you can index them just as you would an array:
int value = v[x][y];
It's worth noting that C++11 introduces std::array which has a compile-time fixed size. You could use it like this:
std::array<std::array<int, 20>, 10> arr;
However, you cannot use this if you want your array size to be determined by a variable. The dimensions must be compile-time constants.
Edit: The below question was answered by this. I have a new updated question, is it any more efficient to use: (my friend said it is inefficient to put a vector of a vector because it uses sequential memory and to realloc when you push_back means it takes more time to find the location where a chunk of memory for the entire large vector can be placed)
(where Picture is a vector of lines, Line is a vector of points)
std::vector<Point> *LineVec;
std::vector<Line> PictureVec;
versus
std::vector<Point> LineVec;
std::vector<Line> PictureVec;
struct Point{
int x;
int y;
}
I'm trying to get a vector of a vector and my friend told me that it's inefficient to put a vector of a vector because it uses sequential memory and vector of a vector will require huge amounts of space. So what he suggested was a using a vector of a pointer vector. Therefore the inner vector looks like this. Clearly I'm very new to C++ and would appreciate any insight.
struct Shape{
int c;
int d;
}
std::vector<Shape> *intvec;
When I want to push back into this, how would I do so? Something like this?
Shape s;
s.c=1;
s.d=1;
intvec->push_back(s);
Also, I wrote an iterator to go through, however it does not seem to work, hence why I believe the above code does not work. Finally my last concern is, while the above code works, it gives really weird values for my output. Large numbers that are 7 digits long and definitely not the values I put in for s.c and s.d
for(std::vector<Shape>::iterator it=Shapes->begin();it<Shapes->end();it++){
Shape s = (*it);
std::cout << s.c << s.d << std::endl;
}
Using a vector of pointers to vectors is not more efficient than a vector of vectors. It's less efficient, because it introduces an extra level of indirection. It also does not cause all elements of the resulting 2-d array to be allocated contiguously.
The reason is that a vector is practically a pointer to an array, in the sense that a vector<T> is implemented roughly as
template <typename T>
class vector
{
T *p; // pointer to array of elements
size_t nelems, capacity;
public:
// interface
};
so that a vector of vectors behaves, performance-wise, like a dynamic array of pointers to arrays.
[Note: I can't quote the C++ standard chapter and verse, but I'm pretty sure it constrains std::vector's operations and complexity in such a way that the above is the only practical way of implementing it.]
As to your updated question about whether or not it is more efficient to use a pointer to a vector over a vector itself. In some cases it is more efficient to use a pointer to a vector rather then the actual vector itself. A specific example would be using a vector as a parameter for a function.
EX:
void somefunction(std::vector<int> hello)
In this case the copy constructor for std::vector is invoked any time this function is called (which copies the vector completely, INCLUDING the elements contained in the vector). Passing by reference gets rid of this extra copy.
As for whether push_back itself is more efficient when using a pointer to a vector. No its not more efficient to use a pointer (they should be roughly equivalent time wise).
If I want to declare a vector of unknown size, then assign values to index 5, index 10, index 1, index 100, in that order. Is it easily doable in a vector?
It seems there's no easy way. Cause if I initialize a vector without a size, then I can't access index 5 without first allocating memory for it by doing resize() or five push_back()'s. But resize clears previously stored values in a vector. I can construct the vector by giving it a size to begin with, but I don't know how big the vector should.
So how can I not have to declare a fixed size, and still access non-continuous indices in a vector?
(I doubt an array would be easier for this task).
Would an std::map between integer keys and values not be an easier solution here? Vectors will require a contiguous allocation of memory, so if you're only using the occasional index, you'll "waste" a lot of memory.
Resize doesn't clear the vector. You can easily do something like:
if (v.size() <= n)
v.resize(n+1);
v[n] = 42;
This will preserve all values in the vector and add just enough default initialized values so that index n becomes accessible.
That said, if you don't need all indexes or contigous memory, you might consider a different data structure.
resize() doesn't clear previously stored values in a vector.
see this documentation
I would also argue that if this is what you need to do then its possible that vector may not be the container for you. Did you consider using map maybe?
Data structures which do not contain a contiguous set of values are known as sparse or compressed data structures. It seems that this is what you are looking for.
If this is case, you want a sparse vector. There is one implemented in boost, see link text
Sparse structures are typically used to conserve memory. It is possible from your problem description that you don't actually care about memory use, but about addressing elements that don't yet exist (you want an auto-resizing container). In this case a simple solution with no external dependencies is as follows:
Create a template class that holds a vector and forwards all vector methods to it. Change your operator[] to resize the vector if the index is out of bounds.
// A vector that resizes on dereference if the index is out of bounds.
template<typename T>
struct resize_vector
{
typedef typename std::vector<T>::size_type size_type;
// ... Repeat for iterator/value_type typedefs etc
size_type size() const { return m_impl.size() }
// ... Repeat for all other vector methods you want
value_type& operator[](size_type i)
{
if (i >= size())
resize(i + 1); // Resize
return m_impl[i];
}
// You may want a const overload of operator[] that throws
// instead of resizing (or make m_impl mutable, but thats ugly).
private:
std::vector<T> m_impl;
};
As noted in other answers, elements aren't cleared when a vector is resized. Instead, when new elements are added by a resize, their default constructor is called. You therefore need to know when using this class that operator[] may return you a default constructed object reference. Your default constructor for <T> should therefore set the object to a sensible value for this purpose. You may use a sentinel value for example, if you need to know whether the element has previously been assigned a value.
The suggestion to use a std::map<size_t, T> also has merit as a solution, provided you don't mind the extra memory use, non-contiguous element storage and O(logN) lookup rather than O(1) for the vector. This all boils down to whether you want a sparse representation or automatic resizing; hopefully this answer covers both.