std::vector<float> to std::vector<glm::vecX> without copying - c++

I am writing a small toy game engine using Tinyobjloader for loading .obj files. I store the vertex data and everything using glm::vecX to make things easier.
Tinyobjloader gives me an std::vector<float>, when I want an std::vector<glm::vecX>. How would I do this without copying?
To be clear, a glm::vecX is a simple struct containing, for example, the float members x, y, z.
I was thinking that since structs can behave a bit like arrays, that std::move would work, but no luck.
Thanks!
Edit:
I know I wasn't clear about this, sorry. I would like to either move the std::vector<float> into an std::vector<glm::vecX> or pass it as a std::vector<glm::vecX>&.
Copying the data using std::memcpy works fine, but it copies the data, which I would like to avoid.

It may be possible to directly interpret the contents of the vector as instances of the struct, without having to copy the data. If you can guarantee the representation is compatible, that is. The contents of a vector<float> are laid out in memory as a sequence of float values directly following each other (an array) with no extra padding, while the contents of a vector<glm::vecX> are laid out as a sequence of vecX. Thus, you need to ensure the following conditions hold:
That glm::vecX is exactly the size of X floats, with no padding. Depending on the declaration of the struct, this may be platform-dependant.
That the contents of the vector<float> are in the correct sequence, i.e. as [x1,y1,z1, x2,y2,z2, ...] for a vec3 instead of [x1,x2,...,xN,y1,y2...].
In that case, you can safely reinterpret the data pointer of the float vector as pointer to an array of vecX as in this example:
std::vector<float> myObjData = ...;
auto nVecs = myObjData.size() / 3; // You should check that there are no remainders!
glm::vec3* vecs = reinterpret_cast<glm::vec3*>(myObjData.data());
std::cout << vecs[0]; // Use vecs[0..nVecs-1]
You cannot, however, safely reinterpret the vector itself as a vector of glm::vecX, not even as a const vector, because the number of elements stored in the vector might not be consistent after the reinterpretation. It depends on whether the vector<T> code stores the number of elements directly, or the number of allocated bytes (and then size() divides that by sizeof(T)):
// Don't do this, the result of .size() and .end() may be wrong!
const std::vector<glm::vec3>& bad = *reinterpret_cast<std::vector<glm::vec3>*>(&myObjData);
bad[bad.size()-1].z = 0; // Potential BOOM!
Most of the time, however, you don't need to pass an actual vector, since most functions in the standard library accept a container range, which is easy to give for arrays like the one in the first example. So, if you wanted to sort your vec3 array based on z position, and then print it out you would do:
// nVecs and vecs from the first example
std::sort(vecs, vecs+nVecs, // Sort by Z position
[](const glm::vec3& a, const glm::vec3& b) { return a.z < b.z; });
std::copy(vecs, vecs+nVecs, std::ostream_iterator<glm::vec3>(std::cout, "\n"));

In short: It is - to the best of my knowledge - not possible without copying.
And in my opinion, std::memcpy has no business being used with std::vector.

Related

Zero overhead subscript operator for a set of values

Assume we have a function with the following signature (the signature may not be changed, since this function is part of a legacy API):
void Foo(const std::string& s, float v0, float v1, float v2)
{ ... }
How can one access the last three arguments by index using the subscript operator [] without actually copying the data into some sort of container?
Regularly when I come across this kind of issue I put the values in a container, like const std::array<float,3> args{v0,v1,v2}; and access these values using args[0], which unfortunately needs to copy the values.
Another idea would be to access the arguments using a parameter pack, which in turn involves the creation of a templated function which seems to be overkill for this task.
I'm aware that the version using the std::array<> might be suitable since the compiler probably will optimize this kind of stuff, however, this question is kind of academically motivated.
You can't. Not in a way that guarantees zero overhead, or overhead similar to that of array subscripting.
You could, of course, do something like float* vs[]{&v0, &v1, &v2};, and then dereference the result of vs[i]. For that matter, you could make a utility class to act as a transparent reference (to try to get around arrays of references being illegal), though the result is inevitably limited.
The ultimate problem, though, is that nothing in the standard guarantees (or even suggests) that function arguments be stored in any particular memory ordering. On most platforms, at least one of those floats is going to be in a register, meaning that there's just no way to natively subscript it.
If a group of objects does not start out as an array, it's not possible to treat them as an array.
Another idea would be to access the arguments using a parameter pack, which in turn involves the creation of a templated function which seems to be overkill for this task.
Not necessarily. One thing you can do is use std::tie to build a std::tuple of references to the function parameters and then access that tuple via std::get. That should optimize out, but let you refer to the parameters as if they are part of a single collection. That would look like
void Foo(const std::string& s, float v0, float v1, float v2)
{
auto args = std::tie(v0, v1, v2);
std::cout << std::get<1>(args);
}
It's not using operator [], and requires your indices be compile time constants, but you can now pass them to something else as one object.
Danger Wil Robinson! Danger!
This is going to be horribly implementation dependent, and an all around bad idea! This relies on undefined behavior. Less awful with set hardware and tools, but less as in "we're only going to eat 5 babies, not a full dozen".
Those three floats are on the stack next to each other. I don't know if there are any packing rules for the stack. I don't know which order on the the stack they'll be ("v0 v1 v2" vs "v2 v1 v0"). Hell, some optimized build might even put them in a different order just to optimize some oddball case that doesn't actually come up in real life. I dunno. But I suspect something like this will work.
void Foo(const std::string& s, float v0, float v1, float v2)
{
float* vp = &v2;
for (int i = 0; i < 3; ++i)
{
printf("%f\n", vp[i]);
}
}
void main(void)
{
Foo("", 1.0f, 2.0f, 3.0f);
}
3.0000
2.0000
1.0000
So it is possible. It's also ugly, vile, evil, and probably both fattening and carcinogenic.
On GodBolt.org, using gcc x86-64 9.3, the above code worked fine. In VS2017 intel/64, I had to use float* vp = &v0 and for (int i = 0; i < 5; i += 2). Different alignment, different order, and different output (1, 2, 3, not 3, 2 1).
I'm pretty sure I just consigned my soul to the Nth circle of hell.

Copying vector elements to a vector pair

In my C++ code,
vector <string> strVector = GetStringVector();
vector <int> intVector = GetIntVector();
So I combined these two vectors into a single one,
void combineVectors(vector<string>& strVector, vector <int>& intVector, vector < pair <string, int>>& pairVector)
{
for (int i = 0; i < strVector.size() || i < intVector.size(); ++i )
{
pairVector.push_back(pair<string, int> (strVector.at(i), intVector.at(i)));
}
}
Now this function is called like this,
vector <string> strVector = GetStringVector();
vector <int> intVector = GetIntVector();
vector < pair <string, int>> pairVector
combineVectors(strVector, intVector, pairVector);
//rest of the implementation
The combineVectors function uses a loop to add the elements of other 2 vectors to the vector pair. I doubt this is a efficient way as this function gets called hundrands of times passing different data. This might cause a performance issue because everytime it goes through the loop.
My goal is to copy both the vectors in "one go" to the vector pair. i.e., without using a loop. Am not sure whether that's even possible.
Is there a better way of achieving this without compromising the performance?
You have clarified that the arrays will always be of equal size. That's a prerequisite condition.
So, your situation is as follows. You have vector A over here, and vector B over there. You have no guarantees whether the actual memory that vector A uses and the actual memory that vector B uses are next to each other. They could be anywhere.
Now you're combining the two vectors into a third vector, C. Again, no guarantees where vector C's memory is.
So, you have really very little to work with, in terms of optimizations. You have no additional guarantees whatsoever. This is pretty much fundamental: you have two chunks of bytes, and those two chunks need to be copied somewhere else. That's it. That's what has to be done, that's what it all comes down to, and there is no other way to get it done, other than doing exactly that.
But there is one thing that can be done to make things a little bit faster. A vector will typically allocate memory for its values in incremental steps, reserving some extra space, initially, and as values get added to the vector, one by one, and eventually reach the vector's reserved size, the vector has to now grab a new larger block of memory, copy everything in the vector to the larger memory block, then delete the older block, and only then add the next value to the vector. Then the cycle begins again.
But you know, in advance, how many values you are about to add to the vector, so you simply instruct the vector to reserve() enough size in advance, so it doesn't have to repeatedly grow itself, as you add values to it. Before your existing for loop, simply:
pairVector.reserve(pairVector.size()+strVector.size());
Now, the for loop will proceed and insert new values into pairVector which is guaranteed to have enough space.
A couple of other things are possible. Since you have stated that both vectors will always have the same size, you only need to check the size of one of them:
for (int i = 0; i < strVector.size(); ++i )
Next step: at() performs bounds checking. This loop ensures that i will never be out of bounds, so at()'s bound checking is also some overhead you can get rid of safely:
pairVector.push_back(pair<string, int> (strVector[i], intVector[i]));
Next: with a modern C++ compiler, the compiler should be able to optimize away, automatically, several redundant temporaries, and temporary copies here. It's possible you may need to help the compiler, a little bit, and use emplace_back() instead of push_back() (assuming C++11, or later):
pairVector.emplace_back(strVector[i], intVector[i]);
Going back to the loop condition, strVector.size() gets evaluated on each iteration of the loop. It's very likely that a modern C++ compiler will optimize it away, but just in case you can also help your compiler check the vector's size() only once:
int i=strVector.size();
for (int i = 0; i < n; ++i )
This is really a stretch, but it might eke out a few extra quantums of execution time. And that pretty much all obvious optimizations here. Realistically, the most to be gained here is by using reserve(). The other optimizations might help things a little bit more, but it all boils down to moving a certain number of bytes from one area in memory to another area. There aren't really special ways of doing that, that's faster than other ways.
We can use std:generate() to achieve this:
#include <bits/stdc++.h>
using namespace std;
vector <string> strVector{ "hello", "world" };
vector <int> intVector{ 2, 3 };
pair<string, int> f()
{
static int i = -1;
++i;
return make_pair(strVector[i], intVector[i]);
}
int main() {
int min_Size = min(strVector.size(), intVector.size());
vector< pair<string,int> > pairVector(min_Size);
generate(pairVector.begin(), pairVector.end(), f);
for( int i = 0 ; i < 2 ; i++ )
cout << pairVector[i].first <<" " << pairVector[i].second << endl;
}
I'll try and summarize what you want with some possible answers depending on your situation. You say you want a new vector that is essentially a zipped version of two other vectors which contain two heterogeneous types. Where you can access the two types as some sort of pair?
If you want to make this more efficient, you need to think about what you are using the new vector for? I can see three scenarios with what you are doing.
The new vector is a copy of your data so you can do stuff with it without affecting the original vectors. (ei you still need the original two vectors)
The new vector is now the storage mechanism for your data. (ei you
no longer need the original two vectors)
You are simply coupling the vectors together to make use and representation easier. (ei where they are stored doesn't actually matter)
1) Not much you can do aside from copying the data into your new vector. Explained more in Sam Varshavchik's answer.
3) You do something like Shakil's answer or here or some type of customized iterator.
2) Here you make some optimisations here where you do zero coping of the data with the use of a wrapper class. Note: A wrapper class works if you don't need to use the actual std::vector < std::pair > class. You can make a class where you move the data into it and create access operators for it. If you can do this, it also allows you to decompose the wrapper back into the original two vectors without copying. Something like this might suffice.
class StringIntContainer {
public:
StringIntContaint(std::vector<std::string>& _string_vec, std::vector<int>& _int_vec)
: string_vec_(std::move(_string_vec)), int_vec_(std::move(_int_vec))
{
assert(string_vec_.size() == int_vec_.size());
}
std::pair<std::string, int> operator[] (std::size_t _i) const
{
return std::make_pair(string_vec_[_i], int_vec_[_i]);
}
/* You may want methods that return reference to data so you can edit it*/
std::pair<std::vector<std::string>, std::vector<int>> Decompose()
{
return std::make_pair(std::move(string_vec_), std::move(int_vec_[_i])));
}
private:
std::vector<std::string> _string_vec_;
std::vector<int> int_vec_;
};

How to construct a new vector/set of pointer from another vector/set of object?

Background
I wanted to manipulate the copy of a vector, however doing a vector copy operation on each of its element is normally expensive operation.
There are concept called shallow copy which I read somewhere is the default copy constructor behavior. However I'm not sure why it doesn't work or at least I tried to do the copy of vector object and the result looks like a deep copy.
struct Vertex{
int label;
Vertex(int label):label(label){ }
};
int main(){
vector<Vertex> vertices { Vertex(0), Vertex(1) };
// I Couldn't force this to be vector<Vertex*>
vector<Vertex> myvertices(vertices);
myvertices[1].label = 123;
std::cout << vertices[1].label << endl;
// OUTPUT: 1 (meaning object is deeply copied)
return 0;
}
Naive Solution: for pointer copy.
int main(){
vector<Vertex> vertices { Vertex(0), Vertex(1) };
vector<Vertex*> myvertices;
for (auto it = vertices.begin(); it != vertices.end(); ++it){
myvertices.push_back(&*it);
}
myvertices[1].label = 123;
std::cout << vertices[1].label << endl;
// OUTPUT: 123 (meaning object is not copied, just the pointer)
return 0;
}
Improvement
Is there any other better approach or std::vector API to construct a new vector containing just the pointer of each of the elements in the original vector?
One way you could transform a vector of elements to a vector of pointers that point to the elements of the original vector that is better in terms of efficiency compared to your example, due to the fact that it preallocates the buffer of the vector of pointers, and IMHO more elegant is via using std::transform as follows:
std::vector<Vertex*> myvertices(vertices.size());
std::transform(vertices.begin(), vertices.end(), myvertices.begin(), [](Vertex &v) { return &v; });
Live Demo
Or if you don't want to use a lambda for the unary operator:
std::vector<Vertex*> myvertices(vertices.size());
std::transform(vertices.begin(), vertices.end(), myvertices.begin(), std::addressof<Vertex>);
Live Demo
Caution: If you alter the original vector then you invalidate the pointers in the pointers' vector.
Thanks for #kfsone for noticing on the main problem that it is very uncommon people wanted to keep track of pointer from another vector of object without utilizing the core idea behind it. He provided an alternative approach that solve similar problem by using bit masking. It may not be obvious for me at first until he mentioned that.
When we are trying to store just the pointers of another vector, we are most probably wanted to do some tracking, house keeping (keeping track) of another object. Which later to be performed on the pointer itself without touching the original data. For my case, I'm solving a minimum vertex cover problem via bruteforce approach. Whereby I will need to generate all permutation of vertices (e.g. 20 vertices will generate 2**20=1million++ permutation), then I trim down all irrelevant permutation by slowly iterating each of the vertices in the vertex cover and remove edges that are covered by the vertices. In doing so, my first intuition is to copy all pointers to ensure efficiency and later i could just remove the pointer one by one.
However, another way of looking into this problem is not to use vector/set at all, but rather just keep track each of those pointer as a bit pattern. I won't go in the detail but feel free to learn from others.
The performance difference is very significant such that in bitwise, you can achieve O(1) constant time without much problem, whereas using a specific container, you tend to have to iterate each of the elements which bound your algorithm to O(n). To make it worst, if you are bruteforcing NP hard problem, you need to keep the constant factor as low as possible, and from O(1) to O(N) is a huge difference in such scenario.

Retuning multiple vectors from a function in c++?

I want to return multiple vectors from a function.
I am not sure either tuple can work or not. I tried but is not working.
xxx myfunction (vector<vector<float>> matrix1 , vector<vector<float>> matrix2) {
// some functional code: e.g.
// vector<vector<float>> matrix3 = matrix1 + matrix2;
// vector<vector<float>> matrix4 = matrix1 - matrix2;
return matrix3, matrix4;
If these matrices are very small then this approach might be OK, but generally I would not do it this way. First, regardless of their size, you should pass them in by const reference.
Also, std::vector<std::vector<T>> is not a very good "matrix" implementation - much better to store the data in a contiguous block and implement element-wise operations over the entire block. Also, if you are going to return the matrices (via a pair or other class) then you'll want to look into move semantics as you don't want extra copies.
If you are not using C++11 then I'd pass in matrices by reference and fill them in the function; e.g.
using Matrix = std::vector<std::vector<float>>; // or preferably something better
void myfunction(const Matrix &m1, const Matrix &m2, Matrix &diff, Matrix &sum)
{
// sum/diff clear / resize / whatever is appropriate for your use case
// sum = m1 + m2
// diff = m1 - m2
}
The main issue with functional style code, e.g. returning std::tuple<Matrix,Matrix> is avoiding copies. There are clever things one can here to avoid extra copies but sometimes it is just simpler, IMO, to go with a less "pure" style of coding.
For Matrices, I normally create a Struct or Class for it that has these vectors, and send objects of that class in to the function. It would also help to encapsulate Matrix related operations inside that Class.
If you still want to use vector of vector, here is my opinion. You could use InOut parameters using references/pointers : Meaning, if the parameters can be updated to hold results of calculation, you would be sending the arguments in, and you would not have to return anything in that case.
If the parameters need to be const and cannot be changed, then I normally send In parameters as const references, and separate Out parameters in the function argument list itself.
Hope this helps a bit.

How to get dimensions of a multidimensional vector in C++

all
I am using multidimensional STL vector to store my data in C++. What I have is a 3D vector
vector<vector<vector<double>>> vec;
What I want to retrieve from it is :
&vec[][1][]; // I need a pointer that points to a 2D matrix located at column 1 in vec
Anyone has any idea to do so? I would be extremly appreciate any help!
Regards
Long
It is best to consider vec just as a vector whose elements happen to be
vectors-of-vectors-of-double, rather than as a multi-dimensional structure.
You probably know, but just in case you don't I'll mention it,
that this datatype does not necessarily represent a rectangular cuboid.
vec will only have that "shape" if you ensure that all the vectors are
the same size at each level. The datatype is quite happy for the vector vec[j]
to be a different size from the one at vec[k] and likewise for vec[j][n]
to be a vector of different size from vec[j][m], so that your structure is "jagged".
So you want to get a pointer to the vector<vector<double>> that is at
index 1 in vec. You can do that by:
vector<vector<double>> * pmatrix = &vec[1];
However this pointer will be an unnecessarily awkward means of accessing that
vector<vector<double>>. You certainly won't be able to write the
like of:
double d = pmatrix[j][k];
and expect to get a double at coordinates (j,k) in the "matrix addressed
by a pmatrix". Because pmatrix is a pointer-to-a-vector-of-vector-of-double;
so what pmatrix[j] refers to is the vector-of-vector-of-double (not vector-of-double)
at index j from pmatrix, where the index goes in steps of
sizeof(vector<vector<double>>). The statement will reference who-knows-what
memory and very likely crash your program.
Instead, you must write the like of:
double d = (*pmatrix)[j][k];
where (*pmatrix) gives you the vector-of-vector-of-double addressed by pmatrix,
or equivalently but more confusingly:
double d = pmatrix[0][j][k];
Much simpler - and therefore, the natural C++ way - is to take a reference,
rather than pointer, to the vector<vector<double>> at index 1 in vec. You
do that simply by:
vector<vector<double>> & matrix = vec[1];
Now matrix is simply another name for the vector<vector<double>> at index 1 in vec,
and you can handle it matrix-wise just as you'd expect (always assuming
you have made sure it is a matrix, and not a jagged array).
Another thing to consider was raised in a comment by manu343726. Do you
want the code that receives this reference to vec[1] to be able to
use it to modify the contents of vec[1] - which would include changing its
size or the size of any of the vector<double>s within it?
If you allow modification, that's fine. If you don't then you want to get
a const reference. You can do that by:
vector<vector<double> > const & matrix = vec[1];
Possibly, you want the receiving code to be able to modify the doubles
but not the sizes of the vectors that contain them? In that case, std::vector
is the wrong container type for your application. If that's your position I
can update this answer to offer alternative containers.
Consider using matrix from some linear algebra library. There are some directions here