I am learning C++ and I have trouble with pointers to structures stored in vector. The problem is that I need to keep the structure Student sorted twice. Once by student's id and another time by student's name, so it is easy to search the values in it. Because of this I created two vectors of pointers:
vector<Student *> sortedByID;
vector<Student *> sortedByName;
The structure looks like this and I keep it in vector as well (even though it is probably not a good idea):
struct Student {
int id;
string name;
};
vector <Student> students;
I am creating the new struct with push_back and filling it with parameters from a function (yes, I have a constructor). To keep the vector of pointers sorted I am using lower_bound as shown below:
students.push_back(Student(id, name));
it = lower_bound(sortedByID.begin(), sortedByID.end(), id, cmp());
sortedByID.insert(it, &(students.back()));
//the same for name
The problem is that everytime I add the structure with push_back it reallocs new vector and destroys the address of previous objects, so pointers in vector sortedByID point to invalid value. I think it would be the same with an array of structs, because once the array is full, there is no other way (as far as I know) to resize it, than to create a new array and copy all the data from previous one (so the address will change again).
Is there any clever way to solve this problem? Please note that I am only allowed to use vector and not any other containers from STL.
There are three options to solve this issue using only vectors and no other containers:
1) Avoid the reallocation. This can only be achieved if you know the maximum M of expected number of elements to be inserted in your vector. In this case you can students.reserve(M);.
2) Forget the the pointers for sortedByID and sortedByName. Use integers (or better said size_t) store the index of a student in students instead of a pointer. This supposes of course that the order of items in students is never changed.
3) Do not store the students themselves in a vector, but make students a vector of pointers to students (unsorted) that are allocated from free store. If this alternative meets all your criteria, i'd suggest to go fore shared_ptr<Student> instead of raw pointers.
Related
I have an array of a struct, lets say
struct cell{
int pos;
int id;
};
std::vector<cell> myArray;
I want an array of the id element. I can't just iterate over my array as it would take too long.
I have to provide std::vector<int> to a function.
My thought process was: Since arrays are usually just a pointer to the first element and then an offset I thought of creating an array where i can provide the offset, such as it would point to the id element of the next cell in std::vector<cell> myArray.
One solution I can think of is having an array of pointers to that element, for example:
The final solution might be something like:
struct cell{
int pos;
int id;
};
std::vector<cell> myArray;
std::vector<int*> pointersToIds;
// Creating an array of int from an array of int*
std::vector<int> idsArray = std::something(pointersToIds);
myFunc(idsArray);
Since the std library has tons of stuff I supposed there would be a way to do this.
Is there a way to convert the array of pointers to an actual array of elements in a very optimized way? The pointers approach was the only i could think but it's doesn't necessarily have to be it.
Thank you all in advance :)
I tried iterating over the the array of pointers and creating an array of elements, but it would take too much time.
TLDR Get array of an element from array of struct
I suppose this might be an instance of The XY Problem, since it's not clear what you are actually trying to solve, do you:
Want to find a fast way to pass the list of struct to a function
Want a way to extract all the members from a list of struct into a list of members
First off, shoo away from your mind the idea of manually creating an array of addresses and then fiddling around with the offsets yourself, this is certainly doable, but probably hard to do yourself in a safe and portable way due to Struct Alignment, something that differs from machine to machine.
besides accessing cell.id is already doing that in a portable way by itself!
Problem 1.
If you want to pass a vector (or any object really) to a function in a fast way, you can use a reference, it would look something like this:
void foo(std::vector<cell>& in_vec);
notice the & operator, declaring that in_vec must be passed as a reference, what this does internally is pass in_vec by address, avoiding copying values one by one, C++ does all this by himself and you can treat in_vec normally in the function without a care in the world, and it's blazing fast.
Problem 2.
if your point is that you want to extract all the IDs before passing them to a function, first off, I still suggest you pass the cell, that way it is clear that foo is supposed to operate on cell IDs and not random integers, once again, paying the cost of unpacking the structs outside (which warrants an iteration) or inside (where you might not even need to access all cells depending on foo's nature) is equal if not worse.
If you must carry through, it's as easy as a for loop:
std::vector<int> ids;
for(auto const& cell : myArray)
{
ids.push_back(cell.id);
}
Or, if you want a elegant and modern solution, using lambdas and algorithm:
#include <algorithm>
std::vector<int> ids;
std::transform(myArray.begin(), myArray.end(),
std::back_inserter(ids), [](cell const& c) {
return c.id;
});
Or something to this effect.
For my latest assignment, I need to create a hash table that houses stocks, which are encapsulated in a class. To avoid collisions, I need to use linear probing. The problem I've run into, however; is that I can't test whether or not an element of the array (which is the hash table) is empty.
Here's some code aggregated from several files, but this is just to give you an idea of what's going on.
class Stock{
friend class HashMap
}
class HashMap{
bool get() //this function is used for putting new stocks into the table
private:
struct Slot {
Stock slotStock;
}
Slot *slots;
}
Within the get() function
while(slots[index] != NULL)
This gives an error: no operator "!=" matches these operands HashMap::Slot != int
What alternative way would there be for me to check whether or not a slot is empty?
The array is allocated dynamically.
EDIT: When I initialize the array, does it use the default constructor to create an object for each element of the array, or does it leave the elements empty?
If you have an array of objects of type X, none of the slots are "empty". They all contain an object of type X. To represent an empty object, it needs to be a possible state of the type which is stored in the array. You could, for example, have boost::optional<Slot>, or std::unique_ptr<Slot>. Otherwise, you can encode the state directly into your Slot class (with a bool member, for example).
What you want to do is store an array of Stock pointers, whereas you are currently storing an array of Slot objects. To make things even easier on yourself, you can use a vector to store the pointers.
Your backing data structure would look like this:
std::vector< Stock* > vecStocks;
Each item in the vector is a "slot", and you do not need your Slot class unless you intend to store some metadata about the stock.
To check whether or not you have a stock in any slot of the vector, you compare the vector item to NULL like this:
if (vecStocks[index] == NULL)
This approach has the positive side effect of not needing to allocate X number of Stock objects up front, where X is the size of your hashmap (potentially a very large number, depending on how often you like to collide).
Say I have a simple contiguous array or vector containing some elements of type T
std::vector<T> someVector;
I have several raw pointers to the insides of the vector distributed around the application.
T* pointerOne = &someVector[5];
T* another = &someVector[42];
T* evenMore = &someVector[55];
However, the elements in the vector sometimes move around in the application, which can invalidate pointers (as in: doesn't point to what it's supposed to point at anymore):
std::swap(someVector[4],someVector[5]); //Oops! pointerOne now essentially points to whatever was in someVector[4], and the correct object that was in someVector[5] has moved places
What's an efficient (in terms of performance and memory footprint [although that probably goes hand in hand]) system for keeping these pointers updated when the contents of the array move around?
Some notes:
elements switch their positions very infrequently. num(location changes) << num(accesses to elements). This means that I'd like to keep pointers which are updated instead of introducing some other system that abstracts this problem away, because dereferencing a pointer is as fast as I can get in the application, and performance is very important here.
all of the Ts will always be inside a contiguous array. It won't at some point in development change to become some other container type, like a map.
I do know (and can modify) the code parts where the Ts are moved around inside the array. In fact that happens inside a single function. I.e. the system doesn't need to monitor the memory and somehow automatically detect at runtime if the contents of the array changes.
How about holding a reverse map to the pointers. This could be an array (or vector) in the length of your original array that holds pointers to the pointers you created. For instance at index 5 of this reverse map, you will have pointers to all the pointers that point at element 5 in the original array. Now if element 5 is swapped with say element 6, just go over all the pointers in the reverse map at index 5, set them to point at element 6 in the original array and also move all these pointers to index 6 of the reverse map. You can do this work from the single point in your code that moves stuff around.
An idea similar to user3678664's suggestion optimised for the case when there are many elements in the vector and few pointers:
Create a function that uses a
multimap<int, T**> ptrmap;
to save the addresses of the pointers.
void RegisterPointer(int pos, T** ptr)
{
ptrmap.insert(pair<int, T**>(pos, ptr));
}
Then write a function void UpdateSwappedPtrs(int pos, int newpos) that updates the swapped pointers by iterating through all the pointers in the ptrmap at pos and newpos.
In C++ I understand that in order to create a dynamic array you need to use vectors. However I have a problem when I need to find information I put in the vector.
For example:
Lets say I have a simple vector that stores the name of a person and a small message the wrote. In the vector how do I find where Bill is located.
I was also trying to understand how to do this in PHP when I posted this question.
Indeed you seam confused. Let me try to help you.
One thing that is maybe confusing you: std::vector is not a geometric vector. It's only a sequence of data of the same type that is contiguous in memory. So it's like an array.
a) Determine the size of a vector based on a variable. For example if
I was using an array it would look something like array [x][y] ( I
know it's not possible to do this). How would I do this with a vector
std::vector is basically a automatically managed dynamic array.
It means that it IS an array inside, but it's managed by code that will make sure that array grows (gets bigger) when you try to add more data than it current capacity can hold.
Actually, std::vector is a class template. It means that it's not a real class, it's code that the compiler will use to generate itself a real class. If I say
std::vector<int> my_ints; // this is a vector of ints
This vector can only hold ints. And then:
std::vector<std::string> name_list;
this one hold std::string objects.
As I was saying, inside, it's only code to manage an array dynamically. You can think the previous examples as if it was like that:
class
{
unsigned long size; // count of elements contained in this container
unsigned long capacity; // count of elements that the memory allocated by the array can hold
int* array; // array containing the values, created using new, destroyed using delete
}
my_ints;
This is an oversimplified view of how it is inside, so don't assume it's exactly like that, but it might be useful.
Now, when you add values, the value is copied in the memory of the array, in an element that is not used yet (through push_back() for example) or writing over an element already existing (using insert() for example).
If you add a value and the capacity of the vector is not enough to hold all values, then the the vector will automatically grow: it will create a much bigger array, copy it's current values inside, copy the additional value too, then delete the array it had before.
It's important to understand this: if a vector grows, then you can't assume that it's data is always at the same adress in memory, so pointers to it's data can't be trusted.
b) second how would I using the push back command to store the value
of a variable inside a specific spot. Again if I was using an array
it'd be like array[x][y] += q. Where x and y are the spot in the array
and q is the value.
You don't use push_back() to add a value between two values, you use insert().
The syntaxe array[x][y] += q Will certainly not do what you describe. It will add q to the value at the position array[x][y].
Arrays are different to std::vector because they are of a fixed size. All elements of the array exist while the array exists. When you create a std::vector with its default constructor, it is empty. It contains no elements, so you cannot index any elements.
However, std::vector does have a constructor that takes the initial size. If you pass a single int argument to the std::vector constructor, it will default initialise that many elements. For example:
std::vector<int> v(10); // Will have 10 ints
If you want the equivalent of a 2D array, then you'll need a std::vector<std::vector<T>>. If you want to construct it with a specific size, you will need specify the size of the outer std::vector as above, and pass it the std::vector that each element should be initialised to. For example, if you want a 10x20 vector:
// This will have 10x20 ints
std::vector<std::vector<int>> v(10, std::vector<int>(20));
Once these elements exist, you can index them just as you would an array:
int value = v[x][y];
It's worth noting that C++11 introduces std::array which has a compile-time fixed size. You could use it like this:
std::array<std::array<int, 20>, 10> arr;
However, you cannot use this if you want your array size to be determined by a variable. The dimensions must be compile-time constants.
If I have a vector in C++, I know I can safely pass it as an array (pointer to the contained type):
void some_function(size_t size, int array[])
{
// impl here...
}
// ...
std::vector<int> test;
some_function(test.size(), &test[0]);
Is it safe to do this with a nested vector?
void some_function(size_t x, size_t y, size_t z, int* multi_dimensional_array)
{
// impl here...
}
// ...
std::vector<std::vector<std::vector<int> > > test;
// initialize with non-jagged dimensions, ensure they're not empty, then...
some_function(test.size(), test[0].size(), test[0][0].size(), &test[0][0][0]);
Edit:
If it is not safe, what are some alternatives, both if I can change the signature of some_function, and if I can't?
Short answer is "no".
Elements here std::vector<std::vector<std::vector<int> > > test; are not replaced in contiguous memory area.
You can only expect multi_dimensional_array to point to a contiguos memory block of size test[0][0].size() * sizeof(int). But that is probably not what you want.
It is erroneous to take the address of any location in a vector and pass it. It might seem to work, but don't count on it.
The reason why is closely tied to why a vector is a vector, and not an array. We want a vector to grow dynamically, unlike an array. We want insertions into a vector be a constant cost and not depend on the size of the vector, like an array until you hit the allocated size of the array.
So how does the magic work? When there is no more internal space to add a next element to the vector, a new space is allocated twice the size of the old. The old space is copied to the new and the old space is no longer needed, or valid, which makes dangling any pointer to the old space. Twice the space is allocated so the average cost of insertion to the vector that is constant.
Is it safe to do this with a nested vector?
Yes, IF you want to access the inner-most vector only, and as long you know the number of elements it contains, and you don't try accessing more than that.
But seeing your function signature, it seems that you want to acess all three dimensions, in that case, no, that isn't valid.
The alternative is that you can call the function some_function(size_t size, int array[]) for each inner-most vector (if that solves your problem); and for that you can do this trick (or something similar):
void some_function(std::vector<int> & v1int)
{
//the final call to some_function(size_t size, int array[])
//which actually process the inner-most vectors
some_function(v1int.size(), &v1int[0]);
}
void some_function(std::vector<std::vector<int> > & v2int)
{
//call some_function(std::vector<int> & v1int) for each element!
std::for_each(v2int.begin(), v2int.end(), some_function);
}
//call some_function(std::vector<std::vector<int> > & v2int) for each element!
std::for_each(test.begin(), test.end(), some_function);
A very simple solution would be to simply copy the contents of the nested vector into one vector and pass it to that function. But this depends on how much overhead you are willing to take.
That being sad: Nested vectorS aren't good practice. A matrix class storing everything in contiguous memory and managing access is really more efficient and less ugly and would possibly allow something like T* matrix::get_raw() but the ordering of the contents would still be an implementation detail.
Simple answer - no, it is not. Did you try compiling this? And why not just pass the whole 3D vector as a reference? If you are trying to access old C code in this manner, then you cannot.
It would be much safer to pass the vector, or a reference to it:
void some_function(std::vector<std::vector<std::vector<int>>> & vector);
You can then get the size and items within the function, leaving less risk for mistakes. You can copy the vector or pass a pointer/reference, depending on expected size and use.
If you need to pass across modules, then it becomes slightly more complicated.
Trying to use &top_level_vector[0] and pass that to a C-style function that expects an int* isn't safe.
To support correct C-style access to a multi-dimensional array, all the bytes of all the hierarchy of arrays would have to be contiguous. In a c++ std::vector, this is true for the items contained by a vector, but not for the vector itself. If you try to take the address of the top-level vector, ala &top_level_vector[0], you're going to get an array of vectors, not an array of int.
The vector structure isn't simply an array of the contained type. It is implemented as a structure containing a pointer, as well as size and capacity book-keeping data. Therefore the question's std::vector<std::vector<std::vector<int> > > is more or less a hierarchical tree of structures, stitched together with pointers. Only the final leaf nodes in that tree are blocks of contiguous int values. And each of those blocks of memory are not necessarily contiguous to any other block.
In order to interface with C, you can only pass the contents of a single vector. So you'll have to create a single std::vector<int> of size x * y * z. Or you could decide to re-structure your C code to handle a single 1-dimensional stripe of data at a time. Then you could keep the hierarchy, and only pass in the contents of leaf vectors.