Working with structure objects - c++

I have a logic that looks like the below (Not the actual code):
StructureElement x;
For i in 1 to 1000
do
x.Elem1 = 20;
x.Elem2 = 30;
push(x into a std::vector)
end
My knowledge is that x be allocated memory only once and that the existing values will be overwritten for every iteration.
Also, the 'x' pushed into the vector will not be affected by subsequent iterations of pushing a modified 'x'.
Am I right in my observations?
Is the above optimal? I would want to keep memory consumption minimal and would not prefer using new. Am I missing anything by not using new?
Also, I pass this vector and recieve a reference to it it another method.
And, if I were to read the vector elements back, is this right?
Structure element xx = mYvector.begin()
print xx.Elem1
print xx.Elem2
Any optimizations or different ideas would be welcome.

Am I right in my observations?
Yes, if the vector is std::vector<StructureElement>, in which case it keeps its own copies if what is pushed in.
Is the above optimal?
It is sub-optimal because it results in many re-allocations of the vector's underlying data buffer, plus unnecessary assignments and copies. The compiler may optimize some of the assignments and copies away, but there is no reason, for example, to re-set the elements of x in the loop.
You can simplify it like this:
std:vector<StructureElement> v(1000, StructureElement{20, 30});
This creates a size-1000 vector containing copies of StructureElement with the desired values, which is what you seem to be trying in your pseudo-code.
To read the elements back, you have options. A range based for-loop if you want to iterate over all elements:
for (const auto& e: v):
std::cout << e.Elem1 << " " << e.Elem2 << std::endl;
Using iterators,
for (auto it = begin(v); it != end(v); ++it)
std::cout << it->Elem1 << it->Elem2 << std::endl;
Or, pass ranges in to algorithms
std::transform(begin(v), end(v), ....);

Related

How are elements in an std::unordered_set stored in memory in C++?

While messing around with type-punning iterators, I came across the ability to do
std::vector<int> vec{ 3, 7, 1, 8, 4 };
int* begin_i = (int*)(void*)&*vec.begin();
std::cout << "1st: " << begin_i << " = " << *begin_i << std::endl;
begin_i++;
std::cout << "2nd: " << begin_i << " = " << *begin_i << std::endl;
Then I tried to do the same kind of thing with an std::unordered_set:
std::unordered_set<int> set{ 3, 7, 1, 8, 4 };
for (auto& el : set)
{ // Display the order the set is currently in
std::cout << el << ", ";
}
std::cout << '\n' <<std::endl;
int* begin_i = (int*)(void*)&*set.begin();
std::cout << "1st: " << begin_i << " = " << *begin_i << std::endl;
begin_i++;
std::cout << "2nd: " << begin_i << " = " << *begin_i << std::endl;
But the output I got was:
4, 8, 1, 7, 3,
1st: [address] = 4
2nd: [address] = 0
I'm supposing this is because and an unordered set's elements are located in different parts of memory? I was confused here considering that I also printed the order the elements were stored in using a range-based loop.
My question is how does an std::unordered_set store its elements in memory? What happens when an element is added to the set? Where does it go in memory and how is that kept track of if it's not stored in an array-like container where the elements are one-right-after-the-other?
An unordered_set is implemented as a hash table using external chaining.
That basically means you have an array of linked lists (which are usually called "buckets"). So, to add an item to an unordered_set you start by hashing the new item you're doing to insert. You then take that hash and reduce it to the range of the current size of the array (which can/will expand as you add more items). You then add the new item at the tail of that linked list.
So, depending on the value produced by the hash, two consecutively inserted items may (and often will) be inserted in the linked lists at completely different parts of the table. Then the node in the linked list will typically be dynamically allocated, so even two consecutive items in the same linked list may be at completely unrelated addresses.
As I noted in an earlier answer, however, quite a bit more about this is actually specified in the standard than most people seem to realize. As I outlined there, it might be (barely) possible to violate the expectation and still (sort of) meet the requirements in the standard, but even at best, doing so would be quite difficult. For most practical purposes, you can assume it's something quite a bit like a vector of linked lists.
Most of the same things apply to an unordered_multiset--the only fundamental difference is that you can have multiple items with the same key instead of only one item with a particular key.
Likewise, there are also unordered_map and unordered_multimap, which are pretty similar again, except that they separate the things being stored into a key and a value associated with that key, and when they do hashing, the only look at the key part, not the value part).
Rather than directly answer the question, I would like to address the "type-punning" trick. (I put that in quotes because the provided code does not demonstrate type-punning. Perhaps the code was appropriately simplified for this question. In any event, *vec.begin() gives an int, so &*vec.begin() is an int*. Further casting to void* then back to int* is a net no-op.)
The property your code takes advantage of is
*(begin_i + 1) == *(vec.begin() + 1) // Using the initial value of begin_i
*(&*vec.begin() + 1) == *(vec.begin() + 1) // Without using an intermediary
This is a property of a contiguous iterator, which is associated with a contiguous container. These are the containers that store their elements in adjacent memory locations. The contiguous containers in the standard library are string, array, and vector; these are the only standard containers for which your trick is guaranteed to work. Trying it on a deque will probably seem to work at first, but the attempt will fail if enough is added to &*begin(). Other containers tend to dynamically allocate elements individually, so there need not be any relation between the addresses of elements; elements are linked together by pointers rather than by position/index.
So that I'm not ignoring the asked question:
An unordered set is merely required to organize elements into buckets. There are no requirements on how this is done, other than requiring that all elements with the same hash value are placed in the same bucket. (This does not imply that all elements in the same bucket have the same hash value.) In practice, each bucket is probably implemented as a list, and the container of buckets is probably a vector, simply because re-using code is cool. At the same time, this is an implementation detail, so it can very from compiler to compiler, and even from compiler version to compiler version. There are no guarantees.
The way std::unordered_set stores its memory is implementation defined. Standart doesn't care as long as it satisfies the requirements.
In VS version it stores them inside an std::list (fast access is provided by creating and managing additional data) - so each element has also pointers towards prev and next is stored via new (at least that's what I remember from std::list).

C++ Insert result of permutations into a vector

I am encountering the issue that the first result of the permutation is being entered into the vector, but on the next for_each loop iteration the size of the vector resets itself to {size = 0}, instead of increasing its size and inserting the second permutation, and so on. How do I get around this? I've tried using a while loop but I couldn't work out what the condition for it should be.
I also wanted to ask, as later on I will need to compare the values in this vector to a vector containing a dictionary, would the current code (when working correctly) allow me to do so.
This is my code so far:
for_each(permutations.begin(), permutations.end(), [](string stringPermutations)
{
vector<string> permutations;
permutations.push_back(stringPermutations);
cout << stringPermutations << endl;
});
So apparently it looks like the lambda always creates a new, local, vector each time it's called. If I place vector<string> permutations; outside of the lambda I get an error with permutations.push_back(stringPermutations);. So how do I go about retrieving the stringPermutations out of the lambda and into a public accessible vector?
Thanks for the help and feedback.
Declare the vector outside the lambda and use lambda capture to capture this vector:
vector<string> permutation_v;
for_each(permutations.begin(), permutations.end(), [&](string stringPermutations)
// ^
{
permutation_v.push_back(stringPermutations);
cout << stringPermutations << endl;
});
But if I were you, I would directly construct this vector as
vector<string> permutation_v{permutations.begin(), permutations.end()};
It is unclear what you want to achieve with your code, but it just seems you want to print the contents of permutations.
Then just look at the elements in the vector.
for (auto &permutation : permutations) std::cout << permutation << '\n';
The question is: why do you use an std::unordered_set<std::string> and not a std::vector<std::string> in the first place? then you do not need to copy the elements into a new vector.

How to maintain reference/pointer/link to vector element after change to vector?

I have a std::vector of a custom class (using int in sample for simplicity). I would like to keep a reference/pointer/link/other to a member of the vector. However, the vector frequently has elements removed and added.
To illustrate my point, in the sample below I take either a reference or a pointer to the second element of the vector. I use the reference/pointer to increase the value of the chosen element. I then erase the first element, and use the ref/pointer to increment again.
Reference example:
std::vector<int> intVect = {1,1,1};
int& refI = intVect.at(1);
refI++;
intVect.erase(intVect.begin());
refI++;
Smart-Pointer example:
std::vector<int> intVect2 = {1,1,1};
std::shared_ptr<int> ptrI = std::make_shared<int>(intVect2.at(1)) ;
*ptrI = *ptrI +1;
intVect2.erase(intVect2.begin());
*ptrI = *ptrI +1;
What I would like to happen is to end up with the referenced element to have a value of 3, the final vector being composed of {3,1}. However, in the reference example, the final vector is {2,2}, and in the pointer example the final vector is {1,1}.
Understanding that the pointer is essentially a memory address, I can understand why this method might not be possible, but if it somehow is, let me know.
The more important question is then, what alternate approach or structure could be used that would allow for some form of ref/pointer/link/other to that element (be it a value or an object) that is viable after adding members to, or removing members from, the vector(or other structure) that contains it?
For extra credit:
The objects I am actually working with have a position property. I have a second structure that needs to keep track of the objects for quick lookup of which objects are at which positions. I am currently using a grid (vector of vectors) to represent possible positions, each holding indexes into the vector of objects for the objects currently at the position. However, when an object is deleted from the vector (which happens very frequently, up to hundreds of times per iteration), my current resort is to loop through every grid position and decrement any indexes greater than the deleted index, which is slow and clumsy. Additional thoughts in regards to this problem in context are much appreciated, but my key question concerns the above examples.
One possible option is to have the vector store std::shared_ptr objects, and issue std::weak_ptr or std::shared_ptr objects to refer to the object in question.
std::vector<std::shared_ptr<int>> ints;
for(size_t i = 0; i < 10000; i++) {
ints.emplace_back(std::make_shared<int>(int(i)));
}
std::weak_ptr<int> my_important_int = ints[6000];
{
auto lock = my_important_int.lock();
if(lock) std::cout << *lock << std::endl;
else std::cout << "index 6000 expired." << std::endl;
}
auto erase_it = std:remove_if(ints.begin(), ints.end(), [](auto & i) {return (*i) > 5000 && ((*i) % 4) != 0;});
ints.erase(erase_it, ints.end());
{
auto lock = my_important_int.lock();
if(lock) std::cout << *lock << std::endl;
else std::cout << "index 6000 expired." << std::endl;
}
ints.erase(ints.begin(), ints.end());
{
auto lock = my_important_int.lock();
if(lock) std::cout << *lock << std::endl;
else std::cout << "index 6000 expired." << std::endl;
}
Which should print out:
6000
6000
index 6000 expired.
A container that stores key/value pairs might work for you. For example, std::map or std::unordered_map.
When using these containers, you'd keep a reference to the desired object by storing the key. If you want to modify said object, just look it up in the container using the key. Now you can add/remove other objects as much as you want without affecting the object in question (assuming the added/removed objects have unique keys).
If there is a way for you to keep using a vector and change the way you manage your objects, then you won't get much more performance than what you have now.
Otherwise, you can use a stable vector (here's the boost version). It is essentially a vector of pointers, which grants it iterator and reference stability. This means that iterators (pointers) and references to the elements are not invalidated by any operation other than removing the element itself.
Of course, there are some big drawbacks to this, mainly in performance. The two main performance issues are the fact that you go through a pointer every time you want to access an element, and the fact that the elements are not stored contiguously (which of course impacts the speed of iteration).
However, it also has advantages over other pointer-heavy data types (lists, sets, maps). Mainly, it performs lookup and pushbacks in constant time, even though it's slower than a normal vector.
Then again, if you really need performance, you might want to keep your vector and rethink your design around it.

C++ Reverse sequence of elements in a vector

Hello I'm still new to C++ and I am writing a program to reverse the elements in a vector. I don't get any errors running the program but when I run it and I enter the numbers my program prints " Printing ... end of print" then it just closes on its own. I sure it may be a simple mistake.
using namespace std;
vector<int> reverse_a(const vector<int>&veca)
{
vector<int> vecb;
//size_t as the index type
size_t i = veca.size();
while ( i > 0 )
vecb.push_back(veca[--i]);
return vecb;
}
void print(const vector<int> vec)
{
cout << "printing " << endl;
for (size_t i = 0; i < vec.size(); ++i)
cout << vec[i] << ",";
cout << "\n" << "\n end of print.\n";
}
int main(void)
{
vector<int>veca;
vector<int>vecb;
int input;
while(cin >> input)
veca.push_back(input);
reverse_a(veca);
print(vecb);
}
Sort of off topic, but can't be explained in a comment. Aderis's answer is correct and πάντα ῥεῖ brings up an alternative for OP.
As with most intro to programming problems, the standard Library has done all of the work already. There is no need for any function because it already exists, in a somewhat twisted form:
std::copy(veca.rbegin(), veca.rend(), std::back_inserter(vecb));
std::copy does just what it sounds like it does: it copies. You specify where to start, where to stop, and where to put the results.
In this case we want to copy from veca, but we want to copy backwards, so rather than calling begin like we normally would, we call rbegin to get one of those reverse iterator thingys πάντα ῥεῖ was talking about. To define the end, we use rend which, rather than tearing things limb from limb marks the end of the reverse range of veca. Typically this is one before the beginning, veca[-1], if such a thing existed.
std::back_inserter tells std::copy how to place the the data from veca in vecb, at the back.
One could be tempted to skip all of this reverse nonsense and
std::copy(veca.begin(), veca.end(), std::front_inserter(vecb));
but no. For one thing, it would be hilariously slow. Consider veca = {1,2,3,4,5}. You'd insert 1 at the beginning of vecb, then copy it to the second slot to make room for 2. Then move 2 and 1 over one slot each to fit in 3. You'd get the nice reverse ordering, but the shuffling would be be murderous. The second reason you can't do it is because vector does not implement the push_front function required to make this work, again because it would be brutally slow.
Caveat:
This approach is simple, but slow. The back_inserter may force resizing of the vector's internal array, but this can be mitigated by preallocating vecb's storage.
It's just a simple mistake. You are forgetting to set vecb to the result of the reverse_a function in main. Instead of reverse_a(veca);, you should have vecb = reverse_a(veca);. The way you currently have it, vecb never gets set and therefore has a length of zero and nothing prints.

What is the best way to access deque's element in C++ STL

I have a deque:
deque<char> My_Deque;
My_Path.push_front('a');
My_Path.push_front('b');
My_Path.push_front('c');
My_Path.push_front('d');
My_Path.push_front('e');
There are such ways to output it.
The first:
deque<char>::iterator It;
for ( It = My_Deque.begin(); It != My_Deque.end(); It++ )
cout << *It << " ";
The second:
for (i=0;i<My_Deque.size();i++) {
cout << My_Deque[i] << " ";
}
What is the best way to access deque's element - through iterator or like this: My_Deque[i]?
Has a deque<...> element an array of pointers to each element for fast access to it's data or it provides access to it's random element in consequtive way (like on a picture below)?
Since you asked for "the best way":
for (char c : My_Deque) { std::cout << c << " "; }
An STL deque is usually implemented as a dynamic array of fixed-size arrays, so indexed access is perfectly efficient.
The standard specifies that deque should support random access in constant time. So yeah, [i] should be reasonably fast.
But there still might, I think, be an advantage to using the iterators. It could (theoretically at least) be a constant multiple faster (or slower maybe!). Anyway, every use of [i] will involve looking up some table(s) and calculating offsets and so on. I would imagine that operator++ for deque::iterator is slightly more than just "find my offset; add 1 to it; lookup with the new offset"
Since you asked for "the best way":
std::copy(My_Deque.begin(), My_Deque.end(),
std::ostream_iterator<char>(std::cout, " "));
Admittedly, for formatting of individual object it won't make much of a difference but using the algorithms on segmented data structure can make a major difference! There is an interesting optimization possible when processing the segments individually when processing an entire range. For example, if you have a large std::deque<char> you want to write verbatim to a file, something like
std::copy(deque.begin(), deque.end(), std::ostreambuf_iterator<char>(out));
which is copying from one segmented data structure to another segmented data structure (under the hood stream buffers use a buffer of characters which becomes their segment) can take substantially less time (depending somewhat on the speed the data can be written to the destination, though).