Vector of pointers undefined behaviour - c++

I'm trying to make a vector of pointers whose elements are pointing to vector of int elements. (I'm solving a competitive programming-like problem, that's why it sounds kinda nonsense).
but here's the code:
#include <bits/stdc++.h>
using namespace std;
int ct = 0;
vector<int> vec;
vector<int*> rf;
void addRef(int n){
vec.push_back(n);
rf.push_back(&vec[ct]);
ct++;
}
int main(){
addRef(1);
addRef(2);
addRef(5);
for(int i = 0; i < ct; i++){
cout << *rf[i] << ' ';
}
cout << endl;
for(int i = 0; i < ct; i++){
cout << vec[i] << ' ';
}
}
When I execute the code, it's showing weird behaviour that I don't understand. The first element of rf (vector<int*>) seems not pointing to the vec's (vector<int>) element, where the rest of the elements are pointing to it.
here's the output when I run it on Dev-C++:
1579600 2 5
1 2 5
When I tried to run the code here, the output is even weirder:
1197743856 0 5
1 2 5
The code is intended to have same output between the first line and the second.
Can you guys explain why it happens? Is there any mistake in my implementation?
thanks

Adding elements to a std::vector with push_back or similar may invalidate all iterators and references to its elements. See https://en.cppreference.com/w/cpp/container/vector/push_back.
The idea is that in order to grow the vector, it may not have enough free memory to expand into, and thus may have to move the whole array to some other location in memory, freeing the old block. That means in particular that your pointers now point to memory that has been freed, or reused for something else.
If you want to keep this approach, you will need to resize() or reserve() a sufficient number of elements in vec before starting. Which of course defeats the whole purpose of a std::vector, and you might as well use an array instead.

The vector is changing sizes and the addresses you are saving might not be those you want. You can preallocate memory using reserve() and the vector will not resize.
vec.reserve(3);
addRef(1);
addRef(2);
addRef(5);

The problem occurs when you call vec.push_back(n) and vec’s internal array is already full. When that happens, the std::vector::push_back() method allocates a larger array, copies the contents of the full array over to the new array, then frees the old/full array and keeps the new one.
Usually that’s all you need, but your program is keeping pointers to elements of the old array inside (rf), and these pointers all become dangling/invalid when the reallocation occurs, hence the funny (undefined) behavior.
An easy fix would be to call vec.reserve(100) (or similar) at the top of your program (so that no further reallocations are necessary). Or alternatively you could postpone the adding of pointers to (rf) until after you’ve finished adding all the values to (vec).

Just do not take pointer from a vector that may change soon. vector will copy the elements to a new space when it enlarges its capacity.
Use an array to store the ints instead.

Related

does shrink_to_fit() function removes null pointers?

I wanted to ask you about the vector::shrink_to_fit() function.
Lets say i've got a vector of pointers to objects (or unique_ptr in my case)
and i want to resize it to the amount of objects that it stores.
At some point i remove some of the objects from the vector by choice using the release() function of unique_ptr
so there is a null pointer in that specific place in the vector as far as i know.
So i want to resize it and remove that null pointer in between the elements of the vector and i'm asking if i could do that with shrink_to_fit() function?
No, shrink_to_fit does not change the contents or size of the vector. All it might do is release some of its internal memory back to a lower level library or the OS, etc. behind the scenes. It may invalidate iterators, pointers, and references, but the only other change you might see would be a reduction of capacity(). It's also valid for shrink_to_fit to do absolutely nothing at all.
It sounds like you want the "Erase-remove" idiom:
vec.erase(std::remove(vec.begin(), vec.end(), nullptr), vec.end());
The std::remove shifts all the elements which don't compare equal to nullptr left, filling the "gaps". But it doesn't change the vector's size; instead it returns an iterator to the position in the vector just after the sequence of shifted elements; the rest of the elements still exist but have been moved from. Then the erase member function gets rid of those unnecessary end elements, reducing the vector's size.
Or as #chris notes, C++20 adds an erase overload to std::vector and a related erase_if, which makes things easier. They may already be supported in MSVC 2019. Using the new erase could just look like:
vec.erase(nullptr);
This quick test show that u can't do like this.
int x = 1;
vector<int*> a;
cout << a.capacity() << endl;
for (int i = 0; i < 10; ++i) {
a.push_back(&x);
}
cout << a.capacity() << endl;
a[9] = nullptr;
a.shrink_to_fit();
cout << a.capacity() << endl;
Result:
0
16
10
m_gates[index].release(); m_gates.shrink_to_fit();
Based on your comment, what you're looking for is simply to erase this single element from your vector right then and there. Replace both of these statements with:
m_gates.erase(m_gates.begin() + index);
Or a more generic version if swapping containers in the future is a possibility:
using std::begin;
m_gates.erase(std::next(begin(m_gates), index));
erase supports iterators rather than indices, so there's a conversion in there. This will remove the pointer from the vector while calling its destructor, which causes unique_ptr to properly clean up its memory.
Now erasing elements one by one could potentially be a performance concern. If it does end up being a concern, you can do what you were getting at in the question and null them out, then remove them all in one go later on:
m_gates[index].reset();
// At some point in the program's future:
std::erase(m_gates, nullptr);
What you have right now is highly likely to be a memory leak. release releases ownership of the managed memory, meaning you're now responsible for cleaning it up, which isn't what you were looking for. Both erase and reset (or equivalently, = {} or = nullptr) will actually call the destructor of unique_ptr while it still has ownership and properly clean up the memory. shrink_to_fit is for vector capacity, not size, and is unrelated.
at the end the solution that i found was simple:
void Controller::delete_allocated_memory(int index)
{
m_vec.erase(m_vec.begin() + index);
m_vec.shrink_to_fit();
}
it works fine even if the vector is made of unique_ptrs, as far as i know it doesn't even create the null pointer that i was talking about and it shifts left all existing objects in the vector.
what do you think?

Strange output from dereferencing pointers into a vector

I was writing a program in C++ where I need to have a 2d grid of pointers which point to objects which are stored in a vector. I tested out some part of the program and saw strange results in the output.
I changed the objects to integers and removed everything non-essential to cut it down to the code snippet below, but I still get a weird output.
vector<vector<int*>> lattice(10, vector<int*>(10));//grid of pointers
vector<int> relevant;//vector carrying actual values onto which pointers will point
for(int i = 0; i<10; i++){
int new_integer = i;
relevant.push_back(new_integer);//insert integer into vector
lattice[0][i] = &relevant[i];//let pointer point onto this value
}
//OUTPUT
for(int j = 0; j<10; j++){
cout<<*lattice[0][j]<<" ";
cout<<relevant[j]<<endl;
}
I get strange outputs like this:
19349144 0
19374040 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
Also, my output changes from run to run and depending on how big/small I make my grid.
I would expect all the values on the left and right to be equal, I guess there is something fundamental about pointers I haven't understood, so sorry if this is a very basic question.
Can someone please explain why I get strange outputs for some values of the grid?
I need to have a 2d grid of pointers which point to objects which are stored in a vector
That's impossible. Or rather, that's a guarantee for dereferencing invalid addresses. Why?
At any given time, an std::vector has enough space allocated for some limited number of elements. If you keep adding elements to it, it will eventually max out its storage. At some insertion, it will decide allocate a new stretch of memory to use for storing its data; move (or copy) existing data to the new storage area; free the old storage area; and then be able to add more elements.
When this happens, all existing pointers into objects in the vector become invalid. The memory they point to may continue to hold the previous values, but may also be used to store other data - there are no guarantees on that! In fact, dereferencing invalid pointers officially results in undefined behavior.
... and I do see that what I've described is exactly what you're doing with your code. Your older pointers become invalid.
Instead, consider keeping indices into the vector rather than pointers. Indices don't get invalidated by adding elements to the vector, and you can keep using them.
PS - I see you're using a vector-of-vectors. That's technically valid, but it is often unadvisable. Consider using a matrix class (e.g. from the Eigen library) or allocating a certain amount of memory using std::make_unique(), then using it to initialize a gsl::multi_span.
relevant.push_back invalidates all pointers/references/iterators to its elements (if the new size exceeds its current capacity).
Therefore you are dereferencing potentially invalid pointers when you do
*lattice[0][j]
later.
You can use a container that doesn't invalidate on insertion at the end, such as std::list or std::deque, instead of std::vector, for relevant.
(Or you can reserve a sufficient capacity with a call to .reserve first, so that the size on the .push_back operations will never exceed the capacity and will therefore never invalidate pointers, but that carries the risk of easily accidentally ignoring this requirement on later code changes, again causing UB.)
When a std::vector is inserted into, if its new size() will exceed its current capacity(), the vector has to reallocate its internal array to make room, which will invalidate any existing iterators and pointers to the old memory.
In your example, you can avoid that by reserve()'ing the capacity() ahead of time, eg:
vector<vector<int*>> lattice(10, vector<int*>(10));//grid of pointers
vector<int> relevant;//vector carrying actual values onto which pointers will point
//ADD THIS!
relevant.reserve(10);//pre-allocate the capacity
for(int i = 0; i<10; i++){
int new_integer = i;
relevant.push_back(new_integer);//insert integer into vector
lattice[0][i] = &relevant[i];//let pointer point onto this value
}
//OUTPUT
for(int j = 0; j<10; j++){
cout<<*lattice[0][j]<<" ";
cout<<relevant[j]<<endl;
}
Alternatively, you can pre-allocate the size() and then use operator[] instead of push_back(), eg:
vector<vector<int*>> lattice(10, vector<int*>(10));//grid of pointers
vector<int> relevant(10);//vector carrying actual values onto which pointers will point
for(int i = 0; i<10; i++){
int new_integer = i;
relevant[i] = new_integer;//insert integer into vector
lattice[0][i] = &relevant[i];//let pointer point onto this value
}
//OUTPUT
for(int j = 0; j<10; j++){
cout<<*lattice[0][j]<<" ";
cout<<relevant[j]<<endl;
}

C++ Iterate through an expanding container

What will happen if I expand a container while iterating through it, may I meet the new elements or just the old ones
std::vector<int> arr(0);
arr.push_back(1);
arr.push_back(2);
arr.push_back(3);
for (auto& ele : arr) {
if (4 == ele) std::cout << "meet new eles" << endl;
arr.push_back(4);
}
push_back may invalidate any iterator to the vector, what you have there is undefined behaviour.
std::vector internally allocates an array with its elements stored contiguously in memory. To do so it needs to preallocate an estimate of the size it may need, that's known as the capacity of the vector. At push_back if the capacity is reached it allocates a new bigger array, copies/moves all the contents of the previous array to the new one and then deletes the old one. That invalidates the iterators, that keep pointing to a deleted array.
Also worth mentioning that the allocation of the new array adds a considerable performance hit. If you know the size that your vector will have always use reserve() to preallocate memory.
Now, true is that if you reserve capacity before hand and never push back exceeding that capacity the iterators won't be invalidated (only the past-end iterator) and you can keep incrementing them since they point to an array of contiguous elements. But I wouldn't advice to do so, IMO the risk of reallocating the vector by mistake is not worth it.
If you are guaranteed to never remove elements the simplest solution would still be to use a regular for loop.
for(size_t i=0, n=arr.size(); i<n; ++i)
{
if(/* something */)
{
arr.push_back(/* ... */);
++n;
}
}
I know it's a trivial solution, but others maybe searching and will find this question.
Good note from #Long_GM. This will not work on every container. std::map for example cannot be iterated in this manner whether or not you insert any values.

Different addresses while filling a std::vector

Wouldn't you expect the addresses printed by the two loops to be the same? I was, and I cannot understand why (sometimes) they are different.
#include <iostream>
#include <vector>
using namespace std;
struct S {
void print_address() {
cout << this << endl;
}
};
int main(int argc,char *argv[]) {
vector<S> v;
for (size_t i = 0; i < 10; i++) {
v.push_back( S() );
v.back().print_address();
}
cout << endl;
for (size_t i = 0; i < v.size(); i++) {
v[i].print_address();
}
return 0;
}
I tested this code with many local and on-line compilers and the output I get looks like this (the last three figures are always the same):
0xaec010
0xaec031
0xaec012
0xaec013
0xaec034
0xaec035
0xaec036
0xaec037
0xaec018
0xaec019
0xaec010
0xaec011
0xaec012
0xaec013
0xaec014
0xaec015
0xaec016
0xaec017
0xaec018
0xaec019
I spotted this because making some initialization in the first loop I obtained uninitialized object in the subsequent part of the program. Am I missing something?
Because when vector capicity changes, it reallocates elements. If you std::vector::reserve enough capacity, no reallcation is needed, it will print same address.
vector<S> v;
v.reserve(10);
Note: properly use std::vector::reserve will increase application performance, because no unnecessary reallocation and objects copy.
The vector is performing re-allocations in order to grow as needed. Each time it does this, it allocates a larger buffer for the data and copies the elements across. You can see this clearly in the first loop, where each address jump is followed by a larger sequence of consecutive addresses. In the second loop, you just look at the addresses after the final reallocation.
0xaec010
0xaec031 <--
0xaec012 <--
0xaec013
0xaec034 <--
0xaec035
0xaec036
0xaec037
0xaec018 <--
0xaec019
The simplest way to instantiate a vector with 10 S objects would be
std::vector<S> v(10);
This would involve no re-allocations. See also std::vector::reserve.
Vector elements are stored contiguously; that is, they're all in a row in memory. Your vector object has to allocate space for this contiguous block of elements.
Your vector can't just keep having things added to it indefinitely. It has to grow the space it has allocated. The memory model typically doesn't allow us to expand a memory block — we have to create a new one instead. When the vector does this, it has to move all its elements to the new space. This is occurring several times within your first loop.
If you'd done:
vector<S> v;
v.reserve(10);
(which you can, since you know you'll end up with 10 elements), then no re-allocation would have been necessary, and the addresses would not have changed.
I'm not really surprised that they can change. As the vector initially has no size, it's likely to reallocate the vector once or twice during the initial loop. That'll change the base address of the vector. It's not impossible that after a resize, you'll end up using an address you used before (though I find that somewhat surprising. Are you sure about the first part of the addresses?)
If you want to ensure they don't change, you need to add a v.reserve() before you start pushing stuff on it.

c++ vectors and pointers

As i understand if i dont store pointers, everything in c++ gets copied, which can lead to bad performance (ignore the simplicity of my example). So i thought i store my objects as pointers instead of string object inside my vector, thats better for performance right? (assumining i have very long strings and lots of them).
The problem when i try to iterate over my vector of string pointers is i cant extract the actual value from them
string test = "my-name";
vector<string*> names(20);
names.push_back(&test);
vector<string*>::iterator iterator = names.begin();
while (iterator != names.end())
{
std::cout << (*iterator) << ":" << std::endl;
// std::cout << *(*iterator); // fails
iterator++;
}
See the commented line, i have no problem in receiving the string pointer. But when i try to get the string pointers value i get an error (i couldnt find what excatly the error is but the program just fails).
I also tried storing (iterator) in a new string variable and but it didnt help?
You've created the vector and initialized it to contain 20 items. Those items are being default initialized, which in the case of a pointer is a null pointer. The program is having trouble dereferencing those null pointers.
One piece of advice is to not worry about what's most efficient until you have a demonstrated problem. This code would certainly work much better with a vector<string> versus a vector<string*>.
No, no, a thousand times no.
Don't prematurely optimize. If the program is fast, there's no need to worry about performance. In this instance, the pointers clearly reduce performance by consuming memory and time, since each object is only the target of a single pointer!
Not to mention that manual pointer programming tends to introduce errors, especially for novices. Sacrificing correctness and stability for performance is a huge step backwards.
The advantage of C++ is that it simplifies the optimization process by providing encapsulated data structures and algorithms. So when you decide to optimize, you can usually do so by swapping in standard parts.
If you want to learn about optimizing data structures, read up on smart pointers.
This is probably the program you want:
vector<string> names(20, "my-name");
for ( vector<string>::iterator iterator = names.begin();
iterator != names.end();
++ iterator )
{
std::cout << *iterator << '\n';
}
Your code looks like you're storing a pointer to a stack-based variable into a vector. As soon as the function where your string is declared returns, that string becomes garbage and the pointer is invalid. If you're going to store pointers into a vector, you probably need to allocate your strings dynamically (using new).
Have a look at your initialization:
string test = "my-name";
vector<string*> names(20);
names.push_back(&test);
You first create a std::vector with 20 elements.
Then you use push_back to append a 21st element, which points to a valid string. That's fine, but this element is never reached in the loop: the first iteration crashes already since the first 20 pointers stored in the vector don't point to valid strings.
Dereferencing an invalid pointer causes a crash. If you make sure that you have a valid pointers in your vector, **iterator is just fine to access an element.
Try
if (*iterator)
{
std::cout << *(*iterator) << ":" << std::endl;
}
Mark Ransom explains why some of the pointers are now
string test = "my-name";
vector<string*> names(20);
The size of vector is 20, meaning it can hold 20 string pointers.
names.push_back(&test);
With the push_back operation, you are leaving out the first 20 elements and adding a new element to the vector which holds the address of test. First 20 elements of vector are uninitialized and might be pointing to garbage. And the while loop runs till the end of vector whose size is 21 and dereferencing uninitialized pointers is what causing the problem. Since the size of vector can be dynamically increased with a push_back operation, there is no need to explicitly mention the size.
vector<string*> names; // Not explicitly mentioning the size and the rest of
// the program should work as expected.