c++ vectors and pointers - c++

As i understand if i dont store pointers, everything in c++ gets copied, which can lead to bad performance (ignore the simplicity of my example). So i thought i store my objects as pointers instead of string object inside my vector, thats better for performance right? (assumining i have very long strings and lots of them).
The problem when i try to iterate over my vector of string pointers is i cant extract the actual value from them
string test = "my-name";
vector<string*> names(20);
names.push_back(&test);
vector<string*>::iterator iterator = names.begin();
while (iterator != names.end())
{
std::cout << (*iterator) << ":" << std::endl;
// std::cout << *(*iterator); // fails
iterator++;
}
See the commented line, i have no problem in receiving the string pointer. But when i try to get the string pointers value i get an error (i couldnt find what excatly the error is but the program just fails).
I also tried storing (iterator) in a new string variable and but it didnt help?

You've created the vector and initialized it to contain 20 items. Those items are being default initialized, which in the case of a pointer is a null pointer. The program is having trouble dereferencing those null pointers.
One piece of advice is to not worry about what's most efficient until you have a demonstrated problem. This code would certainly work much better with a vector<string> versus a vector<string*>.

No, no, a thousand times no.
Don't prematurely optimize. If the program is fast, there's no need to worry about performance. In this instance, the pointers clearly reduce performance by consuming memory and time, since each object is only the target of a single pointer!
Not to mention that manual pointer programming tends to introduce errors, especially for novices. Sacrificing correctness and stability for performance is a huge step backwards.
The advantage of C++ is that it simplifies the optimization process by providing encapsulated data structures and algorithms. So when you decide to optimize, you can usually do so by swapping in standard parts.
If you want to learn about optimizing data structures, read up on smart pointers.
This is probably the program you want:
vector<string> names(20, "my-name");
for ( vector<string>::iterator iterator = names.begin();
iterator != names.end();
++ iterator )
{
std::cout << *iterator << '\n';
}

Your code looks like you're storing a pointer to a stack-based variable into a vector. As soon as the function where your string is declared returns, that string becomes garbage and the pointer is invalid. If you're going to store pointers into a vector, you probably need to allocate your strings dynamically (using new).

Have a look at your initialization:
string test = "my-name";
vector<string*> names(20);
names.push_back(&test);
You first create a std::vector with 20 elements.
Then you use push_back to append a 21st element, which points to a valid string. That's fine, but this element is never reached in the loop: the first iteration crashes already since the first 20 pointers stored in the vector don't point to valid strings.
Dereferencing an invalid pointer causes a crash. If you make sure that you have a valid pointers in your vector, **iterator is just fine to access an element.

Try
if (*iterator)
{
std::cout << *(*iterator) << ":" << std::endl;
}
Mark Ransom explains why some of the pointers are now

string test = "my-name";
vector<string*> names(20);
The size of vector is 20, meaning it can hold 20 string pointers.
names.push_back(&test);
With the push_back operation, you are leaving out the first 20 elements and adding a new element to the vector which holds the address of test. First 20 elements of vector are uninitialized and might be pointing to garbage. And the while loop runs till the end of vector whose size is 21 and dereferencing uninitialized pointers is what causing the problem. Since the size of vector can be dynamically increased with a push_back operation, there is no need to explicitly mention the size.
vector<string*> names; // Not explicitly mentioning the size and the rest of
// the program should work as expected.

Related

Vector of pointers undefined behaviour

I'm trying to make a vector of pointers whose elements are pointing to vector of int elements. (I'm solving a competitive programming-like problem, that's why it sounds kinda nonsense).
but here's the code:
#include <bits/stdc++.h>
using namespace std;
int ct = 0;
vector<int> vec;
vector<int*> rf;
void addRef(int n){
vec.push_back(n);
rf.push_back(&vec[ct]);
ct++;
}
int main(){
addRef(1);
addRef(2);
addRef(5);
for(int i = 0; i < ct; i++){
cout << *rf[i] << ' ';
}
cout << endl;
for(int i = 0; i < ct; i++){
cout << vec[i] << ' ';
}
}
When I execute the code, it's showing weird behaviour that I don't understand. The first element of rf (vector<int*>) seems not pointing to the vec's (vector<int>) element, where the rest of the elements are pointing to it.
here's the output when I run it on Dev-C++:
1579600 2 5
1 2 5
When I tried to run the code here, the output is even weirder:
1197743856 0 5
1 2 5
The code is intended to have same output between the first line and the second.
Can you guys explain why it happens? Is there any mistake in my implementation?
thanks
Adding elements to a std::vector with push_back or similar may invalidate all iterators and references to its elements. See https://en.cppreference.com/w/cpp/container/vector/push_back.
The idea is that in order to grow the vector, it may not have enough free memory to expand into, and thus may have to move the whole array to some other location in memory, freeing the old block. That means in particular that your pointers now point to memory that has been freed, or reused for something else.
If you want to keep this approach, you will need to resize() or reserve() a sufficient number of elements in vec before starting. Which of course defeats the whole purpose of a std::vector, and you might as well use an array instead.
The vector is changing sizes and the addresses you are saving might not be those you want. You can preallocate memory using reserve() and the vector will not resize.
vec.reserve(3);
addRef(1);
addRef(2);
addRef(5);
The problem occurs when you call vec.push_back(n) and vec’s internal array is already full. When that happens, the std::vector::push_back() method allocates a larger array, copies the contents of the full array over to the new array, then frees the old/full array and keeps the new one.
Usually that’s all you need, but your program is keeping pointers to elements of the old array inside (rf), and these pointers all become dangling/invalid when the reallocation occurs, hence the funny (undefined) behavior.
An easy fix would be to call vec.reserve(100) (or similar) at the top of your program (so that no further reallocations are necessary). Or alternatively you could postpone the adding of pointers to (rf) until after you’ve finished adding all the values to (vec).
Just do not take pointer from a vector that may change soon. vector will copy the elements to a new space when it enlarges its capacity.
Use an array to store the ints instead.

does shrink_to_fit() function removes null pointers?

I wanted to ask you about the vector::shrink_to_fit() function.
Lets say i've got a vector of pointers to objects (or unique_ptr in my case)
and i want to resize it to the amount of objects that it stores.
At some point i remove some of the objects from the vector by choice using the release() function of unique_ptr
so there is a null pointer in that specific place in the vector as far as i know.
So i want to resize it and remove that null pointer in between the elements of the vector and i'm asking if i could do that with shrink_to_fit() function?
No, shrink_to_fit does not change the contents or size of the vector. All it might do is release some of its internal memory back to a lower level library or the OS, etc. behind the scenes. It may invalidate iterators, pointers, and references, but the only other change you might see would be a reduction of capacity(). It's also valid for shrink_to_fit to do absolutely nothing at all.
It sounds like you want the "Erase-remove" idiom:
vec.erase(std::remove(vec.begin(), vec.end(), nullptr), vec.end());
The std::remove shifts all the elements which don't compare equal to nullptr left, filling the "gaps". But it doesn't change the vector's size; instead it returns an iterator to the position in the vector just after the sequence of shifted elements; the rest of the elements still exist but have been moved from. Then the erase member function gets rid of those unnecessary end elements, reducing the vector's size.
Or as #chris notes, C++20 adds an erase overload to std::vector and a related erase_if, which makes things easier. They may already be supported in MSVC 2019. Using the new erase could just look like:
vec.erase(nullptr);
This quick test show that u can't do like this.
int x = 1;
vector<int*> a;
cout << a.capacity() << endl;
for (int i = 0; i < 10; ++i) {
a.push_back(&x);
}
cout << a.capacity() << endl;
a[9] = nullptr;
a.shrink_to_fit();
cout << a.capacity() << endl;
Result:
0
16
10
m_gates[index].release(); m_gates.shrink_to_fit();
Based on your comment, what you're looking for is simply to erase this single element from your vector right then and there. Replace both of these statements with:
m_gates.erase(m_gates.begin() + index);
Or a more generic version if swapping containers in the future is a possibility:
using std::begin;
m_gates.erase(std::next(begin(m_gates), index));
erase supports iterators rather than indices, so there's a conversion in there. This will remove the pointer from the vector while calling its destructor, which causes unique_ptr to properly clean up its memory.
Now erasing elements one by one could potentially be a performance concern. If it does end up being a concern, you can do what you were getting at in the question and null them out, then remove them all in one go later on:
m_gates[index].reset();
// At some point in the program's future:
std::erase(m_gates, nullptr);
What you have right now is highly likely to be a memory leak. release releases ownership of the managed memory, meaning you're now responsible for cleaning it up, which isn't what you were looking for. Both erase and reset (or equivalently, = {} or = nullptr) will actually call the destructor of unique_ptr while it still has ownership and properly clean up the memory. shrink_to_fit is for vector capacity, not size, and is unrelated.
at the end the solution that i found was simple:
void Controller::delete_allocated_memory(int index)
{
m_vec.erase(m_vec.begin() + index);
m_vec.shrink_to_fit();
}
it works fine even if the vector is made of unique_ptrs, as far as i know it doesn't even create the null pointer that i was talking about and it shifts left all existing objects in the vector.
what do you think?

C++: std::vector with space reserving for both ends

Just some thoughts:
I wanted to erase x elements from the beginning of a std::vector with size n > x. Since a vector uses an array internally, this could be as easy as setting the pointer for vector.begin() to the index of the first element which is kept. But instead, erase shifts all elements in the array to that the first element actually starts at index 0, making the erase operation take much more time than it could.
Furthermore, if the valid 'zone' of the internal array was really controlled by just some start and end indices / pointers of the vector structure, then there would be also the option to reserve space in front of the first element. E.g., a vector is initialized and 20 spaces are reserved at the end and 10 at the beginning. Internally, then an array of space 30 (or 32) is created, where the start index/pointer points to the 11th value of the internal array, allowing to include new elements to the front of the 'vector' in constant speed.
Well, my point is, I think such a data structure would be somewhat useful, at least for my purposes. I'm pretty sure someone already thought of this and already implemented it. So I want to ask: How is the data structure called that I'm describing? If it exists, I'd love to use it. I think this is not a double-linked list, since there, every element is kind of a struct containing the element value and additional pointers to the neighbors (to my knowledge).
EDIT: And yes, such a structure would probably use more memory than necessary, especially when erasing some elements from the beginning, because then, the internal array still has the initial size. But well, memory isn't a big issue anymore for most problems, and there could be a (time-expensive) 'memory-optimize' operation to create a new, smaller array, copying over all old values and deleting the old internal array to use the smallest possible size.
Expanding on #Kerrek SB's comment, boost::circular_buffer<> does I think what you need, for example:
#include <iostream>
#include <boost/circular_buffer.hpp>
int main()
{
boost::circular_buffer<int> cb(3);
cb.push_back(1);
cb.push_back(2);
cb.push_back(3);
for( auto i : cb) {
std::cout << i << std::endl;
}
// Increase to hold two more items
cb.set_capacity(5);
cb.push_back(4);
cb.push_back(5);
for( auto i : cb) {
std::cout << i << std::endl;
}
// Increase to hold two more items
cb.rset_capacity(7);
cb.push_front(0);
cb.push_front(-1);
for( auto i : cb) {
std::cout << i << std::endl;
}
}
TBH - I have not looked at the implementation, so cannot comment on whether it moves data around (I'd be highly surprised.) but if you pull down the source, take a quick peek to satisfy if performance is a concern...
EDIT: Quick look at the code reveals that the push_xxx operations does not indeed move data around, however the xxx_capacity operations do result in a move/copy - to avoid that, ensure the ring has enough capacity at the start and it will work as you wish...

Why doesn't vector::clear remove elements from a vector?

When I use clear() on a std::vector, it is supposed to destroy all the elements in the vector, but instead it doesn't.
Sample code:
vector<double> temp1(4);
cout << temp1.size() << std::endl;
temp1.clear();
cout << temp1.size() << std::endl;
temp1[2] = 343.5; // I should get segmentation fault here ....
cout << "Printing..... " << temp1[2] << endl;
cout << temp1.size() << std::endl;
Now, I should have gotten segmentation fault while trying to access the cleared vector, but instead it fills in the value there (which according to me is very buggy)
Result looks as follows:
4
0
Printing..... 343.5
0
Is this normal? This is a very hard bug to spot, which basically killed my code for months.
You have no right to get a segmentation fault. For that matter, a segmentation fault isn't even part of C++. Your program is removing all elements from the vector, and you're illegally accessing the container out of bounds. This is undefined behaviour, which means anything can happen. And indeed, something happened.
When you access outside of the bounds of a vector, you get Undefined Behavior. That means anything can happen. Anything.
So you could get the old value, garbage, or a seg-fault. You can't depend on anything.
If you want bounds checking, use the at() member function instead of operator []. It will throw an exception instead of invoking Undefined Behavior.
From cppreference:
void clear();
Removes all elements from the container. Invalidates any references, pointers, or iterators referring to contained elements. May invalidate any past-the-end iterators. Many implementations will not release allocated memory after a call to clear(), effectively leaving the capacity of the vector unchanged.
So the reason there is no apparent problem is because the vector still has the memory available in store. Of course this is merely implementation-specific, but not a bug. Besides, as the other answers point out, your program also does have Undefined Behavior for accessing the cleared contents in the first place, so technically anything can happen.
Let's imagine you're rich (perhaps you are or you aren't ... whatsoever)!
Since you're rich you buy a piece of land on Moorea (Windward Islands, French Polynesia).
You're very certain it is a nice property so you build a villa on that island and you live there.
Your villa has a pool, a tennis court, a big garage and even more nice stuff.
After some time you leave Moorea since you think it's getting really boring. A lot of sports but few people.
You sell your land and villa and decide to move somewhere else.
If you come back some time later you may encounter a lot of different things but you cannot be certain about even one of them.
Your villa may be gone, replaced by a club hotel.
Your villa may be still there.
The island may be sunken.
...
Who knows?
Eventhough the villa may not longer belong to you, you might even be able to jump in the pool or play tennis again.
There may also be another villa next to it where you can swim in an even bigger pool with nobody distracting you.
You have no guarantee of what you're gong to discover if you come back again and that's the same with your vector which contains three pointers in the implementations I've looked at:
(The names may be different but the function is mostly the same.)
begin points to the start of the allocated memory location (i.e. X)
end which points to the end of the allocated memory +1 (i.e. begin+4)
last which points to the last element in the container +1 (i.e. begin+4)
By calling clear the container may well destroy all elements and reset last = begin;. The function size() will most likely return last-begin; and so you'll observe a container size of 0.
Nevertheless, begin may still be valid and there may still be memory allocated (end may still be begin+4). You can even still observe values you set before clear().
std::vector<int> a(4);
a[2] = 12;
cout << "a cap " << a.capacity() << ", ptr is " << a.data() << ", val 2 is " << a[2] << endl;
a.clear();
cout << "a cap " << a.capacity() << ", ptr is " << a.data() << ", val 2 is " << a[2] << endl;
Prints:
a cap 4, ptr is 00746570, val 2 is 12
a cap 4, ptr is 00746570, val 2 is 12
Why don't you observe any errors? It is because std::vector<T>::operator[] does not perform any out-of-boundary checks (in contrast to std::vector<T>::at() which does).
Since C++ doesn't contain "segfaults" your program seems to operate properly.
Note: On MSVC 2012 operator[] performs boundary checks if compiled in the debug mode.
Welcome to the land of undefined behaviour! Things may or may not happen. You probably can't even be cartain about a single circumstance.
You can take a risk and be bold enough to take a look into it but that is probably not the way to produce reliable code.
The operator[] is efficient but comes at a price: it does not perform boundary checking.
There are safer yet efficient way to access a vector, like iterators and so on.
If you need a vector for random access (i.e. not always sequential), either be very careful on how you write your programs, or use the less efficient at(), which in the same conditions would have thrown an exception.
you can get seg fault but this is not for sure since accessing out of range elements of vector with operator[] after clear() called before is just undefined behavior. From your post it looks like you want to try if elements are destroyed so you can use at public function for this purpose:
The function automatically checks whether n is within the bounds of
valid elements in the vector, throwing an out_of_range exception if it
is not (i.e., if n is greater or equal than its size). This is in
contrast with member operator[], that does not check against bounds.
in addition, after clear():
All iterators, pointers and references related to this container are
invalidated.
http://www.cplusplus.com/reference/vector/vector/at/
try to access to an elements sup than 4 that you use for constructor may be you will get your segmentation fault
An other idea from cplusplus.com:
Clear content
Removes all elements from the vector (which are destroyed), leaving the container with a size of 0.
A reallocation is not guaranteed to happen, and the vector capacity is not guaranteed to change due to calling this function. A typical alternative that forces a reallocation is to use swap:
vector().swap(x); // clear x reallocating
If you use
temp1.at(2) = 343.5;
instead of
temp1[2] = 343.5;
you would find the problem. It is recommended to use the function of at(), and the operator[] doesn't check the boundary. You can avoid the bug without knowing the implementation of STL vector.
BTW, i run your code in my Ubuntu (12.04), it turns out like what you say. However, in Win7, it's reported "Assertion Failed".
Well, that reminds me of the type of stringstream. If define the sentence
stringstream str;
str << "3456";
If REUSE str, I was told to do like this
str.str("");
str.clear();
instead of just using the sentence
str.clear();
And I tried the resize(0) in Ubuntu, it turns out useless.
Yes this is normal. clear() doesn't guarantee a reallocation. Try using resize() after clear().
One important addition to the answers so far: If the class the vector is instanciated with provides a destructor, it will be called on clearing (and on resize(0), too).
Try this:
struct C
{
char* data;
C() { data = strdup("hello"); }
C(C const& c) { data = strdup(c.data); }
~C() { delete data; data = 0; };
};
int main(int argc, char** argv)
{
std::vector<C> v;
v.push_back(C());
puts(v[0].data);
v.clear();
char* data = v[0].data; // likely to survive
puts(data); // likely to crash
return 0;
}
This program most likely will crash with a segmentation fault - but (very likely) not at char* data = v[0].data;, but at the line puts(data); (use a debugger to see).
Typical vector implementations leave the memory allocated intact and leave it as is just after calling the destructors (however, no guarantee - remember, it is undefined behaviour!). Last thing that was done was setting data of the C instance to nullptr, and although not valid in sence of C++/vector, the memory is still there, so can access it (illegally) without segmentation fault. This will occur when dereferencing char* data pointer in puts, as being null...

C++ How can I iterate till the end of a dynamic array?

suppose I declare a dynamic array like
int *dynArray = new int [1];
which is initialized with an unknown amount of int values at some point.
How would I iterate till the end of my array of unknown size?
Also, if it read a blank space would its corresponding position in the array end up junked?
Copying Input From users post below:
Thing is:
a) I'm not allowed to use STL (means: no )
b) I want to decompose a string into its characters and store them. So far I wanted to use a function like this:
string breakLine (string line){
int lineSize = line.size();
const char *aux;
aux=line.data();
int index=0;
while (index<=lineSize){
mySynonyms[index]=aux[index];
index++;
}
I thought that the array aux would end up junked if there was a large blank space between the two numbers to be stored (apparently not). And I was wondering if there was a way to iterate till an undefined end in this type of array. Thanks for you answers.
You don't: wrap the array into a structure that remembers its length: std::vector.
std::vector v(1);
std::for_each( v.begin(), v.end(), ... );
No portable way of doing this. Either pass the size together with the array, or, better, use a standard container such as std::vector
Short answer is that you can't. If you have a pointer to the first element of an array, you can't know what the size of the array is. Why do you want to use a array in the first place. You would be much better off using a std::vector if your array can change size dynamically, or a boost::Array if it will be a fixed size.
I don't understand your second question.
Your code needs to keep to track of the array, so the size would never be unknown. (Or you would have to use some library with code that does this.)
I don't understand the last part of your quesiton. Could you elaborate?
You explained in your post below that you want to look at the guts of a std::string.
If you are expecting your stirng to be like a c-string (aka doesn't contain NULLs), then use line.c_str() instead of line.data(). This will guarantee that aux points to a null terminates c-style string.
After that you can iterate until aux[index] == '\0';
Otherwise, you can use line.data() and string.length/size to get it's size like in your example.
However, "decomposing a string into its characters" is pretty pointless, a string is an array of characters. Just make of copy of the string and store that. You are allowed to do:
char ch = line[index];
Better yet, use iterators on the original string!
for(std::string::const_iterator it = line.begin(); it != line.end(); ++it) {
const char ch = *it;
// do whatever with ch
}
a) I'm not allowed to use STL (means:
no )
What?? Who's moronic idea was that?
std::vector isn't part of the "STL" (which is a copyrighted product of HP), but is (and has been for nearly a decade) part of the C++ Language Standard.
If you're not allowed to use the STL (for whatever reason), the first thing you want to do is actually to implement your own version of it – at least the parts you need, with the level of customizability you need. For example, it's probably overkill to make your own vector class parametrizable with a custom allocator. But nevertheless do implement your own lightweight vector. Everything else will result in a bad, hardly maintainable solution.
This smells like homework, and the teacher's objective is to give you a feeling of what it takes to implement dynamic arrays. So far you're getting an F.
You need to realize that when you allocate memory like this
int *dynArray = new int [1];
you allocate precisely one integer, not an indefinite number of integers to be expanded by some unidentified magic. Most importantly, you can only say
dynArray[0] = 78;
but you cannot say
dynArray[1] = 8973;
The element at index 1 does not exist, you're stepping into memory that was not reserved for you. This particular violation will result in a crash later on, when you deallocate the array, because the memory where you stored 8973 belongs to the heap management data structures, and you corrupted your heap.
As many other responders mention, you must know how many elements you have in the array at all times. So, you have to do something along the lines of
int arraySize = 1;
int *dynArray = new int [arraySize];
arraySize goes together with the array, and is best combined with dynArray in one C++ object.
Now, before you assign to dynarray[1], you have to re-allocate the array:
if (index > arraySize) {
int newSize = index+1;
int *newArray = new int[newSize]
// don't forget to copy the data from old array to new
memcpy(newarray dynArray, sizeof *newArray * arraySize);
arraySize = newSize;
dynArray = newArray;
}
// now you're ready!
dynArray[index] = value;
Now, if you want to make it a bit more efficient, you allocate more than you need, so you don't have to allocate each time you add an element. I'll leave this as an exercise to the reader.
And after doing all this, you get to submit your homework and you get to appreciate the humble std::vector that does all of this for you, plus a lot more.
Use a vector, which has a vector.size() function that returns an integer and a vector.end() function that returns an iterator.
You could create a simple Vector class that has only the methods you need. I actually had to recreate the Vector class for a class that I took this year, it's not very difficult.
If there's a value that cannot be valid, you can use that as a sentinel, and make sure all of your arrays are terminated with that. Of course, it's error-prone and will cause hard-to-find bugs when you happen to miss doing it once, but that's what we used to do while reading files in FORTRAN (back in the all-caps days, and before END= became standard).
Yes, I'm dating myself.