Filling a vector with out-of-order data in C++ - c++

I'd like to fill a vector with a (known at runtime) quantity of data, but the elements arrive in (index, value) pairs rather than in the original order. These indices are guaranteed to be unique (each index from 0 to n-1 appears exactly once) so I'd like to store them as follows:
vector<Foo> myVector;
myVector.reserve(n); //total size of data is known
myVector[i_0] = v_0; //data v_0 goes at index i_0 (not necessarily 0)
...
myVector[i_n_minus_1] = v_n_minus_1;
This seems to work fine for the most part; at the end of the code, all n elements are in their proper places in the vector. However, some of the vector functions don't quite work as intended:
...
cout << myVector.size(); //prints 0, not n!
It's important to me that functions like size() still work--I may want to check for example, if all the elements were actually inserted successfully by checking if size() == n. Am I initializing the vector wrong, and if so, how should I approach this otherwise?

myVector.reserve(n) just tells the vector to allocate enough storage for n elements, so that when you push_back new elements into the vector, the vector won't have to continually reallocate more storage -- it may have to do this more than once, because it doesn't know in advance how many elements you will insert. In other words you're helping out the vector implementation by telling it something it wouldn't otherwise know, and allowing it to be more efficient.
But reserve doesn't actually make the vector be n long. The vector is empty, and in fact statements like myVector[0] = something are illegal, because the vector is of size 0: on my implementation I get an assertion failure, "vector subscript out of range". This is on Visual C++ 2012, but I think that gcc is similar.
To create a vector of the required length simply do
vector<Foo> myVector(n);
and forget about the reserve.
(As noted in the comment you an also call resize to set the vector size, but in your case it's simpler to pass the size as the constructor parameter.)

You need to call myVector.resize(n) to set (change) the size of the vector. calling reserve doesn't actually resize the vector, it just makes it so you can later resize without reallocating memory. Writing past the end of the vector (as you are doing here -- the vector size is still 0 when you write to it) is undefined behavior.

Related

Memory management when using vector

I am making a game engine and need to use the std::vector container for all of the components and entities in the game.
In a script the user might need to hold a pointer to an entity or component, perhaps to continuously check some kind of state. If something is added to the vector that the pointer points to and the capacity is exceeded, it is my understanding that the vector will allocate new memory and every pointer that points to any element in the vector will become invalid.
Considering this issue i have a couple of possible solutions. After each push_back to the vector, would it be a viable to check if a current capacity variable is exceeded by the actual capacity of the vector? And if so, fetch and overwrite the old pointers to the new ones? Would this guarantee to "catch" every case that invalidates pointers when performing a push_back?
Another solution that i've found is to instead save an index to the element and access it that way, but i suspect that is bad for performance when you need to continuously check the state of that element (every 1/60 second).
I am aware that other containers do not have this issue but i'd really like to make it work with a vector. Also it might be worth noting that i do not know in advance how many entities / components there will be.
Any input is greatly appreciated.
You shouldn't worry about performance of std::vector when you access its element only 60 times per second. By the way, in Release compilation mode std::vector::operator[] is being converted to a single lea opcode. In Debug mode it is decorated by some runtime range checks though.
If the user is going to store pointers to the objects, why even contain them in a vector?
I don't feel like it is a good idea to (poor wording)->store pointers to objects in a vector. (what I meant is to create pointers that point to vector elements, i.e. my_ptr = &my_vec[n];) The whole point of a container is to reference the contents in the normal ways that the container supports, not to create outside pointers to elements of the container.
To answer your question about whether you can detect the allocations, yes you could, but it is still probably a bad idea to reference the contents of a vector by pointers to elements.
You could also reserve space in the vector when you create it, if you have some idea of what the maximum size might grow to. Then it would never resize.
edit:
After reading other responses, and thinking about what you asked, another thought occurred. If your vector is a vector of pointers to objects, and you pass out the pointers to the objects to your clients, resizing the vector does not invalidate the pointers that the vector hold. The issue becomes keeping track of the life of the object (who owns it), which is why using shared_ptr would be useful.
For example:
vector<shared_ptr> my_vec;
my_vec.push_back(stuff);
if you pass out the pointers contained in the vector to clients...
client_ptr = my_vec[3];
There will be no problem when the vector resizes. The contents of the vector will be preserved, and whatever was at my_vec[3] will still be there. The object pointed to by my_vec[3] will still be at the same address, and my_vec[3] will still contain that address. Whomever got a copy of the pointer at my_vec[3] will still have a valid pointer.
However, if you did this:
client_ptr = &my_vec[3];
And the client is dereferencing like this:
*client_ptr->whatever();
You have a problem. Now when my_vec resized, &my_vec[3] is probably no longer valid, thus client_ptr points to nowhere.
If something is added to the vector that the pointer points to and the
capacity is exceeded, it is my understanding that the vector will
allocate new memory and every pointer that points to any element in
the vector will become invalid.
I once wrote some code to analyze what happens when a vector's capacity is exceeded. (Have you done this, yet?) What that code demonstrated on my Ubuntu with g++v5 system was that std::vector code simply a) doubles the capacity, b) moves all the elements from old to the new storage, then c) cleans up the old. Perhaps your implementation is similar. I think the details of capacity expansion is implementation dependent.
And yes, any pointer into the vector would be invalidated when push_back() causes capacity to be exceeded.
1) I simply don't use pointers-into-the-vector (and neither should you). In this way the issue is completely eliminated, as it simply can not occur. (see also, dangling pointers) The proper way to access a std::vector (or a std::array) element is to use an index (via the operator[]() method).
After any capacity-expansion, the index of all elements at indexes less than the previous capacity limit are still valid, as the push_back() installed the new element at the 'end' (I think highest memory addressed.) The elements memory location may have changed, but the element index is still the same.
2) It is my practice that I simply don't exceed the capacity. Yes, by that I mean that I have been able to formulate all my problems such that I know the required maximum-capacity. I have never found this approach to be a problem.
3) If the vector contents can not be contained in system memory (my system's best upper limit capacity is roughly 3.5 GBytes), then perhaps a vector container (or any ram based container) is inappropriate. You will have to accomplish your goal using disk storage, perhaps with vector containers acting as a cache.
update 2017-July-31
Some code to consider from my latest Game of Life.
Each Cell_t (on the 2-d gameboard) has 8 neighbors.
In my implementation, each Cell_t has a neighbor 'list,' (either std::array or std::vector, I've tried both), and after the gameboard has fully constructed, each Cell_t's init() method is run, filling it's neighbor 'list'.
// see Cell_t data attributes
std::array<int, 8> m_neighbors;
// ...
void Cell_t::void init()
{
int i = 0;
m_neighbors[i] = validCellIndx(m_row-1, m_col-1); // 1 - up left
m_neighbors[++i] = validCellIndx(m_row-1, m_col); // 2 - up
m_neighbors[++i] = validCellIndx(m_row-1, m_col+1); // 3 - up right
m_neighbors[++i] = validCellIndx(m_row, m_col+1); // 4 - right
m_neighbors[++i] = validCellIndx(m_row+1, m_col+1); // 5 - down right
m_neighbors[++i] = validCellIndx(m_row+1, m_col); // 6 - down
m_neighbors[++i] = validCellIndx(m_row+1, m_col-1); // 7 - down left
m_neighbors[++i] = validCellIndx(m_row, m_col-1); // 8 - left
// ^^^^^^^^^^^^^- returns info to quickly find cell
}
The int value in m_neighbors[i] is the index into the gameboard vector. To determine the next state of the cell, the code 'counts the neighbor's states.'
Note - Some cells are at the edge of the gameboard ... in this implementation, validCellIndx() can return a value indicating 'no-neighbor', (above top row, left of left edge, etc.)
// multiplier: for 100x200 cells,20,000 * m_generation => ~20,000,000 ops
void countNeighbors(int& aliveNeighbors, int& totalNeighbors)
{
{ /* ... initialize m_count[]s to 0 */ }
for(auto neighborIndx : m_neighbors ) { // each of 8 neighbors // 123
if(no_neighbor != neighborIndx) // 8-4
m_count[ gBoard[neighborIndx].m_state ] += 1; // 765
}
aliveNeighbors = m_count[ CellALIVE ]; // CellDEAD = 1, CellALIVE
totalNeighbors = aliveNeighbors + m_count [ CellDEAD ];
} // Cell_Arr_t::countNeighbors
init() pre-computes the index to this cells neighbors. The m_neighbors array holds index integers, not pointers. It is trivial to have NO pointers-into-the-gameboard vector.

Using the same vector without the resize() part

I have a question about std::vector -
vector<int> vec(1,0);
while(//something_1)
{
while(//something_2)
{
...
vec.pushback(var)
...
}
process(vec.size()); //every iteration- different size
vec.clear();
vec.resize(0,0);
}
On this case - every vec.push_back(var) there is reallocation of a new array with size bigger by one than the former array.
My question is - if there is a way using one vector, so after the inner while(//something_2), the vec.push_back(var) command will push back from the first cell of vec? instead of using vec.clear() and vec.resize(0,0)? so I could save the resize part and the reallocation.
The size of the vector is important for the function process(vec.size())
Thanks.
You can use reserve first time if you know beforehand approximately how much your vector could grow.
clear Leaves the capacity() of the vector unchanged. Which means that push_back & and other modifiers will use the same memory.
resize(0,0) should be removed.

STL's vector resizing

I can't find this piece of information. I'm dealing with an odd situation here where i'm inside of a loop and i can get a random information at any given time. This information has to be stored in a vector. Now each frame i have to set this vector to ensure that i won't exeed the space (i'm writing values into random points in the vector using indexing).
Now assuming there's no way to change this piece of code, i want to know, does the vector "ignore" the resize() function if i send an argument that's exactly the size of the vector? Where can i find this information?
From MSDN reference1
If the container's size is less than the requested size, _Newsize, elements are added to the vector until it reaches the requested size. If the container's size is larger than the requested size, the elements closest to the end of the container are deleted until the container reaches the size _Newsize. If the present size of the container is the same as the requested size, no action is taken
The ISO C++ standard (page 485 2) specifies this behaviour for vector::resize
void resize ( size_type sz , T c = T ());
if ( sz > size ())
insert ( end () , sz - size () , c );
else if ( sz < size ())
erase ( begin ()+ sz , end ());
else
; // Does nothing
So yes, the vector ignores it and you don't need to perform a check on your own.
Kinda-sorta.
Simply resizing a vector with resize() can only result in more memory being used by the vector itself (will change how much is used by its elements). If there's not enough room in the reserved space, it will reallocate (and sometimes they like to pad themselves a bit so even if there is you might grow). If there is already plenty of room for the requested size and whatever padding it wants to do, it will not regrow.
When the specification says that the elements past the end of the size will be deleted, it means in place. Basically it will call _M_buff[i].~T() for each element it is deleting. Thus any memory your object allocates will be deleted, assuming a working destructor, but the space that the object itself occupies (it's size) will not be. The vector will grow, and grow, and grow to the maximum size you ever tell it to and will not reshrink while it exists.

inserting into the middle of an array

I have an array int *playerNum which stores the list of all the numbers of the players in the team. Each slot e.g playerNum[1]; represents a position on the team, if I wanted to add a new player for a new position on the team. That is, inserting a new element into the array somewhere near the middle, how would I go about doing this?
At the moment, I was thinking you memcpy up to the position you want to insert the player into a new array and then insert the new player and copy over the rest of it?
(I have to use an array)
If you're using C++, I would suggest not using memcpy or memmove but instead using the copy or copy_backward algorithms. These will work on any data type, not just plain old integers, and most implementations are optimized enough that they will compile down to memmove anyway. More importantly, they will work even if you change the underlying type of the elements in the array to something that needs a custom copy constructor or assignment operator.
If you have to use an array, after having made sure you have enough storage (using realloc if necessary), use memmove to shift the items from the insertion point to the end by one position, then save your new player at the desired location.
You can't use memcpy if the source and target areas overlap.
This will fail as soon as the objects in your array have non-trivial copy-constructors, and it's not idiomatic C++. Using one of the container classes is much safer (std::vector or std::list for instance).
Your solution using memcpy is correct (under few assumptions mentionned by other).
However, and since you are programming in C++. It is probably a better choice to use std::vector and its insert method.
vector<int> myvector (3,100);
myvector.insert ( 10 , 42 );
An array takes a contiguous block of memory, there is no function for you to insert an element in the middle. you can create a new one of size larger than the origin's by one then copy the original array into the new one plus the new member
for(int i=0;i<arSize/2;i++)
{
newarray[i]<-ar[i];
}
newarray[i+1]<-newelemant;
for(int j=i+1<newSize;j++,i++)
{
newarray[i]<-ar[i];
}
if you use STL, ting becomes easier, use list.
As you're talking about an array and "insert" I assume that it is a sorted array. You don't necessarily need a second array provided that the capacity N of your existing array is large enough to store more entries (N>n, where n is the number of current entries). You can move the entries from k to n-1 (zero-indexed) to k+1 to n, where k is the desired insert position. Insert the new element at index position k and increase n by one. If the array is not large enough in the beginning, you can follow your proposed approach or just reallocate a new array of larger capacity N' and copy the existing data before applying the actual insert operation described above.
BTW: As you're using C++, you could easily use std::vector.
While it is possible to use arrays for this, C++ has a better solutions to offer. For starters, try std::vector, which is a decent enough general-purpose container, based on a dynamically-allocated array. It behaves exactly like an array in many cases.
Looking at your problem, however, there are two downsides to arrays or vectors:
Indices have to be 0-based and contiguous; you cannot remove elements from the middle without losing key/value associations for everything after the removed element; so if you remove the player on position 4, then the player from position 9 will move to position 8
Random insertion and deletion (that is, anywhere except the end) is expensive - O(n), that is, execution time grows linearly with array size. This is because every time you insert or delete, a part of the array needs to be moved.
If the key/value thing isn't important to you, and insertion/deletion isn't time critical, and your container is never going to be really large, then by all means, use a vector. If you need random insertion/deletion performance, but the key/value thing isn't important, look at std::list (although you won't get random access then, that is, the [] operator isn't defined, as implementing it would be very inefficient for linked lists; linked lists are also very memory hungry, with an overhead of two pointers per element). If you want to maintain key/value associations, std::map is your friend.
Losting the tail:
#include <stdio.h>
#define s 10
int L[s];
void insert(int v, int p, int *a)
{
memmove(a+p+1,a+p,(s-p+1)*4);
*(a+p) = v;
}
int main()
{
for(int i=0;i<s;i++) L[i] = i;
insert(11,6, L);
for(int i=0;i<s;i++) printf("%d %d\n", L[i], &L[i]);
return 0;
}

Stumped at a simple segmentation fault. C++

Could somebody be kind to explain why in the world this gives me a segmentation fault error?
#include <vector>
#include <iostream>
using namespace std;
vector <double>freqnote;
int main(){
freqnote[0] = 16.35;
cout << freqnote[0];
return 0;
}
I had other vectors in the code and this is the only vector that seems to be giving me trouble.
I changed it to vector<int>freqnote; and changed the value to 16 and I STILL get the segmentation fault. What is going on?
I have other vector ints and they give me correct results.
Replace
freqnote[0] = 16.35;
with
freqnote.push_back(16.35);
and you'll be fine.
The error is due to that index being out-of-range. At the time of your accessing the first element via [0], the vector likely has a capacity of 0. push_back(), on the other hand, will expand the vector's capacity (if necessary).
You can't initialise an element in a vector like that.
You have to go:
freqnote.push_back(16.35),
then access it as you would an array
You're accessing vector out of bounds. First you need to initialize vector specifying it's size.
int main() {
vector<int> v(10);
v[0] = 10;
}
As has been said, it's an issue about inserting an out of range index in the vector.
A vector is a dynamically sized array, it begins with a size of 0 and you can then extend/shrink it at your heart content.
There are 2 ways of accessing a vector element by index:
vector::operator[](size_t) (Experts only)
vector::at(size_t)
(I dispensed with the const overloads)
Both have the same semantics, however the second is "secured" in the sense that it will perform bounds checking and throw a std::out_of_range exception in case you're off bound.
I would warmly recommend performing ALL accesses using at.
The performance penalty can be shrugged off for most use cases. The operator[] should only be used by experts, after they have profiled the code and this spot proved to be a bottleneck.
Now, for inserting new elements in the vector you have several alternatives:
push_back will append an element
insert will insert the element in front of the element pointed to by the iterator
Depending on the semantics you wish for, both are to be considered. And of course, both will make the vector grow appropriately.
Finally, you can also define the size explicitly:
vector(size_t n, T const& t = T()) is an overload of the constructor which lets you specify the size
resize(size_t n, T const& t = T()) allows you to resize the vector, appending new elements if it gets bigger than it was
Both method allow you to supply an element to be copied (exemplar) and default to copying a default constructed object (0 if T is an int) if you don't supply the exemplar explicitly.
Besides using push_back() to store new elements, you can also call resize() once before you start using the vector to specify the number of elements it contains. This is very similar to allocating an array.