Adding items outside of allocated array? - c++

I have a class Set:
class Set
{
public:
//Default constructor
Set ();
//Some more functions...
private:
int *p;
const int K = 10;
int numval = 0; //Number of ints in the array
//Other variables...
};
The default constructor:
Set::Set()
{
p = new int[K]; //Allocate memory for array with 10 ints
}
If I in some other function would fill the array with 10 ints and then add an other one, what would happen? The compiler doesn't crash and I'm able to print the 11:th int. But since I havn't allocated memory for it, where is it stored?
Example:
Set1 += 5;
Would add 5 to the array with the following operator overloader.
const Set& Set::operator+=(const int x)
{
p[numval] = x; //Add next int after the last int in the array
numval++; //Increment number of ints
return *this;
}

If I in some other function would fill the array with 10 ints and then add an other one, what would happen?
You'd write into whatever memory came after the end of the array, causing undefined behaviour: perhaps causing no obvious problems, perhaps corrupting some unrelated data (or the metadata used to manage the heap), or perhaps crashing if there was no writable memory there.
But since I havn't allocated memory for it, where is it stored?
It isn't stored anywhere, in the sense of having storage allocated for it. There's just nothing to stop you writing to arbitrary memory locations beyond the end of an array. Be careful not to do that.

Computer memory is linear. It's one huge row of cells (bytes). Every cell has 2 neighbours (except the first and the last ones, obviously). Allocating memory is just an act of telling "this part is mine". It's really nothing more than a promise: you promise to not write outside your plot and in return you get promise noone else would write inside it. So what happens when you write outside of your allocated area? You break your promise. There may be someone's else's plot right next to yours, there might be unused space. Nothing really happens when you write outside your area. Real problem arises when rightful owner comes back and tries to pick up what he left - and it turns out to be something else, something you put there. (Of course it's possible that your plot lies next to something system considers important. In that case, OS stations guards on the border, and they shot to kill any trespassers on sight.)
It is your job as a programmer to make your program keep it's promises. When processes break their promises, bad things may or may not happen - to them or to other processes.

Related

How to free memory for vector of vectors (C++)

I have a vector<vector<double>> elem and I want to deallocate its memory many times in my program.
I tried using
vector<vector<double>>().swap(elem);
Or even a for cicle
for(int i=0; i<elem.size();i++)
vector<double>().swap(elem[i]);
vector<vector<double>>().swap(elem);
elem.resize(dim, vector<double>(0));
(I want the first dimension to be a certain number dim)
But when I call
cout<<elem[0].size();
numerous times in my program, the output keeps growing, even if I've just used the aforementioned method. This issue isn't present with the "main" size of the vector.
i.e.
cout<<elem.size();
always outputs dim
EDIT: I know about clear() but I want to deallocate the vector, shrink_to_fit() doesn't work either. Also this is implemented in a function out of the main one, as follows:
void arrayReset(vector<vector<double>> elem) {
for(int i=0; i<elem.size();i++)
vector<double>().swap(elem[i]);
vector<vector<double>>().swap(elem);
elem.resize(dim, vector<double>(0));
}
Your new function void arrayReset(vector<vector<double>> elem) { gets a COPY of your vector and [possibly] cleans it; you never see it in the calling function.
If you pass your vector by reference, you would manipulate the original vector.
How to free memory for vector
The way is the same for all vectors regardless of the element type.
Step 1: Remove the elements of the vector. Simplest way is the clear member function. After this step, the size member function will return 0.
Step 2: Call shrink_to_fit member function which requests the memory to be deallocated. After this step, capacity may return 0.
Technically, shrink_to_fit is a request that is not required to be honoured by the language implementation. The only guaranteed way to deallocate the memory is to destroy the vector. Example:
{
std::vector<std::vector<double> vector;
// use vector here
}
// memory has been deallocated
I want to deallocate its memory many times in my program.
Note that this is typically slower than not deallocating many times. I recommend making sure that you want something that is actually useful.

Memory management when using vector

I am making a game engine and need to use the std::vector container for all of the components and entities in the game.
In a script the user might need to hold a pointer to an entity or component, perhaps to continuously check some kind of state. If something is added to the vector that the pointer points to and the capacity is exceeded, it is my understanding that the vector will allocate new memory and every pointer that points to any element in the vector will become invalid.
Considering this issue i have a couple of possible solutions. After each push_back to the vector, would it be a viable to check if a current capacity variable is exceeded by the actual capacity of the vector? And if so, fetch and overwrite the old pointers to the new ones? Would this guarantee to "catch" every case that invalidates pointers when performing a push_back?
Another solution that i've found is to instead save an index to the element and access it that way, but i suspect that is bad for performance when you need to continuously check the state of that element (every 1/60 second).
I am aware that other containers do not have this issue but i'd really like to make it work with a vector. Also it might be worth noting that i do not know in advance how many entities / components there will be.
Any input is greatly appreciated.
You shouldn't worry about performance of std::vector when you access its element only 60 times per second. By the way, in Release compilation mode std::vector::operator[] is being converted to a single lea opcode. In Debug mode it is decorated by some runtime range checks though.
If the user is going to store pointers to the objects, why even contain them in a vector?
I don't feel like it is a good idea to (poor wording)->store pointers to objects in a vector. (what I meant is to create pointers that point to vector elements, i.e. my_ptr = &my_vec[n];) The whole point of a container is to reference the contents in the normal ways that the container supports, not to create outside pointers to elements of the container.
To answer your question about whether you can detect the allocations, yes you could, but it is still probably a bad idea to reference the contents of a vector by pointers to elements.
You could also reserve space in the vector when you create it, if you have some idea of what the maximum size might grow to. Then it would never resize.
edit:
After reading other responses, and thinking about what you asked, another thought occurred. If your vector is a vector of pointers to objects, and you pass out the pointers to the objects to your clients, resizing the vector does not invalidate the pointers that the vector hold. The issue becomes keeping track of the life of the object (who owns it), which is why using shared_ptr would be useful.
For example:
vector<shared_ptr> my_vec;
my_vec.push_back(stuff);
if you pass out the pointers contained in the vector to clients...
client_ptr = my_vec[3];
There will be no problem when the vector resizes. The contents of the vector will be preserved, and whatever was at my_vec[3] will still be there. The object pointed to by my_vec[3] will still be at the same address, and my_vec[3] will still contain that address. Whomever got a copy of the pointer at my_vec[3] will still have a valid pointer.
However, if you did this:
client_ptr = &my_vec[3];
And the client is dereferencing like this:
*client_ptr->whatever();
You have a problem. Now when my_vec resized, &my_vec[3] is probably no longer valid, thus client_ptr points to nowhere.
If something is added to the vector that the pointer points to and the
capacity is exceeded, it is my understanding that the vector will
allocate new memory and every pointer that points to any element in
the vector will become invalid.
I once wrote some code to analyze what happens when a vector's capacity is exceeded. (Have you done this, yet?) What that code demonstrated on my Ubuntu with g++v5 system was that std::vector code simply a) doubles the capacity, b) moves all the elements from old to the new storage, then c) cleans up the old. Perhaps your implementation is similar. I think the details of capacity expansion is implementation dependent.
And yes, any pointer into the vector would be invalidated when push_back() causes capacity to be exceeded.
1) I simply don't use pointers-into-the-vector (and neither should you). In this way the issue is completely eliminated, as it simply can not occur. (see also, dangling pointers) The proper way to access a std::vector (or a std::array) element is to use an index (via the operator[]() method).
After any capacity-expansion, the index of all elements at indexes less than the previous capacity limit are still valid, as the push_back() installed the new element at the 'end' (I think highest memory addressed.) The elements memory location may have changed, but the element index is still the same.
2) It is my practice that I simply don't exceed the capacity. Yes, by that I mean that I have been able to formulate all my problems such that I know the required maximum-capacity. I have never found this approach to be a problem.
3) If the vector contents can not be contained in system memory (my system's best upper limit capacity is roughly 3.5 GBytes), then perhaps a vector container (or any ram based container) is inappropriate. You will have to accomplish your goal using disk storage, perhaps with vector containers acting as a cache.
update 2017-July-31
Some code to consider from my latest Game of Life.
Each Cell_t (on the 2-d gameboard) has 8 neighbors.
In my implementation, each Cell_t has a neighbor 'list,' (either std::array or std::vector, I've tried both), and after the gameboard has fully constructed, each Cell_t's init() method is run, filling it's neighbor 'list'.
// see Cell_t data attributes
std::array<int, 8> m_neighbors;
// ...
void Cell_t::void init()
{
int i = 0;
m_neighbors[i] = validCellIndx(m_row-1, m_col-1); // 1 - up left
m_neighbors[++i] = validCellIndx(m_row-1, m_col); // 2 - up
m_neighbors[++i] = validCellIndx(m_row-1, m_col+1); // 3 - up right
m_neighbors[++i] = validCellIndx(m_row, m_col+1); // 4 - right
m_neighbors[++i] = validCellIndx(m_row+1, m_col+1); // 5 - down right
m_neighbors[++i] = validCellIndx(m_row+1, m_col); // 6 - down
m_neighbors[++i] = validCellIndx(m_row+1, m_col-1); // 7 - down left
m_neighbors[++i] = validCellIndx(m_row, m_col-1); // 8 - left
// ^^^^^^^^^^^^^- returns info to quickly find cell
}
The int value in m_neighbors[i] is the index into the gameboard vector. To determine the next state of the cell, the code 'counts the neighbor's states.'
Note - Some cells are at the edge of the gameboard ... in this implementation, validCellIndx() can return a value indicating 'no-neighbor', (above top row, left of left edge, etc.)
// multiplier: for 100x200 cells,20,000 * m_generation => ~20,000,000 ops
void countNeighbors(int& aliveNeighbors, int& totalNeighbors)
{
{ /* ... initialize m_count[]s to 0 */ }
for(auto neighborIndx : m_neighbors ) { // each of 8 neighbors // 123
if(no_neighbor != neighborIndx) // 8-4
m_count[ gBoard[neighborIndx].m_state ] += 1; // 765
}
aliveNeighbors = m_count[ CellALIVE ]; // CellDEAD = 1, CellALIVE
totalNeighbors = aliveNeighbors + m_count [ CellDEAD ];
} // Cell_Arr_t::countNeighbors
init() pre-computes the index to this cells neighbors. The m_neighbors array holds index integers, not pointers. It is trivial to have NO pointers-into-the-gameboard vector.

Pointers don't point to the right place in a vector of pointers to a class

I have two classes: spot and frame.
spot holds data about a spot detected by some image processor: it has only an id (int) which is unique, and x,y-coordinates (both double). I store all the spots in a vector I call spots.
frame holds, among other things, a vector of pointers to all the spots that belong to it:
class frame
{
int num ;
vector <spot *> spots_list ;
// other members and functions
}
I read the data from a file:
while (//goes through a lot of rows)
{
spot* S = new spot (ID, X, Y) ;
spots.push_back (*S) ;
frames[i].spot_list.push_back (&spots.back()) ;
delete S ;
}
so essentially, I'm creating a new instance S, and then I add its data to vector spots, and add a pointer to its address to frame's spot_list.
(at least, this is what I want to do)
When I try to print all the points in a frame, some of them hold garbage data: e.g. id=423784237, id=-9431101 - and the rest have valid data.
But, when I check it against the vector spots directly, it's not pointing to the right place.
For example, id=37 is in cell 0x20f8288 in the frame's spot_list, but at 0x210d080 in the vector spots.
Since there's random garbage data and the addresses are not the same, I'm pretty sure I'm not doing this correctly - but I don't understand what I should do differently. I would appreciate any help.
spots.push_back(*S);
This call will sometimes need to reallocate the storage inside spots; when that happens, any previously-stored addresses become invalid, so the entries in the spots_list become bogus.
If you know how large the vector will be, you can either pre-allocate the internals of spots with spots.reserve(size); otherwise, you can store indices into the vector instead of pointers.

Can't keep dynamically allocated memory consistent after return

I'm working on a chess-playing program. As part of it, I wrote a static method that is supposed to recursively operate on its input by calling itself with varied versions of the board, a Piece * board[8][8], and pass back the the location of the "best" version of the board inside a std::unique_ptr, which is the return type of the method.
Node is defined as such:
class node
{
public:
node();
~node();
std::unique_ptr<node> l;
std::unique_ptr<node> r;
std::unique_ptr<node> m;
node * best;
int bestval;
Piece * (*board)[8];
};
The goal is to eventually have the result of the initial call to the recursive method contain a "best" value which links to the whole chain of best path choices through the chessboard. I would then draw the series of board states that result.
As part of this, the board must be preserved. Whichever board "wins" at each recursive step gets copied to dynamic memory, and the board pointer of the return ( Piece * (*board)[8] in the node declaration) is set to this dynamically allocated memory.
This is done like so:
std::unique_ptr<node> ret (new node);
Piece *** reboard = new Piece**[8];
for (int i = 0; i < 8; i++)
{
reboard[i] = new Piece*[8];
}
...code to copy values to reboard and set other ret property values...
ret->board = reboard;
return ret;
All the local values of the winning chess board are then copied to reboard. This all works fine. If I copy all the values of reboard to a global board at this stage, return, and directly draw that global board to the screen, it draws the correct result. Likewise, if I set ret->board to point to that global board and then copy values to the global board and return, it draws the correct values.
But, if I do what I've written above and try to draw ret->board, I get invalid memory access errors in my draw method, and I'm pulling my hair out trying to pin this problem down. It seems that immediately after return, the memory pointed to by reboard is reclaimed somehow. In this memory which should only be data, I see that entries into the array appear to point to code in msctf.dll, among other invalid data pointers. I thought it was being reclaimed by garbage collection, so I've even tried putting in some std::declare_reachable calls on any and every pointer I can see, but this has not helped.
Anyone notice recognize what's going on here? Shouldn't that dynamically allocated memory stick around until I free it?
std::unique_ptr is a smart pointer that retains sole ownership of an object through a pointer and destroys that object when the unique_ptr goes out of scope. No two unique_ptr instances can manage the same object.
Source: cppreference.com
In other words, you are freeing the memory as soon as you hit return, and then return garbage. Of course it'll cause an access violation to try to dereference that memory.

Initializing and maintaining structs of structs

I’m writing C++ code to deal with a bunch of histograms that are populated from laboratory measurements. I’m running into problems when I try to organize things better, and I think my problems come from mishandling pointers and/or structs.
My original design looked something like this:
// the following are member variables
Histogram *MassHistograms[3];
Histogram *MomentumHistograms[3];
Histogram *PositionHistograms[3];
where element 0 of each array corresponded to one laboratory measurement, element 1 of each corresponded to another, etc. I could access the individual histograms via MassHistograms[0] or similar, and that worked okay. However, the organization didn't seem right to me—if I were to perform a new measurement, I’d have to add an element to each of the histogram arrays. Instead, I came up with
struct Measurement {
Histogram *MassHistogram;
Histogram *MomentumHistogram;
Histogram *PositionHistogram;
};
As an added layer of complexity, I further wanted to bundle these measurements according to the processing that has been done on their data, so I made
struct MeasurementSet {
Measurement SignalMeasurement;
Measurement BackgroundMeasurement;
};
I think this arrangement is much more logical and extensible—but it doesn’t work ;-) If I have code like
MeasurementSet ms;
Measurement m = ms.SignalMeasurement;
Histogram *h = m.MassHistogram;
and then try to do stuff with h, I get a segmentation fault. Since the analogous code was working fine before, I assume that I’m not properly handling the structs in my code. Specifically, do structs need to be initialized explicitly in any way? (The Histograms are provided by someone else’s library, and just declaring Histogram *SomeHistograms[4] sufficed to initialize them before.)
I appreciate the feedback. I’m decently familar with Python and Clojure, but my limited knowledge of C++ doesn’t extend to [what seems like] the arcana of the care and feeding of structs :-)
What I ended up doing
I turned Measurement into a full-blown class:
class Measurement {
Measurement() {
MassHistogram = new Histogram();
MomentumHistogram = new Histogram();
PositionHistogram = new Histogram();
};
~Measurement() {
delete MassHistogram;
delete MomentumHistogram;
delete PositionHistogram;
};
Histogram *MassHistogram;
Histogram *MomentumHistogram;
Histogram *PositionHistogram;
}
(The generic Histogram() constructor I call works fine.) The other problem I was having was solved by always passing Measurements by reference; otherwise, the destructor would be called at the end of any function that received a Measurement and the next attempt to do something with one of the histograms would segfault.
Thank you all for your answers!
Are you sure that Histogram *SomeHistograms[4] initialized the data? How do you populate the Histogram structs?
The problem here is not the structs so much as the pointers that are tripping you up. When you do this: MeasurementSet ms; it declares an 'automatic variable' of type MeasurementSet. What it means is that all the memory for MeasurementSet is 'allocated' and ready to go. MeasurementSet, in turn, has two variables of type Measurement that are also 'allocated' and 'ready to go'. Measurement, in turn, has 3 variables of type Histogram * that are also 'allocated' and 'ready to go'... but wait! The type 'Histogram *' is a 'pointer'. That means it's an address - a 32 or 64 bit (or whatever bit) value that describes an actual memory location. And that's it. It's up to you to make it point to something - to put something at that location. Before it points to anything, it will have literally random data in it (or 0'd out data, or some special debug data, or something like that) - the point is that if you try to do something with it, you'll get a segmentation fault, because you will likely be attempting to read a part of data your program isn't supposed to be reading.
In c++, a struct is almost exactly the same thing as a class (which has a similar concept in python), and you typically allocate one like so:
m.MassHistogram = new Histogram();
...after that, the histogram is ready-to-go. However, YMMV: can you allocate one yourself? Or can you only get one from some library, maybe from a device reading, etc? Furthermore, although you can do what I wrote, it's not necessarily 'pretty'. A c++-ic solution would be to put the allocation in a constructor (like init in python) and delete in a destructor.
When your struct contains a pointer, you have to initialize that variable yourself.
Example
struct foo
{
int *value;
};
foo bar;
// bar.value so far is not initialized and points to a random piece of data
bar.value = new int(0);
// bar.value now points to a int with the value 0
// remember, that you have to delete everything that you new'd, once your done with it:
delete bar.value;
First, always remember that structs and classes are almost exactly the same things. The only difference is that struct members are public by default, and a class member is private by default.
But all the rest is exactly the same.
Second, carefully differentiate between pointers and objects.
If I write
Histogram h;
space for histogram's data will be allocated, and it's constructor will be called. ( A construct is a method with exactly the same name as the class, here Historgram() )
If I write
Histogram* h;
I'm declaring a variable of 32/64 bits that will be used as a pointer to memory. It's initialzed with a random value. Dangerous!
If I write
Histogram* h = new Histogram();
memory will be allocated for one Histogram's data members, and it's constructor will be called. The address in memory will be stored in "h".
If I write
Histogram* copy = h;
I'm again declaring a 32/64 bit variable that points to exactly the same address in memory as h
If I write
Histogram* h = new Historgram;
Histogram* copy = h;
delete h;
the following happens
memory is allocated for a Histogram object
The constructor of Histogram will be called (even if you didn't write it, your compiler will generate one).
h will contain the memory address of this object
the delete operator will call the destructor of Histogram (even if you didn't write it, your compiler will generate one).
the memory allocated for the Histogram object will be deallocated
copy will still contain the memory address where the object used to be allocated. But you're not allowed to use it. It's called a "dangling pointer"
h's contents will be undefined
In short: the "n.MassHistogram" in your code is referring to a random area in memory. Don't use it. Either allocated it first using operator "new", or declare it as "Histogram" (object instead of pointer)
Welcome to CPP :D
You are aware that your definition of Measurement does not allocate memory for actual Histograms? In your code, m.MassHistogram is a dangling (uninitialized) pointer, it's not pointing to any measured Histogram, nor to any memory capable of storing a Histogram. As #Nari Rennlos posted just now, you need to point it to an existing (or newly allocated) Histogram.
What does your 3rd party library's interface look like? If it's at all possible, you should have a Measurement containing 3 Histograms (as opposed to 3 pointers to Histograms). That way when you create a Measurement or a MeasurementSet the corresponding Histograms will be created for you, and the same goes for destruction. If you still need a pointer, you can use the & operator:
struct Measurement2 {
Histogram MassHistogram;
Histogram MomentumHistogram;
Histogram PositionHistogram;
};
MeasurementSet2 ms;
Histogram *h = &ms.SignalMeasurement.MassHistogram; //h valid as long as ms lives
Also note that as long as you're not working with pointers (or references), objects will be copied and assigned by value:
MeasurementSet ms; //6 uninitialized pointers to Histograms
Measurement m = ms.SignalMeasurement; //3 more pointers, values taken from first 3 above
Histogram *h = m.MassHistogram; //one more pointer, same uninitialized value
Though if the pointers had been initialized, all 10 of them would be pointing to an actual Histogram at this point.
It gets worse if you have actual members instead of pointers:
MeasurementSet2 ms; //6 Histograms
Measurement2 m = ms.SignalMeasurement; //3 more Histograms, copies of first 3 above
Histogram h = m.MassHistogram; //one more Histogram
h.firstPoint = 42;
m.MassHistogram.firstPoint = 43;
ms.SignalMeasurement.MassHistogram.firstPoint = 44;
...now you have 3 slightly different mass signal histograms, 2 pairs of identical momentum and position signal histograms, and a triplet of background histograms.