How to use pointer array for segmented data - c++

So I am doing some graphics rendering and have gotten to a point where the data being used is too high and run/loading time takes way too long. Completely my fault as I am copying a massive chucks (2+ gig) of data all over the place. Naturally I need to transition to pointers and here is the problem I face.
We have main data "vector data" and I need to access random areas (xyz points) in it.
vector<float> data{1, 2, 3, ... , 101, 102, 103, ...};
float* point1 = &data[0] //points to beginning of array (1,2,3,...)
float* point2 = &data[100] //points to middle of array (101, 102, 103,...)
Now I need to make an output array that uses both pointers, but I'm not sure how to do this. In essence I want the following.
float* outputList = point1;
outputList+3 = point2;
Such that output List = {1,2,3,101,102,103};
This wont work because I am trying to reassign the actual pointer address in the second line. The second major issue is that output list would go on after 103, and keep going till the end of the data vector. I know there are a few issues with this, but hopefully I got the idea across. Thank you for any advice.

Well, pointers point to physical address cells. When you assign outputList = point1; that means outputList will now point to the same address cell as point1. Adding a value to a pointer will move it certain amount of physical cells. You cannot state that outputList+3 = point2, because outputList already points to physical cell point1, and moving it 3 cells will point to physical cell + 3.
If you would like to have another array with values from both point1 and point2 you would have to allocate new memory and assign the right values to it. You can do that by creating another vector.
However, if you only want to use the values from point1 and point2 you could just iterate over interesting parts of original array and temporarily use the original values (without creating a new array).

This should be a comment, but I don't have enough reputation.
Take a look at std::span.
Maybe you could use a std::vector<std:span<...>>?
std::span is basically a pointer and a size.
Note that if you do something like this you lose cache locality if the data is too far apart.

Related

Moving pointer to vector of vectors to specific position

I have a maze vector of vector of ints and a pointer line to that vector declared as follows. I also have a pointer to line that identifies each element in particular.
std::vector<std::vector<int>> maze;
auto * line = new std::vector<std::vector<int>>(maze); //pointer to maze to hold line position
auto * column = line; // pointer to element in line
maze is a vector of vectors of int that holds numbers. I'm supposed to follow a route starting from 1 and going to the next highest number(+1) until i find an exit or a dead end inside the labyrinth.
It is unclear to me how the pointer line works. My understanding is that it will hold the address of the first element in the first vector of maze. By writing line+1, the pointer will hold the address of the first element of the second vector etc. The compiler lets me write column = &line[1] and i assume column will point to the first element of line[1].
However, i didn't find a way to have line point to a specific vector in maze. I have tried the following and all cause errors:
line = &maze[i]; // make line point to i-th vector in maze
line = &(*maze.begin()); // as suggested in a stackoverflow topic, converting iterator to pointer
*line = maze[i];
The only way i didn't get an error was by writing line->begin = maze.begin(), but i want to move line to any position in maze.
What is the conceptual difference that doesn't allow me to assign line the same way i do column?
Thanks.
It is unclear to me how the pointer line works.
This line
auto * line = new std::vector<std::vector<int>>(maze);
creates a fresh new instance of std::vector<std::vector<int>>, copying the contents of maze into that new instance, and then stores a pointer to the new instance in line. line is not a pointer to maze! That would be this instead:
auto * line = &maze;
By writing line+1, the pointer will hold the address of the first element of the second vector etc.
No. line is not a pointer to an array, or to an element of an array, so you cannot increment it (well, to be precise, it can be considered as an array with one element, and so incrementing the pointer once is fine, but anything beyond that is not).
The only way i didn't get an error was by writing line->begin = maze.begin()
I'll skip the rest, because you are on the wrong track, and I have to admit that I do not completely understand what the second snippet is supposed to do.
However, i didn't find a way to have line point to a specific vector in maze
It is unclear why you want a pointer in the first place. An element of maze you get via
auto x = maze[i][j];
a reference via
auto& r = maze[i][j];
and a pointer via
auto* p = &maze[i][j];
It is not perfectly clear what you are trying to accomplish. What is clear is that your attempt with using line is flawed, because it is not a pointer to maze.
PS: If your intention was to use pointer arithmetic to navigate through the maze, then there is bad news: It won't work. At least not as easy as you hope for. A std::vector<T> stores its elements in contiguous memory. However, that memory is not inside the vector object. Hence, the Ts in a std::vector<std::vector<T>> are stored in contiguous blocks, but they are not contiguous as a whole. The outer vector holds a contiguous block of vector<T>s, and each inner vector<T> holds a separate contiguous block of ints. Collectively, the individual blocks of ints are scattered around memory.

Pointers don't point to the right place in a vector of pointers to a class

I have two classes: spot and frame.
spot holds data about a spot detected by some image processor: it has only an id (int) which is unique, and x,y-coordinates (both double). I store all the spots in a vector I call spots.
frame holds, among other things, a vector of pointers to all the spots that belong to it:
class frame
{
int num ;
vector <spot *> spots_list ;
// other members and functions
}
I read the data from a file:
while (//goes through a lot of rows)
{
spot* S = new spot (ID, X, Y) ;
spots.push_back (*S) ;
frames[i].spot_list.push_back (&spots.back()) ;
delete S ;
}
so essentially, I'm creating a new instance S, and then I add its data to vector spots, and add a pointer to its address to frame's spot_list.
(at least, this is what I want to do)
When I try to print all the points in a frame, some of them hold garbage data: e.g. id=423784237, id=-9431101 - and the rest have valid data.
But, when I check it against the vector spots directly, it's not pointing to the right place.
For example, id=37 is in cell 0x20f8288 in the frame's spot_list, but at 0x210d080 in the vector spots.
Since there's random garbage data and the addresses are not the same, I'm pretty sure I'm not doing this correctly - but I don't understand what I should do differently. I would appreciate any help.
spots.push_back(*S);
This call will sometimes need to reallocate the storage inside spots; when that happens, any previously-stored addresses become invalid, so the entries in the spots_list become bogus.
If you know how large the vector will be, you can either pre-allocate the internals of spots with spots.reserve(size); otherwise, you can store indices into the vector instead of pointers.

Misunderstanding of C++ array structure

I'm new to C++ and I learned with different tutorials, in one of them I found an example of code:
I have pointed by numbers of lines, that I completely do not understand;
Does this array in array or something like that?
I can understand the second call, but what is the first doing? There is already
"coordinates[blocks[num]]", aren't there? Why need again blocks(i) ?
How do you make this part of the code easier? Did struct with this arrays
don't make easier getting value from arrays?
Thanks in advance!
// Global vars
Struct Rect {
float left;
}
Rectangle *coordinates;
int *blocks;
coordinates = new Rect[25];
blocks = new int[25];
// in method storing values
const int currentBlock = 0; //var in cycle
coordinates[currentBlock].left = column;
blocks[currentBlock] = currentBlock;
//get element method
const Rect& classA::Coords(int num) const
{
return coordinates[blocks[num]]; //(2)
}
//and calling this method like
Coords(blocks[i]); //(3)
Coords(i); //(3)
// (4)
No, not really. Lots of people will think of them as arrays and even describe them as arrays, but they're actually not. coordinates and blocks are both pointers. They just store a single address of a Rect and an int respectively.
However, when you do coordinates = new Rect[25];, for example, you are allocating an array of 25 Rects and setting the pointer coordinates to point at the first element in that array. So, while coordinates itself is a pointer, it's pointing at the first element in an array.
You can index coordinates and blocks like you would an array. For example, coordinates[3] will access the 4th element of the array of Rects you allocated. The reason why this behaves the same as arrays is because it actually is the same. When you have an actual array arr, for example, and you do arr[4], the array first gets converted to a pointer to its first element and then the indexing occurs.
No, this is not an array of arrays. What it is doing is looking up a value in one array (blocks[num]) and using that to index the next array (coordinates[blocks[num]]). So one array is storing indices into the other array.
I'll ignore that this won't compile, but in both cases you are passing an int to the Coords function. The first case looks incorrect, but might not be. It is taking the value at blocks[i], passing that to the function then using that value to index blocks to get another value, then using that other value to index coordinates. In the second case, you are just passing i, which is being used to index blocks to give you a value with which you index coordinates.
That's a broad question that I don't think I can answer without knowing exactly what you want to simplify and without seeing some real valid code.

C++ Array Size Initialization

I am trying to define a class. This is what I have:
enum Tile {
GRASS, DIRT, TREE
};
class Board {
public:
int toShow;
int toStore;
Tile* shown;
Board (int tsh, int tst);
~Board();
};
Board::Board (int tsh, int tst) {
toShow = tsh;
toStore = tst;
shown = new Tile[toStore][toStore]; //ERROR!
}
Board::~Board () {
delete [] shown;
}
However, I get the following error on the indicated line -- Only the first dimension of an allocated array can have dynamic size.
What I want to be able to do is rather then hard code it, pass the parameter toShow to the constructor and create a two-dimensional array which only contains the elements that I want to be shown.
However, my understanding is that when the constructor is called, and shown is initialized, its size will be initialized to the current value of toStore. Then even if toStore changes, the memory has already been allocated to the array shown and therefore the size should not change. However, the compiler doesn't like this.
Is there a genuine misconception in how I'm understanding this? Does anyone have a fix which will do what I want it to without having to hard code in the size of the array?
Use C++'s containers, that's what they're there for.
class Board {
public:
int toShow;
int toStore;
std::vector<std::vector<Tile> > shown;
Board (int tsh, int tst) :
toShow(tsh), toStore(tst),
shown(tst, std::vector<Tile>(tst))
{
};
};
...
Board board(4, 5);
board.shown[1][3] = DIRT;
You can use a one dimensional array. You should know that bi-dimensional arrays are treated as single dimensional arrays and when you want a variable size you can use this pattern. for example :
int arr1[ 3 ][ 4 ] ;
int arr2[ 3 * 4 ] ;
They are the same and their members can be accessed via different notations :
int x = arr1[ 1 ][ 2 ] ;
int x = arr2[ 1 * 4 + 2 ] ;
Of course arr1 can be seen as a 3 rows x 4 cols matrix and 3 cols x 4 rows matrix.
With this type of multi-dimensional arrays you can access them via a single pointer but you have to know about its internal structure. They are one dimensional arrays which they are treated as 2 or 3 dimensional.
Let me tell you about what I did when I needed a 3D array. It might be an overkeill, but it's rather cool and might help, although it's a whole different way of doing what you want.
I needed to represent a 3D box of cells. Only a part of the cells were marked and were of any interest. There were two options to do that. The first one, declare a static 3D array with the largest possible size, and use a portion of it if one or more of the dimensions of the box were smaller than the corresponding dimensions in the static array.
The second way was to allocate and deallocate the array dynamically. It's quite an effort with a 2D array, not to mention 3D.
The array solution defined a 3D array with the cells of interest having a special value. Most of the allocated memory was unnecessary.
I dumped both ways. Instead I turned to STL map.
I define a struct called Cell with 3 member variables, x, y, z which represented coordinates. The constructor Cell(x, y, z) was used to create such a Cell easily.
I defined the operator < upon it to make it orderable. Then I defined a map<Cell, Data>. Adding a marked cell with coordinates x, y, z to the map was done simply by
my_map[Cell(x, y, z)] = my_data;
This way I didn't need to maintain 3D array memory management, and also only the required cells were actually created.
Checking if a call at coordinate x0, y0, z0 exists (or marked) was done by:
map<Cell, Data>::iterator it = my_map.find(Cell(x0, y0, z0));
if (it != my_map.end()) { ...
And referencing the cell's data at coordinat x0, y0, z0 was done by:
my_map[Cell(x0, y0, z0)]...
This methid might seem odd, but it is robust, self managed regarding to memory, and safe - no boundary overrun.
First, if you want to refer to a 2D array, you have to declare a pointer to a pointer:
Tile **shown;
Then, have a look at the error message. It's proper, comprehensible English. It says what the error is. Only the first dimension of an allocated array can have dynamic size. means -- guess what, that only the first dimension of an allocated array can have dynamic size. That's it. If you want your matrix to have multiple dynamic dimensions, use the C-style malloc() to maintain the pointers to pointers, or, which is even better for C++, use vector, made exactly for this purpose.
It's good to understand a little of how memory allocation works in C and C++.
char x[10];
The compiler will allocate ten bytes and remember the starting address, perhaps it's at 0x12 (in real life probably a much larger number.)
x[3] = 'a';
Now the compiler looks up x[3] by taking the starting address of x, which is 0x12, and adding 3*sizeof(char), which brings to 0x15. So x[3] lives at 0x15.
This simple addition-arithmetic is how memory inside an array is accessed. For two dimensional arrays the math is only slightly trickier.
char xy[20][30];
Allocates 600 bytes starting at some place, maybe it's 0x2000. Now accessing
xy[4][3];
Requires some math... xy[0][0], xy[0][1], xy[0][2]... are going to occupy the first 30 bytes. Then xy[1][0], xy[1][1], ... are going to occupy bytes 31 to 60. It's multiplication: xy[a][b] will be located at the address of xy, plus a*20, plus b.
This is only possible if the compiler knows how long the first dimension is - you'll notice the compiler needed to know the number "20" to do this math.
Now function calls. The compiler little cares whether you call
foo(int* x);
or
foo(int[] x);
Because in either case it's an array of bytes, you pass the starting address, and the compiler can do the additional to find the place at which x[3] or whatever lives. But in the case of a two dimensional array, the compiler needs to know that magic number 20 in the above example. So
foo(int[][] xy) {
xy[3][4] = 5; //compiler has NO idea where this lives
//because it doesn't know the row dimension of xy!
}
But if you specify
foo(int[][30] xy)
Compiler knows what to do. For reasons I can't remember it's often considered better practice to pass it as a double pointer, but this is what's going on at the technical level.

Initializing and maintaining structs of structs

I’m writing C++ code to deal with a bunch of histograms that are populated from laboratory measurements. I’m running into problems when I try to organize things better, and I think my problems come from mishandling pointers and/or structs.
My original design looked something like this:
// the following are member variables
Histogram *MassHistograms[3];
Histogram *MomentumHistograms[3];
Histogram *PositionHistograms[3];
where element 0 of each array corresponded to one laboratory measurement, element 1 of each corresponded to another, etc. I could access the individual histograms via MassHistograms[0] or similar, and that worked okay. However, the organization didn't seem right to me—if I were to perform a new measurement, I’d have to add an element to each of the histogram arrays. Instead, I came up with
struct Measurement {
Histogram *MassHistogram;
Histogram *MomentumHistogram;
Histogram *PositionHistogram;
};
As an added layer of complexity, I further wanted to bundle these measurements according to the processing that has been done on their data, so I made
struct MeasurementSet {
Measurement SignalMeasurement;
Measurement BackgroundMeasurement;
};
I think this arrangement is much more logical and extensible—but it doesn’t work ;-) If I have code like
MeasurementSet ms;
Measurement m = ms.SignalMeasurement;
Histogram *h = m.MassHistogram;
and then try to do stuff with h, I get a segmentation fault. Since the analogous code was working fine before, I assume that I’m not properly handling the structs in my code. Specifically, do structs need to be initialized explicitly in any way? (The Histograms are provided by someone else’s library, and just declaring Histogram *SomeHistograms[4] sufficed to initialize them before.)
I appreciate the feedback. I’m decently familar with Python and Clojure, but my limited knowledge of C++ doesn’t extend to [what seems like] the arcana of the care and feeding of structs :-)
What I ended up doing
I turned Measurement into a full-blown class:
class Measurement {
Measurement() {
MassHistogram = new Histogram();
MomentumHistogram = new Histogram();
PositionHistogram = new Histogram();
};
~Measurement() {
delete MassHistogram;
delete MomentumHistogram;
delete PositionHistogram;
};
Histogram *MassHistogram;
Histogram *MomentumHistogram;
Histogram *PositionHistogram;
}
(The generic Histogram() constructor I call works fine.) The other problem I was having was solved by always passing Measurements by reference; otherwise, the destructor would be called at the end of any function that received a Measurement and the next attempt to do something with one of the histograms would segfault.
Thank you all for your answers!
Are you sure that Histogram *SomeHistograms[4] initialized the data? How do you populate the Histogram structs?
The problem here is not the structs so much as the pointers that are tripping you up. When you do this: MeasurementSet ms; it declares an 'automatic variable' of type MeasurementSet. What it means is that all the memory for MeasurementSet is 'allocated' and ready to go. MeasurementSet, in turn, has two variables of type Measurement that are also 'allocated' and 'ready to go'. Measurement, in turn, has 3 variables of type Histogram * that are also 'allocated' and 'ready to go'... but wait! The type 'Histogram *' is a 'pointer'. That means it's an address - a 32 or 64 bit (or whatever bit) value that describes an actual memory location. And that's it. It's up to you to make it point to something - to put something at that location. Before it points to anything, it will have literally random data in it (or 0'd out data, or some special debug data, or something like that) - the point is that if you try to do something with it, you'll get a segmentation fault, because you will likely be attempting to read a part of data your program isn't supposed to be reading.
In c++, a struct is almost exactly the same thing as a class (which has a similar concept in python), and you typically allocate one like so:
m.MassHistogram = new Histogram();
...after that, the histogram is ready-to-go. However, YMMV: can you allocate one yourself? Or can you only get one from some library, maybe from a device reading, etc? Furthermore, although you can do what I wrote, it's not necessarily 'pretty'. A c++-ic solution would be to put the allocation in a constructor (like init in python) and delete in a destructor.
When your struct contains a pointer, you have to initialize that variable yourself.
Example
struct foo
{
int *value;
};
foo bar;
// bar.value so far is not initialized and points to a random piece of data
bar.value = new int(0);
// bar.value now points to a int with the value 0
// remember, that you have to delete everything that you new'd, once your done with it:
delete bar.value;
First, always remember that structs and classes are almost exactly the same things. The only difference is that struct members are public by default, and a class member is private by default.
But all the rest is exactly the same.
Second, carefully differentiate between pointers and objects.
If I write
Histogram h;
space for histogram's data will be allocated, and it's constructor will be called. ( A construct is a method with exactly the same name as the class, here Historgram() )
If I write
Histogram* h;
I'm declaring a variable of 32/64 bits that will be used as a pointer to memory. It's initialzed with a random value. Dangerous!
If I write
Histogram* h = new Histogram();
memory will be allocated for one Histogram's data members, and it's constructor will be called. The address in memory will be stored in "h".
If I write
Histogram* copy = h;
I'm again declaring a 32/64 bit variable that points to exactly the same address in memory as h
If I write
Histogram* h = new Historgram;
Histogram* copy = h;
delete h;
the following happens
memory is allocated for a Histogram object
The constructor of Histogram will be called (even if you didn't write it, your compiler will generate one).
h will contain the memory address of this object
the delete operator will call the destructor of Histogram (even if you didn't write it, your compiler will generate one).
the memory allocated for the Histogram object will be deallocated
copy will still contain the memory address where the object used to be allocated. But you're not allowed to use it. It's called a "dangling pointer"
h's contents will be undefined
In short: the "n.MassHistogram" in your code is referring to a random area in memory. Don't use it. Either allocated it first using operator "new", or declare it as "Histogram" (object instead of pointer)
Welcome to CPP :D
You are aware that your definition of Measurement does not allocate memory for actual Histograms? In your code, m.MassHistogram is a dangling (uninitialized) pointer, it's not pointing to any measured Histogram, nor to any memory capable of storing a Histogram. As #Nari Rennlos posted just now, you need to point it to an existing (or newly allocated) Histogram.
What does your 3rd party library's interface look like? If it's at all possible, you should have a Measurement containing 3 Histograms (as opposed to 3 pointers to Histograms). That way when you create a Measurement or a MeasurementSet the corresponding Histograms will be created for you, and the same goes for destruction. If you still need a pointer, you can use the & operator:
struct Measurement2 {
Histogram MassHistogram;
Histogram MomentumHistogram;
Histogram PositionHistogram;
};
MeasurementSet2 ms;
Histogram *h = &ms.SignalMeasurement.MassHistogram; //h valid as long as ms lives
Also note that as long as you're not working with pointers (or references), objects will be copied and assigned by value:
MeasurementSet ms; //6 uninitialized pointers to Histograms
Measurement m = ms.SignalMeasurement; //3 more pointers, values taken from first 3 above
Histogram *h = m.MassHistogram; //one more pointer, same uninitialized value
Though if the pointers had been initialized, all 10 of them would be pointing to an actual Histogram at this point.
It gets worse if you have actual members instead of pointers:
MeasurementSet2 ms; //6 Histograms
Measurement2 m = ms.SignalMeasurement; //3 more Histograms, copies of first 3 above
Histogram h = m.MassHistogram; //one more Histogram
h.firstPoint = 42;
m.MassHistogram.firstPoint = 43;
ms.SignalMeasurement.MassHistogram.firstPoint = 44;
...now you have 3 slightly different mass signal histograms, 2 pairs of identical momentum and position signal histograms, and a triplet of background histograms.