Using void pointers in calculations - c++

This is quite a long introduction to a simple question, but otherwise there will be questions of the type "Why do you want to handle void pointers in C++? You horrible person!". Which I'd rather (a)void. :)
I'm using an C library from which I intially retrieve a list of polygons which it will operate on. The function I use gives me an array of pointers (PolygonType**), from which I create a std::vector<MyPolyType> of my own polygon class MyPolyType. This is in turn used to create a boost::graph with node identifiers given by the index in the vector.
At a later time in execution, I want to calculate a path between two polygons. These polygons are given to me in form of two PolygonType*, but I want to find the corresponding nodes in my graph. I can find these if I know the index they had in the previous vector form.
Now, the question: The PolygonType struct has a void* to an "internal identifier" which it seems I cannot know the type of. I do know however that the pointer increases with a fixed step (120 bytes). And I know the value of the pointer, which would be the offset of the first object. Can I use this to calculate my index with (p-p0)/120, where p is the address of the Polygon I want to find, and p0 is the offset of the first polygon? I'd have to cast the adresses to ints, is this portable? (The application can be used on windows and linux)

You cannot substract two void pointers. The compiler will shout that it doesn't know the size. You must first cast them to char pointers (char*) and then substract them and then divide them by 120. If you are dead sure that your object's size is actually 120, then it is safe( though ugly) provided that p and p0 point to objects within the same array
I still don't understand why p0 is the offset? I'd say p is the address of your Polygon, and p0 is the address of the first polygon... Am I misunderstanding something?

Given that the pointer is pointing to an "internal identifier" I don't think you can make any assumptions about the actual values stored in it. If the pointer can point into the heap you may be just seeing one possible set of values and it will subtly (or obviously) break in the future.
Why not just create a one-time reverse mapping of PolygonType* -> index and use that?

Related

C++ Memory Allocator

I'm trying to figure out how I could make a linked list which links to a single byte array. So each element I put into the byte array could be enqued() and dequeued(). However, I need to figure out how to do this using pointer offsets and linked lists.
My question is:
How do I get an offset of a set amount from the start of a pointer? For example, let's say the beginning of my list is at one pointer. I would start by just checking if that space is empty, if not, get the next value in the list. How do I offset from a current pointer position and get a new pointer location that is basically just an offset of another pointer, forward or backwards, up or down, left and right, plus or minus.
Someone asked for an example:
byte myData[1024];
I have to store all of my data into this. This is for a class assignment. Essentially, I have to use this array to store any and all of my data to it, and basically create a queue, like the standard c++ queue. I have to create Enqueue() and Dequeue() functions and then dynamically allocate the memory for each. I have a general idea of what I'm doing. I'm stuck on trying to figure out how to take a pointer of my current position, and then set it to a new position, and then have that be my "next" in the list.
It sounds like what you really want is pointer arithmetic. It's simple enough.
std::int32_t foo[] = {42, 350};
std::int32_t* intPtr = &foo; // We'll say foo is at address 0x005
++intPtr; // Or intPtr += 1, either way the value of intPtr is now 0x009
// *intPtr would now give you 350.
// Your program knows the type being pointed to, and bumps up the address
// accordingly. In this case a 4-byte integer
When doing pointer arithmetic on a C-array, it's important to have checks in place to stop you going out of bounds on either side. However, I don't even think pointer arithmetic is necessary. If you're storing an array privately, simply using index access and tracking what index your list ends at is a lot simpler. You still have to do checks, but their easier checks.
You're also saying linked list, but describing an array list. They are two very different data structures. Your queue will be a lot easier to write if you write a separate array list class, and store an array list object in your queue instead or a raw array.
How do I get an offset of a set amount from the start of a pointer?
Read the C++11 standard n3337 about pointer arithmetic. Notice the existence of offsetof in C++.
If you have two short*ptr1; and short*ptr2; pointers which contain a valid address, you might code ptr1 - ptr2 or ptr1 + 5 or ptr2 - 3 (however, ptr1+ptr2 is forbidden). The C++11 standard explains when that is valid (sometimes it is not, e.g. when ptr2 is the nullptr). Notice also that in general &ptr1[3] is the same as ptr1+3 and ptr2[-1] is exactly *(ptr2-1) (when that makes sense).
Beware of undefined behavior in your code, such as buffer overflows (and you will have one if you do pointer arithmetic carelessly: beware of segmentation faults).
Tools like address sanitizers, debuggers (such as GDB), valgrind should be helpful to understand the behavior of your code.
Don't forget to enable warnings and debug info in your C++ compiler. Once your C++ code compiles without warnings, read how to debug small programs. With GCC, compile with g++ -Wall -Wextra -g. Notice that GCC 10 adds some static analysis abilities. And you could use the Clang static analyzer or Frama-C (or develop your own GCC plugin).
The linked list wikipage has a nice figure. The wikipage on tries could help you also.
I recommend reading a good C++ programming book and then some introduction to algorithms.
On github or elsewhere you can find tons of examples of C++ code related to your question (whose terminology is confusing to non-native English speakers).
Academic papers about memory shape analysis (such as this one or that one) contain figures which would improve your understanding. Books or web resources about garbage collection are also relevant.

Visual Studio Variable Type Asterisk

Using Visual Studio, I am running into some trouble with an int type variable and a float type variable. They are both stored in their own arrays. When I go to print them out they come out as memory location gibberish. When I debug I noticed that the correct value is displayed next to the memory gibberish in the watch area. I also noticed that under type, the variable types have an * (asterisk) next to them. Could anybody offer information as to why this would happen? Thanks in advance.
Watch area looks like this...
Name Value Type
score 0x002ff5c8 {96.0000000} float *
studentID 0x002ff698 {9317} int *
I recommend reading into the tutorial above and perhaps an introductory book depending on how interested you are in pursuing learning c++. The type is a pointer type to int and float data. Here is a small example that answers the question (how to print out these values):
float* a = new float(5.8);
the pointer is established, this points to a memory location where a float with the value 5.8 is stored.
std::cout << *a;
The asterisk before a is called dereferencing a pointer, this is how the data is accessed, you may want to check to make sure you have a valid pointer or your program can crash.
delete a;
delete memory allocated when it will no longer be used(EDIT 2), this will free the space given to store a (failing to do so causes a memory leak)
EDIT 1:
Consider that the pointer may point to a contiguous array of floats or int (which is much more likely than just one), then you will have to know the size of the array you are reading to access the elements. In this case, you will use the operator [] to access the members, let's say we have an array b,
float* b = new float[2] {0.0,1.0};
to print it's members you would have to access each element
std::cout << b[0] << ' ' << b[1];
the delete operator looks like this for arrays
delete[] b;
EDIT 2:
Whenever you use new to dynamically allocate memory, think about the scope of the variable, when the scope is over delete the pointer. User is correct, you do not want to delete a pointer which may be used later, nor is it necessary to call delete to pointers obtained from references.
First off you are using the debugger. This is awesome. The sheer number of SO questions that could be solved in five minutes with a debugger is staggering. You are already far, far ahead of the game than a lot of the time-wasting sad sacks who can't be bothered to use the expletive deleted tools that came with the compiler.
Second, some important reading because it explains part of what is going on: What is array decaying?
Now to break down what the debugger is showing you
score 0x002ff5c8 {96.0000000} float *
score: Obviously the variable's name
float *: score is a variable of type float *, a pointer to a float.
0x002ff5c8: This is the data value of score. Pointers are a reference to a location in memory. Rather than being data, they point to data. So a pointer is a variable that contains where to find, the address of, another variable. 002ff5c8 is the hexadecimal location in memory where you will find what score points to.
{96.0000000}: score points to a floating point value that has been set to 96 (possibly plus or minus some fuzziness because not all numbers can be exactly represented with floating point)
So the crazy number 0x002ff5c8 tells the program where to find score's data, and this data happens to be 96.
Note the debugger only shows you the first value in the array of data that is at score, which brings us back to array decaying. Odds are good that the program has knowledge of how much data is pointed at by score. Could be one float. Could be a million. You have to carry the length of a block of an array around with it once the array has decayed.

Why am I being told that an array is a pointer? What is the relationship between arrays and pointers in C++?

My background is C++ and I'm currently about to start developing in C# so am doing some research. However, in the process I came across something that raised a question about C++.
This C# for C++ developers guide says that
In C++ an array is merely a pointer.
But this StackOverflow question has a highly-upvoted comment that says
Arrays are not pointers. Stop telling people that.
The cplusplus.com page on pointers says that arrays and pointers are related (and mentions implicit conversion, so they're obviously not the same).
The concept of arrays is related to that of pointers. In fact, arrays work very much like pointers to their first elements, and, actually, an array can always be implicitly converted to the pointer of the proper type.
I'm getting the impression that the Microsoft page wanted to simplify things in order to summarise the differences between C++ and C#, and in the process wrote something that was simpler but not 100% accurate.
But what have arrays got to do with pointers in the first place? Why is the relationship close enough for them to be summarised as the "same" even if they're not?
The cplusplus.com page says that arrays "work like" pointers to their first element. What does that mean, if they're not actually pointers to their first element?
There is a lot of bad writing out there. For example the statement:
In C++ an array is merely a pointer.
is simply false. How can such bad writing come about? We can only speculate, but one possible theory is that the author learned C++ by trial and error using a compiler, and formed a faulty mental model of C++ based on the results of his experiments. This is possibly because the syntax used by C++ for arrays is unconventional.
The next question is, how can a learner know if he/she is reading good material or bad material? Other than by reading my posts of course ;-) , participating in communities like Stack Overflow helps to bring you into contact with a lot of different presentations and descriptions, and then after a while you have enough information and experience to make your own decisions about which writing is good and which is bad.
Moving back to the array/pointer topic: my advice would be to first build up a correct mental model of how object storage works when we are working in C++. It's probably too much to write about just for this post, but here is how I would build up to it from scratch:
C and C++ are designed in terms of an abstract memory model, however in most cases this translates directly to the memory model provided by your system's OS or an even lower layer
The memory is divided up into basic units called bytes (usually 8 bits)
Memory can be allocated as storage for an object; e.g. when you write int x; it is decided that a particular block of adjacent bytes is set aside to store an integer value. An object is any region of allocated storage. (Yes this is a slightly circular definition!)
Each byte of allocated storage has an address which is a token (usually representible as a simple number) that can be used to find that byte in memory. The addresses of any bytes within an object must be sequential.
The name x only exists during the compilation stage of a program. At runtime there can be int objects allocated that never had a name; and there can be other int objects with one or more names during compilation.
All of this applies to objects of any other type, not just int
An array is an object which consists of many adjacent sub-objects of the same type
A pointer is an object which serves as a token identifying where another object can be found.
From hereon in, C++ syntax comes into it. C++'s type system uses strong typing which means that each object has a type. The type system extends to pointers. In almost all situations, the storage used to store a pointer only saves the address of the first byte of the object being pointed to; and the type system is used at compilation time to keep track of what is being pointed to. This is why we have different types of pointer (e.g. int *, float *) despite the fact that the storage may consist of the same sort of address in both cases.
Finally: the so-called "array-pointer equivalence" is not an equivalence of storage, if you understood my last two bullet points. It's an equivalence of syntax for looking up members of an array.
Since we know that a pointer can be used to find another object; and an array is a series of many adjacent objects; then we can work with the array by working with a pointer to that array's first element. The equivalence is that the same processing can be used for both of the following:
Find Nth element of an array
Find Nth object in memory after the one we're looking at
and furthermore, those concepts can be both expressed using the same syntax.
They are most definitely not the same thing at all, but in this case, confusion can be forgiven because the language semantics are ... flexible and intended for the maximum confusion.
Let's start by simply defining a pointer and an array.
A pointer (to a type T) points to a memory space which holds at least one T (assuming non-null).
An array is a memory space that holds multiple Ts.
A pointer points to memory, and an array is memory, so you can point inside or to an array. Since you can do this, pointers offer many array-like operations. Essentially, you can index any pointer on the presumption that it actually points to memory for more than one T.
Therefore, there's some semantic overlap between (pointer to) "Memory space for some Ts" and "Points to a memory space for some Ts". This is true in any language- including C#. The main difference is that they don't allow you to simply assume that your T reference actually refers to a space where more than one T lives, whereas C++ will allow you to do that.
Since all pointers to a T can be pointers to an array of T of arbitrary size, you can treat pointers to an array and pointers to a T interchangably. The special case of a pointer to the first element is that the "some Ts" for the pointer and "some Ts" for the array are equal. That is, a pointer to the first element yields a pointer to N Ts (for an array of size N) and a pointer to the array yields ... a pointer to N Ts, where N is equal.
Normally, this is just interesting memory crapping-around that nobody sane would try to do. But the language actively encourages it by converting the array to the pointer to the first element at every opportunity, and in some cases where you ask for an array, it actually gives you a pointer instead. This is most confusing when you want to actually use the array like a value, for example, to assign to it or pass it around by value, when the language insists that you treat it as a pointer value.
Ultimately, all you really need to know about C++ (and C) native arrays is, don't use them, pointers to arrays have some symmetries with pointers to values at the most fundamental "memory as an array of bytes" kind of level, and the language exposes this in the most confusing, unintuitive and inconsistent way imaginable. So unless you're hot on learning implementation details nobody should have to know, then use std::array, which behaves in a totally consistent, very sane way and just like every other type in C++. C# gets this right by simply not exposing this symmetry to you (because nobody needs to use it, give or take).
Arrays and pointers in C and C++ can be used with the exact same semantics and syntax in the vast majority of cases.
That is achieved by one feature:
Arrays decay to pointers to their first element in nearly all contexts.
Exceptions in C: sizeof, _Alignas, _Alignas, address-of &
In C++, the difference can also be important for overload-resolution.
In addition, array notation for function arguments is deceptive, these function-declarations are equivalent:
int f(int* a);
int f(int a[]);
int f(int a[3]);
But not to this one:
int f(int (&a)[3]);
Besides what has already been told, there is one big difference:
pointers are variables to store memory addresses, and they can be incremented or decremented and the values they store can change (they can point to any other memory location). That's not the same for arrays; once they are allocated, you can't change the memory region they reference, e.g. you cannot assign other values to them:
int my_array[10];
int x = 2;
my_array = &x;
my_array++;
Whereas you can do the same with a pointer:
int *p = array;
p++;
p = &x;
The meaning in this guide was simply that in C# an array is an object (perhaps like in STL that we can use in C++), while in C++ an array is basically a sequence of variables located & allocated one after the other, and that's why we can refer to them using a pointer (pointer++ will give us the next one etc.).
it's as simple as:
int arr[10];
int* arr_pointer1 = arr;
int* arr_pointer2 = &arr[0];
so, since arrays are contiguous in memory, writing
arr[1];
is the same as writing:
*(arr_pointer+1)
pushing things a bit further, writing:
arr[2];
//resolves to
*(arr+2);
//note also that this is perfectly valid
2[arr];
//resolves to
*(2+arr);

How does GetGlyphOutline function work? (WinAPI)

Basically I want to get bezier control points from a ttf font and then draw them. I was basically wondering 2 things.
Does it return an array of points or is it more complex?
How can you tell the points of 1 contour from another ex: the letter O which has 2 contours?
Thanks
Found it:
The native buffer returned by GetGlyphOutline when GGO_NATIVE is specified is a glyph outline. A glyph outline is returned as a series of one or more contours defined by a TTPOLYGONHEADER structure followed by one or more curves. Each curve in the contour is defined by a TTPOLYCURVE structure followed by a number of POINTFX data points. POINTFX points are absolute positions, not relative moves. The starting point of a contour is given by the pfxStart member of the TTPOLYGONHEADER structure. The starting point of each curve is the last point of the previous curve or the starting point of the contour. The count of data points in a curve is stored in the cpfx member of TTPOLYCURVE structure. The size of each contour in the buffer, in bytes, is stored in the cb member of TTPOLYGONHEADER structure. Additional curve definitions are packed into the buffer following preceding curves and additional contours are packed into the buffer following preceding contours. The buffer contains as many contours as fit within the buffer returned by GetGlyphOutline.
#Milo:
A void pointer is where you define a pointer to any memory location regardless of that locations defined "type" (basically, it's similiar to the "pointers" used in assembly). The problem is, it's usually not portable between different machine architectures and if you want to do anything with arrays you have to a) Manually implement boundries between each member and keep track of byte sizes (which is usually even more architecture dependant) or b) Cast it to a predefined struct or series of structs, and generally if you have to do this, you could likely just use a pointer to the struct and avoid the use of void pointers entirely.
The reason I say usually, is because there are certain circumstances when the drawbacks of using void pointers don't apply. Also, I know assembly dosen't use "pointers" per se, but it is still very similiar and I just used the term to help clarify my explanation. Remember, many things aren't set in stone.

What may cause losing object at the other end of a pointer in c++?

EDIT: I have found the error: I did not initialize an array with a size. question can be closed.
I have a class V, and another class N. An object of N will have an array of pointers to objects of class V (say V **vList). So, N has a function like
V **getList();
Now in some function of other classes or simply a driver function, if I say V **theList = (N)n.getList(); Q1: theList would be pointing at the 1st element of the array? Given that the size of array is known, can I loop through with index i and say V *oneV = *vList[i]? Please correct me if what I'm doing above is wrong.
I have been using debugger to trace through the whole process of my program running, the thing I found was that after using V *oneV = vList[i], the value of the pointers in the array, vList, were the same as when they were created, but if I follow the pointer to where it is pointing at, the object was gone. I'm guessing that might be the reason why I am getting seg fault or bus error. Could it be the case? WHY did I 'loose' the object at the other end of a pointer? What did I do wrong?
and yes, I am working on a school assignment, that's why I do not want to print out my codes, I want to finish it myself, but I need help finding a problem. I think I still need explanation on array of pointers. Thank you
Q1 is right. For the second part, V *oneV = vList[i] would be the correct syntax. In your syntax you are dereferencing one more time (treating an object of type V as a pointer to such an object) which obviously is crashing your code.
EDIT:
Since you are using the correct syntax, the reason of segfaults would depend on your memory management of the objects of type V. If you have inserted addresses of objects created on the stack (automatic vars, not by new or malloc) inside a function and are trying to access them outside of it, then the pointers would be dangling and your code will crash.
Class N has to manage the number of elements in a list somehow. The usual approaches are to make a public function which returns the number of elements in the array, or to provide an iterator function which loops over all the list's elements.
An array with N elements are stored at array[0] through array[N-1]. You're accessing one past the end of the array.
First rule out the initial ones:
you are initializing correctly (new instead of automatic/local variables)
you are accessing the elements correctly (not like in the typo you posted in the question - based on your comment)
you are using the right size
If you go through all the normal ones and everything is k, then make sure to pay special attention to your loops / size calculations / and anything else that could be causing you to write to unintended addresses.
It is possible to write garbage at unintended locations & then get the error in unexpected places ... the worst I saw like that, was some file descriptors's variables being corrupted because of an array gone wrong right before those variables - it broke on file related functions, which seemed v. crazy.
theList would be pointing at the 1st
element of the array? Given that the
size of array is known, can I loop
through with index i and say V *oneV =
*vList[i]?
Yes, that is correct.
I'm guessing that might be the reason
why I am getting seg fault or bus
error. Could it be the case?
Yes, if you have an invalid pointer and try to dereference it you'll get a segfault.
WHY did I 'loose' the object at the
other end of a pointer? What did I do
wrong?
That is difficult to predict without seeing the actual code. Most probable causes are that either you are not filling the V** correctly or after putting a V* pointer inside V** array you are deleting that object from some other place. BTW, I am assuming that you are allocating memory using new, is this assumption correct?