Access struct variable value by pointer offset - c++

I have a struct which looks like:
#pragma pack(1)
typedef struct WHEATHER_STRUCT {
uint8_t packetID; // Value 9
uint16_t packetSize; // Value 7
float cloudLayerAltitude; // Value 25000
} Wheather_Struct
This struct was initialized correctly. Due to design of an algorithm I need to read these three attributes values by a pointer offset. I thank about declare an array which have the size in bytes of these attributes. Just like:
int sizeOfStructAttributes = {1, 2, 4};
And finally to access these values do something like:
pointer = (*this->wheather_struct->packetID)
for (i=0; i<sizeof(sizeOfStructAttributes); i++)
cout << &pointer << ' ';
pointer = pointer + sizeOfStructAttributes[i];
Expected result:
9
7
25000
Could you help me please?

You have many problems with the code I will try to go through them all:
1- Your structure has padding values that depends on the architecture you are targeting maybe 3 or 7 bytes after the first member (packetID) it depends on the architecture and compiler.
2- You are initializing the pointer in a wrong way, it should be:
pointer = &(this->wheather_struct->packetID);
3- cout should be:
cout << *((datatype*)pointer) << ' ';
//datatype should be different in each loop iteration of course.
4- In case you are creating array of this strcutrue, I am not sure if you will face a problem of padding or not. It happens in very rare cases when you use different packing and padding due to mixing your code with other libraries that are compiled with different compiler directives or even uses #pragma to modify the behavior of the compiler during the compile time.
Finally I am sure there is no need at all to enumerate struct members with a pointer.
I encourage you to read about struct padding and packing, good place to start is this question on SO:
Structure padding and packing

One thing for sure, you won't be able to write these offsets manually. This is absolutely not a stable way of doing things, because your compiler might do optimizations such as aligning your struct members.
What you can do is this:
Wheather_Struct w;
long offsetsOfStructAttributes[3] = {0,
(char*)&w.packetSize - (char*)&w.packetID,
(char*)&w.cloudLayerAltitude - (char*)&w.packetID};
Notice that this is the byte difference in size.
Having told you how to do that, I have to say like people said in the comments, please find another way of doing this. This is not safe, unless you absolutely know what you're doing.

Your mistake is that you've assumed that the class has no padding between the members. But there must be padding in order to meet the alignment requirements of the members. Thus the offsets are not what you assume.
To get the offset of a class member, you can use the offsetof macro provided by the standard library. That said, without knowing what you need it for, I remain skeptical about it being appropriate. Note that offsetof works only if your class is a standard layout class. Otherwise the behaviour will be undefined. Your example WHEATHER_STRUCT is standard layout.
cout << &pointer << ' ';
Something like this can not possibly have the output that you expect. You take the address of the pointer, it cannot possibly give you the value of the pointed object that you wanted.
The way to get the pointed value is the indirection operator. But, indirection operator can only work correctly if the pointer is of correct type (float* for float members, uint16_t* for uint16_t members ...) but it cannot be of correct type since it has to be a pointer to a byte for the pointer arithmetic to work with the offsets.
Besides the offset, you also need to know the type of the variable in order to interpret the value. You could store the type in some structure. But you cannot cast the pointer to a type determined at runtime, so what you need is some runtime flow-structure such as a switch or a jump table for the conversion.

You'd better do not use pointer hack: one day underlying memory layout will be changed and your program may corrupt it.
Try to simulate metadata instead.
enum WheatherStructFields
{
wsfPacketID,
wsfPacketSize,
wsfCloudLayerAltitude,
wsfNone
};
typedef struct WHEATHER_STRUCT
{
uint8_t packetID;
uint16_t packetSize;
float cloudLayerAltitude;
void OutFieldValue(std::ostream& os, WheatherStructFields whatField)
{
switch (whatField)
{
case wsfPacketID:
os << (int)packetID;
break;
case wsfPacketSize:
os << packetSize;
break;
case wsfCloudLayerAltitude:
os << cloudLayerAltitude;
break;
default:
os << "Unsupported field: " << whatField;
}
}
} Wheather_Struct;
int main()
{
Wheather_Struct weather = { 9, 7, 25000 };
for (WheatherStructFields whatField = wsfPacketID; whatField < wsfNone;
whatField = (WheatherStructFields)((int)whatField + 1))
{
weather.OutFieldValue(std::cout, whatField);
std::cout << " ";
}
}

There are two problems with your approach:
Firstly, it requires you to get the sizes right. Use sizeof to do that. So your array would look like:
size_t sizeOfStructAttributes = {sizeof(wheather_struct::packet_id),
sizeof(wheather_struct::packet_size),
sizeof(wheather_struct::cloudLayerAltitude) };
The second (more serious) problem is that you don't allow for padding in your structure. Almost all compilers will (unless specially instructed), insert a padding byte between packet_id and packet_size so that everything is nicely aligned. Fortunately, there is a solution for that too - use the offsetof macro (defined in stddef.h):
size_t offsetOfStructAttributes = {offsetof(wheather_struct, packet_id),
offsetof(wheather_struct, packet_size),
offsetof(wheather_struct, cloudLayerAltitude) };
The code then becomes:
for (size_t offset: offsetsOfStructAttributes) {
pointer = &(this->wheather_struct->packetID) + offset
cout << pointer << ' ';
}
Actually: the above code fixes a third problem with your code: sizeof() returns the size in bytes, which is unlikely to be the element count.
Finally, your variables have a typo: meteorology is concerned with whether the weather will be fine or not. You have confused the two words and I am pretty sure you mean "weather".

Related

Why pointer can avoid the warning Warrary-bounds

For the code(Full demo) like:
#include <iostream>
struct A
{
int a;
char ch[1];
};
int main()
{
volatile A *test = new A;
test->a = 1;
test->ch[0] = 'a';
test->ch[1] = 'b';
test->ch[2] = 'c';
test->ch[3] = '\0';
std::cout << sizeof(*test) << std::endl
<< test->ch[0] << std::endl;
}
I need to ignore the compilation warning like
warning: array subscript 1 is above array bounds of 'volatile char 1' [-Warray-bounds]
which is raised by gcc8.2 compiler:
g++ -O2 -Warray-bounds=2 main.cpp
A method to ignore this warning is to use pointer to operate the four bytes characters like:
#include <iostream>
struct A
{
int a;
char ch[1];
};
int main()
{
volatile A *test = new A;
test->a = 1;
// Use pointer to avoid the warning
volatile char *ptr = test->ch;
*ptr = 'a';
*(ptr + 1) = 'b';
*(ptr + 2) = 'c';
*(ptr + 3) = '\0';
std::cout << sizeof(*test) << std::endl
<< test->ch[0] << std::endl;
}
But I can not figure out why that works to use pointer instead of subscript array. Is it because pointer do not have boundary checking for which it point to? Can anyone explain that?
Thanks.
Background:
Due to padding and alignment of memory for struct, though ch[1]-ch[3] in struct A is out of declared array boundary, it is still not overflow from memory view
Why don't we just declare the ch to ch[4] in struct A to avoid this warning?
Answer:
struct A in our app code is generated by other script while compiling. The design rule for struct in our app is that if we do not know the length of an array, we declare it with one member, place it at the end of the struct, and use another member like int a in struct A to control the array length.
Due to padding and alignment of memory for struct, though ch[1]
– ch[3] in struct A is out of declared array boundary, it is
still not overflow for memory view, so we want to ignore this warning.
C++ does not work the way you think it does. You are triggering undefined behavior. When your code triggers undefined behavior, the C++ standard places no requirement on its behavior. A version of GCC attempts to start some video games when certain kind of undefined behavior is encountered. Anthony Williams also knows at least one case where a particular instance of undefined behavior caused someone's monitor to catch on fire. (C++ Concurrency in Action, page 106) Your code may appear to be working at this very time and situation, but that is just an instance of undefined behavior and you cannot count on it. See Undefined, unspecified and implementation-defined behavior.
The correct way to suppress this warning is to write correct C++ code with well-defined behavior. In your case, declaring ch as char ch[4]; solves the problem.
The standard specifies this as undefined behavior in [expr.add]/4:
When an expression J that has integral type is added to or
subtracted from an expression P of pointer type, the result has the
type of P.
If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]),78 the expressions P +
J and J + P (where J has the value j) point to the
(possibly-hypothetical) array element i + j of x if
0 ≤ i + j ≤ n and the
expression P - J points to the (possibly-hypothetical) array element
i − j of x if 0 ≤ i − j ≤ n.
Otherwise, the behavior is undefined.
78) An object that is not an array element is
considered to belong to a single-element array for this purpose; see
[expr.unary.op]. A pointer past the last element of an array x of
n elements is considered to be equivalent to a pointer to a hypothetical array element n for this purpose; see
[basic.compound].
I want to avoid the warning like
warning: array subscript 1 is above array bounds of 'volatile char 1' [-Warray-bounds]
Well, it is probably better to fix the warning, not just avoid it.
The warning is actually telling you something: what you are doing is undefined behavior. Undefined behavior is really bad (it allows your program to literally anything!) and should be fixed.
Let's look at your struct again:
struct A
{
int a;
char ch[1];
};
In C++, your array has only one element in it. The standard only guarantees array elements of 0 through N-1, where N is the size of the array:
[dcl.array]
...If the value of the constant expression is N, the array
has N elements numbered 0 to N-1...
So ch only has the elements 0 through 1-1, or elements 0 through 0, which is just element 0. That means accessing ch[1], ch[2] overruns the buffer, which is undefined behavior.
Due to padding and alignment of memory for struct, though ch1-ch3 in struct A is out of declared array boundary, it is still not overflow for memory view, so we want to ignore this warning.
Umm, if you say so. The example you gave only allocated 1 A, so as far as we know, there is still only space for the 1 character. If you do allocate more than 1 A at a time in your real program, then I suppose this is possible. But that's still probably not a good thing to do. Especially since you might run into int a of the next A if you're not careful.
A solution to ignore this warning is to use pointer...But I can not figure out why that works. Is it because pointer do not have boundary checking for which it point?
Probably. That would be my guess too. Pointers can point to anything (including destroyed data or even nothing at all!), so the compiler probably won't check it for you. The compiler may not even have a way of knowing whether the memory you point to is valid or not (or may just not care), and, thus, may not even have a way to warn you, much less will warn you. Its only choice is to trust you, so I'm guessing that's why there's no warning.
Why don't we just declare the ch to ch4 in struct A to avoid this warning?
Side issue: actually std::string is probably a better choice here if you don't know how many characters you want to store in here ahead of time--assuming it's different for every instance of A. Anyway, moving on:
Why don't we just declare the ch to ch4 in struct A to avoid this warning?
Answer:
struct A in our app code is generated by other script while compiling. The design rule for struct in our app is that if we do not know the length of an array, we declare it with one member, place it at the end of the struct, and use another member like int a in struct A to control the array length.
I'm not sure I understand your design principle completely, but it sounds like std::vector might be a better option. Then, size is kept track of automatically by the std::vector, and you know that everything is stored in ch. To access it, it would be something like:
myVec[i].ch[0]
I don't know all your constraints for your situation, but it sounds like a better solution instead of walking the line around undefined behavior. But that's just me.
Finally, I should mention that if you are still really interested in ignoring our advice, then I should mention that you still have the option to turn off the warning, but again, I'd advise not doing that. It'd be better to fix A if you can, or get a better use strategy if you can't.
There really is no way to work with this cleanly in C++ and iirc the type (a dynamically sized struct) isn't actually properly formed in C++. But you can work with it because compilers still try to preserve compatibility with C. So it works in practice.
You can't have a value of the struct, only references or pointers to it. And they must be allocated by malloc() and released by free(). You can't use new and delete. Below I show you a way that only allows you to allocate pointers to variable sized structs given the desired payload size. This is the tricky bit as sizeof(Buf) will be 16 (and not 8) because Buf::buf must have a unique address. So here we go:
#include <cstddef>
#include <cstdint>
#include <stdlib.h>
#include <new>
#include <iostream>
#include <memory>
struct Buf {
size_t size {0};
char buf[];
[[nodiscard]]
static Buf * alloc(size_t size) {
void *mem = malloc(offsetof(Buf, buf) + size);
if (!mem) throw std::bad_alloc();
return std::construct_at(reinterpret_cast<Buf*>(mem), AllocGuard{}, size);
}
private:
class AllocGuard {};
public:
Buf(AllocGuard, size_t size_) noexcept : size(size_) {}
};
int main() {
Buf *buf = Buf::alloc(13);
std::cout << "buffer has size " << buf->size << std::endl;
}
You should delete or implement the assign/copy/move constructors and operators as desired. A another good idea would be to use std::uniq_ptr or std::shared_ptr with a Deleter that calls free() instead of returning a naked pointer. But I leave that as exercise to the reader.

Can you use a fixed size type and pointers together in the same union?

I'm investigating the possibility of defining a structure for a packet. I would like to set header variables in the packet and then set a pointer to the data part of the packet. My end goal is to be able to send this packet to a low level library that takes only a uint8_t*. I created this quick program to test the feasibility and it does not seem to work.
#include <iostream>
#include <cstdint>
#include <stdlib.h>
typedef union {
struct {
uint8_t header;
uint8_t* data;
};
uint8_t* packet;
} sometype;
int main() {
sometype s;
s.header = 3;
s.data = (uint8_t *) malloc(sizeof(uint8_t) * 2);
s.data[0] = 1;
s.data[1] = 2;
for (unsigned int i = 0; i < 3; i++) {
std::cout << s.packet[i] << std::endl;
}
std::cout << std::endl;
std::cout << s.header << std::endl;
std::cout << s.data[0] << std::endl;
std::cout << s.data[1] << std::endl;
}
My output is
�
�
�
Which makes me realize I have some type of error in my code (I've never worked with union before). However, when I debug the program I can see the data in the union. Looking at the packet, I can see that this method does not appear to be working. The data in the packet is not 3, 1, 2. It is 300, 221, 020 instead.
(gdb) print s
$1 = {{header = 3 '\003', data = 0x613c20 "\001\002"}, packet = 0x400903 <main()+125> "\300\211ƿ`\020`"}
Is this method that I am attempting valid? From google searches I saw someone say you can use whatever datatypes you want. Do I have to pack the structure using a pragma to get this to work or is this method not feasible?
The unusual output is because you attempt to use << to print a uint8_t.
Usually (although the C++ standard doesn't specify this), uint8_t triggers the character overload of <<, so you print out the glyph corresponding to that character code, instead of the integer. To avoid this hiccup you could do std::cout << static_cast<int>(s.header); etc.
Note that in Standard C++ it is not permitted to write one member of a union and then read a different member, i.e. you may only read the same member that was last written. The technique you are trying to use is called union aliasing and is not allowed in Standard C++, although compilers may seem to support it as an extension.
However, even if you're on a compiler that does offer union aliasing, you still won't be able to do s.packet[i] with your current struct definition. This is because packet overlaps with header and data. The byte value of header should not be a part of the address packet is pointing to, but your code treats it like it is.
I guess you mentally had a model of a single pointer and you can interpret the memory being pointed to by the pointer as either a char array, or as a char followed by a char array. But your code doesn't reflect that (and in fact you can't do that at all, unless the lengths of the arrays are known at compile-time).
Since header and data[0] are not in contiguous memory, there's no way you are going to be able to have a single pointer that points to some fictitious memory block in which those two bytes are adjacent. I would recommend giving up on this entire line of enquiry; just have a single memory block, and you can make functions that access particular parts of it.
Yes, you have to use #pragma pack(1) to get the behavior most engineers expect. And yes, this is how most communication low level software works.
Otherwise, compilers tend to align each element to its data size for performance and compatibility reasons.
There is enormous cross-compatibility with #pragma pack() across compilers. See this for gcc.

How many indirection level I can have in c++? [duplicate]

How many pointers (*) are allowed in a single variable?
Let's consider the following example.
int a = 10;
int *p = &a;
Similarly we can have
int **q = &p;
int ***r = &q;
and so on.
For example,
int ****************zz;
The C standard specifies the lower limit:
5.2.4.1 Translation limits
276 The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits: [...]
279 — 12 pointer, array, and function declarators (in any combinations) modifying an
arithmetic, structure, union, or void type in a declaration
The upper limit is implementation specific.
Actually, C programs commonly make use of infinite pointer indirection. One or two static levels are common. Triple indirection is rare. But infinite is very common.
Infinite pointer indirection is achieved with the help of a struct, of course, not with a direct declarator, which would be impossible. And a struct is needed so that you can include other data in this structure at the different levels where this can terminate.
struct list { struct list *next; ... };
now you can have list->next->next->next->...->next. This is really just multiple pointer indirections: *(*(..(*(*(*list).next).next).next...).next).next. And the .next is basically a noop when it's the first member of the structure, so we can imagine this as ***..***ptr.
There is really no limit on this because the links can be traversed with a loop rather than a giant expression like this, and moreover, the structure can easily be made circular.
Thus, in other words, linked lists may be the ultimate example of adding another level of indirection to solve a problem, since you're doing it dynamically with every push operation. :)
Theoretically:
You can have as many levels of indirections as you want.
Practically:
Of course, nothing that consumes memory can be indefinite, there will be limitations due to resources available on the host environment. So practically there is a maximum limit to what an implementation can support and the implementation shall document it appropriately. So in all such artifacts, the standard does not specify the maximum limit, but it does specify the lower limits.
Here's the reference:
C99 Standard 5.2.4.1 Translation limits:
— 12 pointer, array, and function declarators (in any combinations) modifying an
arithmetic, structure, union, or void type in a declaration.
This specifies the lower limit that every implementation must support. Note that in a footenote the standard further says:
18) Implementations should avoid imposing fixed translation limits whenever possible.
As people have said, no limit "in theory". However, out of interest I ran this with g++ 4.1.2, and it worked with size up to 20,000. Compile was pretty slow though, so I didn't try higher. So I'd guess g++ doesn't impose any limit either. (Try setting size = 10 and looking in ptr.cpp if it's not immediately obvious.)
g++ create.cpp -o create ; ./create > ptr.cpp ; g++ ptr.cpp -o ptr ; ./ptr
create.cpp
#include <iostream>
int main()
{
const int size = 200;
std::cout << "#include <iostream>\n\n";
std::cout << "int main()\n{\n";
std::cout << " int i0 = " << size << ";";
for (int i = 1; i < size; ++i)
{
std::cout << " int ";
for (int j = 0; j < i; ++j) std::cout << "*";
std::cout << " i" << i << " = &i" << i-1 << ";\n";
}
std::cout << " std::cout << ";
for (int i = 1; i < size; ++i) std::cout << "*";
std::cout << "i" << size-1 << " << \"\\n\";\n";
std::cout << " return 0;\n}\n";
return 0;
}
Sounds fun to check.
Visual Studio 2010 (on Windows 7), you can have 1011 levels before getting this error:
fatal error C1026: parser stack overflow, program too complex
gcc (Ubuntu), 100k+ * without a crash ! I guess the hardware is the limit here.
(tested with just a variable declaration)
There is no limit, check example at Pointers :: C Interview Questions and Answers.
The answer depends on what you mean by "levels of pointers." If you mean "How many levels of indirection can you have in a single declaration?" the answer is "At least 12."
int i = 0;
int *ip01 = & i;
int **ip02 = & ip01;
int ***ip03 = & ip02;
int ****ip04 = & ip03;
int *****ip05 = & ip04;
int ******ip06 = & ip05;
int *******ip07 = & ip06;
int ********ip08 = & ip07;
int *********ip09 = & ip08;
int **********ip10 = & ip09;
int ***********ip11 = & ip10;
int ************ip12 = & ip11;
************ip12 = 1; /* i = 1 */
If you mean "How many levels of pointer can you use before the program gets hard to read," that's a matter of taste, but there is a limit. Having two levels of indirection (a pointer to a pointer to something) is common. Any more than that gets a bit harder to think about easily; don't do it unless the alternative would be worse.
If you mean "How many levels of pointer indirection can you have at runtime," there's no limit. This point is particularly important for circular lists, in which each node points to the next. Your program can follow the pointers forever.
It's actually even funnier with pointer to functions.
#include <cstdio>
typedef void (*FuncType)();
static void Print() { std::printf("%s", "Hello, World!\n"); }
int main() {
FuncType const ft = &Print;
ft();
(*ft)();
(**ft)();
/* ... */
}
As illustrated here this gives:
Hello, World!
Hello, World!
Hello, World!
And it does not involve any runtime overhead, so you can probably stack them as much as you want... until your compiler chokes on the file.
There is no limit. A pointer is a chunk of memory whose contents are an address.
As you said
int a = 10;
int *p = &a;
A pointer to a pointer is also a variable which contains an address of another pointer.
int **q = &p;
Here q is pointer to pointer holding the address of p which is already holding the address of a.
There is nothing particularly special about a pointer to a pointer. So there is no limit on chain of poniters which are holding the address of another pointer.
ie.
int **************************************************************************z;
is allowed.
Every C++ developer should have heard of the (in)famous Three star programmer.
And there really seems to be some magic "pointer barrier" that has to be camouflaged.
Quote from C2:
Three Star Programmer
A rating system for C-programmers. The more indirect your pointers are (i.e. the more "*" before your variables), the higher your reputation will be. No-star C-programmers are virtually non-existent, as virtually all non-trivial programs require use of pointers. Most are one-star programmers. In the old times (well, I'm young, so these look like old times to me at least), one would occasionally find a piece of code done by a three-star programmer and shiver with awe.
Some people even claimed they'd seen three-star code with function pointers involved, on more than one level of indirection. Sounded as real as UFOs to me.
Note that there are two possible questions here: how many levels of pointer indirection we can achieve in a C type, and how many levels of pointer indirection we can stuff into a single declarator.
The C standard allows a maximum to be imposed on the former (and gives a minimum value for that). But that can be circumvented via multiple typedef declarations:
typedef int *type0;
typedef type0 *type1;
typedef type1 *type2; /* etc */
So ultimately, this is an implementation issue connected to the idea of how big/complex can a C program be made before it is rejected, which is very compiler specific.
I'd like to point out that producing a type with an arbitrary number of *'s is something that can happen with template metaprogramming. I forget what I was doing exactly, but it was suggested that I could produce new distinct types that have some kind of meta maneuvering between them by using recursive T* types.
Template Metaprogramming is a slow descent into madness, so it is not necessary to make excuses when generating a type with several thousand level of indirection. It's just a handy way to map peano integers, for example, onto template expansion as a functional language.
Rule 17.5 of the 2004 MISRA C standard prohibits more than 2 levels of pointer indirection.
There isn't such a thing like real limit but limit exists. All pointers are variables that are usually storing in stack not heap. Stack is usually small (it is possible to change its size during some linking). So lets say you have 4MB stack, what is quite normal size. And lets say we have pointer which is 4 bytes size (pointer sizes are not the same depending on architecture, target and compiler settings).
In this case 4 MB / 4 b = 1024 so possible maximum number would be 1048576, but we shouldn't ignore the fact that some other stuff is in stack.
However some compilers may have maximum number of pointer chain, but the limit is stack size. So if you increase stack size during linking with infinity and have machine with infinity memory which runs OS which handles that memory so you will have unlimited pointer chain.
If you use int *ptr = new int; and put your pointer into heap, that is not so usual way limit would be heap size, not stack.
EDIT Just realize that infinity / 2 = infinity. If machine has more memory so the pointer size increases. So if memory is infinity and size of pointer is infinity, so it is bad news... :)
It depends on the place where you store pointers. If they are in stack you have quite low limit. If you store it in heap, you limit is much much much higher.
Look at this program:
#include <iostream>
const int CBlockSize = 1048576;
int main()
{
int number = 0;
int** ptr = new int*[CBlockSize];
ptr[0] = &number;
for (int i = 1; i < CBlockSize; ++i)
ptr[i] = reinterpret_cast<int *> (&ptr[i - 1]);
for (int i = CBlockSize-1; i >= 0; --i)
std::cout << i << " " << (int)ptr[i] << "->" << *ptr[i] << std::endl;
return 0;
}
It creates 1M pointers and at the shows what point to what it is easy to notice what the chain goes to the first variable number.
BTW. It uses 92K of RAM so just imagine how deep you can go.

c++ function that takes any data type without using templates?

I have assignment which asks one to write a function for any data type.The function is supposed to print the bytes of the structure and identify the total number of bytes the data structure uses along with differentiating between bytes used for members and bytes used for padding.
My immediate reaction, along with most of the classes reaction was to use templates. This allows you to write the function once and gather the run time type of the objects passed into the function. Using memset and typeid's one can easily accomplish what has been asked. However, our prof. just saw our discussion about templates and damned templates to hell.
After seeing this I was thrown for a loop and I'm looking for a little guidance as the best way to get around this. Some things I've looked into:
void pointers with explicit casting (this seems like it'd get messy)
base class with virtual functions only from which all data structures inherit from, seems a bit odd to do.
a base class with 'friendships' to each of our data structures.
rewriting a function for each data structure in our problem set (what I imagine is the worst possible solution).
Was hoping I overlooked a common c++ tool, does anyone have any ideas?
Treat the function as stupid as possible, in fact, treat it as if it doesn't know anything and all information must be passed to it.
Parameters to the function:
Structure address, as a uint8_t *. (Needed to print the bytes)
Structure size, in bytes. (Needed to print the bytes and to print the
total size)
A vector of member information: member length OR the sum of the bytes used by the members.
The vector is needed to fulfill the requirement of printing the bytes used by the members and the bytes used by padding. Optionally you could pass the sum of the members.
Example:
void Analyze_Structure(uint8_t const * p_structure,
size_t size_of_structure,
size_t size_occupied_by_members);
The trick of this assignment is to figure out how to have the calling function determine these items.
Hope this helps.
Edit 1:
struct Apple
{
char a;
int weight;
double protein_per_gram;
};
int main(void)
{
Apple granny_smith;
Analyze_Structure((uint8_t *) &granny_smith,
sizeof(Apple),
sizeof(granny_smith.a)
+ sizeof(granny_smith.weight)
+ sizeof(granny_smith.protein_per_gram);
return 0;
}
I have assignment which asks one to write a function for any data type.
This means either templates (which your prof. dismissed), void*, or variable number of arguments (simiar to printf).
The function is supposed to print the bytes of the structure
void your_function(void* data, std::size_t size)
{
std::uint8_t* bytes = reinterpret_cast<std::uint8_t*>(data);
for(auto x = bytes; x != bytes + size; ++x)
std::clog << "0x" << std::hex << static_cast<std::uint32_t>(*x) << " ";
}
[...] and identify the total number of bytes the data structure uses along with differentiating between bytes used for members and bytes used for padding.
On this one, I'm lost: the bytes used for padding are (by definition) not part of the structure. Consider:
struct x { char c; char d; char e; }; // sizeof(x) == 3;
x instance{ 0, 0, 0 };
your_function(&instance, sizeof(x)); // passes 3, not 4 (4 for 32bits architecture)
Theoretically, you could also pass alignof(instance) to the function, but that won't tell you the alignment of the fields in memory (as far as I know it is not standardized, but I may be wrong).
There are a few possibilities here:
Your prof. learned "hacky" C++ that was considered good code 10 or 20 years ago and didn't update his knowledge (C-style code, pointers, direct memory access and "smart hacks" are all in here).
He didn't know how to express exactly what he wanted or the terminology to use ("write a function for any data type" is too vague: as a developer, if I got this assignment, the first thing to do would be to ask for details - like "how will it be used?" and "what is the expected function signature").
For example, this could be achieved - to a degree - with macros, but if he wants you to use macros in place of functions and templates, you should probably contemplate changing professors.
He meant that you should write some arbitrary data type (like my struct x above) and define your API around that (unlikely).
I am not sure that such a function can be built without a minimum of introspection: you need to know what the struct members are, otherwise you only have access to the size of the struct.
Anyway, here is my proposal for a solution that should work without introspection, provided the user of the code "cooperates".
Your functions will take as arguments void* and size_t for the address and sizeof of the struct.
0) let the user create a struct of the desired type.
1) let the user call a function of yours that sets all bytes to 0.
2) let the user assign a value to every field of the struct.
3) let the user call a function of yours that keeps a record of every byte that is still 0.
4) let the user call a function of yours that sets all bytes to 1.
5) let the user assign a value to every field of the struct again. (Same values as the first time!)
6) let the user call a function of yours and count the bytes that are still 1 AND were marked before. These are padding bytes.
The reason to try with values 0 then 1 is that the values assigned by the user could include bytes 0; but they can't be bytes 0 and bytes 1 at the same time so one of the test will exclude them.
struct _S { int I; char C } S;
Fill0(S, sizeof(S));
// User cooperation
S.I= 0;
S.C= '\0';
Mark0(S, sizeof(S)); // Has some form of static storage
Fill1(S, sizeof(S));
// User cooperation
S.I= 0;
S.C= '\0';
DetectPadding(S, sizeof(S));
You can pack all of this in a single function that takes a callback function argument that does the member assignments.
void Assign(void* pS) // User-written callback
{
struct _S& S= *(struct _S)pS;
S.I= 0;
S.C= '\0';
}

How many levels of pointers can we have?

How many pointers (*) are allowed in a single variable?
Let's consider the following example.
int a = 10;
int *p = &a;
Similarly we can have
int **q = &p;
int ***r = &q;
and so on.
For example,
int ****************zz;
The C standard specifies the lower limit:
5.2.4.1 Translation limits
276 The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits: [...]
279 — 12 pointer, array, and function declarators (in any combinations) modifying an
arithmetic, structure, union, or void type in a declaration
The upper limit is implementation specific.
Actually, C programs commonly make use of infinite pointer indirection. One or two static levels are common. Triple indirection is rare. But infinite is very common.
Infinite pointer indirection is achieved with the help of a struct, of course, not with a direct declarator, which would be impossible. And a struct is needed so that you can include other data in this structure at the different levels where this can terminate.
struct list { struct list *next; ... };
now you can have list->next->next->next->...->next. This is really just multiple pointer indirections: *(*(..(*(*(*list).next).next).next...).next).next. And the .next is basically a noop when it's the first member of the structure, so we can imagine this as ***..***ptr.
There is really no limit on this because the links can be traversed with a loop rather than a giant expression like this, and moreover, the structure can easily be made circular.
Thus, in other words, linked lists may be the ultimate example of adding another level of indirection to solve a problem, since you're doing it dynamically with every push operation. :)
Theoretically:
You can have as many levels of indirections as you want.
Practically:
Of course, nothing that consumes memory can be indefinite, there will be limitations due to resources available on the host environment. So practically there is a maximum limit to what an implementation can support and the implementation shall document it appropriately. So in all such artifacts, the standard does not specify the maximum limit, but it does specify the lower limits.
Here's the reference:
C99 Standard 5.2.4.1 Translation limits:
— 12 pointer, array, and function declarators (in any combinations) modifying an
arithmetic, structure, union, or void type in a declaration.
This specifies the lower limit that every implementation must support. Note that in a footenote the standard further says:
18) Implementations should avoid imposing fixed translation limits whenever possible.
As people have said, no limit "in theory". However, out of interest I ran this with g++ 4.1.2, and it worked with size up to 20,000. Compile was pretty slow though, so I didn't try higher. So I'd guess g++ doesn't impose any limit either. (Try setting size = 10 and looking in ptr.cpp if it's not immediately obvious.)
g++ create.cpp -o create ; ./create > ptr.cpp ; g++ ptr.cpp -o ptr ; ./ptr
create.cpp
#include <iostream>
int main()
{
const int size = 200;
std::cout << "#include <iostream>\n\n";
std::cout << "int main()\n{\n";
std::cout << " int i0 = " << size << ";";
for (int i = 1; i < size; ++i)
{
std::cout << " int ";
for (int j = 0; j < i; ++j) std::cout << "*";
std::cout << " i" << i << " = &i" << i-1 << ";\n";
}
std::cout << " std::cout << ";
for (int i = 1; i < size; ++i) std::cout << "*";
std::cout << "i" << size-1 << " << \"\\n\";\n";
std::cout << " return 0;\n}\n";
return 0;
}
Sounds fun to check.
Visual Studio 2010 (on Windows 7), you can have 1011 levels before getting this error:
fatal error C1026: parser stack overflow, program too complex
gcc (Ubuntu), 100k+ * without a crash ! I guess the hardware is the limit here.
(tested with just a variable declaration)
There is no limit, check example at Pointers :: C Interview Questions and Answers.
The answer depends on what you mean by "levels of pointers." If you mean "How many levels of indirection can you have in a single declaration?" the answer is "At least 12."
int i = 0;
int *ip01 = & i;
int **ip02 = & ip01;
int ***ip03 = & ip02;
int ****ip04 = & ip03;
int *****ip05 = & ip04;
int ******ip06 = & ip05;
int *******ip07 = & ip06;
int ********ip08 = & ip07;
int *********ip09 = & ip08;
int **********ip10 = & ip09;
int ***********ip11 = & ip10;
int ************ip12 = & ip11;
************ip12 = 1; /* i = 1 */
If you mean "How many levels of pointer can you use before the program gets hard to read," that's a matter of taste, but there is a limit. Having two levels of indirection (a pointer to a pointer to something) is common. Any more than that gets a bit harder to think about easily; don't do it unless the alternative would be worse.
If you mean "How many levels of pointer indirection can you have at runtime," there's no limit. This point is particularly important for circular lists, in which each node points to the next. Your program can follow the pointers forever.
It's actually even funnier with pointer to functions.
#include <cstdio>
typedef void (*FuncType)();
static void Print() { std::printf("%s", "Hello, World!\n"); }
int main() {
FuncType const ft = &Print;
ft();
(*ft)();
(**ft)();
/* ... */
}
As illustrated here this gives:
Hello, World!
Hello, World!
Hello, World!
And it does not involve any runtime overhead, so you can probably stack them as much as you want... until your compiler chokes on the file.
There is no limit. A pointer is a chunk of memory whose contents are an address.
As you said
int a = 10;
int *p = &a;
A pointer to a pointer is also a variable which contains an address of another pointer.
int **q = &p;
Here q is pointer to pointer holding the address of p which is already holding the address of a.
There is nothing particularly special about a pointer to a pointer. So there is no limit on chain of poniters which are holding the address of another pointer.
ie.
int **************************************************************************z;
is allowed.
Every C++ developer should have heard of the (in)famous Three star programmer.
And there really seems to be some magic "pointer barrier" that has to be camouflaged.
Quote from C2:
Three Star Programmer
A rating system for C-programmers. The more indirect your pointers are (i.e. the more "*" before your variables), the higher your reputation will be. No-star C-programmers are virtually non-existent, as virtually all non-trivial programs require use of pointers. Most are one-star programmers. In the old times (well, I'm young, so these look like old times to me at least), one would occasionally find a piece of code done by a three-star programmer and shiver with awe.
Some people even claimed they'd seen three-star code with function pointers involved, on more than one level of indirection. Sounded as real as UFOs to me.
Note that there are two possible questions here: how many levels of pointer indirection we can achieve in a C type, and how many levels of pointer indirection we can stuff into a single declarator.
The C standard allows a maximum to be imposed on the former (and gives a minimum value for that). But that can be circumvented via multiple typedef declarations:
typedef int *type0;
typedef type0 *type1;
typedef type1 *type2; /* etc */
So ultimately, this is an implementation issue connected to the idea of how big/complex can a C program be made before it is rejected, which is very compiler specific.
I'd like to point out that producing a type with an arbitrary number of *'s is something that can happen with template metaprogramming. I forget what I was doing exactly, but it was suggested that I could produce new distinct types that have some kind of meta maneuvering between them by using recursive T* types.
Template Metaprogramming is a slow descent into madness, so it is not necessary to make excuses when generating a type with several thousand level of indirection. It's just a handy way to map peano integers, for example, onto template expansion as a functional language.
Rule 17.5 of the 2004 MISRA C standard prohibits more than 2 levels of pointer indirection.
There isn't such a thing like real limit but limit exists. All pointers are variables that are usually storing in stack not heap. Stack is usually small (it is possible to change its size during some linking). So lets say you have 4MB stack, what is quite normal size. And lets say we have pointer which is 4 bytes size (pointer sizes are not the same depending on architecture, target and compiler settings).
In this case 4 MB / 4 b = 1024 so possible maximum number would be 1048576, but we shouldn't ignore the fact that some other stuff is in stack.
However some compilers may have maximum number of pointer chain, but the limit is stack size. So if you increase stack size during linking with infinity and have machine with infinity memory which runs OS which handles that memory so you will have unlimited pointer chain.
If you use int *ptr = new int; and put your pointer into heap, that is not so usual way limit would be heap size, not stack.
EDIT Just realize that infinity / 2 = infinity. If machine has more memory so the pointer size increases. So if memory is infinity and size of pointer is infinity, so it is bad news... :)
It depends on the place where you store pointers. If they are in stack you have quite low limit. If you store it in heap, you limit is much much much higher.
Look at this program:
#include <iostream>
const int CBlockSize = 1048576;
int main()
{
int number = 0;
int** ptr = new int*[CBlockSize];
ptr[0] = &number;
for (int i = 1; i < CBlockSize; ++i)
ptr[i] = reinterpret_cast<int *> (&ptr[i - 1]);
for (int i = CBlockSize-1; i >= 0; --i)
std::cout << i << " " << (int)ptr[i] << "->" << *ptr[i] << std::endl;
return 0;
}
It creates 1M pointers and at the shows what point to what it is easy to notice what the chain goes to the first variable number.
BTW. It uses 92K of RAM so just imagine how deep you can go.