I have been working on a project which utilizes a dynamic array of structures. To avoid storing the number of structures in its own variables (the count of structures), I have been using an array of pointers to the structure variables with a NULL terminator.
For example, let's say my structure type is defined as:
typedef struct structure_item{
/* ... Structure Variables Here ... */
} item_t;
Now let's say my code has item_t **allItems = { item_1, item_2, item_3, ..., item_n, NULL }; and all item_#s are of the type item_t *.
Using this setup, I then do not have to keep track of another variable which tells me the total number of items. Instead, I can determine the total number of items as needed by saying:
int numberOfStructures;
for( numberOfStructures = 0;
*(allItems + numberOfStructures) != NULL;
numberOfStructures++
);
When this code executes, it counts the total number of pointers before NULL.
As a comparison, this system is similar to C-style strings; whereas tracking the total number of structures would be similar to a Pascal-style string. (Because C uses a NULL terminated array of characters vs. Pascal which tracks the length of its array of characters.)
My question is rather simple, is an array of pointers (pointer to pointer to struct) really necessary or could this be done with an array of structs (pointer to struct)? Can anybody provide better ways to handle this?
Note: it is important that the solution is compatible with both C and C++. This is being used in a wrapper library which is wrapping a C++ library for use in standard C.
Thank you all in advance!
What you need is a sentinel value, a recognizable valid value that means "nothing". For pointers, the standard sentinel value is NULL.
If you want to use your structs directly, you will need to decide on a sentinel value of type item_t, and check for that. Your call.
Yes, it is possible to have an array of structs, and (at least) one of those a defined sentinel (which is that the '\0' used at the end of strings, and the NULL pointer in your case).
What you need to do, for your struct type, is reserve one or more possible values of that struct (composed of the set of values of its members) to indicate a sentinel.
For example, let's say we have a struct type
struct X {int a; char *p};
then define a function
int is_sentinel(struct X x)
{
return x.p == NULL;
}
This will mean any struct X for which the member p is NULL can be used as a sentinel (and the member a would not matter in this case).
Then just loop looking for a sentinel.
Note: to be compatible in both C and C++, the struct type needs to be compatible (e.g. POD).
Related
I want to know whether a pointer is pointing to an array or single integer. I have a function which takes two pointer (int and char) as input and tell whether a pointer is pointing to an array or single integer.
pointer=pointer+4;
pointer1=pointer1+4;
Is this a good idea?
Like others have said here, C doesn't know what a pointer is pointing to. However if you should choose to go down this path, you could put a sentinel value in the integer or first position in the array to indicate what it is...
#define ARRAY_SENTINEL -1
int x = 0;
int x_array[3] = {ARRAY_SENTINEL, 7, 11};
pointer = &x_array[0];
if (*pointer == ARRAY_SENTINEL)
{
// do some crazy stuff
}
pointer = &x;
if (*pointer != ARRAY_SENTINEL)
{
// do some more crazy stuff
}
That's not a good idea. Using just raw pointers there's no way to know if they point to an array or a single value.
A pointer that is being used as an array and a pointer to a single values are identical - they're both just a memory address - so theres no information to use to distinguish between them. If you post what you want to ultimately do there might be a solution that doesn't rely on comparing pointers to arrays and single values.
Actually pointers point to a piece of memory, not integers or arrays. It is not possible to distinguish if an integer is single variable or the integer is an element of array, both will look exactly the same in memory.
Can you use some C++ data structures, std::vector for example?
For C++ questions, the answer is simple. Do not use C-style dynamic arrays in C++. Whenever you need a C-style dynamic array, you should use std::vector.
This way you would never guess what the pointer points to, because only std::vector will be holding an array.
The Multiboot Specification has structures like this:
struct multiboot_tag_mmap
{
multiboot_uint32_t type;
multiboot_uint32_t size;
multiboot_uint32_t entry_size;
multiboot_uint32_t entry_version;
struct multiboot_mmap_entry entries[0];
};
The intent seems to be that the array size can vary. The information is not known until passed along by the boot loader. In hosted C++, the suggested advice is to "use vector". Well, I can't do that. The alternative is to use dynamic allocation, but that would require implementing a significant chunk of the kernel (paging, MMU, etc.) before I even have the memory map information. A bit of a chicken or egg problem.
The "hack" is to just enable extensions with gnu++11. But I try to avoid using extensions as much as possible to avoid C-ish code or code that could potentially lead to undefined behavior. The more portable the code is, the less chance of bugs in my opinion.
Finally, you iterate over the memory map like this:
for (mmap = ((struct multiboot_tag_mmap *) tag)->entries;
(multiboot_uint8_t *) mmap
< (multiboot_uint8_t *) tag + tag->size;
mmap = (multiboot_memory_map_t *)
((unsigned long) mmap
+ ((struct multiboot_tag_mmap *) tag)->entry_size))
So the size of the structure is tag->size.
I can modify the multiboot header so long as the semantics are the same. The point is how it looks to the bootloader. What can I do?
Instead of 0-sized array, you can use 1-sized array:
struct multiboot_tag_mmap
{
...
struct multiboot_mmap_entry entries[1];
};
This will change only result of sizeof(struct multiboot_tag_mmap), but it shouldn't be used in any case: size of allocated structure should be computed as
offsetof(struct multiboot_tag_mmap, entries) + <num-of-entries> * sizeof(struct multiboot_mmap_entry)
Alignment of the map structure doesn't depends on the number of elements in the entries array, but on the entry type.
Strictly confirming alternative:
If there is known bounary for array size, one can use this boundary for type declaration:
struct multiboot_tag_mmap
{
...
struct multiboot_mmap_entry entries[<UPPER-BOUNDARY>];
};
For such declaration all possible issues described below are not applied.
NOTE about elements accessing:
For accessing elements (above the first one) in such flexible array one need to declare new pointer variable:
struct multiboot_mmap_entry* entries = tag->entries;
entries[index] = ...; // This is OK.
instead of using entries field directly:
tag->entries[index] = ...; // WRONG! May spuriously fail!
The thing is that compiler, knowing that the only one element exists in the entries field array, may optimize last case it to:
tag->entries[0] = ...; // Compiler is in its rights to assume index to have the only allowed value
Issues about standard confirmance: With flexible array approach, there is no object of type struct multiboot_tag_mmap in the memory(in the heap or on the stack). All we have is a pointer of this type, which is never dereferenced (e.g. for making full copy of the object). Similarly, there is no object of the array type struct multiboot_mmap_entry[1], corresponded to the entries field of the structure, this field is used only for conversion to generic pointer of type struct multiboot_mmap_entry*.
So, phrase in the C standard, which denotes Undefine Behavior
An object is assigned to an inexactly overlapping object or to an exactly overlapping object with incompatible type
is inapplicable for accessing entries array field using generic pointer: there is no overlapping object here.
I want to access continuously declared member arrays of the same type with a single pointer.
So for example say I have :
struct S
{
double m1[2];
double m2[2];
}
int main()
{
S obj;
double *sp = obj.m1;
// Code that maybe unsafe !!
for (int i(0); i < 4; ++i)
*(sp++) = i; // *
return 0;
}
Under what circumstances is line (*) problematic ?
I know there's for sure a problem when virtual functions are present but I need a more structured answer than my assumptions
You can be sure that the members of the struct are stored in a contiguos block of bytes, in the order they appear. Besides, the elements of the arrays are contiguous. So, it seems that everything is fine.
The problem here is that there is no standard way of knowing if there is padding bytes between consecutive members in the struct.
So, it is unsafe to assume that there is not padding bytes at all.
If you can be plenty sure, for some particular reason, that there are not padding bytes, then the 4 double elements will be contiguous, as you want.
The C++ standard makes certain guarantees about the layout of "plain old data" (or in C++11, standard layout) types. For the most part, these inherit from how C treated such data.
What follows only applies to "plain old data"/"standard layout" structures and data.
If you have two structs with the same initial order and type of arguments, casting a pointer to one to a pointer to the other and accessing their common initial prefix is valid, and will access the corresponding field. This is known as "layout compatible". This also applies if you have a structure X and a structure Y, and X is the first element of the structure Y -- a pointer to Y can be cast to a pointer to X, and it will access the fields of the X substructure in Y.
Now, while it is a common assumption, I am unaware of a requirement of either C or C++ that an array and a structure starting with fields of the same type and count are layout compatible with an array.
Your case is somewhat similar, in that we have two arrays adjacent to each other in a structure, and you are treating it as one large array of size equal to the sum of those two arrays size. It is a relatively common and safe assumption that it works, but I am unaware of a guarantee in the standard that it actually works.
In this kind of undefined behavior, you have to examine your particular compilers guarantees (de facto or explicit) about layout of plain old data/standard layout data, as the C++ standard does not guarantee your code does what you want it to do.
I was writing a struct to describe a constant value I needed, and noticed something strange.
namespace res{
namespace font{
struct Structure{
struct Glyph{
int x, y, width, height, easement, advance;
};
int glyphCount;
unsigned char asciiMap[]; // <-- always generates an error
Glyph glyphData[]; // <-- never generates an error
};
const Structure system = {95,
{
// mapping data
},
{
// glyph spacing data
}
}; // system constructor
} // namespace font
} // namespace res
The last two members of Structure, the unsized arrays, do not stop the compiler if they are by themselves. But if they are both included in the struct's definition, it causes an error, saying the "type is incomplete"
This stops being a problem if I give the first array a size. Which isn't a problem in this case, but I'm still curious...
My question is, why can I have one unsized array in my struct, but two cause a problem?
In standard C++, you can't do this at all, although some compilers support it as an extension.
In C, every member of a struct needs to have a fixed position within the struct. This means that the last member can have an unknown size; but nothing can come after it, so there is no way to have more than one member of unknown size.
If you do take advantage of your compilers non-standard support for this hack in C++, then beware that things may go horribly wrong if any member of the struct is non-trivial. An object can only be "created" with a non-empty array at the end by allocating a block of raw memory and reinterpreting it as this type; if you do that, no constructors or destructors will be called.
You are using a non-standard microsoft extension. C11 (note: C, not C++) allows the last array in a structure to be unsized (read: a maximum of one arrays):
A Microsoft extension allows the last member of a C or C++ structure or class to be a variable-sized array. These are called unsized arrays. The unsized array at the end of the structure allows you to append a variable-sized string or other array, thus avoiding the run-time execution cost of a pointer dereference.
// unsized_arrays_in_structures1.cpp
// compile with: /c
struct PERSON {
unsigned number;
char name[]; // Unsized array
};
If you apply the sizeof operator to this structure, the ending array size is considered to be 0. The size of this structure is 2 bytes, which is the size of the unsigned member. To get the true size of a variable of type PERSON, you would need to obtain the array size separately.
The size of the structure is added to the size of the array to get the total size to be allocated. After allocation, the array is copied to the array member of the structure, as shown below:
The compiler needs to be able to decide on the offset of every member within the struct. That's why you're not allowed to place any further members after an unsized array. It follows from this that you can't have two unsized arrays in a struct.
It is an extension from Microsoft, and sizeof(structure) == sizeof(structure_without_variable_size_array).
I guess they use the initializer to find the size of the array. If you have two variable size arrays, you can't find it (equivalent to find one unique solution of a 2-unknown system with only 1 equation...)
Arrays without a dimension are not allowed in a struct,
period, at least in C++. In C, the last member (and only the
last) may be declared without a dimension, and some compilers
allow this in C++, as an extension, but you shouldn't count on
it (and in strict mode, they should at least complain about it).
Other compilers have implemented the same semantics if the last
element had a dimension of 0 (also an extension, requiring
a diagnostic in strict mode).
The reason for limiting incomplete array types to the last
element is simple: what would be the offset of any following
elements? Even when it is the last element, there are
restrictions to the use of the resulting struct: it cannot be
a member of another struct or an array, for example, and
sizeof ignores this last element.
This question already has answers here:
How do I use arrays in C++?
(5 answers)
Closed 8 years ago.
In C# I use the Length property embedded to the array I'd like to get the size of.
How to do that in C++?
It really depends what you mean by "array". Arrays in C++ will have a size (meaning the "raw" byte-size now) that equals to N times the size of one item. By that one can easily get the number of items using the sizeof operator. But this requires that you still have access to the type of that array. Once you pass it to functions, it will be converted to pointers, and then you are lost. No size can be determined anymore. You will have to construct some other way that relies on the value of the elements to calculate the size.
Here are some examples:
int a[5];
size_t size = (sizeof a / sizeof a[0]); // size == 5
int *pa = a;
If we now lose the name "a" (and therefor its type), for example by passing "pa" to a function where that function only then has the value of that pointer, then we are out of luck. We then cannot receive the size anymore. We would need to pass the size along with the pointer to that function.
The same restrictions apply when we get an array by using new. It returns a pointer pointing to that array's elements, and thus the size will be lost.
int *p = new int[5];
// can't get the size of the array p points to.
delete[] p;
It can't return a pointer that has the type of the array incorporated, because the size of the array created with new can be calculated at runtime. But types in C++ must be set at compile-time. Thus, new erases that array part, and returns a pointer to the elements instead. Note that you don't need to mess with new in C++. You can use the std::vector template, as recommended by another answer.
Arrays in C/C++ do not store their lengths in memory, so it is impossible to find their size purely given a pointer to an array. Any code using arrays in those languages relies on a constant known size, or a separate variable being passed around that specifies their size.
A common solution to this, if it does present a problem, is to use the std::vector class from the standard library, which is much closer to a managed (C#) array, i.e. stores its length and additionally has a few useful member functions (for searching and manipulation).
Using std::vector, you can simply call vector.size() to get its size/length.
To count the number of elements in a static array, you can create a template function:
template < typename T, size_t N >
size_t countof( T const (&array)[ N ] )
{
return N;
}
For standard containers such as std::vector, the size() function is used. This pattern is also used with boost arrays, which are fixed size arrays and claim no worse performance to static arrays. The code you have in a comment above should be:
for ( std::vector::size_type i(0); i < entries.size(); ++i )
( assuming the size changes in the loop, otherwise hoist it, ) rather than treating size as a member variable.
In C/C++, arrays are simply pointers to the first element in the array, so there is no way to keep track of the size or # of elements. You will have to pass an integer indicating the size of the array if you need to use it.
Strings may have their length determined, assuming they are null terminated, by using the strlen() function, but that simply counts until the \0 character.
Als Nolrodin pointed out above, it is pretty much impossible to get the size of an plain array in C++ if you only have a pointer to its first element. However if you have a fixed-size array there is a well-known C trick to work out the number of elements in the array at compile time, namely by doing:
GrmblFx loadsOfElements[1027];
GrmblFx_length = sizeof(loadsOfElements)/sizeof(GrmblFx);