What is Trailing Array Idiom ?
P.S : Googling this term gives The vectors are implemented using the trailing array idiom, thus they are not resizeable without changing the address of the vector object itself.
If you mean the trailing array idiom mentioned in the GCC source code (where your quote comes from), it seems to refer to the old C trick to implement a dynamic array:
typedef struct {
/* header */
size_t nelems;
/* actual array */
int a[1];
} IntVector;
where an array would be created with
IntVector *make_intvector(size_t n)
{
IntVector *v = malloc(sizeof(IntVector) + sizeof(int) * (n-1));
if (v != NULL)
v->nelems = n;
return v;
}
It seems to refer to arrays in structs, which may have a variable array-size. See:
http://blogs.msdn.com/b/oldnewthing/archive/2004/08/26/220873.aspx
and
http://sourceware.org/gdb/current/onlinedocs/gdbint/Support-Libraries.html
Another tip, if you google for an expression put the expression in "" like "trailing array" this will give you more specific results. Google knows about trailing arrays.
I think what is meant is:
struct foo {
... some data members, maybe the length of bar ...
char bar[]; /* last member of foo, char is just an example */
};
It is used by allocating with malloc(sizeof(struct foo)+LEN), where LEN is the desired length of bar. This way only one malloc is needed. The [] can only be used with the last struct member.
And, as fas as I understand the GCC doc, struct foo can also only be (reasonably) used as last member of another struct, because the storage size is not fixed -- or as pointer.
Related
I am reading a book called "Teach Yourself C in 21 Days" (I have already learned Java and C# so I am moving at a much faster pace). I was reading the chapter on pointers and the -> (arrow) operator came up without explanation. I think that it is used to call members and functions (like the equivalent of the . (dot) operator, but for pointers instead of members). But I am not entirely sure.
Could I please get an explanation and a code sample?
foo->bar is equivalent to (*foo).bar, i.e. it gets the member called bar from the struct that foo points to.
Yes, that's it.
It's just the dot version when you want to access elements of a struct/class that is a pointer instead of a reference.
struct foo
{
int x;
float y;
};
struct foo var;
struct foo* pvar;
pvar = malloc(sizeof(struct foo));
var.x = 5;
(&var)->y = 14.3;
pvar->y = 22.4;
(*pvar).x = 6;
That's it!
I'd just add to the answers the "why?".
. is standard member access operator that has a higher precedence than * pointer operator.
When you are trying to access a struct's internals and you wrote it as *foo.bar then the compiler would think to want a 'bar' element of 'foo' (which is an address in memory) and obviously that mere address does not have any members.
Thus you need to ask the compiler to first dereference whith (*foo) and then access the member element: (*foo).bar, which is a bit clumsy to write so the good folks have come up with a shorthand version: foo->bar which is sort of member access by pointer operator.
a->b is just short for (*a).b in every way (same for functions: a->b() is short for (*a).b()).
foo->bar is only shorthand for (*foo).bar. That's all there is to it.
Well I have to add something as well. Structure is a bit different than array because array is a pointer and structure is not. So be careful!
Lets say I write this useless piece of code:
#include <stdio.h>
typedef struct{
int km;
int kph;
int kg;
} car;
int main(void){
car audi = {12000, 230, 760};
car *ptr = &audi;
}
Here pointer ptr points to the address (!) of the structure variable audi but beside address structure also has a chunk of data (!)! The first member of the chunk of data has the same address than structure itself and you can get it's data by only dereferencing a pointer like this *ptr (no braces).
But If you want to acess any other member than the first one, you have to add a designator like .km, .kph, .kg which are nothing more than offsets to the base address of the chunk of data...
But because of the preceedence you can't write *ptr.kg as access operator . is evaluated before dereference operator * and you would get *(ptr.kg) which is not possible as pointer has no members! And compiler knows this and will therefore issue an error e.g.:
error: ‘ptr’ is a pointer; did you mean to use ‘->’?
printf("%d\n", *ptr.km);
Instead you use this (*ptr).kg and you force compiler to 1st dereference the pointer and enable acess to the chunk of data and 2nd you add an offset (designator) to choose the member.
Check this image I made:
But if you would have nested members this syntax would become unreadable and therefore -> was introduced. I think readability is the only justifiable reason for using it as this ptr->kg is much easier to write than (*ptr).kg.
Now let us write this differently so that you see the connection more clearly. (*ptr).kg ⟹ (*&audi).kg ⟹ audi.kg. Here I first used the fact that ptr is an "address of audi" i.e. &audi and fact that "reference" & and "dereference" * operators cancel eachother out.
struct Node {
int i;
int j;
};
struct Node a, *p = &a;
Here the to access the values of i and j we can use the variable a and the pointer p as follows: a.i, (*p).i and p->i are all the same.
Here . is a "Direct Selector" and -> is an "Indirect Selector".
I had to make a small change to Jack's program to get it to run. After declaring the struct pointer pvar, point it to the address of var. I found this solution on page 242 of Stephen Kochan's Programming in C.
#include <stdio.h>
int main()
{
struct foo
{
int x;
float y;
};
struct foo var;
struct foo* pvar;
pvar = &var;
var.x = 5;
(&var)->y = 14.3;
printf("%i - %.02f\n", var.x, (&var)->y);
pvar->x = 6;
pvar->y = 22.4;
printf("%i - %.02f\n", pvar->x, pvar->y);
return 0;
}
Run this in vim with the following command:
:!gcc -o var var.c && ./var
Will output:
5 - 14.30
6 - 22.40
#include<stdio.h>
int main()
{
struct foo
{
int x;
float y;
} var1;
struct foo var;
struct foo* pvar;
pvar = &var1;
/* if pvar = &var; it directly
takes values stored in var, and if give
new > values like pvar->x = 6; pvar->y = 22.4;
it modifies the values of var
object..so better to give new reference. */
var.x = 5;
(&var)->y = 14.3;
printf("%i - %.02f\n", var.x, (&var)->y);
pvar->x = 6;
pvar->y = 22.4;
printf("%i - %.02f\n", pvar->x, pvar->y);
return 0;
}
The -> operator makes the code more readable than the * operator in some situations.
Such as: (quoted from the EDK II project)
typedef
EFI_STATUS
(EFIAPI *EFI_BLOCK_READ)(
IN EFI_BLOCK_IO_PROTOCOL *This,
IN UINT32 MediaId,
IN EFI_LBA Lba,
IN UINTN BufferSize,
OUT VOID *Buffer
);
struct _EFI_BLOCK_IO_PROTOCOL {
///
/// The revision to which the block IO interface adheres. All future
/// revisions must be backwards compatible. If a future version is not
/// back wards compatible, it is not the same GUID.
///
UINT64 Revision;
///
/// Pointer to the EFI_BLOCK_IO_MEDIA data for this device.
///
EFI_BLOCK_IO_MEDIA *Media;
EFI_BLOCK_RESET Reset;
EFI_BLOCK_READ ReadBlocks;
EFI_BLOCK_WRITE WriteBlocks;
EFI_BLOCK_FLUSH FlushBlocks;
};
The _EFI_BLOCK_IO_PROTOCOL struct contains 4 function pointer members.
Suppose you have a variable struct _EFI_BLOCK_IO_PROTOCOL * pStruct, and you want to use the good old * operator to call it's member function pointer. You will end up with code like this:
(*pStruct).ReadBlocks(...arguments...)
But with the -> operator, you can write like this:
pStruct->ReadBlocks(...arguments...).
Which looks better?
#include<stdio.h>
struct examp{
int number;
};
struct examp a,*b=&a;`enter code here`
main()
{
a.number=5;
/* a.number,b->number,(*b).number produces same output. b->number is mostly used in linked list*/
printf("%d \n %d \n %d",a.number,b->number,(*b).number);
}
output is 5
5 5
Dot is a dereference operator and used to connect the structure variable for a particular record of structure.
Eg :
struct student
{
int s.no;
Char name [];
int age;
} s1,s2;
main()
{
s1.name;
s2.name;
}
In such way we can use a dot operator to access the structure variable
The BITMAPINFO structure has the following declaration
typedef struct tagBITMAPINFO {
BITMAPINFOHEADER bmiHeader;
RGBQUAD bmiColors[1];
} BITMAPINFO;
Why is the RGBQUAD array static? Why is it not a pointer?
It is a standard trick to declare a variable sized struct. The color table never has just one entry, it has at least 2 for a monochrome bitmap, typically 256 for a 8bpp bitmap, etc. Indicated by the bmiHeader.biClrUsed member. So the actual size of the struct depends on the bitmap format.
Since the C language doesn't permit declaring such a data structure, this is the closest match. Creating the structure requires malloc() to allocate sufficient bytes to store the structure, calculated from biClrUsed. Then a simple cast to (BITMAPINFO*) makes it usable.
It doesn't matter than it is static or not. The thing is, you'd still have to allocate enough memory for the palette. It is a RGBQuad because it stores only R, G, B, A and nothing more..
example:
for(i = 0; i < 256; i++)
{
lpbmpinfo->bmiColors[i].rgbRed = some_r;
lpbmpinfo->bmiColors[i].rgbGreen = some_g;
lpbmpinfo->bmiColors[i].rgbBlue = some_b;
lpbmpinfo->bmiColors[i].rgbReserved = 0;
}
There's no static keyword in the declaration. It's a completely normal struct member. It's used to declare a variable-sized struct with a single variable-sized array at the end
The size of the array is only known at compile time, but since arrays of size 0 are forbidden in C and C++ so we'll use array[1] instead. See the detailed explanation from MS' Raymond Chen in Why do some structures end with an array of size 1?
On some compilers like GCC zero-length arrays are allowed as an extension so Linux and many other platforms usually use array[0] instead of array[1]
Declaring zero-length arrays is allowed in GNU C as an extension. A zero-length array can be useful as the last element of a structure that is really a header for a variable-length object:
struct line {
int length;
char contents[0];
};
struct line *thisline = (struct line *)
malloc (sizeof (struct line) + this_length);
thisline->length = this_length;
Arrays of Length Zero
In C99 a new feature called flexible array member was introduced. Since then it's better to use array[] for portability
struct vectord {
size_t len;
double arr[]; // the flexible array member must be last
};
See also
What is the purpose of a zero length array in a struct?
What's the need of array with zero elements?
Array of zero length
Is empty array in the end of the structure a C standard?
What is the advantage of using zero-length arrays in C?
I am trying to use XCode for my project and have this code in my .h:
class FileReader
{
private:
int numberOfNodes;
int startingNode;
int numberOfTerminalNodes;
int terminalNode[];
int numberOfTransitions;
int transitions[];
public:
FileReader();
~FileReader();
};
I get a "Field has incomplete type int[]" error on the terminalNode line... but not on the transitions line. What could be going on? I'm SURE that's the correct syntax?
Strictly speaking the size of an array is part of its type, and an array must have a (greater than zero) size.
There's an extension that allows an array of indeterminate size as the last element of a class. This is used to conveniently access a variable sized array as the last element of a struct.
struct S {
int size;
int data[];
};
S *make_s(int size) {
S *s = (S*)malloc(sizeof(S) + sizeof(int)*size);
s->size = size;
return s;
}
int main() {
S *s = make_s(4);
for (int i=0;i<s->size;++i)
s->data[i] = i;
free(s);
}
This code is unfortunately not valid C++, but it is valid C (C99 or C11). If you've inherited this from some C project, you may be surprised that this works there but not in C++. But the truth of the matter is that you can't have zero-length arrays (which is what the incomplete array int transitions[] is in this context) in C++.
Use a std::vector<int> instead. Or a std::unique_ptr<int[]>.
(Or, if you're really really really fussy about not having two separate memory allocations, you can write your own wrapper class which allocates one single piece of memory and in-place constructs both the preamble and the array. But that's excessive.)
The original C use would have been something like:
FileReader * p = malloc(sizeof(FileReader) + N * sizeof(int));
Then you could have used p->transitions[i], for i in [0, N).
Such a construction obviously doesn't make sense in the object model of C++ (think constructors and exceptions).
You can't put an unbound array length in a header -- there is no way for the compiler to know the class size, thus it can never be instantiated.
Its likely that the lack of error on the transitions line is a result of handling the first error. That is, if you comment out terminalNode, transitions should give the error.
It isn't. If you're inside a struct definition, the compiler needs to know the size of the struct, so it also needs to know the size of all its elements. Because int [] means an array of ints of any length, its size is unknown. Either use a fixed-size array (int field[128];) or a pointer that you'll use to malloc memory (int *field;).
I declare the following array:
char* array [2] = { "One", "Two"};
I pass this array to a function. How can I find the length of this array in the function?
You can't find the length of an array after you pass it to a function without extra effort. You'll need to:
Use a container that stores the size, such as vector (recommended).
Pass the size along with it. This will probably require the least modification to your existing code and be the quickest fix.
Use a sentinel value, like C strings do1. This makes finding the length of the array a linear time operation and if you forget the sentinel value your program will likely crash. This is the worst way to do it for most situations.
Use templating to deduct the size of the array as you pass it. You can read about it here: How does this Array Size Template Work?
1 In case you were wondering, most people regret the fact that C strings work this way.
When you pass an array there is NOT an easy way to determine the size within the function.
You can either pass the array size as a parameter
or
use std::vector<std::string>
If you are feeling particularly adventurous you can use some advanced template techniques
In a nutshell it looks something like
template <typename T, size_t N>
void YourFunction( T (&array)[N] )
{
size_t myarraysize = N;
}
C is doing some trickery behind your back.
void foo(int array[]) {
/* ... */
}
void bar(int *array) {
/* ... */
}
Both of these are identical:
6.3.2.1.3: Except when it is the operand of the sizeof operator or the unary & operator,
or is a string literal used to initialize an array, an expression that has type
‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’
that points to the initial element of the array object and is not an lvalue. If
the array object has register storage class, the behavior is undefined.
As a result, you don't know, inside foo() or bar(), if you were
called with an array, a portion of an array, or a pointer to a single
integer:
int a[10];
int b[10];
int c;
foo(a);
foo(&b[1]);
foo(&c);
Some people like to write their functions like: void foo(int *array)
just to remind themselves that they weren't really passed an array,
but rather a pointer to an integer and there may or may not be more
integers elsewhere nearby. Some people like to write their functions
like: void foo(int array[]), to better remind themselves of what the
function expects to be passed to it.
Regardless of which way you like to do it, if you want to know how long
your array is, you've got a few options:
Pass along a length paramenter too. (Think int main(int argc, char
*argv)).
Design your array so every element is non-NULL, except the last
element. (Think char *s="almost a string"; or execve(2).)
Design your function so it takes some other descriptor of the
arguments. (Think printf("%s%i", "hello", 10); -- the string describes
the other arguments. printf(3) uses stdarg(3) argument handling, but
it could just as easily be an array.)
Getting the array-size from the pointer isn't possible. You could just terminate the array with a NULL-pointer. That way your function can search for the NULL-pointer to know the size, or simply just stop processing input once it hits the NULL...
If you mean how long are all the strings added togather.
int n=2;
int size=0;
char* array [n] = { "One", "Two"};
for (int i=0;i<n;++i)
size += strlen(array[i];
Added:
yes thats what im currently doing but i wanted to remove that extra
paramater. oh well –
Probably going to get a bad response for this, but you could always use the first pointer to store the size, as long as you don't deference it or mistake it for actually being a pointer.
char* array [] = { (char*)2,"One", "Two"};
long size=(long)array[0];
for(int i=1; i<= size;++i)
printf("%s",array[i]);
Or you could NULL terminate your array
char* array [] = { "One", "Two", (char*)0 };
for(int i=0;array[i]!=0;++i)
{
printf("%s",array[i]);
}
Use the new C++11 std::array
http://www.cplusplus.com/reference/stl/array/
the standard array has the size method your looking for
If I have a typedef of a struct
typedef struct
{
char SmType;
char SRes;
float SParm;
float EParm;
WORD Count;
char Flags;
char unused;
GPOINT2 Nodes[];
} GPATH2;
and it contains an uninitialized array, how can I create an instance of this type so that is will hold, say, 4 values in Nodes[]?
Edit: This belongs to an API for a program written in Assembler. I guess as long as the underlying data in memory is the same, an answer changing the struct definition would work, but not if the underlying memory is different. The Assembly Language application is not using this definition .... but .... a C program using it can create GPATH2 elements that the Assembly Language application can "read".
Can I ever resize Nodes[] once I have created an instance of GPATH2?
Note: I would have placed this with a straight C tag, but there is only a C++ tag.
You could use a bastard mix of C and C++ if you really want to:
#include <new>
#include <cstdlib>
#include "definition_of_GPATH2.h"
using namespace std;
int main(void)
{
int i;
/* Allocate raw memory buffer */
void * raw_buffer = calloc(1, sizeof(GPATH2) + 4 * sizeof(GPOINT2));
/* Initialize struct with placement-new */
GPATH2 * path = new (raw_buffer) GPATH2;
path->Count = 4;
for ( i = 0 ; i < 4 ; i++ )
{
path->Nodes[i].x = rand();
path->Nodes[i].y = rand();
}
/* Resize raw buffer */
raw_buffer = realloc(raw_buffer, sizeof(GPATH2) + 8 * sizeof(GPOINT2));
/* 'path' still points to the old buffer that might have been free'd
* by realloc, so it has to be re-initialized
* realloc copies old memory contents, so I am not certain this would
* work with a proper object that actaully does something in the
* constructor
*/
path = new (raw_buffer) GPATH2;
/* now we can write more elements of array */
path->Count = 5;
path->Nodes[4].x = rand();
path->Nodes[4].y = rand();
/* Because this is allocated with malloc/realloc, free it with free
* rather than delete.
* If 'path' was a proper object rather than a struct, you should
* call the destructor manually first.
*/
free(raw_buffer);
return 0;
}
Granted, it's not idiomatic C++ as others have observed, but if the struct is part of legacy code it might be the most straightforward option.
Correctness of the above sample program has only been checked with valgrind using dummy definitions of the structs, your mileage may vary.
If it is fixed size write:
typedef struct
{
char SmType;
char SRes;
float SParm;
float EParm;
WORD Count;
char Flags;
char unused;
GPOINT2 Nodes[4];
} GPATH2;
if not fixed then change declaration to
GPOINT2* Nodes;
after creation or in constructor do
Nodes = new GPOINT2[size];
if you want to resize it you should use vector<GPOINT2>, because you can't resize array, only create new one. If you decide to do it, don't forget to delete previous one.
also typedef is not needed in c++, you can write
struct GPATH2
{
char SmType;
char SRes;
float SParm;
float EParm;
WORD Count;
char Flags;
char unused;
GPOINT2 Nodes[4];
};
This appears to be a C99 idiom known as the "struct hack". You cannot (in standard C99; some compilers have an extension that allows it) declare a variable with this type, but you can declare pointers to it. You have to allocate objects of this type with malloc, providing extra space for the appropriate number of array elements. If nothing holds a pointer to an array element, you can resize the array with realloc.
Code that needs to be backward compatible with C89 needs to use
GPOINT2 Nodes[1];
as the last member, and take note of this when allocating.
This is very much not idiomatic C++ -- note for instance that you would have to jump through several extra hoops to make new and delete usable -- although I have seen it done. Idiomatic C++ would use vector<GPOINT2> as the last member of the struct.
Arrays of unknown size are not valid as C++ data members. They are valid in C99, and your compiler may be mixing C99 support with C++.
What you can do in C++ is 1) give it a size, 2) use a vector or another container, or 3) ditch both automatic (local variable) and normal dynamic storage in order to control allocation explicitly. The third is particularly cumbersome in C++, especially with non-POD, but possible; example:
struct A {
int const size;
char data[1];
~A() {
// if data was of non-POD type, we'd destruct data[1] to data[size-1] here
}
static auto_ptr<A> create(int size) {
// because new is used, auto_ptr's use of delete is fine
// consider another smart pointer type that allows specifying a deleter
A *p = ::operator new(sizeof(A) + (size - 1) * sizeof(char));
try { // not necessary in our case, but is if A's ctor can throw
new(p) A(size);
}
catch (...) {
::operator delete(p);
throw;
}
return auto_ptr<A>(p);
}
private:
A(int size) : size (size) {
// if data was of non-POD type, we'd construct here, being very careful
// of exception safety
}
A(A const &other); // be careful if you define these,
A& operator=(A const &other); // but it likely makes sense to forbid them
void* operator new(size_t size); // doesn't prevent all erroneous uses,
void* operator new[](size_t size); // but this is a start
};
Note you cannot trust sizeof(A) any where else in the code, and using an array of size 1 guarantees alignment (matters when the type isn't char).
This type of structure is not trivially useable on the stack, you'll have to malloc it. the significant thing to know is that sizeof(GPATH2) doesn't include the trailing array. so to create one, you'd do something like this:
GPATH2 *somePath;
size_t numPoints;
numPoints = 4;
somePath = malloc(sizeof(GPATH2) + numPoints*sizeof(GPOINT2));
I'm guessing GPATH2.Count is the number of elements in the Nodes array, so if it's up to you to initialize that, be sure and set somePath->Count = numPoints; at some point. If I'm mistaken, and the convention used is to null terminate the array, then you would do things just a little different:
somePath = malloc(sizeof(GPATH2) + (numPoints+1)*sizeof(GPOINT2));
somePath->Nodes[numPoints] = Some_Sentinel_Value;
make darn sure you know which convention the library uses.
As other folks have mentioned, realloc() can be used to resize the struct, but it will invalidate old pointers to the struct, so make sure you aren't keeping extra copies of it (like passing it to the library).