How does gcc allocate statically run time known length of array

How does gcc allocate statically run time known length of array - c++

I wrote the following code:
int tester(int n)
{
int arr[n];
// ...
}
This code compiled, no warnings, using g++.
My question is - how? The parameter n is known just in runtime, in the array is statically allocated. How does gcc compile this?

This is an extension that GCC offers for C++, though variable-length arrays ("VLAs") are properly supported by C since C99.
The implementation isn't terribly hard; on a typical call-stack implementation, the function only needs to save the base of the stack frame and then advance the stack pointer by the dynamically specified amount. VLAs always come with the caveat that if the number is too large, you get undefined behaviour (manifesting in Stack Overflow), which makes them much tricker to use right than, say, std::vector.
There had at some point been an effort to add a similar feature to C++, but this turns out surprisingly difficult in terms of the type system (e.g. what is the type of arr? How does it get deduced in function templates?). The problems are less visible in C which has a much simpler type system and object model (but that said, you can still argue that C is worse off for having VLAs, a considerable part of the standard is spent on them, and the language would have been quite a bit simpler without them, and not necessarily poorer for it).

The GNU C library provides a function to allocate memory on the stack - alloca(3). It simply decrements the stack pointer thus creating some scratch space on it. GCC uses alloca(3) to implement C99 variable-length arrays - it first decrements the stack pointer in the function prologue to create space for all automatic variables, whose size is known at compile time, and then uses alloca(3) to further decrement it and make space for arr with size as determined at run-time. The optimiser might actually fuse both decrements.
int tester(int n)
{
int arr[n];
return 0;
}
compiles into
;; Function tester (tester)
tester (int n)
{
int arr[0:D.1602D.1602] [value-expr: *arr.1];
int[0:D.1602D.1602] * arr.1;
long unsigned int D.1610D.1610;
int n.0;
...
<bb 2>:
n.0 = n;
...
D.1609D.1609 = (long unsigned int) n.0;
D.1610D.1610 = D.1609D.1609 * 4;
D.1612D.1612 = __builtin_alloca (D.1610D.1610); <----- arr is allocated here
arr.1 = (int[0:D.1602D.1602] *) D.1612D.1612;
...
That's equivalent to the following C code:
int tester(int n)
{
int *arr = __builtin_alloca(n * sizeof(int));
return 0;
}
__builtin_alloca() is GCC's internal implementation of alloca(3).

Related

compile-time-known array size: passed compilation using g++ but not for icpc

The toy code is quite simple:
#include <stdio.h>
#include <math.h>
#define A 10
#define SIZE (int)(ceil(A / 2)) // needs computation but known at compile-time
struct node {
int keys[SIZE];
};
int main() {
node a_node;
for (int i = 0; i < SIZE; ++i) {
a_node.keys[i] = i;
printf("%d\n", a_node.keys[i]);
}
return 0;
}
When I compile it using g++ -std=c++11 -O3 test.cpp -o test, it passes and runs as expected; but using the intel compiler with icpc -std=c++11 -O3 test.cpp -o test, an error occurs:
test.cpp(8): error: function call must have a constant value in a constant expression
int keys[SIZE];
Why is it regarded as a function call? But if I used simple arithmetic in the macro (e.g. #define SIZE (A / 2)), the error disappears.
gcc version: 4.8.5; icc version: 16.0.3.
Any clue how to solve this issue? Thanks.
------------------ ADDED ------------------
If int keys[SIZE] is declared in the main function instead of the struct, i.e. like this:
#include <stdio.h>
#include <math.h>
#define A 10
#define SIZE (int)(ceil(A / 2))
/*struct node {
int keys[SIZE];
};*/
int main() {
//node a_node;
int keys[SIZE];
for (int i = 0; i < SIZE; ++i) {
keys[i] = i;
printf("%d\n", keys[i]);
}
return 0;
}
it will pass both gnu and intel compilers. Quite interesting and wierd. I am not sure if I made any mistake that I am not aware of, any possbile clue?

#define A 10
#define SIZE (int)(ceil(A / 2))
Your SIZE is not a compile-time constant, because ceil is a floating point function from <math.h>. So the compiler is understanding the variable int keys[SIZE]; as a variable length array (VLA) (whose size is probably computed at runtime). Standard C++11 or C++14 don't have VLAs. But GCC accepts them as an extension to the language. Notice that if you declare your struct node then keys is not a variable but a field or member.
BTW, if you replace the define of SIZE with
#define SIZE ((A+1)/2)
it becomes a compile-time constant (e.g. a constexpr in C++ parlance) and has the same value as before.
BTW, your code is not genuine C++11, but looks like C11 code. I would suggest to install a newer version of GCC (in march 2017, use GCC 6, not some old 4.8). Then use constexpr and std::array
VLAs cannot occur as fields in struct-s. C99 and C11 have flexible array members as the possible last member of a struct. C++11 don't have them.
Probably, your GCC compiler (and the <math.h> header) knows that abs can be expanded as __builtin_abs which is handled specially by the GCC compiler (but not by others) (so SIZE is expanded as (int)(ceil(10 / 2)) which, with GCC specifically, becomes a compile-time constant thanks to __builtin_abs)
You should understand that C and C++ are different languages. If you choose to code in C++ (at least C++11, anything older is obsolete in 2017) you should use its standard containers (and you probably want std::vector since the size is runtime computed). If you choose to code in C (at least C99), consider using flexible array members and allocate your flexible structure in the heap (appropriately using malloc or friends). Notice that C++ does not have flexible array members.
Contrarily to your title, your SIZE is not (according to the standard specifications) a "compile-time-known array size" (but becomes one when optimized by gcc or g++)
An even better reason to use heap allocation (that is either std::vector in C++, or a pointer to some struct ending with a flexible array member, or a pointer to an array, in C) is that in practice you'll want to have a much bigger A (e.g. #define A 100000 ...) and it is not reasonable to allocate large data structures on the call stack (often limited to a megabyte or a few of them on current machines and operating systems).

Authoritative "correct" way to avoid signed-unsigned warnings when testing a loop variable against size_t

The code below generates a compiler warning:
private void test()
{
byte buffer[100];
for (int i = 0; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
}
warning: comparison between signed and unsigned integer expressions
[-Wsign-compare]
This is because sizeof() returns a size_t, which is unsigned.
I have seen a number of suggestions for how to deal with this, but none with a preponderance of support and none with any convincing logic nor any references to support one approach as clearly "better." The most common suggestions seem to be:
ignore the warnings
turn off the warnings
use a loop variable of type size_t
use a loop variable of type size_t with tricks to avoid decrementing past zero
cast size_of(buffer) to an int
some extremely convoluted suggestions that I did not have the patience to follow because they involved unreadable code, generally involving vectors and/or iterators
libraries that I cannot load in the AVR / ARM embedded environments I often use.
free functions returning a valid int or long representing the byte count of T
Don't use loops (gotta love that advice)
Is there a "correct" way to approach this?
-- Begin Edit --
The example I gave is, of course, trivial, and meant only to demonstrate the type mismatch warning that can occur in an indexing situation.
#3 is not necessarily the obviously correct answer because size_t carries special risks in a decrementing loop such as
for (size_t i = myArray.size; i > 0; --i)
(the array may someday have a size of zero).
#4 is a suggestion to deal with decrementing size_t indexes by including appropriate and necessary checks to avoid ever decrementing past zero. Since that makes the code harder to read, there are some cute shortcuts that are not particularly readable, hence my referring to them as "tricks."
#7 is a suggestion to use libraries that are not generalizable in the sense that they may not be available or appropriate in every setting.
#8 is a suggestion to keep the checks readable, but to hide them in a non-member method, sometimes referred to as a "free function."
#9 is a suggestion to use algorithms rather than loops. This was offered many times as a solution to the size_t indexing problem, and there were a lot of upvotes. I include it even though I can't use the stl library in most of my environments and would have to write the code myself.
-- End Edit--
I am hoping for evidence-based guidance or references as to best practices for handling something like this. Is there a "standard text" or a style guide somewhere that addresses the question? A defined approach that has been adopted/endorsed internally by a major tech company? An emulatable solution forthcoming in a new language release? If necessary, I would be satisfied with an unsupported public recommendation from a single widely recognized expert.
None of the options on offer seem very appealing. The warnings drown out other things I want to see. I don't want to miss signed/unsigned comparisons in places where it might matter. Decrementing a loop variable of type size_t with comparison >=0 results in an infinite loop from unsigned integer wraparound, and even if we protect against that with something like for (size_t i = sizeof(buffer); i-->0 ;), there are other issues with incrementing/decrementing/comparing to size_t variables. Testing against size_t - 1 will yield a large positive 'oops' number when size_t is unexpectedly zero (e.g. strlen(myEmptyString)). Casting an unsigned size_t to an integer is a container size problem (not guaranteed a value) and of course size_t could potentially be bigger than an int.
Given that my arrays are of known sizes well below Int_Max, it seems to me that casting size_t to a signed integer is the best of the bunch, but it makes me cringe a little bit. Especially if it has to be static_cast<int>. Easier to take if it's hidden in a function call with some size testing, but still...
Or perhaps there's a way to turn off the warnings, but just for loop comparisons?

I find any of the three following approaches equally good.
Use a variable of type int to store the size and compare the loop variable to it.
byte buffer[100];
int size = sizeof(buffer);
for (int i = 0; i < size; ++i)
{
buffer[i] = 0;
}
Use size_t as the type of the loop variable.
byte buffer[100];
for (size_t i = 0; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
Use a pointer.
byte buffer[100];
byte* end = buffer + sizeof(buffer)
for (byte* p = buffer; p < end; ++p)
{
*p = 0;
}
If you are able to use a C++11 compiler, you can also use a range for loop.
byte buffer[100];
for (byte& b : buffer)
{
b = 0;
}

The most appropriate solution will depend entirely on context. In the context of the code fragment in your question the most appropriate action is perhaps to have type-agreement - the third option in your bullet list. This is appropriate in this case because the usage of i throughout the code is only to index the array - in this case the use of int is inappropriate - or at least unnecessary.
On the other hand if i were an arithmetic object involved in some arithmetic expression that was itself signed, the int might be appropriate and a cast would be in order.
I would suggest that as a guideline, a solution that involves the fewest number of necessary type casts (explicit of implicit) is appropriate, or to look at it another way, the maximum possible type agreement. There is not one "authoritative" rule because the purpose and usage of the variables involved is semantically rather then syntactically dependent. In this case also as has been pointed out in other answers, newer language features supporting iteration may avoid this specific issue altogether.
To discuss the advice you say you have been given specifically:
ignore the warnings
Never a good idea - some will be genuine semantic errors or maintenance issues, and by teh time you have several hundred warnings you are ignoring, how will you spot the one warning that is and issue?
turn off the warnings
An even worse idea; the compiler is helping you to improve your code quality and reliability. Why would you disable that?
use a loop variable of type size_t
In this precise example, that is exactly why you should do; exact type agreement should always be the aim.
use a loop variable of type size_t with tricks to avoid decrementing past zero
This advice is irrelevant for the trivial example given. Moreover I presume that by "tricks" the adviser in fact means checks or just correct code. There is no need for "tricks" and the term is entirely ambiguous - who knows what the adviser means? It suggests something unconventional and a bit "dirty", when there is not need for any solution with such attributes.
cast size_of(buffer) to an int
This may be necessary if the usage of i warrants the use of int for correct semantics elsewhere in the code. The example in the question does not, so this would not be an appropriate solution in this case. Essentially if making i a size_t here causes type agreement warnings elsewhere that cannot themselves be resolved by universal type agreement for all operands in an expression, then a cast may be appropriate. The aim should be to achieve zero warnings an minimum type casts.
some extremely convoluted suggestions that I did not have the patience to follow, generally involving vectors and/or iterators
If you are not prepared to elaborate or even consider such advice, you'd have better omitted the "advice" from your question. The use of STL containers in any case is not always appropriate to a large segment of embedded targets in any case, excessive code size increase and non-deterministic heap management are reasons to avoid on many platforms and applications.
libraries that I cannot load in an embedded environment.
Not all embedded environments have equal constraints. The restriction is on your embedded environment, not by any means all embedded environments. However the "loading of libraries" to resolve or avoid type agreement issues seems like a sledgehammer to crack a nut.
free functions returning a valid int or long representing the byte count of T
It is not clear what that means. What id a "free function"? Is that just a non-member function? Such a function would internally necessarily have a type case, so what have you achieved other than hiding a type cast?
Don't use loops (gotta love that advice).
I doubt you needed to include that advice in your list. The problem is not in any case limited to loops; it is not because you are using a loop that you have the warning, it is because you have used < with mismatched types.

My favorite solution is to use C++11 or newer and skip the whole manual size bounding entirely like so:
// assuming byte is defined by something like using byte = std::uint8_t;
void test()
{
byte buffer[100];
for (auto&& b: buffer)
{
b = 0;
}
}
Alternatively, if I can't use the ranged-based for loop (but still can use C++11 or newer), my favorite syntax becomes:
void test()
{
byte buffer[100];
for (auto i = decltype(sizeof(buffer)){0}; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
}
Or for iterating backwards:
void test()
{
byte buffer[100];
// relies on the defined modwrap semantics behavior for unsigned integers
for (auto i = sizeof(buffer) - 1; i < sizeof(buffer); --i)
{
buffer[i] = 0;
}
}

The correct generic way is to use a loop iterator of type size_t. Simply because the is the most correct type to use for describing an array size.
There is not much need for "tricks to avoid decrementing past zero", because the size of an object can never be negative.
If you find yourself needing negative numbers to describe a variable size, it is probably because you have some special case where you are iterating across an array backwards. If so, the "trick" to deal with it is this:
for(size_t i=0; i<sizeof(array); i++)
{
size_t index = sizeof(array)-1 - i;
array[index] = something;
}
However, size_t is often an inconvenient type to use in embedded systems, because it may end up as a larger type than what your MCU can handle with one instruction, resulting in needlessly inefficient code. It may then be better to use a fixed width integer such as uint16_t, if you know the maximum size of the array in advance.
Using plain int in an embedded system is almost certainly incorrect practice. Your variables must be of deterministic size and signedness - most variables in an embedded system are unsigned. Signed variables also lead to major problems whenever you need to use bitwise operators.

If you are able to use C++ 11, you could use decltype to obtain the actual type of what sizeof returns, for instance:
void test()
{
byte buffer[100];
// On macOS decltype(sizeof(buffer)) returns unsigned long, this passes
// the compiler without warnings.
for (decltype(sizeof(buffer)) i = 0; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
}

Using a size_t as limiter for a "for loop"

I'm using a C++ app on my s6 named CppDroid to make a quick program.
How do you use a size_t as limiter for a "for loop" on it's counter?
int c;
//... more codes here...
for (c=0; c < a.used; ++c)
//... more codes here...
The a.used is a number of used array that came from a solution to make a dynamic sized array
The error is: comparison of integer of different signs: 'int' and 'size_t' (aka unsigned int)
The for loop is one of an internal nested loops of the program so I want to maintain variable c as an "int" as much as possible.
I've seen examples about Comparing int with size_t but I'm not sure how it can help since it is for an "if" condition.

Just use std::size_t c instead of int c.

As long as a.used doesn't change during the iteration, a common idiom is:
for(int c=0, n=a.used; c<n; ++c) {
...
}
In this way, the cast happens implicitly, and you also have the "total number of elements" variable n handy in the loop body. Also, when n comes from a methods call (say, vec.size()) you evaluate it just once, which is slightly more efficient.1
1. In theory the compiler may do this optimization by itself, but with "complicated" stuff like std::vector and a nontrivial loop body it's surprisingly difficult to prove that it's a loop invariant, so often it's just recalculated at each iteration.

Regarding
” comparison of integer of different signs: 'int' and 'size_t' (aka unsigned int)
… this is a warning. It's not an error that prevents creation of an executable, unless you've asked the compiler to treat warnings as errors.
A very direct way to address it is to use a cast, int(a.size).
More generally I recommend defining a common function to do that, e.g. named n_items (C++17 will have a size function, unfortunately with unsigned result and conflating two or more logical functions, so that name's taken):
using My_array = ...; // Whatever
using Size = ptrdiff_t;
auto n_items( My_array const& a )
-> Size
{ return a.used; }
then for your loop:
for( int c = 0; c < n_items( a ); ++c )
By the way it's generally Not A Good Idea™ to reuse a variable, like c here. I'm assuming that that reuse was unintentional. The example above shows how to declare the loop variable in the for loop head.
Also, as Matteo Italia notes in his answer, it can sometimes be a good idea to manually optimize a loop like this, if measuring shows it to be a bottleneck. That's because the compiler can't easily prove that the result of the n_items call, or any other dynamic array size expression, is the same (is “invariant”) in all executions of the loop body.
Thus, if measuring tells you that the possibly repeated size expression evaluations are a bottleneck, you can do e.g.
for( int c = 0, n = n_items( a ); c < n; ++c )
It's worth noting that any manual optimization carries costs, which are not easy to measure, but which are severe enough that the usual advice to is to defer optimization until measurements tell you that it's really needed.

2D array access time comparison

I have two ways of constructing a 2D array:
int arr[NUM_ROWS][NUM_COLS];
//...
tmp = arr[i][j]
and flattened array
int arr[NUM_ROWS*NUM_COLS];
//...
tmp = arr[i*NuM_COLS+j];
I am doing image processing so even a little improvement in access time is necessary. Which one is faster? I am thinking the first one since the second one needs calculation, but then the first one requires two addressing so I am not sure.

I don't think there is any performance difference. System will allocate same amount of contiguous memory in both cases. For calculate i*Numcols+j, either you would do it for 1D array declaration, or system would do it in 2D case. Only concern is ease of usage.

You should have trust into the capabilities of your compiler in optimizing standard code.
Also you should have trust into modern CPUs having fast numeric multiplication instructions.
Don't bother to use one or another!
I - decades ago - optimized some code greatly by using pointers instead of using 2d-array-calculation --> but this will a) only be useful if it is an option to store the pointer - e.g. in a loop and b) have low impact since i guess modern cpus should do 2d array access in a single cycle? Worth measuring! May be related to the array size.
In any case pointers using ptr++ or ptr += NuM_COLS will for sure be a little bit faster if applicable!

The first method will almost always be faster. IN GENERAL (because there are always corner cases) processor and memory architecture as well as compilers may have optimizations built in to aid with 2d arrays or other similar data structures. For example, GPUs are optimized for matrix (2d array) math.
So, again in general, I would allow the compiler and hardware to optimize your memory and address arithmetic if possible.
...also I agree with #Paul R, there are much bigger considerations when it comes to performance than your array allocation and address arithmetic.

There are two cases to consider: compile time definition and run-time definition of the array size. There is big difference in performance.
Static allocation, global or file scope, fixed size array:
The compiler knows the size of the array and tells the linker to allocate space in the data / memory section. This is the fastest method.
Example:
#define ROWS 5
#define COLUMNS 6
int array[ROWS][COLUMNS];
int buffer[ROWS * COLUMNS];
Run time allocation, function local scope, fixed size array:
The compiler knows the size of the array, and tells the code to allocate space in the local memory (a.k.a. stack) for the array. In general, this means adding a value to a stack register. Usually one or two instructions.
Example:
void my_function(void)
{
unsigned short my_array[ROWS][COLUMNS];
unsigned short buffer[ROWS * COLUMNS];
}
Run Time allocation, dynamic memory, fixed size array:
Again, the compiler has already calculated the amount of memory required for the array since it was declared with fixed size. The compiler emits code to call the memory allocation function with the required amount (usually passed as a parameter). A little slower because of the function call and the overhead required to find some dynamic memory (and maybe garbage collection).
Example:
void another_function(void)
{
unsigned char * array = new char [ROWS * COLS];
//...
delete[] array;
}
Run Time allocation, dynamic memory, variable size:
Regardless of the dimensions of the array, the compiler must emit code to calculate the amount of memory to allocate. This quantity is then passed to the memory allocation function. A little slower than above because of the code required to calculate the size.
Example:
int * create_board(unsigned int rows, unsigned int columns)
{
int * board = new int [rows * cols];
return board;
}

Since your goal is image processing then I would assume your images are too large for static arrays. The correct question you should be about dynamically allocated arrays
In C/C++ there are multiple ways you can allocate a dynamic 2D array How do I work with dynamic multi-dimensional arrays in C?. To make this work in both C/C++ we can use malloc with casting (for C++ only you can use new)
Method 1:
int** arr1 = (int**)malloc(NUM_ROWS * sizeof(int*));
for(int i=0; i<NUM_ROWS; i++)
arr[i] = (int*)malloc(NUM_COLS * sizeof(int));
Method 2:
int** arr2 = (int**)malloc(NUM_ROWS * sizeof(int*));
int* arrflat = (int*)malloc(NUM_ROWS * NUM_COLS * sizeof(int));
for (int i = 0; i < dimension1_max; i++)
arr2[i] = arrflat + (i*NUM_COLS);
Method 2 essentially creates a contiguous 2D array: i.e. arrflat[NUM_COLS*i+j] and arr2[i][j] should have identical performance. However, arrflat[NUM_COLS*i+j] and arr[i][j] from method 1 should not be expected to have identical performance since arr1 is not contiguous. Method 1, however, seems to be the method that is most commonly used for dynamic arrays.
In general, I use arrflat[NUM_COLS*i+j] so I don't have to think of how to allocated dynamic 2D arrays.

How is the sizeof operator implemented in c++?

Can someone point me the to the implementation of sizeof operator in C++ and also some description about its implementation.
sizeof is one of the operator that cannot be overloaded.
So it means we cannot change its default behavior?

sizeof is not a real operator in C++. It is merely special syntax which inserts a constant equal to the size of the argument. sizeof doesn't need or have any runtime support.
Edit: do you want to know how to determine the size of a class/structure looking at its definition? The rules for this are part of the ABI, and compilers merely implement them. Basically the rules consist of
size and alignment definitions for primitive types;
structure, size and alignment of the various pointers;
rules for packing fields in structures;
rules about virtual table-related stuff (more esoteric).
However, ABIs are platform- and often vendor-specific, i.e. on x86 and (say) IA64 the size of A below will be different because IA64 does not permit unaligned data access.
struct A
{
char i ;
int j ;
} ;
assert (sizeof (A) == 5) ; // x86, MSVC #pragma pack(1)
assert (sizeof (A) == 8) ; // x86, MSVC default
assert (sizeof (A) == 16) ; // IA64

http://en.wikipedia.org/wiki/Sizeof
Basically, to quote Bjarne Stroustrup's C++ FAQ:
Sizeof cannot be overloaded because built-in operations, such as incrementing a pointer into an array implicitly depends on it. Consider:
X a[10];
X* p = &a[3];
X* q = &a[3];
p++; // p points to a[4]
// thus the integer value of p must be
// sizeof(X) larger than the integer value of q
Thus, sizeof(X) could not be given a
new and different meaning by the
programmer without violating basic
language rules.

No, you can't change it. What do you hope to learn from seeing an implementation of it?
What sizeof does can't be written in C++ using more basic operations. It's not a function, or part of a library header like e.g. printf or malloc. It's inside the compiler.
Edit: If the compiler is itself written in C or C++, then you can think of the implementation being something like this:
size_t calculate_sizeof(expression_or_type)
{
if (is_type(expression_or_type))
{
if (is_array_type(expression_or_type))
{
return array_size(exprssoin_or_type) *
calculate_sizeof(underlying_type_of_array(expression_or_type));
}
else
{
switch (expression_or_type)
{
case int_type:
case unsigned_int_type:
return 4; //for example
case char_type:
case unsigned_char_type:
case signed_char_type:
return 1;
case pointer_type:
return 4; //for example
//etc., for all the built-in types
case class_or_struct_type:
{
int base_size = compiler_overhead(expression_or_type);
for (/*loop over each class member*/)
{
base_size += calculate_sizeof(class_member) +
padding(class_member);
}
return round_up_to_multiple(base_size,
alignment_of_type(expression_or_type));
}
case union_type:
{
int max_size = 0;
for (/*loop over each class member*/)
{
max_size = max(max_size,
calculate_sizeof(class_member));
}
return round_up_to_multiple(max_size,
alignment_of_type(expression_or_type));
}
}
}
}
else
{
return calculate_sizeof(type_of(expression_or_type));
}
}
Note that is is very much pseudo-code. There's lots of things I haven't included, but this is the general idea. The compiler probably doesn't actually do this. It probably calculates the size of a type (including a class) and stores it, instead of recalculating every time you write sizeof(X). It is also allowed to e.g. have pointers being different sizes depending on what they point to.

sizeof does what it does at compile time. Operator overloads are simply functions, and do what they do at run time. It is therefore not possible to overload sizeof, even if the C++ Standard allowed it.

sizeof is a compile-time operator, which means that it is evaluated at compile-time.
It cannot be overloaded, because it already has a meaning on all user-defined types - the sizeof() a class is the size that the object the class defines takes in memory, and the sizeof() a variable is the size that the object the variable names occupies in memory.

Unless you need to see how C++-specific sizes are calculated (such as allocation for the v-table), you can look at Plan9's C compiler. It's much simpler than trying to tackle g++.

Variable:
#define getsize_var(x) ((char *)(&(x) + 1) - (char *)&(x))
Type:
#define getsize_type(type) ( (char*)((type*)(1) + 1) - (char*)((type *)(1)))

Take a look at the source for the Gnu C++ compiler for an real-world look at how this is done.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js