Can someone point me the to the implementation of sizeof operator in C++ and also some description about its implementation.
sizeof is one of the operator that cannot be overloaded.
So it means we cannot change its default behavior?
sizeof is not a real operator in C++. It is merely special syntax which inserts a constant equal to the size of the argument. sizeof doesn't need or have any runtime support.
Edit: do you want to know how to determine the size of a class/structure looking at its definition? The rules for this are part of the ABI, and compilers merely implement them. Basically the rules consist of
size and alignment definitions for primitive types;
structure, size and alignment of the various pointers;
rules for packing fields in structures;
rules about virtual table-related stuff (more esoteric).
However, ABIs are platform- and often vendor-specific, i.e. on x86 and (say) IA64 the size of A below will be different because IA64 does not permit unaligned data access.
struct A
{
char i ;
int j ;
} ;
assert (sizeof (A) == 5) ; // x86, MSVC #pragma pack(1)
assert (sizeof (A) == 8) ; // x86, MSVC default
assert (sizeof (A) == 16) ; // IA64
http://en.wikipedia.org/wiki/Sizeof
Basically, to quote Bjarne Stroustrup's C++ FAQ:
Sizeof cannot be overloaded because built-in operations, such as incrementing a pointer into an array implicitly depends on it. Consider:
X a[10];
X* p = &a[3];
X* q = &a[3];
p++; // p points to a[4]
// thus the integer value of p must be
// sizeof(X) larger than the integer value of q
Thus, sizeof(X) could not be given a
new and different meaning by the
programmer without violating basic
language rules.
No, you can't change it. What do you hope to learn from seeing an implementation of it?
What sizeof does can't be written in C++ using more basic operations. It's not a function, or part of a library header like e.g. printf or malloc. It's inside the compiler.
Edit: If the compiler is itself written in C or C++, then you can think of the implementation being something like this:
size_t calculate_sizeof(expression_or_type)
{
if (is_type(expression_or_type))
{
if (is_array_type(expression_or_type))
{
return array_size(exprssoin_or_type) *
calculate_sizeof(underlying_type_of_array(expression_or_type));
}
else
{
switch (expression_or_type)
{
case int_type:
case unsigned_int_type:
return 4; //for example
case char_type:
case unsigned_char_type:
case signed_char_type:
return 1;
case pointer_type:
return 4; //for example
//etc., for all the built-in types
case class_or_struct_type:
{
int base_size = compiler_overhead(expression_or_type);
for (/*loop over each class member*/)
{
base_size += calculate_sizeof(class_member) +
padding(class_member);
}
return round_up_to_multiple(base_size,
alignment_of_type(expression_or_type));
}
case union_type:
{
int max_size = 0;
for (/*loop over each class member*/)
{
max_size = max(max_size,
calculate_sizeof(class_member));
}
return round_up_to_multiple(max_size,
alignment_of_type(expression_or_type));
}
}
}
}
else
{
return calculate_sizeof(type_of(expression_or_type));
}
}
Note that is is very much pseudo-code. There's lots of things I haven't included, but this is the general idea. The compiler probably doesn't actually do this. It probably calculates the size of a type (including a class) and stores it, instead of recalculating every time you write sizeof(X). It is also allowed to e.g. have pointers being different sizes depending on what they point to.
sizeof does what it does at compile time. Operator overloads are simply functions, and do what they do at run time. It is therefore not possible to overload sizeof, even if the C++ Standard allowed it.
sizeof is a compile-time operator, which means that it is evaluated at compile-time.
It cannot be overloaded, because it already has a meaning on all user-defined types - the sizeof() a class is the size that the object the class defines takes in memory, and the sizeof() a variable is the size that the object the variable names occupies in memory.
Unless you need to see how C++-specific sizes are calculated (such as allocation for the v-table), you can look at Plan9's C compiler. It's much simpler than trying to tackle g++.
Variable:
#define getsize_var(x) ((char *)(&(x) + 1) - (char *)&(x))
Type:
#define getsize_type(type) ( (char*)((type*)(1) + 1) - (char*)((type *)(1)))
Take a look at the source for the Gnu C++ compiler for an real-world look at how this is done.
Related
There was a similar question here, but the user in that question seemed to have a much larger array, or vector. If I have:
bool boolArray[4];
And I want to check if all elements are false, I can check [ 0 ], [ 1 ] , [ 2 ] and [ 3 ] either separately, or I can loop through it. Since (as far as I know) false should have value 0 and anything other than 0 is true, I thought about simply doing:
if ( *(int*) boolArray) { }
This works, but I realize that it relies on bool being one byte and int being four bytes. If I cast to (std::uint32_t) would it be OK, or is it still a bad idea? I just happen to have 3 or 4 bools in an array and was wondering if this is safe, and if not if there is a better way to do it.
Also, in the case I end up with more than 4 bools but less than 8 can I do the same thing with a std::uint64_t or unsigned long long or something?
As πάντα ῥεῖ noticed in comments, std::bitset is probably the best way to deal with that in UB-free manner.
std::bitset<4> boolArray {};
if(boolArray.any()) {
//do the thing
}
If you want to stick to arrays, you could use std::any_of, but this requires (possibly peculiar to the readers) usage of functor which just returns its argument:
bool boolArray[4];
if(std::any_of(std::begin(boolArray), std::end(boolArray), [](bool b){return b;}) {
//do the thing
}
Type-punning 4 bools to int might be a bad idea - you cannot be sure of the size of each of the types. It probably will work on most architectures, but std::bitset is guaranteed to work everywhere, under any circumstances.
Several answers have already explained good alternatives, particularly std::bitset and std::any_of(). I am writing separately to point out that, unless you know something we don't, it is not safe to type pun between bool and int in this fashion, for several reasons:
int might not be four bytes, as multiple answers have pointed out.
M.M points out in the comments that bool might not be one byte. I'm not aware of any real-world architectures in which this has ever been the case, but it is nevertheless spec-legal. It (probably) can't be smaller than a byte unless the compiler is doing some very elaborate hide-the-ball chicanery with its memory model, and a multi-byte bool seems rather useless. Note however that a byte need not be 8 bits in the first place.
int can have trap representations. That is, it is legal for certain bit patterns to cause undefined behavior when they are cast to int. This is rare on modern architectures, but might arise on (for example) ia64, or any system with signed zeros.
Regardless of whether you have to worry about any of the above, your code violates the strict aliasing rule, so compilers are free to "optimize" it under the assumption that the bools and the int are entirely separate objects with non-overlapping lifetimes. For example, the compiler might decide that the code which initializes the bool array is a dead store and eliminate it, because the bools "must have" ceased to exist* at some point before you dereferenced the pointer. More complicated situations can also arise relating to register reuse and load/store reordering. All of these infelicities are expressly permitted by the C++ standard, which says the behavior is undefined when you engage in this kind of type punning.
You should use one of the alternative solutions provided by the other answers.
* It is legal (with some qualifications, particularly regarding alignment) to reuse the memory pointed to by boolArray by casting it to int and storing an integer, although if you actually want to do this, you must then pass boolArray through std::launder if you want to read the resulting int later. Regardless, the compiler is entitled to assume that you have done this once it sees the read, even if you don't call launder.
You can use std::bitset<N>::any:
Any returns true if any of the bits are set to true, otherwise false.
#include <iostream>
#include <bitset>
int main ()
{
std::bitset<4> foo;
// modify foo here
if (foo.any())
std::cout << foo << " has " << foo.count() << " bits set.\n";
else
std::cout << foo << " has no bits set.\n";
return 0;
}
Live
If you want to return true if all or none of the bits set to on, you can use std::bitset<N>::all or std::bitset<N>::none respectively.
The standard library has what you need in the form of the std::all_of, std::any_of, std::none_of algorithms.
...And for the obligatory "roll your own" answer, we can provide a simple "or"-like function for any array bool[N], like so:
template<size_t N>
constexpr bool or_all(const bool (&bs)[N]) {
for (bool b : bs) {
if (b) { return b; }
}
return false;
}
Or more concisely,
template<size_t N>
constexpr bool or_all(const bool (&bs)[N]) {
for (bool b : bs) { if (b) { return b; } }
return false;
}
This also has the benefit of both short-circuiting like ||, and being optimised out entirely if calculable at compile time.
Apart from that, if you want to examine the original idea of type-punning bool[N] to some other type to simplify observation, I would very much recommend that you don't do that view it as char[N2] instead, where N2 == (sizeof(bool) * N). This would allow you to provide a simple representation viewer that can automatically scale to the viewed object's actual size, allow iteration over its individual bytes, and allow you to more easily determine whether the representation matches specific values (such as, e.g., zero or non-zero). I'm not entirely sure off the top of my head whether such examination would invoke any UB, but I can say for certain that any such type's construction cannot be a viable constant-expression, due to requiring a reinterpret cast to char* or unsigned char* or similar (either explicitly, or in std::memcpy()), and thus couldn't as easily be optimised out.
I wrote the following code:
int tester(int n)
{
int arr[n];
// ...
}
This code compiled, no warnings, using g++.
My question is - how? The parameter n is known just in runtime, in the array is statically allocated. How does gcc compile this?
This is an extension that GCC offers for C++, though variable-length arrays ("VLAs") are properly supported by C since C99.
The implementation isn't terribly hard; on a typical call-stack implementation, the function only needs to save the base of the stack frame and then advance the stack pointer by the dynamically specified amount. VLAs always come with the caveat that if the number is too large, you get undefined behaviour (manifesting in Stack Overflow), which makes them much tricker to use right than, say, std::vector.
There had at some point been an effort to add a similar feature to C++, but this turns out surprisingly difficult in terms of the type system (e.g. what is the type of arr? How does it get deduced in function templates?). The problems are less visible in C which has a much simpler type system and object model (but that said, you can still argue that C is worse off for having VLAs, a considerable part of the standard is spent on them, and the language would have been quite a bit simpler without them, and not necessarily poorer for it).
The GNU C library provides a function to allocate memory on the stack - alloca(3). It simply decrements the stack pointer thus creating some scratch space on it. GCC uses alloca(3) to implement C99 variable-length arrays - it first decrements the stack pointer in the function prologue to create space for all automatic variables, whose size is known at compile time, and then uses alloca(3) to further decrement it and make space for arr with size as determined at run-time. The optimiser might actually fuse both decrements.
int tester(int n)
{
int arr[n];
return 0;
}
compiles into
;; Function tester (tester)
tester (int n)
{
int arr[0:D.1602D.1602] [value-expr: *arr.1];
int[0:D.1602D.1602] * arr.1;
long unsigned int D.1610D.1610;
int n.0;
...
<bb 2>:
n.0 = n;
...
D.1609D.1609 = (long unsigned int) n.0;
D.1610D.1610 = D.1609D.1609 * 4;
D.1612D.1612 = __builtin_alloca (D.1610D.1610); <----- arr is allocated here
arr.1 = (int[0:D.1602D.1602] *) D.1612D.1612;
...
That's equivalent to the following C code:
int tester(int n)
{
int *arr = __builtin_alloca(n * sizeof(int));
return 0;
}
__builtin_alloca() is GCC's internal implementation of alloca(3).
Okay, Allow me to re-ask the question, as none of the answers got at what I was really interested in (apologies if whole-scale editing of the question like this is a faux-paus).
A few points:
This is offline analysis with a different compiler than the one I'm testing, so SIZEOF() or similar won't work for what I'm doing.
I know it's implementation-defined, but I happen to know the implementation that is of interest to me, which is below.
Let's make a function called pack, which takes as input an integer, called alignment, and a tuple of integers, called elements. It outputs another integer, called size.
The function works as follows:
int pack (int alignment, int[] elements)
{
total_size = 0;
foreach( element in elements )
{
while( total_size % min(alignment, element) != 0 ) { ++total_size; }
total_size += element;
}
while( total_size % packing != 0 ) { ++total_size; }
return total_size;
}
I think what I want to ask is "what is the inverse of this function?", but I'm not sure whether inversion is the correct term--I don't remember ever dealing with inversions of functions with multiple inputs, so I could just be using a term that doesn't apply.
Something like what I want (sort of) exists; here I provide pseudo code for a function we'll call determine_align. The function is a little naive, though, as it just calls pack over and over again with different inputs until it gets an answer it expects (or fails).
int determine_align(int total_size, int[] elements)
{
for(packing = 1,2,4,...,64) // expected answers.
{
size_at_cur_packing = pack(packing, elements);
if(actual_size == size_at_cur_packing)
{
return packing;
}
}
return unknown;
}
So the question is, is there a better implementation of determine_align?
Thanks,
Alignment of struct members in C/C++ is entirely implementation-defined. There are a few guarantees there, but I don't see how they would help you.
Thus, there's no generic way to do what you want. In the context of a particular implementation, you should refer to the documentation of that implementation that covers this (if it is covered).
When choosing how to pack members into a struct an implementation doesn't have to follow the sort of scheme that you describe in your algorithm although it is a common one. (i.e. minimum of sizeof type being aligned and preferred machine alignment size.)
You don't have to compare overall size of a struct to determine the padding that has been applied to individual struct members, though. The standard macro offsetof will give the byte offset from the start of the struct of any individual struct member.
I let the compiler do the alignment for me.
In gcc,
typedef struct _foo
{
u8 v1 __attribute__((aligned(4)));
u16 v2 __attribute__((aligned(4)));
u32 v3 __attribute__((aligned(8)));
u8 v1 __attribute__((aligned(4)));
} foo;
Edit: Note that sizeof(foo) will return the correct value including any padding.
Edit2: And offsetof(foo, v2) also works. Given these two functions/macros, you can figure out everything you need to know about the layout of the struct in memory.
I'm honestly not sure what you're trying to do, and I'm probably completely misunderstanding what you're looking for, but if you want to simply determine what the alignment requirement of a struct is, the following macro might be helpful:
#define ALIGNMENT_OF( t ) offsetof( struct { char x; t test; }, test )
To determine the alignment of your foo structure, you can do:
ALIGNMENT_OF( foo);
If this isn't what you're ultimately tring to do, it might be possible that the macro might help in whatever algorithm you do come up with.
You need to pad based on the alignment of the next field and then pad the last element based on the maximum alignment you've seen in the struct. Note that the actual alignment of a field is the minimum of its natural alignment and the packing for that struct. I.e., if you have a struct packed at 4 bytes, a double will be aligned to 4 bytes, even though its natural alignment is 8.
You can make your inner loop faster with total_size+= total_size % min(packing, element.size); You can optimize it further if packing and element.size is a power of two.
If the problem is just that you want to guarantee a particular alignment, that is easy. For a particular alignment=2^n:
void* p = malloc( sizeof( _foo ) + alignment -1 );
p = (void*) ( ( (char*)(p) + alignment - 1 ) & ~alignment );
I've neglected to save to original p returned from malloc. If you intend to free this memory, you need to save that pointer somewhere.
I'm not sure what you want to achieve here. As Pavel Minaev said, alignment is handled by a compiler which in turn is constrained by a platform's Application Binary Interface for data that is made accessible to code compiled by a different compiler. The following paper discusses the problem in the context of a compiler that needs to implement calling conventions:
Christian Lindig and Norman Ramsey. Declarative Composition of Stack Frames. In Evelyn Duesterwald, editors, Proc. of the 14th International Conference on Compiler Construction, Springer, LNCS 2985, 2004.
This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
In C arrays why is this true? a[5] == 5[a]
Is the possibility of both array[index] and index[array] a compiler feature or a language feature. How is the second one possible?
The compiler will turn
index[array]
into
*(index + array)
With the normal syntax it would turn
array[index]
into
*(array + index)
and thus you see that both expressions evaluate to the same value. This holds for both C and C++.
From the earliest days of C, the expression a[i] was simply the address of a[0] added to i (scaled up by the size of a[0]) and then de-referenced. In fact, all these were equivalent:
a[i]
i[a]
*(a+i)
====
The only thing I'd be concerned about is the actual de-referencing. Whilst they all produce the same address, de-referencing may be a concern if the types of a and i are different.
For example:
int i = 4;
long a[9];
long x = a[i]; //get the long at memory location X.
long x = i[a]; //get the int at memory location X?
I haven't actually tested that behavior but it's something you may want to watch out for. If it does change what gets de-referenced, it's likely to cause all sorts of problems with arrays of objects as well.
====
Update:
You can probably safely ignore the bit above between the ===== lines. I've tested it under Cygwin with a short and a long and it seems okay, so I guess my fears were unfounded, at least for the basic cases. I still have no idea what happens with more complicated ones because it's not something I'm ever likely to want to do.
As Matthew Wilson discusses in Imperfect C++, this can be used to enforce type safety in C++, by preventing use of DIMENSION_OF()-like macros with instances of types that define the subscript operator, as in:
#define DIMENSION_OF_UNSAFE(x) (sizeof(x) / sizeof((x)[0]))
#define DIMENSION_OF_SAFER(x) (sizeof(x) / sizeof(0[(x)]))
int ints[4];
DIMENSION_OF_UNSAFE(ints); // 4
DIMENSION_OF_SAFER(ints); // 4
std::vector v(4);
DIMENSION_OF_UNSAFE(v); // gives impl-defined value; v likely wrong
DIMENSION_OF_SAFER(v); // does not compile
There's more to this, for dealing with pointers, but that requires some additional template smarts. Check out the implementation of STLSOFT_NUM_ELEMENTS() in the STLSoft libraries, and read about it all in chapter 14 of Imperfect C++.
edit: some of the commenters suggest that the implementation does not reject pointers. It does (as well as user-defined types), as illustrated by the following program. You can verify this by uncommented lines 16 and 18. (I just did this on Mac/GCC4, and it rejects both forms).
#include <stlsoft/stlsoft.h>
#include <vector>
#include <stdio.h>
int main()
{
int ar[1];
int* p = ar;
std::vector<int> v(1);
printf("ar: %lu\n", STLSOFT_NUM_ELEMENTS(ar));
// printf("p: %lu\n", STLSOFT_NUM_ELEMENTS(p));
// printf("v: %lu\n", STLSOFT_NUM_ELEMENTS(v));
return 0;
}
In C and C++ (with array being a pointer or array) it is a language feature: pointer arithmetic. The operation a[b] where either a or b is a pointer is converted into pointer arithmetic: *(a + b). With addition being symetrical, reordering does not change meaning.
Now, there are differences for non-pointers. In fact given a type A with overloaded operator[], then a[4] is a valid method call (will call A::operator ) but the opposite will not even compile.
Most of experienced programmer knows data alignment is important for program's performance. I have seen some programmer wrote program that allocate bigger size of buffer than they need, and use the aligned pointer as begin. I am wondering should I do that in my program, I have no idea is there any guarantee of alignment of address returned by C++'s new operation. So I wrote a little program to test
for(size_t i = 0; i < 100; ++i) {
char *p = new char[123];
if(reinterpret_cast<size_t>(p) % 4) {
cout << "*";
system("pause");
}
cout << reinterpret_cast<void *>(p) << endl;
}
for(size_t i = 0; i < 100; ++i) {
short *p = new short[123];
if(reinterpret_cast<size_t>(p) % 4) {
cout << "*";
system("pause");
}
cout << reinterpret_cast<void *>(p) << endl;
}
for(size_t i = 0; i < 100; ++i) {
float *p = new float[123];
if(reinterpret_cast<size_t>(p) % 4) {
cout << "*";
system("pause");
}
cout << reinterpret_cast<void *>(p) << endl;
}
system("pause");
The compiler I am using is Visual C++ Express 2008. It seems that all addresses the new operation returned are aligned. But I am not sure. So my question is: are there any guarantee? If they do have guarantee, I don't have to align myself, if not, I have to.
The alignment has the following guarantee from the standard (3.7.3.1/2):
The pointer returned shall be suitably aligned so that it can be converted to a
pointer of any complete object type and then used to access the object or array in the
storage allocated (until
the storage is explicitly deallocated by a call to a corresponding deallocation function).
EDIT: Thanks to timday for highlighting a bug in gcc/glibc where the guarantee does not hold.
EDIT 2: Ben's comment highlights an intersting edge case. The requirements on the allocation routines are for those provided by the standard only. If the application has it's own version, then there's no such guarantee on the result.
This is a late answer but just to clarify the situation on Linux - on 64-bit systems
memory is always 16-byte aligned:
http://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html
The address of a block returned by malloc or realloc in the GNU system is always a
multiple of eight (or sixteen on 64-bit systems).
The new operator calls malloc internally
(see ./gcc/libstdc++-v3/libsupc++/new_op.cc)
so this applies to new as well.
The implementation of malloc which is part of the glibc basically defines
MALLOC_ALIGNMENT to be 2*sizeof(size_t) and size_t is 32bit=4byte and 64bit=8byte
on a x86-32 and x86-64 system, respectively.
$ cat ./glibc-2.14/malloc/malloc.c:
...
#ifndef INTERNAL_SIZE_T
#define INTERNAL_SIZE_T size_t
#endif
...
#define SIZE_SZ (sizeof(INTERNAL_SIZE_T))
...
#ifndef MALLOC_ALIGNMENT
#define MALLOC_ALIGNMENT (2 * SIZE_SZ)
#endif
C++17 changes the requirements on the new allocator, such that it is required to return a pointer whose alignment is equal to the macro __STDCPP_DEFAULT_NEW_ALIGNMENT__ (which is defined by the implementation, not by including a header).
This is important because this size can be larger than alignof(std::max_align_t). In Visual C++ for example, the maximum regular alignment is 8-byte, but the default new always returns 16-byte aligned memory.
Also, note that if you override the default new with your own allocator, you are required to abide by the __STDCPP_DEFAULT_NEW_ALIGNMENT__ as well.
Incidentally the MS documentation mentions something about malloc/new returning addresses which are 16-byte aligned, but from experimentation this is not the case. I happened to need the 16-byte alignment for a project (to speed up memory copies with enhanced instruction set), in the end I resorted to writing my own allocator...
The platform's new/new[] operator will return pointers with sufficient alignment so that it'll perform good with basic datatypes (double,float,etc.). At least any sensible C++ compiler+runtime should do that.
If you have special alignment requirements like for SSE, then it's probably a good idea use special aligned_malloc functions, or roll your own.
I worked on a system where they used the alignment to free up the odd bit for there own use!
They used the odd bit to implement a virtual memory system.
When a pointer had the odd bit set they used that to signify that it pointed (minus the odd
bit) to the information to get the data from the database not the data itself.
I thought this a particulary nasty bit of coding which was far to clever for its own good!!
Tony