What does sizeof do? - c++

What is the main function of sizeof (I am new to C++). For instance
int k=7;
char t='Z';
What do sizeof (k) or sizeof (int) and sizeof (char) mean?

sizeof(x) returns the amount of memory (in bytes) that the variable or type x occupies. It has nothing to do with the value of the variable.
For example, if you have an array of some arbitrary type T then the distance between elements of that array is exactly sizeof(T).
int a[10];
assert(&(a[0]) + sizeof(int) == &(a[1]));
When used on a variable, it is equivalent to using it on the type of that variable:
T x;
assert(sizeof(T) == sizeof(x));
As a rule-of-thumb, it is best to use the variable name where possible, just in case the type changes:
int x;
std::cout << "x uses " << sizeof(x) << " bytes." << std::endl
// If x is changed to a char, then the statement doesn't need to be changed.
// If we used sizeof(int) instead, we would need to change 2 lines of code
// instead of one.
When used on user-defined types, sizeof still returns the amount of memory used by instances of that type, but it's worth pointing out that this does not necessary equal the sum of its members.
struct Foo { int a; char b; };
While sizeof(int) + sizeof(char) is typically 5, on many machines, sizeof(Foo) may be 8 because the compiler needs to pad out the structure so that it lies on 4 byte boundaries. This is not always the case, and it's quite possible that on your machine sizeof(Foo) will be 5, but you can't depend on it.

To add to Peter Alexander's answer: sizeof yields the size of a value or type in multiples of the size of a char---char being defined as the smallest unit of memory addressable (by C or C++) for a given architecture (and, in C++ at least, at least 8 bits in size according to the standard). This is what's generally meant by "bytes" (smallest addressable unit for a given architecture) but it never hurts to clarify, and there are occasionally questions about the variability of sizeof (char), which is of course always 1.

sizeof() returns the size of the argument passed to it.
sizeof() cpp reference

sizeof is a compile time unary operator that returns size of data type.
For example:
sizeof(int)
will return the size of int in byte.
Also remember that type sizes are platform dependent.
Check this page for more details: sizeof in C/C++

Related

Why is this pointer 8 bytes?

I am learning C++, and read that when an array is passed into a function it decays into a pointer. I wanted to play around with this and wrote the following function:
void size_print(int a[]){
cout << sizeof(a)/sizeof(a[0]) << endl;
cout << "a ->: " << sizeof(a) << endl;
cout << "a[0] ->" << sizeof(a[0]) << endl;
}
I tried inputting an array with three elements, let's say
int test_array[3] = {1, 2, 3};
With this input, I was expecting this function to print 1, as I thought a would be an integer pointer (4 bytes) and a[0] would also be 4 bytes. However, to my surprise the result is 2 and sizeof(a) = 8.
I cannot figure out why a takes up 8 bytes, but a[0] takes up 4. Shouldn't they be the same?
Shouldn't they be the same?
No. a is (meant to be) an array (but because it's a function argument, has been adjusted to a pointer to the 1st element), and as such, has the size of a pointer. Your machine seems to have 64 bit addresses, and thus, each address (and hence, each pointer) is 64 bits (8 bytes) long.
a[0], on the other hand, is of the type that an element of that array has (an int), and that type has 32 bits (4 bytes) on your machine.
A pointer is just an address of memory where the start of the variable is located. That address is 8 bytes.
a[0] is a variable in the first place of the array. It technically could be anything of whatever size. When you take a pointer to it, the pointer just contains an address of memory (integer) without knowing or caring what this address contains. (This is just to illustrate the concept, in the example in the question, a[] is an integer array but the same logic works with anything).
Note, the size of the pointer is actually different on different architectures. This is where the 32-bit, 64-bit, etc. comes in. It can also depend on the compiler but this is beyond the question.
The size of the pointer depends on the system and implementation. Your uses 64 bits (8 bytes).
a[0] is an integer and the standard only gives an indication of the minimum max value it has to store. It can be anything from 2 bytes up. Most modern implementations use 32 bits (4 bytes) integers.
sizeof(a)/sizeof(a[0]) will not work on the function parameters. Arrays are passed by the reference and this division will only give you information how many times size of the pointer is larger than the size of an integer, but not the size of the object referenced by the pointer.

Why is this pointer null

In Visual Studio, it seems like pointer to member variables are 32 bit signed integers behind the scenes (even in 64 bit mode), and a null-pointer is -1 in that context. So if I have a class like:
#include <iostream>
#include <cstdint>
struct Foo
{
char arr1[INT_MAX];
char arr2[INT_MAX];
char ch1;
char ch2;
};
int main()
{
auto p = &Foo::ch2;
std::cout << (p?"Not null":"null") << '\n';
}
It compiles, and prints "null". So, am I causing some kind of undefined behavior, or was the compiler supposed to reject this code and this is a bug in the compiler?
Edit:
It appears that I can keep the "2 INT_MAX arrays plus 2 chars" pattern and only in that case the compiler allows me to add as many members as I wish and the second character is always considered to be null. See demo. If I changed the pattern slightly (like 1 or 3 chars instead of 2 at some point) it complains that the class is too large.
The size limit of an object is implementation defined, per Annex B of the standard [1]. Your struct is of an absurd size.
If the struct is:
struct Foo
{
char arr1[INT_MAX];
//char arr2[INT_MAX];
char ch1;
char ch2;
};
... the size of your struct in a relatively recent version of 64-bit MSVC appears to be around 2147483649 bytes. If you then add in arr2, suddenly sizeof will tell you that Foo is of size 1.
The C++ standard (Annex B) states that the compiler must document limitations, which MSVC does [2]. It states that it follows the recommended limit. Annex B, Section 2.17 provides a recommended limit of 262144(?) for the size of an object. While it's clear that MSVC can handle more than that, it documents that it follows that minimum recommendation so I'd assume you should take care when your object size is more than that.
[1] http://eel.is/c++draft/implimits
[2] https://learn.microsoft.com/en-us/cpp/cpp/compiler-limits?view=vs-2019
It's clearly a collision between an optimization on pointer-to-member representation (use only 4 bytes of storage when no virtual bases are present) and the pigeonhole principle.
For a type X containing N subobjects of type char, there are N+1 possible valid pointer-to-members of type char X::*... one for each subobject, and one for null-pointer-to-member.
This works when there are at least N+1 distinct values in the pointer-to-member representation, which for a 4-byte representation implies that N+1 <= 232 and therefore the maximum object size is 232 - 1.
Unfortunately the compiler in question made the maximum object-type size (before it rejects the program) equal to 232 which is one too large and creates a pigeonhole problem -- at least one pair of pointer-to-members must be indistinguishable. It's not necessary that the null pointer-to-member be one half of this pair, but as you've observed in this implementation it is.
The expression &Foo::ch2 is of type char Foo::*, which is pointer to member of class Foo. By rules, a pointer to member converted to bool should be evaluated as false ONLY if it is a null pointer, i.e. it had nullptr assigned to it.
The fault here appears to be a implementation's flaw. i.e. on gcc compilers with -march=x86-64 any assigned pointer to member evaluates to non-null (1) unless it had nullptr assigned to it with following code:
struct foo
{
char arr1[LLONG_MAX];
char arr2[LLONG_MAX];
char ch1;
char ch2;
};
int main()
{
char foo::* p1 = &foo::ch1;
char foo::* p2 = &foo::ch2;
std::cout << (p1?"Not null ":"null ") << '\n';
std::cout << (p2?"Not null ":"null ") << '\n';
std::cout << LLONG_MAX + LLONG_MAX << '\n';
std::cout << ULLONG_MAX << '\n';
std::cout << offsetof(foo, ch1) << '\n';
}
Output:
Not null
null
-2
18446744073709551615
18446744073709551614
Likely it's related to fact that class size is exceeding platform limitations, leading to offset of member being wrapped around of 0 (internal value of nullptr). Compiler doesn't detect it because it becomes a victim of... integer overflow with signed value and it's programmer's fault to cause UB within compiler by using signed literals as array size: LLONG_MAX + LLONG_MAX = -2 would be "size" of two arrays combined.
Essentially size of first two members is calculated as negative and offset of ch1 is -2 represented as unsigned 18446744073709551614.
And -2 therefore pointer is not null. Another compiler may clamp value to 0 producing a nullptr, or actually detect existing problem as clang does.
If offset of ch1 is -2, then offset of ch2 is -1? Let's add this:
std::cout << reinterpret_cast<signed long long&&> (offsetof(foo, ch1)) << '\n';
std::cout << reinterpret_cast<signed long long&&> (offsetof(foo, ch2)) << '\n';
Additional output:
-2
-1
And offset for first member is obviously 0 and if pointer represent offsets, then it needs another value to represent nullptr. it's logical to assume that this particular compiler considers only -1 to be a null value, which may or may not be case for other implementations.
When I test the code, VS shows that Foo: the class is too large.
When I add char arr3[INT_MAX], Visual Studio will report Error C2089 'Foo': 'struct' too large. Microsoft Docs explains it as The specified structure or union exceeds the 4GB limit.

regarding sizeof()

I have doubt regarding sizeof(). I know it gives the number of bytes used in an array. My question is what if the array is not defined, but it declared.
example:
float array[3];
int p = sizeof(array);
The value yielded by sizeof depends solely on the type, not anything that happens at run time1.
That said, despite the name, float array(3); simply defines a single float, with an initial value of 3, so int p=sizeof(array); is equivalent to int p = sizeof(float);.
Edit: (to correspond to edited question): yes, float array[3]; defines an array of 3 floats, so int p = sizeof(array); is equivalent to int p = 3 * sizeof(float);
1 In C++. As of C99, the situation in C is somewhat different (but irrelevant to the question at hand).
You are not declaring an array of floats when you use this code
float array(3);
you have simple created a float variable called array with value 3. Your call of sizeof on this variable just returns the size of float. Had you declared it properly
float float_array[3];
and called sizeof(float_array) you would get the value you expect - 3*sizeof(float)
sizeof() does not give the number of bytes used in an array, your definition is incomplete and partially incorrect.
http://en.cppreference.com/w/cpp/language/sizeof says:
"sizeof( type) --returns size in bytes of the object representation of type"
Also float array[3] is the correct way to declare an array of floats with 3 elements as other people have noted.
Finally, sizeof( array) would return 12, whereas sizeof( array) would return 16 if you declared it with 4 elements, and sizeof( array) would return 40 if you declared an array of 5 doubles instead of floats, at least on my system. Of course, number of bytes used for data types may change from system to system.

ARRAYSIZE C++ macro: how does it work?

OK, I'm not entirely a newbie, but I cannot say I understand the following macro. The most confusing part is the division with value cast to size_t: what on earth does that accomplish? Especially, since I see a negation operator, which, as far as I know, might result in a zero value. Does not this mean that it can lead to a division-by-zero error? (By the way, the macro is correct and works beautifully.)
#define ARRAYSIZE(a) \
((sizeof(a) / sizeof(*(a))) / \
static_cast<size_t>(!(sizeof(a) % sizeof(*(a)))))
The first part (sizeof(a) / sizeof(*(a))) is fairly straightforward; it's dividing the size of the entire array (assuming you pass the macro an object of array type, and not a pointer), by the size of the first element. This gives the number of elements in the array.
The second part is not so straightforward. I think the potential division-by-zero is intentional; it will lead to a compile-time error if, for whatever reason, the size of the array is not an integer multiple of one of its elements. In other words, it's some kind of compile-time sanity check.
However, I can't see under what circumstances this could occur... As people have suggested in comments below, it will catch some misuse (like using ARRAYSIZE() on a pointer). It won't catch all errors like this, though.
I wrote this version of this macro. Consider the older version:
#include <sys/stat.h>
#define ARRAYSIZE(a) (sizeof(a) / sizeof(*(a)))
int main(int argc, char *argv[]) {
struct stat stats[32];
std::cout << "sizeof stats = " << (sizeof stats) << "\n";
std::cout << "sizeof *stats = " << (sizeof *stats) << "\n";
std::cout << "ARRAYSIZE=" << ARRAYSIZE(stats) << "\n";
foo(stats);
}
void foo(struct stat stats[32]) {
std::cout << "sizeof stats = " << (sizeof stats) << "\n";
std::cout << "sizeof *stats = " << (sizeof *stats) << "\n";
std::cout << "ARRAYSIZE=" << ARRAYSIZE(stats) << "\n";
}
On a 64-bit machine, this code produces this output:
sizeof stats = 4608
sizeof *stats = 144
ARRAYSIZE=32
sizeof stats = 8
sizeof *stats = 144
ARRAYSIZE=0
What's going on? How did the ARRAYSIZE go from 32 to zero? Well, the problem is the function parameter is actually a pointer, even though it looks like an array. So inside of foo, "sizeof(stats)" is 8 bytes, and "sizeof(*stats)" is still 144.
With the new macro:
#define ARRAYSIZE(a) \
((sizeof(a) / sizeof(*(a))) / \
static_cast<size_t>(!(sizeof(a) % sizeof(*(a)))))
When sizeof(a) is not a multiple of sizeof(* (a)), the % is not zero, which the ! inverts, and then the static_cast evaluates to zero, causing a compile-time division by zero. So to the extent possible in a macro, this weird division catches the problem at compile-time.
PS: in C++17, just use std::size, see http://en.cppreference.com/w/cpp/iterator/size
The division at the end seems to be an attempt at detecting a non-array argument (e.g. pointer).
It fails to detect that for, for example, char*, but would work for T* where sizeof(T) is greater than the size of a pointer.
In C++, one usually prefers the following function template:
typedef ptrdiff_t Size;
template< class Type, Size n >
Size countOf( Type (&)[n] ) { return n; }
This function template can't be instantiated with pointer argument, only array. In C++11 it can alternatively be expressed in terms of std::begin and std::end, which automagically lets it work also for standard containers with random access iterators.
Limitations: doesn't work for array of local type in C++03, and doesn't yield compile time size.
For compile time size you can instead do like
template< Size n > struct Sizer { char elems[n]; };
template< class Type, size n >
Sizer<n> countOf_( Type (&)[n] );
#define COUNT_OF( a ) sizeof( countOf_( a ).elems )
Disclaimer: all code untouched by compiler's hands.
But in general, just use the first function template, countOf.
Cheers & hth.
suppose we have
T arr[42];
ARRAYSIZE(arr) will expand to (rougly)
sizeof (arr) / sizeof(*arr) / !(sizeof(arr) % sizeof(*arr))
which in this case gives 42/!0 which is 42
If for some reason sizeof array is not divisible by sizeof its element, division by zero will occur. When can it happen? For example when you pass a dynamically allocated array instead of a static one!
It does lead to a division-by-zero error (intentionally). The way that this macro works is it divides the size of the array in bytes by the size of a single array element in bytes. So if you have an array of int values, where an int is 4 bytes (on most 32-bit machines), an array of 4 int values would be 16 bytes.
So when you call this macro on such an array, it does sizeof(array) / sizeof(*array). And since 16 / 4 = 4, it returns that there are 4 elements in the array.
Note: *array dereferences the first element of the array and is equivalent to array[0].
The second division does modulo-division (gets the remainder of the division), and since any non-zero value is considered "true", using the ! operator would cause a division by zero if the remainder of the division is non-zero (and similarly, division by 1 otherwise).
The div-by-zero may be trying to catch alignment errors caused by whatever reason. Like if, with some compiler settings, the size of an array element were 3 but the compiler would round it to 4 for faster array access, then an array of 4 entries would have the size of 16 and !(16/3) would go to zero, giving division-by-zero at compile time. Yet, I don't know of any compiler doing like that, and it may be against the specification of C++ for sizeof to return a size that differs from the size of that type in an array..
Coming late to the party here...
Google's C++ codebase has IMHO the definitive C++ implementation of the arraysize() macro, which includes several wrinkles that aren't considered here.
I cannot improve upon the source, which has clear and complete comments.

Creating integer variable of a defined size

I want to define an integer variable in C/C++ such that my integer can store 10 bytes of data or may be a x bytes of data as defined by me in the program.
for now..!
I tried the
int *ptr;
ptr = (int *)malloc(10);
code. Now if I'm finding the sizeof ptr, it is showing as 4 and not 10. Why?
C and C++ compilers implement several sizes of integer (typically 1, 2, 4, and 8 bytes {8, 16, 32, and 64 bits}), but without some helper code to preform arithmetic operations you can't really make arbitrary sized integers.
The declarations you did:
int *ptr;
ptr = (int *)malloc(10);
Made what is probably a broken array of integers. Broken because unless you are on a system where (10 % sizeof(int) ) == 0) then you have extra bytes at the end which can't be used to store an entire integer.
There are several big number Class libraries you should be able to locate for C++ which do implement many of the operations you may want preform on your 10 byte (80 bit) integers. With C you would have to do operation as function calls because it lacks operator overloading.
Your sizeof(ptr) evaluated to 4 because you are using a machine that uses 4 byte pointers (a 32 bit system). sizeof tells you nothing about the size of the data that a pointer points to. The only place where this should get tricky is when you use sizeof on an array's name which is different from using it on a pointer. I mention this because arrays names and pointers share so many similarities.
Because on you machine, size of a pointer is 4 byte. Please note that type of the variable ptr is int *. You cannot get complete allocated size by sizeof operator if you malloc or new the memory, because sizeof is a compile time operator, meaning that at compile time the value is evaluated.
It is showing 4 bytes because a pointer on your platform is 4 bytes. The block of memory the pointer addresses may be of any arbitrary size, in your case it is 10 bytes. You need to create a data structure if you need to track that:
struct VariableInteger
{
int *ptr;
size_t size;
};
Also, using an int type for your ptr variable doesn't mean the language will allow you to do arithmetic operations on anything of a size different than the size of int on your platform.
Because the size of the pointer is 4. Try something like:
typedef struct
{
int a[10];
} big_int_t;
big_int_t x;
printf("%d\n", sizeof(x));
Note also that an int is typically not 1 byte in size, so this will probably print 20 or 40, depending on your platform.
Integers in C++ are of a fixed size. Do you mean an array of integers? As for sizeof, the way you are using it, it tells you that your pointer is four bytes in size. It doesn't tell you the size of a dynamically allocated block.
Few or no compilers support 10-byte integer arithmetic. If you want to use integers bigger than the values specified in <limits.h>, you'll need to either find a library with support for big integers or make your own class which defines the mathematical operators.
I believe what you're looking for is known as "Arbitrary-precision arithmetic". It allows you to have numbers of any size and any number of decimals. Instead of using fixed-size assembly level math functions, these libraries are coded to do math how one would do them on paper.
Here's a link to a list of arbitrary-precision arithmetic libraries in a few different languages, compliments of Wikipedia: link.