ARRAYSIZE C++ macro: how does it work? - c++

OK, I'm not entirely a newbie, but I cannot say I understand the following macro. The most confusing part is the division with value cast to size_t: what on earth does that accomplish? Especially, since I see a negation operator, which, as far as I know, might result in a zero value. Does not this mean that it can lead to a division-by-zero error? (By the way, the macro is correct and works beautifully.)
#define ARRAYSIZE(a) \
((sizeof(a) / sizeof(*(a))) / \
static_cast<size_t>(!(sizeof(a) % sizeof(*(a)))))

The first part (sizeof(a) / sizeof(*(a))) is fairly straightforward; it's dividing the size of the entire array (assuming you pass the macro an object of array type, and not a pointer), by the size of the first element. This gives the number of elements in the array.
The second part is not so straightforward. I think the potential division-by-zero is intentional; it will lead to a compile-time error if, for whatever reason, the size of the array is not an integer multiple of one of its elements. In other words, it's some kind of compile-time sanity check.
However, I can't see under what circumstances this could occur... As people have suggested in comments below, it will catch some misuse (like using ARRAYSIZE() on a pointer). It won't catch all errors like this, though.

I wrote this version of this macro. Consider the older version:
#include <sys/stat.h>
#define ARRAYSIZE(a) (sizeof(a) / sizeof(*(a)))
int main(int argc, char *argv[]) {
struct stat stats[32];
std::cout << "sizeof stats = " << (sizeof stats) << "\n";
std::cout << "sizeof *stats = " << (sizeof *stats) << "\n";
std::cout << "ARRAYSIZE=" << ARRAYSIZE(stats) << "\n";
foo(stats);
}
void foo(struct stat stats[32]) {
std::cout << "sizeof stats = " << (sizeof stats) << "\n";
std::cout << "sizeof *stats = " << (sizeof *stats) << "\n";
std::cout << "ARRAYSIZE=" << ARRAYSIZE(stats) << "\n";
}
On a 64-bit machine, this code produces this output:
sizeof stats = 4608
sizeof *stats = 144
ARRAYSIZE=32
sizeof stats = 8
sizeof *stats = 144
ARRAYSIZE=0
What's going on? How did the ARRAYSIZE go from 32 to zero? Well, the problem is the function parameter is actually a pointer, even though it looks like an array. So inside of foo, "sizeof(stats)" is 8 bytes, and "sizeof(*stats)" is still 144.
With the new macro:
#define ARRAYSIZE(a) \
((sizeof(a) / sizeof(*(a))) / \
static_cast<size_t>(!(sizeof(a) % sizeof(*(a)))))
When sizeof(a) is not a multiple of sizeof(* (a)), the % is not zero, which the ! inverts, and then the static_cast evaluates to zero, causing a compile-time division by zero. So to the extent possible in a macro, this weird division catches the problem at compile-time.
PS: in C++17, just use std::size, see http://en.cppreference.com/w/cpp/iterator/size

The division at the end seems to be an attempt at detecting a non-array argument (e.g. pointer).
It fails to detect that for, for example, char*, but would work for T* where sizeof(T) is greater than the size of a pointer.
In C++, one usually prefers the following function template:
typedef ptrdiff_t Size;
template< class Type, Size n >
Size countOf( Type (&)[n] ) { return n; }
This function template can't be instantiated with pointer argument, only array. In C++11 it can alternatively be expressed in terms of std::begin and std::end, which automagically lets it work also for standard containers with random access iterators.
Limitations: doesn't work for array of local type in C++03, and doesn't yield compile time size.
For compile time size you can instead do like
template< Size n > struct Sizer { char elems[n]; };
template< class Type, size n >
Sizer<n> countOf_( Type (&)[n] );
#define COUNT_OF( a ) sizeof( countOf_( a ).elems )
Disclaimer: all code untouched by compiler's hands.
But in general, just use the first function template, countOf.
Cheers & hth.

suppose we have
T arr[42];
ARRAYSIZE(arr) will expand to (rougly)
sizeof (arr) / sizeof(*arr) / !(sizeof(arr) % sizeof(*arr))
which in this case gives 42/!0 which is 42
If for some reason sizeof array is not divisible by sizeof its element, division by zero will occur. When can it happen? For example when you pass a dynamically allocated array instead of a static one!

It does lead to a division-by-zero error (intentionally). The way that this macro works is it divides the size of the array in bytes by the size of a single array element in bytes. So if you have an array of int values, where an int is 4 bytes (on most 32-bit machines), an array of 4 int values would be 16 bytes.
So when you call this macro on such an array, it does sizeof(array) / sizeof(*array). And since 16 / 4 = 4, it returns that there are 4 elements in the array.
Note: *array dereferences the first element of the array and is equivalent to array[0].
The second division does modulo-division (gets the remainder of the division), and since any non-zero value is considered "true", using the ! operator would cause a division by zero if the remainder of the division is non-zero (and similarly, division by 1 otherwise).

The div-by-zero may be trying to catch alignment errors caused by whatever reason. Like if, with some compiler settings, the size of an array element were 3 but the compiler would round it to 4 for faster array access, then an array of 4 entries would have the size of 16 and !(16/3) would go to zero, giving division-by-zero at compile time. Yet, I don't know of any compiler doing like that, and it may be against the specification of C++ for sizeof to return a size that differs from the size of that type in an array..

Coming late to the party here...
Google's C++ codebase has IMHO the definitive C++ implementation of the arraysize() macro, which includes several wrinkles that aren't considered here.
I cannot improve upon the source, which has clear and complete comments.

Related

Why is this pointer null

In Visual Studio, it seems like pointer to member variables are 32 bit signed integers behind the scenes (even in 64 bit mode), and a null-pointer is -1 in that context. So if I have a class like:
#include <iostream>
#include <cstdint>
struct Foo
{
char arr1[INT_MAX];
char arr2[INT_MAX];
char ch1;
char ch2;
};
int main()
{
auto p = &Foo::ch2;
std::cout << (p?"Not null":"null") << '\n';
}
It compiles, and prints "null". So, am I causing some kind of undefined behavior, or was the compiler supposed to reject this code and this is a bug in the compiler?
Edit:
It appears that I can keep the "2 INT_MAX arrays plus 2 chars" pattern and only in that case the compiler allows me to add as many members as I wish and the second character is always considered to be null. See demo. If I changed the pattern slightly (like 1 or 3 chars instead of 2 at some point) it complains that the class is too large.
The size limit of an object is implementation defined, per Annex B of the standard [1]. Your struct is of an absurd size.
If the struct is:
struct Foo
{
char arr1[INT_MAX];
//char arr2[INT_MAX];
char ch1;
char ch2;
};
... the size of your struct in a relatively recent version of 64-bit MSVC appears to be around 2147483649 bytes. If you then add in arr2, suddenly sizeof will tell you that Foo is of size 1.
The C++ standard (Annex B) states that the compiler must document limitations, which MSVC does [2]. It states that it follows the recommended limit. Annex B, Section 2.17 provides a recommended limit of 262144(?) for the size of an object. While it's clear that MSVC can handle more than that, it documents that it follows that minimum recommendation so I'd assume you should take care when your object size is more than that.
[1] http://eel.is/c++draft/implimits
[2] https://learn.microsoft.com/en-us/cpp/cpp/compiler-limits?view=vs-2019
It's clearly a collision between an optimization on pointer-to-member representation (use only 4 bytes of storage when no virtual bases are present) and the pigeonhole principle.
For a type X containing N subobjects of type char, there are N+1 possible valid pointer-to-members of type char X::*... one for each subobject, and one for null-pointer-to-member.
This works when there are at least N+1 distinct values in the pointer-to-member representation, which for a 4-byte representation implies that N+1 <= 232 and therefore the maximum object size is 232 - 1.
Unfortunately the compiler in question made the maximum object-type size (before it rejects the program) equal to 232 which is one too large and creates a pigeonhole problem -- at least one pair of pointer-to-members must be indistinguishable. It's not necessary that the null pointer-to-member be one half of this pair, but as you've observed in this implementation it is.
The expression &Foo::ch2 is of type char Foo::*, which is pointer to member of class Foo. By rules, a pointer to member converted to bool should be evaluated as false ONLY if it is a null pointer, i.e. it had nullptr assigned to it.
The fault here appears to be a implementation's flaw. i.e. on gcc compilers with -march=x86-64 any assigned pointer to member evaluates to non-null (1) unless it had nullptr assigned to it with following code:
struct foo
{
char arr1[LLONG_MAX];
char arr2[LLONG_MAX];
char ch1;
char ch2;
};
int main()
{
char foo::* p1 = &foo::ch1;
char foo::* p2 = &foo::ch2;
std::cout << (p1?"Not null ":"null ") << '\n';
std::cout << (p2?"Not null ":"null ") << '\n';
std::cout << LLONG_MAX + LLONG_MAX << '\n';
std::cout << ULLONG_MAX << '\n';
std::cout << offsetof(foo, ch1) << '\n';
}
Output:
Not null
null
-2
18446744073709551615
18446744073709551614
Likely it's related to fact that class size is exceeding platform limitations, leading to offset of member being wrapped around of 0 (internal value of nullptr). Compiler doesn't detect it because it becomes a victim of... integer overflow with signed value and it's programmer's fault to cause UB within compiler by using signed literals as array size: LLONG_MAX + LLONG_MAX = -2 would be "size" of two arrays combined.
Essentially size of first two members is calculated as negative and offset of ch1 is -2 represented as unsigned 18446744073709551614.
And -2 therefore pointer is not null. Another compiler may clamp value to 0 producing a nullptr, or actually detect existing problem as clang does.
If offset of ch1 is -2, then offset of ch2 is -1? Let's add this:
std::cout << reinterpret_cast<signed long long&&> (offsetof(foo, ch1)) << '\n';
std::cout << reinterpret_cast<signed long long&&> (offsetof(foo, ch2)) << '\n';
Additional output:
-2
-1
And offset for first member is obviously 0 and if pointer represent offsets, then it needs another value to represent nullptr. it's logical to assume that this particular compiler considers only -1 to be a null value, which may or may not be case for other implementations.
When I test the code, VS shows that Foo: the class is too large.
When I add char arr3[INT_MAX], Visual Studio will report Error C2089 'Foo': 'struct' too large. Microsoft Docs explains it as The specified structure or union exceeds the 4GB limit.

What does size_t mean in c++? [duplicate]

I'm just wondering should I use std::size_t for loops and stuff instead of int?
For instance:
#include <cstdint>
int main()
{
for (std::size_t i = 0; i < 10; ++i) {
// std::size_t OK here? Or should I use, say, unsigned int instead?
}
}
In general, what is the best practice regarding when to use std::size_t?
A good rule of thumb is for anything that you need to compare in the loop condition against something that is naturally a std::size_t itself.
std::size_t is the type of any sizeof expression and as is guaranteed to be able to express the maximum size of any object (including any array) in C++. By extension it is also guaranteed to be big enough for any array index so it is a natural type for a loop by index over an array.
If you are just counting up to a number then it may be more natural to use either the type of the variable that holds that number or an int or unsigned int (if large enough) as these should be a natural size for the machine.
size_t is the result type of the sizeof operator.
Use size_t for variables that model size or index in an array. size_t conveys semantics: you immediately know it represents a size in bytes or an index, rather than just another integer.
Also, using size_t to represent a size in bytes helps making the code portable.
The size_t type is meant to specify the size of something so it's natural to use it, for example, getting the length of a string and then processing each character:
for (size_t i = 0, max = strlen (str); i < max; i++)
doSomethingWith (str[i]);
You do have to watch out for boundary conditions of course, since it's an unsigned type. The boundary at the top end is not usually that important since the maximum is usually large (though it is possible to get there). Most people just use an int for that sort of thing because they rarely have structures or arrays that get big enough to exceed the capacity of that int.
But watch out for things like:
for (size_t i = strlen (str) - 1; i >= 0; i--)
which will cause an infinite loop due to the wrapping behaviour of unsigned values (although I've seen compilers warn against this). This can also be alleviated by the (slightly harder to understand but at least immune to wrapping problems):
for (size_t i = strlen (str); i-- > 0; )
By shifting the decrement into a post-check side-effect of the continuation condition, this does the check for continuation on the value before decrement, but still uses the decremented value inside the loop (which is why the loop runs from len .. 1 rather than len-1 .. 0).
By definition, size_t is the result of the sizeof operator. size_t was created to refer to sizes.
The number of times you do something (10, in your example) is not about sizes, so why use size_t? int, or unsigned int, should be ok.
Of course it is also relevant what you do with i inside the loop. If you pass it to a function which takes an unsigned int, for example, pick unsigned int.
In any case, I recommend to avoid implicit type conversions. Make all type conversions explicit.
short answer:
Almost never. Use signed version ptrdiff_t or non-standard ssize_t. Use function std::ssize instead of std::size.
long answer:
Whenever you need to have a vector of char bigger that 2gb on a 32 bit system. In every other use case, using a signed type is much safer than using an unsigned type.
example:
std::vector<A> data;
[...]
// calculate the index that should be used;
size_t i = calc_index(param1, param2);
// doing calculations close to the underflow of an integer is already dangerous
// do some bounds checking
if( i - 1 < 0 ) {
// always false, because 0-1 on unsigned creates an underflow
return LEFT_BORDER;
} else if( i >= data.size() - 1 ) {
// if i already had an underflow, this becomes true
return RIGHT_BORDER;
}
// now you have a bug that is very hard to track, because you never
// get an exception or anything anymore, to detect that you actually
// return the false border case.
return calc_something(data[i-1], data[i], data[i+1]);
The signed equivalent of size_t is ptrdiff_t, not int. But using int is still much better in most cases than size_t. ptrdiff_t is long on 32 and 64 bit systems.
This means that you always have to convert to and from size_t whenever you interact with a std::containers, which not very beautiful. But on a going native conference the authors of c++ mentioned that designing std::vector with an unsigned size_t was a mistake.
If your compiler gives you warnings on implicit conversions from ptrdiff_t to size_t, you can make it explicit with constructor syntax:
calc_something(data[size_t(i-1)], data[size_t(i)], data[size_t(i+1)]);
if just want to iterate a collection, without bounds cheking, use range based for:
for(const auto& d : data) {
[...]
}
here some words from Bjarne Stroustrup (C++ author) at going native
For some people this signed/unsigned design error in the STL is reason enough, to not use the std::vector, but instead an own implementation.
size_t is a very readable way to specify the size dimension of an item - length of a string, amount of bytes a pointer takes, etc.
It's also portable across platforms - you'll find that 64bit and 32bit both behave nicely with system functions and size_t - something that unsigned int might not do (e.g. when should you use unsigned long
Use std::size_t for indexing/counting C-style arrays.
For STL containers, you'll have (for example) vector<int>::size_type, which should be used for indexing and counting vector elements.
In practice, they are usually both unsigned ints, but it isn't guaranteed, especially when using custom allocators.
Soon most computers will be 64-bit architectures with 64-bit OS:es running programs operating on containers of billions of elements. Then you must use size_t instead of int as loop index, otherwise your index will wrap around at the 2^32:th element, on both 32- and 64-bit systems.
Prepare for the future!
size_t is returned by various libraries to indicate that the size of that container is non-zero. You use it when you get once back :0
However, in the your example above looping on a size_t is a potential bug. Consider the following:
for (size_t i = thing.size(); i >= 0; --i) {
// this will never terminate because size_t is a typedef for
// unsigned int which can not be negative by definition
// therefore i will always be >= 0
printf("the never ending story. la la la la");
}
the use of unsigned integers has the potential to create these types of subtle issues. Therefore imho I prefer to use size_t only when I interact with containers/types that require it.
When using size_t be careful with the following expression
size_t i = containner.find("mytoken");
size_t x = 99;
if (i-x>-1 && i+x < containner.size()) {
cout << containner[i-x] << " " << containner[i+x] << endl;
}
You will get false in the if expression regardless of what value you have for x.
It took me several days to realize this (the code is so simple that I did not do unit test), although it only take a few minutes to figure the source of the problem. Not sure it is better to do a cast or use zero.
if ((int)(i-x) > -1 or (i-x) >= 0)
Both ways should work. Here is my test run
size_t i = 5;
cerr << "i-7=" << i-7 << " (int)(i-7)=" << (int)(i-7) << endl;
The output: i-7=18446744073709551614 (int)(i-7)=-2
I would like other's comments.
It is often better not to use size_t in a loop. For example,
vector<int> a = {1,2,3,4};
for (size_t i=0; i<a.size(); i++) {
std::cout << a[i] << std::endl;
}
size_t n = a.size();
for (size_t i=n-1; i>=0; i--) {
std::cout << a[i] << std::endl;
}
The first loop is ok. But for the second loop:
When i=0, the result of i-- will be ULLONG_MAX (assuming size_t = unsigned long long), which is not what you want in a loop.
Moreover, if a is empty then n=0 and n-1=ULLONG_MAX which is not good either.
size_t is an unsigned type that can hold maximum integer value for your architecture, so it is protected from integer overflows due to sign (signed int 0x7FFFFFFF incremented by 1 will give you -1) or short size (unsigned short int 0xFFFF incremented by 1 will give you 0).
It is mainly used in array indexing/loops/address arithmetic and so on. Functions like memset() and alike accept size_t only, because theoretically you may have a block of memory of size 2^32-1 (on 32bit platform).
For such simple loops don't bother and use just int.
I have been struggling myself with understanding what and when to use it. But size_t is just an unsigned integral data type which is defined in various header files such as <stddef.h>, <stdio.h>, <stdlib.h>, <string.h>, <time.h>, <wchar.h> etc.
It is used to represent the size of objects in bytes hence it's used as the return type by the sizeof operator. The maximum permissible size is dependent on the compiler; if the compiler is 32 bit then it is simply a typedef (alias) for unsigned int but if the compiler is 64 bit then it would be a typedef for unsigned long long. The size_t data type is never negative(excluding ssize_t)
Therefore many C library functions like malloc, memcpy and strlen declare their arguments and return type as size_t.
/ Declaration of various standard library functions.
// Here argument of 'n' refers to maximum blocks that can be
// allocated which is guaranteed to be non-negative.
void *malloc(size_t n);
// While copying 'n' bytes from 's2' to 's1'
// n must be non-negative integer.
void *memcpy(void *s1, void const *s2, size_t n);
// the size of any string or `std::vector<char> st;` will always be at least 0.
size_t strlen(char const *s);
size_t or any unsigned type might be seen used as loop variable as loop variables are typically greater than or equal to 0.
size_t is an unsigned integral type, that can represent the largest integer on you system.
Only use it if you need very large arrays,matrices etc.
Some functions return an size_t and your compiler will warn you if you try to do comparisons.
Avoid that by using a the appropriate signed/unsigned datatype or simply typecast for a fast hack.
size_t is unsigned int. so whenever you want unsigned int you can use it.
I use it when i want to specify size of the array , counter ect...
void * operator new (size_t size); is a good use of it.

Size of std::array, std::vector and raw array

Lets we have,
std::array <int,5> STDarr;
std::vector <int> VEC(5);
int RAWarr[5];
I tried to get size of them as,
std::cout << sizeof(STDarr) + sizeof(int) * STDarr.max_size() << std::endl;
std::cout << sizeof(VEC) + sizeof(int) * VEC.capacity() << std::endl;
std::cout << sizeof(RAWarr) << std::endl;
The outputs are,
40
20
40
Are these calculations correct? Considering I don't have enough memory for std::vector and no way of escaping dynamic allocation, what should I use? If I would know that std::arrayresults in lower memory requirement I could change the program to make array static.
These numbers are wrong. Moreover, I don't think they represent what you think they represent, either. Let me explain.
First the part about them being wrong. You, unfortunately, don't show the value of sizeof(int) so we must derive it. On the system you are using the size of an int can be computed as
size_t sizeof_int = sizeof(RAWarr) / 5; // => sizeof(int) == 8
because this is essentially the definition of sizeof(T): it is the number of bytes between the start of two adjacent objects of type T in an array. This happens to be inconsistent with the number print for STDarr: the class template std::array<T, n> is specified to have an array of n objects of type T embedded into it. Moreover, std::array<T, n>::max_size() is a constant expression yielding n. That is, we have:
40 // is identical to
sizeof(STDarr) + sizeof(int) * STDarr.max_size() // is bigger or equal to
sizeof(RAWarr) + sizeof_int * 5 // is identical to
40 + 40 // is identical to
80
That is 40 >= 80 - a contradication.
Similarily, the second computation is also inconsistent with the third computation: the std::vector<int> holds at least 5 elements and the capacity() has to be bigger than than the size(). Moreover, the std::vector<int>'s size is non-zero. That is, the following always has to be true:
sizeof(RAWarr) < sizeof(VEC) + sizeof(int) * VEC.capacity()
Anyway, all this is pretty much irrelevant to what your actual question seems to be: What is the overhead of representing n objects of type T using a built-in array of T, an std::array<T, n>, and an std::vector<T>? The answer to this question is:
A built-in array T[n] uses sizeof(T) * n.
An std::array<T, n> uses the same size as a T[n].
A std::vector<T>(n) has needs some control data (the size, the capacity, and possibly and possibly an allocator) plus at least 'n * sizeof(T)' bytes to represent its actual data. It may choose to also have a capacity() which is bigger than n.
In addition to these numbers, actually using any of these data structures may require addition memory:
All objects are aligned at an appropriate address. For this there may be additional byte in front of the object.
When the object is allocated on the heap, the memory management system my include a couple of bytes in addition to the memory made avaiable. This may be just a word with the size but it may be whatever the allocation mechanism fancies. Also, this memory may live somewhere else than the allocate memory, e.g. in a hash table somewhere.
OK, I hope this provided some insight. However, here comes the important message: if std::vector<T> isn't capable of holding the amount of data you have there are two situations:
You have extremely low memory and most of this discussion is futile because you need entirely different approaches to cope with the few bytes you have. This would be the case if you are working on extremely resource constrained embedded systems.
You have too much data an using T[n] or std::array<T, n> won't be of much help because the overhead we are talking of is typically less than 32 bytes.
Maybe you can describe what you are actually trying to do and why std::vector<T> is not an option.

Macro definition ARRAY_SIZE

I encountered the following macro definition when reading the globals.h in the Google V8 project.
// The expression ARRAY_SIZE(a) is a compile-time constant of type
// size_t which represents the number of elements of the given
// array. You should only use ARRAY_SIZE on statically allocated
// arrays.
#define ARRAY_SIZE(a) \
((sizeof(a) / sizeof(*(a))) / \
static_cast<size_t>(!(sizeof(a) % sizeof(*(a)))))
My question is the latter part: static_cast<size_t>(!(sizeof(a) % sizeof(*(a))))). One thing in my mind is the following: Since the latter part will always evaluates to 1, which is of type size_t, the whole expression will be promoted to size_t.
If this assumption is correct, then there comes another question: since the return type of sizeof operator is size_t, why is such a promotion necessary? What's the benefit of defining a macro in this way?
As explained, this is a feeble (*) attempt to secure the macro against use with pointers (rather than true arrays) where it would not correctly assess the size of the array. This of course stems from the fact that macros are pure text-based manipulations and have no notion of AST.
Since the question is also tagged C++, I would like to point out that C++ offers a type-safe alternative: templates.
#ifdef __cplusplus
template <size_t N> struct ArraySizeHelper { char _[N]; };
template <typename T, size_t N>
ArraySizeHelper<N> makeArraySizeHelper(T(&)[N]);
# define ARRAY_SIZE(a) sizeof(makeArraySizeHelper(a))
#else
# // C definition as shown in Google's code
#endif
Alternatively, will soon be able to use constexpr:
template <typename T, size_t N>
constexpr size_t size(T (&)[N]) { return N; }
However my favorite compiler (Clang) still does not implement them :x
In both cases, because the function does not accept pointer parameters, you get a compile-time error if the type is not right.
(*) feeble in that it does not work for small objects where the size of the objects is a divisor of the size of a pointer.
Just a demonstration that it is a compile-time value:
template <size_t N> void print() { std::cout << N << "\n"; }
int main() {
int a[5];
print<ARRAY_SIZE(a)>();
}
See it in action on IDEONE.
latter part will always evaluates to 1, which is of type size_t,
Ideally the later part will evaluate to bool (i.e. true/false) and using static_cast<>, it's converted to size_t.
why such promotion is necessary? What's the benefit of defining a
macro in this way?
I don't know if this is ideal way to define a macro. However, one inspiration I find is in the comments: //You should only use ARRAY_SIZE on statically allocated arrays.
Suppose, if someone passes a pointer then it would fail for the struct (if it's greater than pointer size) data types.
struct S { int i,j,k,l };
S *p = new S[10];
ARRAY_SIZE(p); // compile time failure !
[Note: This technique may not show any error for int*, char* as said.]
If sizeof(a) / sizeof(*a) has some remainder (i.e. a is not an integral number of *a) then the expression would evaluate to 0 and the compiler would give you a division by zero error at compile time.
I can only assume the author of the macro was burned in the past by something that didn't pass that test.
In the Linux kernel, the macro is defined as (GCC specific):
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
where __must_be_array() is
/* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
and __same_type() is
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
The second part wants to ensure that the sizeof( a ) is divisible of by sizeof( *a ).
Thus the (sizeof(a) % sizeof(*(a))) part. If it's divisible, the expression will be evaluated to 0. Here comes the ! part - !(0) will give true. That's why the cast is needed. Actually, this does not affect the calculation of the size, just adds compile time check.
As it's compile time, in case that (sizeof(a) % sizeof(*(a))) is not 0, you'll have a compile-time error for 0 division.

What does sizeof do?

What is the main function of sizeof (I am new to C++). For instance
int k=7;
char t='Z';
What do sizeof (k) or sizeof (int) and sizeof (char) mean?
sizeof(x) returns the amount of memory (in bytes) that the variable or type x occupies. It has nothing to do with the value of the variable.
For example, if you have an array of some arbitrary type T then the distance between elements of that array is exactly sizeof(T).
int a[10];
assert(&(a[0]) + sizeof(int) == &(a[1]));
When used on a variable, it is equivalent to using it on the type of that variable:
T x;
assert(sizeof(T) == sizeof(x));
As a rule-of-thumb, it is best to use the variable name where possible, just in case the type changes:
int x;
std::cout << "x uses " << sizeof(x) << " bytes." << std::endl
// If x is changed to a char, then the statement doesn't need to be changed.
// If we used sizeof(int) instead, we would need to change 2 lines of code
// instead of one.
When used on user-defined types, sizeof still returns the amount of memory used by instances of that type, but it's worth pointing out that this does not necessary equal the sum of its members.
struct Foo { int a; char b; };
While sizeof(int) + sizeof(char) is typically 5, on many machines, sizeof(Foo) may be 8 because the compiler needs to pad out the structure so that it lies on 4 byte boundaries. This is not always the case, and it's quite possible that on your machine sizeof(Foo) will be 5, but you can't depend on it.
To add to Peter Alexander's answer: sizeof yields the size of a value or type in multiples of the size of a char---char being defined as the smallest unit of memory addressable (by C or C++) for a given architecture (and, in C++ at least, at least 8 bits in size according to the standard). This is what's generally meant by "bytes" (smallest addressable unit for a given architecture) but it never hurts to clarify, and there are occasionally questions about the variability of sizeof (char), which is of course always 1.
sizeof() returns the size of the argument passed to it.
sizeof() cpp reference
sizeof is a compile time unary operator that returns size of data type.
For example:
sizeof(int)
will return the size of int in byte.
Also remember that type sizes are platform dependent.
Check this page for more details: sizeof in C/C++