Generating BitCount LUT at compile time - c++

Let's say that I need to create a LUT containing precomputed bit count values (count of 1 bits in a number) for 0...255 values:
int CB_LUT[256] = {0, 1, 1, 2, ... 7, 8};
If I don't want to use hard-coded values, I can use nice template solution How to count the number of set bits in a 32-bit integer?
template <int BITS>
int CountBits(int val)
{
return (val & 0x1) + CountBits<BITS-1>(val >> 1);
}
template<>
int CountBits<1>(int val)
{
return val & 0x1;
}
int CB_LUT[256] = {CountBits<8>(0), CountBits<8>(1) ... CountBits<8>(255)};
This array is computed completely at compile time. Is there any way to avoid a long list, and generate such array using some kind of templates or even macros (sorry!), something like:
Generate(CB_LUT, 0, 255); // array declaration
...
cout << CB_LUT[255]; // should print 8
Notes. This question is not about counting 1 bits in an number, it is used just as example. I want to generate such array completely in the code, without using external code generators. Array must be generated at compile time.
Edit.
To overcome compiler limits, I found the following solution, based on
Bartek Banachewicz` code:
#define MACRO(z,n,text) CountBits<8>(n)
int CB_LUT[] = {
BOOST_PP_ENUM(128, MACRO, _)
};
#undef MACRO
#define MACRO(z,n,text) CountBits<8>(n+128)
int CB_LUT2[] = {
BOOST_PP_ENUM(128, MACRO, _)
};
#undef MACRO
for(int i = 0; i < 256; ++i) // use only CB_LUT
{
cout << CB_LUT[i] << endl;
}
I know that this is possibly UB...

It would be fairly easy with macros using (recently re-discovered by me for my code) Boost.Preprocessor - I am not sure if it falls under "without using external code generators".
PP_ENUM version
Thanks to #TemplateRex for BOOST_PP_ENUM, as I said, I am not very experienced at PP yet :)
#include <boost/preprocessor/repetition/enum.hpp>
// with ENUM we don't need a comma at the end
#define MACRO(z,n,text) CountBits<8>(n)
int CB_LUT[256] = {
BOOST_PP_ENUM(256, MACRO, _)
};
#undef MACRO
The main difference with PP_ENUM is that it automatically adds the comma after each element and strips the last one.
PP_REPEAT version
#include <boost/preprocessor/repetition/repeat.hpp>
#define MACRO(z,n,data) CountBits<8>(n),
int CB_LUT[256] = {
BOOST_PP_REPEAT(256, MACRO, _)
};
#undef MACRO
Remarks
It's actually very straightforward and easy to use, though it's up to you to decide if you will accept macros. I've personally struggled a lot with Boost.MPL and template techniques, to find PP solutions easy to read, short and powerful, especially for enumerations like those. Additional important advantage of PP over TMP is the compilation time.
As for the comma at the end, all reasonable compilers should support it, but in case yours doesn't, simply change the number of repetitions to 255 and add last case by hand.
You might also want to rename MACRO to something meaningful to avoid possible redefinitions.

I like to do it like this:
#define MYLIB_PP_COUNT_BITS(z, i, data) \
CountBits< 8 >(i)
int CB_LUT[] = {
BOOST_PP_ENUM(256, MYLIB_PP_COUNT_BITS, ~)
};
#undef MYLIB_PP_COUNT_BITS
The difference with BOOST_PP_REPEAT is that BOOST_PP_ENUM generates a comma-separated sequence of values, so no need to worry about comma's and last-case behavior.
Furthermore, it is recommended to make your macros really loud and obnoixous by using a NAMESPACE_PP_FUNCTION naming scheme.
a small configuration thing is to omit the [256] in favor of [] in the array size so that you can more easily modify it later.
Finally, I would recommend making your CountBit function template constexpr so that you also can initialize const arrays.

Related

Why is my C++ OpenGL program not rendering a simple rectangle on screen with no errors from OpenGL debugging context? [duplicate]

How do I determine the size of my array in C?
That is, the number of elements the array can hold?
Executive summary:
int a[17];
size_t n = sizeof(a)/sizeof(a[0]);
Full answer:
To determine the size of your array in bytes, you can use the sizeof
operator:
int a[17];
size_t n = sizeof(a);
On my computer, ints are 4 bytes long, so n is 68.
To determine the number of elements in the array, we can divide
the total size of the array by the size of the array element.
You could do this with the type, like this:
int a[17];
size_t n = sizeof(a) / sizeof(int);
and get the proper answer (68 / 4 = 17), but if the type of
a changed you would have a nasty bug if you forgot to change
the sizeof(int) as well.
So the preferred divisor is sizeof(a[0]) or the equivalent sizeof(*a), the size of the first element of the array.
int a[17];
size_t n = sizeof(a) / sizeof(a[0]);
Another advantage is that you can now easily parameterize
the array name in a macro and get:
#define NELEMS(x) (sizeof(x) / sizeof((x)[0]))
int a[17];
size_t n = NELEMS(a);
The sizeof way is the right way iff you are dealing with arrays not received as parameters. An array sent as a parameter to a function is treated as a pointer, so sizeof will return the pointer's size, instead of the array's.
Thus, inside functions this method does not work. Instead, always pass an additional parameter size_t size indicating the number of elements in the array.
Test:
#include <stdio.h>
#include <stdlib.h>
void printSizeOf(int intArray[]);
void printLength(int intArray[]);
int main(int argc, char* argv[])
{
int array[] = { 0, 1, 2, 3, 4, 5, 6 };
printf("sizeof of array: %d\n", (int) sizeof(array));
printSizeOf(array);
printf("Length of array: %d\n", (int)( sizeof(array) / sizeof(array[0]) ));
printLength(array);
}
void printSizeOf(int intArray[])
{
printf("sizeof of parameter: %d\n", (int) sizeof(intArray));
}
void printLength(int intArray[])
{
printf("Length of parameter: %d\n", (int)( sizeof(intArray) / sizeof(intArray[0]) ));
}
Output (in a 64-bit Linux OS):
sizeof of array: 28
sizeof of parameter: 8
Length of array: 7
Length of parameter: 2
Output (in a 32-bit windows OS):
sizeof of array: 28
sizeof of parameter: 4
Length of array: 7
Length of parameter: 1
It is worth noting that sizeof doesn't help when dealing with an array value that has decayed to a pointer: even though it points to the start of an array, to the compiler it is the same as a pointer to a single element of that array. A pointer does not "remember" anything else about the array that was used to initialize it.
int a[10];
int* p = a;
assert(sizeof(a) / sizeof(a[0]) == 10);
assert(sizeof(p) == sizeof(int*));
assert(sizeof(*p) == sizeof(int));
The sizeof "trick" is the best way I know, with one small but (to me, this being a major pet peeve) important change in the use of parenthesis.
As the Wikipedia entry makes clear, C's sizeof is not a function; it's an operator. Thus, it does not require parenthesis around its argument, unless the argument is a type name. This is easy to remember, since it makes the argument look like a cast expression, which also uses parenthesis.
So: If you have the following:
int myArray[10];
You can find the number of elements with code like this:
size_t n = sizeof myArray / sizeof *myArray;
That, to me, reads a lot easier than the alternative with parenthesis. I also favor use of the asterisk in the right-hand part of the division, since it's more concise than indexing.
Of course, this is all compile-time too, so there's no need to worry about the division affecting the performance of the program. So use this form wherever you can.
It is always best to use sizeof on an actual object when you have one, rather than on a type, since then you don't need to worry about making an error and stating the wrong type.
For instance, say you have a function that outputs some data as a stream of bytes, for instance across a network. Let's call the function send(), and make it take as arguments a pointer to the object to send, and the number of bytes in the object. So, the prototype becomes:
void send(const void *object, size_t size);
And then you need to send an integer, so you code it up like this:
int foo = 4711;
send(&foo, sizeof (int));
Now, you've introduced a subtle way of shooting yourself in the foot, by specifying the type of foo in two places. If one changes but the other doesn't, the code breaks. Thus, always do it like this:
send(&foo, sizeof foo);
Now you're protected. Sure, you duplicate the name of the variable, but that has a high probability of breaking in a way the compiler can detect, if you change it.
int size = (&arr)[1] - arr;
Check out this link for explanation
I would advise to never use sizeof (even if it can be used) to get any of the two different sizes of an array, either in number of elements or in bytes, which are the last two cases I show here. For each of the two sizes, the macros shown below can be used to make it safer. The reason is to make obvious the intention of the code to maintainers, and difference sizeof(ptr) from sizeof(arr) at first glance (which written this way isn't obvious), so that bugs are then obvious for everyone reading the code.
TL;DR:
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + must_be_array(arr))
#define ARRAY_BYTES(arr) (sizeof(arr) + must_be_array(arr))
must_be_array(arr) (defined below) IS needed as -Wsizeof-pointer-div is buggy (as of april/2020):
#define is_same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
#define is_array(arr) (!is_same_type((arr), &(arr)[0]))
#define must_be(e) \
( \
0 * (int)sizeof( \
struct { \
static_assert(e); \
char ISO_C_forbids_a_struct_with_no_members__; \
} \
) \
)
#define must_be_array(arr) must_be(is_array(arr))
There have been important bugs regarding this topic: https://lkml.org/lkml/2015/9/3/428
I disagree with the solution that Linus provides, which is to never use array notation for parameters of functions.
I like array notation as documentation that a pointer is being used as an array. But that means that a fool-proof solution needs to be applied so that it is impossible to write buggy code.
From an array we have three sizes which we might want to know:
The size of the elements of the array
The number of elements in the array
The size in bytes that the array uses in memory
The size of the elements of the array
The first one is very simple, and it doesn't matter if we are dealing with an array or a pointer, because it's done the same way.
Example of usage:
void foo(size_t nmemb, int arr[nmemb])
{
qsort(arr, nmemb, sizeof(arr[0]), cmp);
}
qsort() needs this value as its third argument.
For the other two sizes, which are the topic of the question, we want to make sure that we're dealing with an array, and break the compilation if not, because if we're dealing with a pointer, we will get wrong values. When the compilation is broken, we will be able to easily see that we weren't dealing with an array, but with a pointer instead, and we will just have to write the code with a variable or a macro that stores the size of the array behind the pointer.
The number of elements in the array
This one is the most common, and many answers have provided you with the typical macro ARRAY_SIZE:
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
Recent versions of compilers, such as GCC 8, will warn you when you apply this macro to a pointer, so it is safe (there are other methods to make it safe with older compilers).
It works by dividing the size in bytes of the whole array by the size of each element.
Examples of usage:
void foo(size_t nmemb)
{
char buf[nmemb];
fgets(buf, ARRAY_SIZE(buf), stdin);
}
void bar(size_t nmemb)
{
int arr[nmemb];
for (size_t i = 0; i < ARRAY_SIZE(arr); i++)
arr[i] = i;
}
If these functions didn't use arrays, but got them as parameters instead, the former code would not compile, so it would be impossible to have a bug (given that a recent compiler version is used, or that some other trick is used), and we need to replace the macro call by the value:
void foo(size_t nmemb, char buf[nmemb])
{
fgets(buf, nmemb, stdin);
}
void bar(size_t nmemb, int arr[nmemb])
{
for (size_t i = nmemb - 1; i < nmemb; i--)
arr[i] = i;
}
The size in bytes that the array uses in memory
ARRAY_SIZE is commonly used as a solution to the previous case, but this case is rarely written safely, maybe because it's less common.
The common way to get this value is to use sizeof(arr). The problem: the same as with the previous one; if you have a pointer instead of an array, your program will go nuts.
The solution to the problem involves using the same macro as before, which we know to be safe (it breaks compilation if it is applied to a pointer):
#define ARRAY_BYTES(arr) (sizeof((arr)[0]) * ARRAY_SIZE(arr))
How it works is very simple: it undoes the division that ARRAY_SIZE does, so after mathematical cancellations you end up with just one sizeof(arr), but with the added safety of the ARRAY_SIZE construction.
Example of usage:
void foo(size_t nmemb)
{
int arr[nmemb];
memset(arr, 0, ARRAY_BYTES(arr));
}
memset() needs this value as its third argument.
As before, if the array is received as a parameter (a pointer), it won't compile, and we will have to replace the macro call by the value:
void foo(size_t nmemb, int arr[nmemb])
{
memset(arr, 0, sizeof(arr[0]) * nmemb);
}
Update (23/apr/2020): -Wsizeof-pointer-div is buggy:
Today I found out that the new warning in GCC only works if the macro is defined in a header that is not a system header. If you define the macro in a header that is installed in your system (usually /usr/local/include/ or /usr/include/) (#include <foo.h>), the compiler will NOT emit a warning (I tried GCC 9.3.0).
So we have #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0])) and want to make it safe. We will need C2X static_assert() and some GCC extensions: Statements and Declarations in Expressions, __builtin_types_compatible_p:
#include <assert.h>
#define is_same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
#define is_array(arr) (!is_same_type((arr), &(arr)[0]))
#define Static_assert_array(arr) static_assert(is_array(arr))
#define ARRAY_SIZE(arr) \
({ \
Static_assert_array(arr); \
sizeof(arr) / sizeof((arr)[0]); \
})
Now ARRAY_SIZE() is completely safe, and therefore all its derivatives will be safe.
Update: libbsd provides __arraycount():
Libbsd provides the macro __arraycount() in <sys/cdefs.h>, which is unsafe because it lacks a pair of parentheses, but we can add those parentheses ourselves, and therefore we don't even need to write the division in our header (why would we duplicate code that already exists?). That macro is defined in a system header, so if we use it we are forced to use the macros above.
#inlcude <assert.h>
#include <stddef.h>
#include <sys/cdefs.h>
#include <sys/types.h>
#define is_same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
#define is_array(arr) (!is_same_type((arr), &(arr)[0]))
#define Static_assert_array(arr) static_assert(is_array(arr))
#define ARRAY_SIZE(arr) \
({ \
Static_assert_array(arr); \
__arraycount((arr)); \
})
#define ARRAY_BYTES(arr) (sizeof((arr)[0]) * ARRAY_SIZE(arr))
Some systems provide nitems() in <sys/param.h> instead, and some systems provide both. You should check your system, and use the one you have, and maybe use some preprocessor conditionals for portability and support both.
Update: Allow the macro to be used at file scope:
Unfortunately, the ({}) gcc extension cannot be used at file scope.
To be able to use the macro at file scope, the static assertion must be
inside sizeof(struct {}). Then, multiply it by 0 to not affect
the result. A cast to (int) might be good to simulate a function
that returns (int)0 (in this case it is not necessary, but then it
is reusable for other things).
Additionally, the definition of ARRAY_BYTES() can be simplified a bit.
#include <assert.h>
#include <stddef.h>
#include <sys/cdefs.h>
#include <sys/types.h>
#define is_same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
#define is_array(arr) (!is_same_type((arr), &(arr)[0]))
#define must_be(e) \
( \
0 * (int)sizeof( \
struct { \
static_assert(e); \
char ISO_C_forbids_a_struct_with_no_members__; \
} \
) \
)
#define must_be_array(arr) must_be(is_array(arr))
#define ARRAY_SIZE(arr) (__arraycount((arr)) + must_be_array(arr))
#define ARRAY_BYTES(arr) (sizeof(arr) + must_be_array(arr))
Notes:
This code makes use of the following extensions, which are completely necessary, and their presence is absolutely necessary to achieve safety. If your compiler doesn't have them, or some similar ones, then you can't achieve this level of safety.
__builtin_types_compatible_p()
typeof()
I also make use of the following C2X feature. However, its absence by using an older standard can be overcome using some dirty tricks (see for example: What is “:-!!” in C code?) (in C11 you also have static_assert(), but it requires a message).
static_assert()
You can use the sizeof operator, but it will not work for functions, because it will take the reference of a pointer.
You can do the following to find the length of an array:
len = sizeof(arr)/sizeof(arr[0])
The code was originally found here:
C program to find the number of elements in an array
If you know the data type of the array, you can use something like:
int arr[] = {23, 12, 423, 43, 21, 43, 65, 76, 22};
int noofele = sizeof(arr)/sizeof(int);
Or if you don't know the data type of array, you can use something like:
noofele = sizeof(arr)/sizeof(arr[0]);
Note: This thing only works if the array is not defined at run time (like malloc) and the array is not passed in a function. In both cases, arr (array name) is a pointer.
The macro ARRAYELEMENTCOUNT(x) that everyone is making use of evaluates incorrectly. This, realistically, is just a sensitive matter, because you can't have expressions that result in an 'array' type.
/* Compile as: CL /P "macro.c" */
# define ARRAYELEMENTCOUNT(x) (sizeof (x) / sizeof (x[0]))
ARRAYELEMENTCOUNT(p + 1);
Actually evaluates as:
(sizeof (p + 1) / sizeof (p + 1[0]));
Whereas
/* Compile as: CL /P "macro.c" */
# define ARRAYELEMENTCOUNT(x) (sizeof (x) / sizeof (x)[0])
ARRAYELEMENTCOUNT(p + 1);
It correctly evaluates to:
(sizeof (p + 1) / sizeof (p + 1)[0]);
This really doesn't have a lot to do with the size of arrays explicitly. I've just noticed a lot of errors from not truly observing how the C preprocessor works. You always wrap the macro parameter, not an expression in might be involved in.
This is correct; my example was a bad one. But that's actually exactly what should happen. As I previously mentioned p + 1 will end up as a pointer type and invalidate the entire macro (just like if you attempted to use the macro in a function with a pointer parameter).
At the end of the day, in this particular instance, the fault doesn't really matter (so I'm just wasting everyone's time; huzzah!), because you don't have expressions with a type of 'array'. But really the point about preprocessor evaluation subtles I think is an important one.
For multidimensional arrays it is a tad more complicated. Oftenly people define explicit macro constants, i.e.
#define g_rgDialogRows 2
#define g_rgDialogCols 7
static char const* g_rgDialog[g_rgDialogRows][g_rgDialogCols] =
{
{ " ", " ", " ", " 494", " 210", " Generic Sample Dialog", " " },
{ " 1", " 330", " 174", " 88", " ", " OK", " " },
};
But these constants can be evaluated at compile-time too with sizeof:
#define rows_of_array(name) \
(sizeof(name ) / sizeof(name[0][0]) / columns_of_array(name))
#define columns_of_array(name) \
(sizeof(name[0]) / sizeof(name[0][0]))
static char* g_rgDialog[][7] = { /* ... */ };
assert( rows_of_array(g_rgDialog) == 2);
assert(columns_of_array(g_rgDialog) == 7);
Note that this code works in C and C++. For arrays with more than two dimensions use
sizeof(name[0][0][0])
sizeof(name[0][0][0][0])
etc., ad infinitum.
Size of an array in C:
int a[10];
size_t size_of_array = sizeof(a); // Size of array a
int n = sizeof (a) / sizeof (a[0]); // Number of elements in array a
size_t size_of_element = sizeof(a[0]); // Size of each element in array a
// Size of each element = size of type
sizeof(array) / sizeof(array[0])
#define SIZE_OF_ARRAY(_array) (sizeof(_array) / sizeof(_array[0]))
If you really want to do this to pass around your array I suggest implementing a structure to store a pointer to the type you want an array of and an integer representing the size of the array. Then you can pass that around to your functions. Just assign the array variable value (pointer to first element) to that pointer. Then you can go Array.arr[i] to get the i-th element and use Array.size to get the number of elements in the array.
I included some code for you. It's not very useful but you could extend it with more features. To be honest though, if these are the things you want you should stop using C and use another language with these features built in.
/* Absolutely no one should use this...
By the time you're done implementing it you'll wish you just passed around
an array and size to your functions */
/* This is a static implementation. You can get a dynamic implementation and
cut out the array in main by using the stdlib memory allocation methods,
but it will work much slower since it will store your array on the heap */
#include <stdio.h>
#include <string.h>
/*
#include "MyTypeArray.h"
*/
/* MyTypeArray.h
#ifndef MYTYPE_ARRAY
#define MYTYPE_ARRAY
*/
typedef struct MyType
{
int age;
char name[20];
} MyType;
typedef struct MyTypeArray
{
int size;
MyType *arr;
} MyTypeArray;
MyType new_MyType(int age, char *name);
MyTypeArray newMyTypeArray(int size, MyType *first);
/*
#endif
End MyTypeArray.h */
/* MyTypeArray.c */
MyType new_MyType(int age, char *name)
{
MyType d;
d.age = age;
strcpy(d.name, name);
return d;
}
MyTypeArray new_MyTypeArray(int size, MyType *first)
{
MyTypeArray d;
d.size = size;
d.arr = first;
return d;
}
/* End MyTypeArray.c */
void print_MyType_names(MyTypeArray d)
{
int i;
for (i = 0; i < d.size; i++)
{
printf("Name: %s, Age: %d\n", d.arr[i].name, d.arr[i].age);
}
}
int main()
{
/* First create an array on the stack to store our elements in.
Note we could create an empty array with a size instead and
set the elements later. */
MyType arr[] = {new_MyType(10, "Sam"), new_MyType(3, "Baxter")};
/* Now create a "MyTypeArray" which will use the array we just
created internally. Really it will just store the value of the pointer
"arr". Here we are manually setting the size. You can use the sizeof
trick here instead if you're sure it will work with your compiler. */
MyTypeArray array = new_MyTypeArray(2, arr);
/* MyTypeArray array = new_MyTypeArray(sizeof(arr)/sizeof(arr[0]), arr); */
print_MyType_names(array);
return 0;
}
The best way is you save this information, for example, in a structure:
typedef struct {
int *array;
int elements;
} list_s;
Implement all necessary functions such as create, destroy, check equality, and everything else you need. It is easier to pass as a parameter.
The function sizeof returns the number of bytes which is used by your array in the memory. If you want to calculate the number of elements in your array, you should divide that number with the sizeof variable type of the array. Let's say int array[10];, if variable type integer in your computer is 32 bit (or 4 bytes), in order to get the size of your array, you should do the following:
int array[10];
size_t sizeOfArray = sizeof(array)/sizeof(int);
A more elegant solution will be
size_t size = sizeof(a) / sizeof(*a);
You can use the & operator. Here is the source code:
#include<stdio.h>
#include<stdlib.h>
int main(){
int a[10];
int *p;
printf("%p\n", (void *)a);
printf("%p\n", (void *)(&a+1));
printf("---- diff----\n");
printf("%zu\n", sizeof(a[0]));
printf("The size of array a is %zu\n", ((char *)(&a+1)-(char *)a)/(sizeof(a[0])));
return 0;
};
Here is the sample output
1549216672
1549216712
---- diff----
4
The size of array a is 10
The simplest answer:
#include <stdio.h>
int main(void) {
int a[] = {2,3,4,5,4,5,6,78,9,91,435,4,5,76,7,34}; // For example only
int size;
size = sizeof(a)/sizeof(a[0]); // Method
printf("size = %d", size);
return 0;
}
"you've introduced a subtle way of shooting yourself in the foot"
C 'native' arrays do not store their size. It is therefore recommended to save the length of the array in a separate variable/const, and pass it whenever you pass the array, that is:
#define MY_ARRAY_LENGTH 15
int myArray[MY_ARRAY_LENGTH];
If you are writing C++, you SHOULD always avoid native arrays anyway (unless you can't, in which case, mind your foot). If you are writing C++, use the STL's 'vector' container. "Compared to arrays, they provide almost the same performance", and they are far more useful!
// vector is a template, the <int> means it is a vector of ints
vector<int> numbers;
// push_back() puts a new value at the end (or back) of the vector
for (int i = 0; i < 10; i++)
numbers.push_back(i);
// Determine the size of the array
cout << numbers.size();
See:
http://www.cplusplus.com/reference/stl/vector/
Beside the answers already provided, I want to point out a special case by the use of
sizeof(a) / sizeof (a[0])
If a is either an array of char, unsigned char or signed char you do not need to use sizeof twice since a sizeof expression with one operand of these types do always result to 1.
Quote from C18,6.5.3.4/4:
"When sizeof is applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1."
Thus, sizeof(a) / sizeof (a[0]) would be equivalent to NUMBER OF ARRAY ELEMENTS / 1 if a is an array of type char, unsigned char or signed char. The division through 1 is redundant.
In this case, you can simply abbreviate and do:
sizeof(a)
For example:
char a[10];
size_t length = sizeof(a);
If you want a proof, here is a link to GodBolt.
Nonetheless, the division maintains safety, if the type significantly changes (although these cases are rare).
To know the size of a fixed array declared explicitly in code and referenced by its variable, you can use sizeof, for example:
int a[10];
int len = sizeof(a)/sizeof(int);
But this is usually useless, because you already know the answer.
But if you have a pointer you can’t use sizeof, its a matter of principle.
But...Since arrays are presented as linear memory for the user, you can calculate the size if you know the last element address and if you know the size of the type, then you can count how many elements it have. For example:
#include <stdio.h>
int main(){
int a[10];
printf("%d\n", sizeof(a)/sizeof(int));
int *first = a;
int *last = &(a[9]);
printf("%d\n", (last-first) + 1);
}
Output:
10
10
Also if you can't take advantage of compile time you can:
#include <stdio.h>
int main(){
int a[10];
printf("%d\n", sizeof(a)/sizeof(int));
void *first = a;
void *last = &(a[9]);
printf("%d\n", (last-first)/sizeof(int) + 1);
}
Note: This one can give you undefined behaviour as pointed out by M.M in the comment.
int a[10];
int size = (*(&a+1)-a);
For more details, see here and also here.
For a predefined array:
int a[] = {1, 2, 3, 4, 5, 6};
Calculating number of elements in the array:
element _count = sizeof(a) / sizeof(a[0]);

Using CRC32 algorithm to hash string at compile-time

Basically I want in my code to be able to do this:
Engine.getById(WSID('some-id'));
Which should get transformed by
Engine.getById('1a61bc96');
just before being compiled into asm. So at compile-time.
This is my try
constexpr int WSID(const char* str) {
boost::crc_32_type result;
result.process_bytes(str,sizeof(str));
return result.checksum();
}
But I get this when trying to compile with MSVC 18 (CTP November 2013)
error C3249: illegal statement or sub-expression for 'constexpr' function
How can I get the WSID function, using this way or any, as long as it is done during compile time?
Tried this: Compile time string hashing
warning C4592: 'crc32': 'constexpr' call evaluation failed; function will be called at run-time
EDIT:
I first heard about this technique in Game Engine Architecture by Jason Gregory. I contacted the author who obligingly answer to me this :
What we do is to pass our source code through a custom little pre-processor that searches for text of the form SID('xxxxxx') and converts whatever is between the single quotes into its hashed equivalent as a hex literal (0xNNNNNNNN). [...]
You could conceivably do it via a macro and/or some template metaprogramming, too, although as you say it's tricky to get the compiler to do this kind of work for you. It's not impossible, but writing a custom tool is easier and much more flexible. [...]
Note also that we chose single quotes for SID('xxxx') literals. This was done so that we'd get some reasonable syntax highlighting in our code editors, yet if something went wrong and some un-preprocessed code ever made it thru to the compiler, it would throw a syntax error because single quotes are normally reserved for single-character literals.
Note also that it's crucial to have your little pre-processing tool cache the strings in a database of some sort, so that the original strings can be looked up given the hash code. When you are debugging your code and you inspect a StringId variable, the debugger will normally show you the rather unintelligible hash code. But with a SID database, you can write a plug-in that converts these hash codes back to their string equivalents. That way, you'll see SID('foo') in your watch window, not 0x75AE3080 [...]. Also, the game should be able to load this same database, so that it can print strings instead of hex hash codes on the screen for debugging purposes [...].
But while preprocess has some main advantages, it means that I have to prepare some kind of output system of modified files (those will be stored elsewhere, and then we need to tell MSVC). So it might complicate the compiling task. Is there a way to preprocess file with python for instance without headaches? But this is not the question, and I'm still interested about using compile-time function (about cache I could use an ID index)
Here is a solution that works entirely at compile time, but may also be used at runtime. It is a mix of constexpr, templates and macros. You may want to change some of the names or put them in a separate file since they are quite short.
Note that I reused code from this answer for the CRC table generation and I based myself off of code from this page for the implementation.
I have not tested it on MSVC since I don't currently have it installed in my Windows VM, but I believe it should work, or at least be made to work with trivial changes.
Here is the code, you may use the crc32 function directly, or the WSID function that more closely matches your question :
#include <cstring>
#include <cstdint>
#include <iostream>
// Generate CRC lookup table
template <unsigned c, int k = 8>
struct f : f<((c & 1) ? 0xedb88320 : 0) ^ (c >> 1), k - 1> {};
template <unsigned c> struct f<c, 0>{enum {value = c};};
#define A(x) B(x) B(x + 128)
#define B(x) C(x) C(x + 64)
#define C(x) D(x) D(x + 32)
#define D(x) E(x) E(x + 16)
#define E(x) F(x) F(x + 8)
#define F(x) G(x) G(x + 4)
#define G(x) H(x) H(x + 2)
#define H(x) I(x) I(x + 1)
#define I(x) f<x>::value ,
constexpr unsigned crc_table[] = { A(0) };
// Constexpr implementation and helpers
constexpr uint32_t crc32_impl(const uint8_t* p, size_t len, uint32_t crc) {
return len ?
crc32_impl(p+1,len-1,(crc>>8)^crc_table[(crc&0xFF)^*p])
: crc;
}
constexpr uint32_t crc32(const uint8_t* data, size_t length) {
return ~crc32_impl(data, length, ~0);
}
constexpr size_t strlen_c(const char* str) {
return *str ? 1+strlen_c(str+1) : 0;
}
constexpr int WSID(const char* str) {
return crc32((uint8_t*)str, strlen_c(str));
}
// Example usage
using namespace std;
int main() {
cout << "The CRC32 is: " << hex << WSID("some-id") << endl;
}
The first part takes care of generating the table of constants, while crc32_impl is a standard CRC32 implementation converted to a recursive style that works with a C++11 constexpr.
Then crc32 and WSID are just simple wrappers for convenience.
If anyone is interested, I coded up a CRC-32 table generator function and code generator function using C++14 style constexpr functions. The result is, in my opinion, much more maintainable code than many other attempts I have seen on the internet and it stays far, far away from the preprocessor.
Now, it does use a custom std::array 'clone' called cexp::array, because G++ seems to not have not added the constexpr keyword to their non-const reference index access/write operator.
However, it is quite light-weight, and hopefully the keyword will be added to std::array in the close future. But for now, the very simple array implementation is as follows:
namespace cexp
{
// Small implementation of std::array, needed until constexpr
// is added to the function 'reference operator[](size_type)'
template <typename T, std::size_t N>
struct array {
T m_data[N];
using value_type = T;
using reference = value_type &;
using const_reference = const value_type &;
using size_type = std::size_t;
// This is NOT constexpr in std::array until C++17
constexpr reference operator[](size_type i) noexcept {
return m_data[i];
}
constexpr const_reference operator[](size_type i) const noexcept {
return m_data[i];
}
constexpr size_type size() const noexcept {
return N;
}
};
}
Now, we need to generate the CRC-32 table. I based the algorithm off some Hacker's Delight code, and it can probably be extended to support the many other CRC algorithms out there. But alas, I only required the standard implementation, so here it is:
// Generates CRC-32 table, algorithm based from this link:
// http://www.hackersdelight.org/hdcodetxt/crc.c.txt
constexpr auto gen_crc32_table() {
constexpr auto num_bytes = 256;
constexpr auto num_iterations = 8;
constexpr auto polynomial = 0xEDB88320;
auto crc32_table = cexp::array<uint32_t, num_bytes>{};
for (auto byte = 0u; byte < num_bytes; ++byte) {
auto crc = byte;
for (auto i = 0; i < num_iterations; ++i) {
auto mask = -(crc & 1);
crc = (crc >> 1) ^ (polynomial & mask);
}
crc32_table[byte] = crc;
}
return crc32_table;
}
Next, we store the table in a global and perform rudimentary static checking on it. This checking could most likely be improved, and it is not necessary to store it in a global.
// Stores CRC-32 table and softly validates it.
static constexpr auto crc32_table = gen_crc32_table();
static_assert(
crc32_table.size() == 256 &&
crc32_table[1] == 0x77073096 &&
crc32_table[255] == 0x2D02EF8D,
"gen_crc32_table generated unexpected result."
);
Now that the table is generated, it's time to generate the CRC-32 codes. I again based the algorithm off the Hacker's Delight link, and at the moment it only supports input from a c-string.
// Generates CRC-32 code from null-terminated, c-string,
// algorithm based from this link:
// http://www.hackersdelight.org/hdcodetxt/crc.c.txt
constexpr auto crc32(const char *in) {
auto crc = 0xFFFFFFFFu;
for (auto i = 0u; auto c = in[i]; ++i) {
crc = crc32_table[(crc ^ c) & 0xFF] ^ (crc >> 8);
}
return ~crc;
}
For sake of completion, I generate one CRC-32 code below and statically check if it has the expected output, and then print it to the output stream.
int main() {
constexpr auto crc_code = crc32("some-id");
static_assert(crc_code == 0x1A61BC96, "crc32 generated unexpected result.");
std::cout << std::hex << crc_code << std::endl;
}
Hopefully this helps anyone else that was looking to achieve compile time generation of CRC-32, or even in general.
#tux3's answer is pretty slick! Hard to maintain, though, because you are basically writing your own implementation of CRC32 in preprocessor commands.
Another way to solve your question is to go back and understand the need for the requirement first. If I understand you right, the concern seems to be performance. In that case, there is a second point of time you can call your function without performance impact: at program load time. In that case, you would be accessing a global variable instead of passing a constant. Performance-wise, after initialization both should be identical (a const fetches 32 bits from your code, a global variable fetches 32 bits from a regular memory location).
You could do something like this:
static int myWSID = 0;
// don't call this directly
static int WSID(const char* str) {
boost::crc_32_type result;
result.process_bytes(str,sizeof(str));
return result.checksum();
}
// Put this early into your program into the
// initialization code.
...
myWSID = WSID('some-id');
Depending on your overall program, you may want to have an inline accessor to retrieve the value.
If a minor performance impact is acceptable, you would also write your function like this, basically using the singleton pattern.
// don't call this directly
int WSID(const char* str) {
boost::crc_32_type result;
result.process_bytes(str,sizeof(str));
return result.checksum();
}
// call this instead. Note the hard-coded ID string.
// Create one such function for each ID you need to
// have available.
static int myWSID() {
// Note: not thread safe!
static int computedId = 0;
if (computedId == 0)
computedId = WSID('some-id');
return computedId;
}
Of course, if the reason for asking for compile-time evaluation is something different (such as, not wanting some-id to appear in the compiled code), these techniques won't help.
The other option is to use Jason Gregory's suggestion of a custom preprocessor. It can be done fairly cleanly if you collect all the IDS into a separate file. This file doesn't need to have C syntax. I'd give it an extension such as .wsid. The custom preprocessor generates a .H file from it.
Here is how this could look:
idcollection.wsid (before custom preprocessor):
some_id1
some_id2
some_id3
Your preprocessor would generate the following idcollection.h:
#define WSID_some_id1 0xabcdef12
#define WSID_some_id2 0xbcdef123
#define WSID_some_id3 0xcdef1234
And in your code, you'd call
Engine.getById(WSID_some_id1);
A few notes about this:
This assumes that all the original IDs can be converted into valid identifiers. If they contain special characters, your preprocessor may need to do additional munging.
I notice a mismatch in your original question. Your function returns an int, but Engine.getById seems to take a string. My proposed code would always use int (easy to change if you want always string).

How to paste a term and counter in C using ##?

In an embedded system define:
#define Row1_PORT GPIOD
#define Row1_PIN GPIO_PIN_4
#define Row2_PORT GPIOD
#define Row2_PIN GPIO_PIN_7
#define Row3_PORT GPIOD
#define Row3_PIN GPIO_PIN_1
#define Row4_PORT GPIOD
#define Row4_PIN GPIO_PIN_3
//------------
#define Paste2(a,b) a ## b
#define Paste(a,b) Paste2(a,b)
#define NRows 4
I want use above defined macros in a loop like this:
for(i=1;i<=NRows;i++)
{
GPIO_Init(Paste(Paste(Row,i),_PORT),Paste(Paste(Row,i),_PIN),GPIO_MODE_IN_PU_NO_IT);
}
instead of
GPIO_Init(Row1_PORT,Row1_PIN);
GPIO_Init(Row2_PORT,Row2_PIN);
GPIO_Init(Row3_PORT,Row3_PIN);
GPIO_Init(Row4_PORT,Row4_PIN);
Is it possible?
I need some things like __COUNTER__ in ANSI C or C++. My compiler is IAR.
The preprocessor runs at compile time and textually modifies the source code presented to the compiler. What you are seeking to do is not possible; the compiler would embed the letter i into the macro expansions, not the value of the variable i at run-time.
I would probably use something like:
static const int ports[] = { 0, Row1_PORT, Row2_PORT, Row3_PORT, Row4_PORT };
static const int pins[] = { 0, Row1_PIN, Row2_PIN, Row3_PIN, Row4_PIN };
for (int i = 1; i <= NRows; i++)
GPIO_Init(ports[i], pins[i]);
Or I'd write it out longhand (as you show in your 'instead of' option) — there is little penalty and possibly a small saving for just 4 entries. If you have 100 ports to initialize, the loop would be better, of course.
Also, if you're going to use the port and pin numbers again in future (in other portions of the code than just the initialization code), having the arrays available will allow for greater flexibility.
As chris said -- this information isn't available to you during preprocessing, so you're ending up with
GPIO_Init(Rowi_PORT,Rowi_PIN);
which errors, as expected.
I don't think that macros are the right tool for this. Why not save your ports and pins in an array? Something like:
int ports[] = {Row1_PORT, Row2_PORT, ...};
int pins[] = {Row1_PIN, Row2_PIN, ...};
for (int i = 0; i < NRows; i++) {
GPIO_Init(ports[i], pins[i];
}
No less concise, but no macro hacks.

Representing big numbers in source code for readability?

Is there a more human-readable way for representing big numbers in the source code of an application written in C++ or C?
let's for example take the number 2,345,879,444,641 , in C or C++ if we wanted a program to return this number we would do return 2345879444641.
But this is not really readable.
In PAWN (a scripting language) for example I can do return 2_345_879_444_641 or even return 2_34_58_79_44_46_41 and these both would return the number 2,345,879,444,641.
This is much more readable for the human-eye.
Is there a C or C++ equivalent for this?
With a current compiler (C++14 or newer), you can use apostrophes, like:
auto a = 1'234'567;
If you're still stuck with C++11, you could use a user-defined literal to support something like: int i = "1_000_000"_i. The code would look something like this:
#include <iostream>
#include <string>
#include <cstdlib>
int operator "" _i (char const *in, size_t len) {
std::string input(in, len);
int pos;
while (std::string::npos != (pos=input.find_first_of("_,")))
input.erase(pos, 1);
return std::strtol(input.c_str(), NULL, 10);
}
int main() {
std::cout << "1_000_000_000"_i;
}
As I've written it, this supports underscores or commas interchangeably, so you could use one or the other, or both. For example, "1,000_000" would turn out as 1000000.
Of course, Europeans would probably prefer "." instead of "," -- if so, feel free to modify as you see fit.
With Boost.PP:
#define NUM(...) \
NUM_SEQ(BOOST_PP_VARIADIC_TO_SEQ(__VA_ARGS__))
#define NUM_SEQ(seq) \
BOOST_PP_SEQ_FOLD_LEFT(NUM_FOLD, BOOST_PP_SEQ_HEAD(seq), BOOST_PP_SEQ_TAIL(seq))
#define NUM_FOLD(_, acc, x) \
BOOST_PP_CAT(acc, x)
Usage:
NUM(123, 456, 789) // Expands to 123456789
Demo.
Another way is making an UDL. Left as an exercise (and also because it requires more code).
Here's a macro that would do it, tested on both MSVC and GCC. No reliance on Boost...
#define NUM(...) NUM_(__VA_ARGS__, , , , , , , , , , )
#define NUM_(...) NUM_MSVCHACK((__VA_ARGS__))
#define NUM_MSVCHACK(numlist_) NUM__ numlist_
#define NUM__(a1_, a2_, a3_, a4_, a5_, a6_, a7_, a8_, ...) a1_##a2_##a3_##a4_##a5_##a6_##a7_##a8_
Use it like:
int y = NUM(1,2,3,4,5,6,7,8);
int x = NUM(100,460,694);
Produces:
int y = 12345678;
int x = 100460694;
For C++1y you can now use single quote(') as a digit separator. Based on N3781: Single-Quotation-Mark as a Digit Separator which has finally been accepted. Both gcc and clang have supported this feature as part of their C++1y implementation.
So the following program (see it live for clang):
#include <iostream>
int main(){
std::cout << 2'345'879'444'641 << std::endl ;
}
will output:
2345879444641
You could use a preprocessor macro
#define BILLION (1000*1000*1000)
then code e.g. (4*BILLION) ; if you care about large power of two just ust 1<<30
PS Notice that 1e6 is a double literal (same as 1.0e6)
And you could also:
patch the GCC lexer to accept 1_234_567 notation for number literals and publish that patch for conformance with GPLv3 and free software spirit.
probably in file libpp/lex.c and/or gcc/c-family/c-lex.c and/or gcc/cpp/lex.c of future GCC 4.8, i.e. current trunk.
lobby the C & C++ standardization groups to get that accepted in future C or C++ standards.

Is compile-time "strlen()" effective?

Sometimes it is necessary to compare a string's length with a constant.
For example:
if ( line.length() > 2 )
{
// Do something...
}
But I am trying to avoid using "magic" constants in code.
Usually I use such code:
if ( line.length() > strlen("[]") )
{
// Do something...
}
It is more readable, but not efficient because of the function call.
I wrote template functions as follow:
template<size_t N>
size_t _lenof(const char (&)[N])
{
return N - 1;
}
template<size_t N>
size_t _lenof(const wchar_t (&)[N])
{
return N - 1;
}
// Using:
if ( line.length() > _lenof("[]") )
{
// Do something...
}
In a release build (VisualStudio 2008) it produces pretty good code:
cmp dword ptr [esp+27Ch],2
jbe 011D7FA5
And the good thing is that the compiler doesn't include the "[]" string in the binary output.
Is it a compiler specific optimisation or is it a common behavior?
Why not
sizeof "[]" - 1;
(minus one for the trailing null. You could
do sizeof "[]" - sizeof '\0', but sizeof '\0'
is often sizeof( int ) in C, and "- 1 " is
perfectly readable.)
The capability to inline a function call is both a compiler-specific optimization and a common behavior. That is, many compilers can do it, but they aren't required to.
I think most compilers will optimize it away when optimizations are enabled. If they're disabled, it might slow your program down much more than necessary.
I would prefer your template functions, as they're guaranteed to not call strlen at runtime.
Of course, rather than writing separate functions for char and wchar_t, you could add another template argument, and get a function which works for any type:
template <typename Char_t, int len>
int static_strlen(const Char_t (&)[N] array){
return len / sizeof(Char_t) - 1;
}
(As already mentioned in comments, this will give funny results if passed an array of ints, but are you likely to do that? It's meant for strings, after all)
A final note, the name _strlen is bad. All name at namespace scope beginning with an underscore are reserved to the implementation. You risk some nasty naming conflicts.
By the way, why is "[]" less of a magic constant than 2 is?
In both cases, it is a literal that has to be changed if the format of the string it is compared to changes.
#define TWO 2
#define STRING_LENGTH 2
/* ... etc ... */
Seriously, why go through all this hassle just to avoid typing a 2? I honestly think you're making your code less readable, and other programmers are going to stare at you like you're snorting the used coffee from the filter.