Interesting stdext::hash_value() implementation - c++

I was trying some hashing algorithms with stdext::hash_value() on VS2010 and realized this:
#include <iostream>
#include <xhash>
using namespace std;
int main()
{
#ifdef _WIN64
std::cout << "x64" << std::endl;
#else
std::cout << "x32" << std::endl;
#endif
std::cout << stdext::hash_value(345.34533) << std::endl;
std::cout << stdext::hash_value(345.566) << std::endl;
return 0;
}
// Output is:
// x64
//3735928758
//3735928758
I tried some other couples of double variable that has the same integer but different fractional part. Like 1.234 vs 1.568. Hash values were always the same. So I took a look at source of hash_value() and saw
#define _HASH_SEED (size_t)0xdeadbeef
template<class _Kty> inline
size_t hash_value(const _Kty& _Keyval)
{ // hash _Keyval to size_t value one-to-one
return ((size_t)_Keyval ^ _HASH_SEED);
}
_KeyVal is cast to size_t which does not make sense to me. The function simply ignores the fractional part of double. What is the logic behind this implementation? Is it a bug or feature?

stdext::hash_value isn't a hash function. It's the input to a hash function, you specialize it for your type so that it can be used as a key for the stdext hash classes. There doesn't seem to be any documentation for it however. The actual hash function is stdext::hash_compare.
But because there is no default specialization of hash_value it uses the convert-to-int method which ignores the fractional part.
There is an almost-identical bug for the standard std::tr1::hash/std::hash function up until vc10:
http://connect.microsoft.com/VisualStudio/feedback/details/361851/std-tr1-hash-float-or-hash-double-is-poorly-implemented
in vc10 std::hash gets a double specialization that hashes the bits. I guess stdext is considered obsolete now so there's no fix for it even in vc10.

The function is written to work with any type of data. It makes no assumption about the size and hence is inefficient for certain types. You can override this behavior for doubles to make it more efficient via a template specialization
template<>
size_t hash_value<double>(const double& key) {
return // Some awesome double hashing algorithm
}
Putting this definition above your main method will cause the calls to stdext::hash_value(354.566) to bind to this definition as opposed to the standard one

That's old code - doesn't seem very good.
You should probably try std::hash instead.

This is, apparently, an attempt to provide a generic hash function for
integers (although I don't see what the xor adds). It quite clearly
won't work for most other types. Including floating point.
Providing a good hash function for a floating point value is difficult;
if I were trying to make a generic hash, I'd probably start by testing
for 0, NaN and Inf, and returning predefined hashes for those (or
rejecting NaN completely as it is not a valid hashable value), and then
simply using a standard string hash on the underlying bytes. This will
at least make hashing compatible with the == operator. But the problems
of precision mean that == itself may not be what is needed. Nor <, in
the case of std::map, since std::map uses < to define an equality
relationship, and depending on the source of the floats or doubles, such
an equality relationship might not be appropriate for a hash table.
At any rate, don't expect to find a standard hash function for the floating point types.

VC10 contains the C++0x Standard hashing mechanisms, so there's no need to use the stdext ones anymore, and std::hash does contain a mechanism for double which performs a bitwise conversion and then hashes. That code that you have for stdext is just a fallback mechanism, and it's not really intended for use with floating-point types. I guess it's a design oversight.
template<>
class hash<double>
: public unary_function<double, size_t>
{ // hash functor
public:
typedef double _Kty;
typedef _ULonglong _Inttype; // use first 2*32 bits
size_t operator()(const _Kty& _Keyval) const
{ // hash _Keyval to size_t value by pseudorandomizing transform
_Inttype _Bits = *(_Inttype *)&_Keyval;
return (hash<_Inttype>()(
(_Bits & (_ULLONG_MAX >> 1)) == 0 ? 0 : _Bits));
}
};

Related

Math Parser for Complex Numbers in C (ExprTk)

I have been using the ExprTk library quite frequently in the past in order to further process large output files generated with Mathematica (containing mathematical expressions) in C.
Until now, I exclusively used this library to process expressions that yield values of the type <double>, for which the library works flawlessly by defining the types
typedef exprtk::symbol_table<double> symbol_table_t;
typedef exprtk::expression<double> expression_t;
typedef exprtk::parser<double> parser_t;
and storing "everything" in a struct
struct readInExpression
{
double a, b;
symbol_table_t symbol_table;
expression_t expression;
};
Reading in a text file that contains the variables a and b as well as e.g. the user-defined function
double my_function(double a, double b) {
return a+b;
}
can be achieved by means of
void readInFromFile(readInExpression* f, parser_t* p) {
std::string file = "xyz.txt";
std::ifstream ifs(file);
std::string content( (std::istreambuf_iterator<char>(ifs) ),
(std::istreambuf_iterator<char>() ) );
f->symbol_table.add_variable("a",f->a);
f->symbol_table.add_variable("b",f->b);
f->symbol_table.add_function("my_function",my_function);
f->expression.register_symbol_table(f->symbol_table);
p->compile(content,f->expression);
}
One may then evaluate the read-in expression for arbitrary values of a and b by using
double evaluateFile(readInExpression* f, double a, double b) {
f->a = a;
f->b = b;
return f->expression.value();
}
Recently, I ran into problems when trying to process text files that contain complex numbers and functions that return complex values of the type std::complex<double>. More specifically, I have a .txt file that contains expressions of the form
2*m*A0(m*m) + Complex(1.0,2.0)*B0(M*M,0.0,m*m)
where A0(a) and B0(a,b,c) are the scalar loop integrals that arise from the Passarino-Veltman reduction of (tensor) loop integrals in high-energy physics.
These can be evaluated numerically in C using LoopTools, where it is to be noted that they take complex values for certain values of a, b, and c. Simply replacing <double> by std::complex<double> in the typedefs above throws tons of errors when compiling. I am not sure whether the ExprTk library is able to handle complex numbers at all -- I know that it cannot deal with custom classes, but from what I understand, it should be able to handle native datatypes (as I found here, ExprTk is able to at least deal with vectors, but given the complexity of the expressions I need to process, I do not think it will be possible to somehow rewrite everything in form of vectors, in particular due to the difference in doing algebra with complex numbers and vectors). Note that neither can I split the expressions into real and imaginary part because I have to evaluate the expressions for many different values of the variables.
Although I dealt with complex numbers and the mentioned functions A0(a) and B0(a,b,c) in text files before, I solved this by simply including the .txt files in C using #include "xyz.txt", implemented in a corresponding function, which, however, seems impossible given the size of the text files at hand (the compiler throws an error if I try to do so).
Does anybody know if and how ExprTk can deal with complex numbers? (A MWE would be highly appreciated.) If that is not the case, can anyone here suggest a different math parser that is user friendly and can deal with complex numbers of the type std::complex<double>, at the same time allowing to define custom functions that themselves return such complex values?
A MWE:
/************/
/* Includes */
/************/
#include <iostream> // Input and Output on the console
#include <fstream> // Input and Output to files
#include <string> // In order to work with strings -- needed to loop through the Mathematica output files
#include "exprtk.hpp" // Parser to evaluate string expressions as mathematical/arithmetic input
#include <math.h> // Simple Math Stuff
#include <gsl/gsl_math.h> // GSL math stuff
#include <complex> // Complex Numbers
/**********/
/* Parser */
/**********/
// Type definitions for the parser
typedef exprtk::symbol_table<double> symbol_table_t; // (%)
typedef exprtk::expression<double> expression_t; // (%)
typedef exprtk::parser<double> parser_t; // (%)
/* This struct is used to store certain information of the Mathematica files in
order to later evaluate them for different variables with the parser library. */
struct readInExpression
{
double a,b; // (%)
symbol_table_t symbol_table;
// Instantiate expression
expression_t expression;
};
/* Global variable where the read-in file/parser is stored. */
readInExpression file;
parser_t parser;
/*******************/
/* Custom function */
/*******************/
double my_function(double a, double b) {
return a+b;
}
/***********************************/
/* Converting Mathematica Notation */
/***********************************/
/* Mathematica prints complex numbers as Complex(x,y), so we need a function to convert to C++ standard. */
std::complex<double> Complex(double a, double b) { // (%)
std::complex<double> c(a,b);
return c;
}
/************************************/
/* Processing the Mathematica Files */
/************************************/
double evaluateFileDoubleValuedInclude(double a, double b) {
return
#include "xyz.txt"
;
}
std::complex<double> evaluateFileComplexValuedInclude(double a, double b) {
return
#include "xyzC.txt"
;
}
void readInFromFile(readInExpression* f, parser_t* p) {
std::string file = "xyz.txt"; // (%)
std::ifstream ifs(file);
std::string content( (std::istreambuf_iterator<char>(ifs) ),
(std::istreambuf_iterator<char>() ) );
// Register variables with the symbol_table
f->symbol_table.add_variable("a",f->a);
f->symbol_table.add_variable("b",f->b);
// Add custom functions to the evaluation list (see definition above)
f->symbol_table.add_function("my_function",my_function); // (%)
// f->symbol_table.add_function("Complex",Complex); // (%)
// Register symbol_table to instantiated expression
f->expression.register_symbol_table(f->symbol_table);
// Compile the expression with the instantiate parser
p->compile(content,f->expression);
}
std::complex<double> evaluateFile(readInExpression* f, double a, double b) { // (%)
// Set the values of the struct to the input values
f->a = a;
f->b = b;
// Evaluate the result for the upper values
return f->expression.value();
}
int main() {
exprtk::symbol_table<std::complex<double> > st1; // Works
exprtk::expression<std::complex<double> > e1; // Works
// exprtk::parser<std::complex<double> > p1; // Throws an error
double a = 2.0;
double b = 3.0;
std::cout << "Evaluating the text file containing only double-valued functions via the #include method: \n" << evaluateFileDoubleValuedInclude(a,b) << "\n \n";
std::cout << "Evaluating the text file containing complex-valued functions via the #include method: \n" << evaluateFileComplexValuedInclude(a,b) << "\n \n";
readInFromFile(&file,&parser);
std::cout<< "Evaluating either the double-valued or the complex-valued file [see the necessary changes tagged with (%)]:\n" << evaluateFile(&file,a,b) << "\n";
return 0;
}
xyz.txt
a + b * my_function(a,b)
xyzC.txt
2.0*Complex(a,b) + 3.0*a
To get the MWE to work, put the exprtk.hpp file in the same folder where you compile.
Note that the return type of the evaluateFile(...) function can be/is std::complex<double>, even though only double-valued types are returned. Lines tagged with // (%) are subject to change when trying out the complex-valued file xyzC.txt.
Instantiating exprtk::parser<std::complex<double> > throws (among others)
./exprtk.hpp:1587:10: error: no matching function for call to 'abs_impl'
exprtk_define_unary_function(abs )
while all other needed types seem to not complain about the type std::complex<double>.
I actually know next to nothing about ExprTk (just what I just read in its documentation and a bit of its code -- EDIT: now somewhat more of its code), but it seems to me unlikely that you'll be able to accomplish what you want without doing some major surgery to the package.
The basis of its API, as you demonstrate in your question, are template objects specialised on a single datatype. The documentation says that that type "…can be any floating point type. This includes… any custom type conforming to an interface comptaible (sic) with the standard floating point type." Unfortunately, it doesn't clarify what they consider the interface of the standard floating point type to be; if it includes every standard library function which could take a floating point argument, it's a very big interface indeed. However, the distribution includes the adaptor used to create a compatible interface for the MPFR package which gives some kind of idea what is necessary.
But the issue here is that I suspect you don't want an evaluator which can only handle complex numbers. It seems to me that you want to be able to work with both real and complex numbers. For example, there are expressions which are unambiguous for real numbers and somewhat arbitrary for complex numbers; these include a < b and max(a, b), neither of which are implemented for complex types in C++. However, they are quite commonly used in real-valued expressions, so just eliminating them from the evaluation language would seem a bit arbitrary. [Note 1] ExprTK does assume that the numeric type is ordered for some purposes, including its incorrect (imho) delta equality operator, and those comparisons are responsible for a lot of the error messages which you are receiving.
The ExprTK header does correctly figure out that std::complex<double> is a complex type, and tags it as such. However, no implementation is provided for any standard function called on complex numbers, even though C++ includes implementations for many of them. This absence is basically what triggers the rest of the errors, including the one you mention for abs, which is an example of a math function which C++ does implement for complex numbers. [Note 2]
That doesn't mean that there is no solution; just that the solution is going to involve a certain amount of work. So a first step might be to fill in the implementations for complex types, as per the MPFR adaptor linked to above (although not everything goes through the _impl methods, as noted above with respect to ordered comparison operators).
One option would be to write your own "real or complex" datatype; in a simple implementation, you could use a discriminated union like:
template<typename T>
class RealOrComplex {
public:
RealOrComplex(T a = 0)
: is_complex_(false), value_.real(a) {}
RealOrComplex(std::complex<T> a)
: is_complex_(true), value_.cplx(a) {}
// Operator implementations, omitted
private:
bool is_complex_;
union {
T real;
std::complex<T> cmplx;
} value_;
};
A possible simpler but more problematic approach would be to let a real number simply be a complex number whose imaginary part is 0 and then write shims for any missing standard library math functions. But you might well need to shim a large part of the Loop library as well.
So while all that is doable, actually doing it is, I'm afraid, too much work for an SO answer.
Since there is some indication in the ExprTK source that the author is aware of the existence of complex numbers, you might want to contact them directly to ask about the possibility of a future implementation.
Notes
It seems that Mathematica throws an error, if these operations are attempted on complex arguments. On the other hand, MatLab makes an arbitrary choice: ordered comparison looks only at the real part, but maximum is (inconsistently) handled by converting to polar coordinates and then comparing component-wise.
Forcing standard interfaces to be explicitly configured seems to me to be a curious implementation choice. Surely it would have been better to allow standard functions to be the default implementation, which would also avoid unnecessary shims like the explicit reimplementation of log1p and other standard math functions. (Perhaps these correspond to known inadequacies in certain math libraries, though.)
... can anyone here suggest a different math parser that is user friendly and can deal with complex numbers of the type std::complex(double), at the same time allowing to define custom functions that themselves return such complex values?
I came across only one math parser that handled complex numbers, Foreval:
https://sourceforge.net/projects/foreval/
Implementation as dll library.
You can connect real and complex variables of the "double" and "extended" type to Foreval in any form, passing the addresses of the variables.
Has built-in standard functions with complex variables. Also, you can connect external functions with complex variables.
But only the type of complex variables, passed and returned in functions is own, internal and will be different from std::complex(double).
There are examples for GCC in the source.
There are two disadvantages:
There is only a 32-bit version of Foreval.dll (connection is possible only for 32-bit program).
And only for OS Windows.
But there are also advantages:
Foreval.dll is a compiler, that generates machine code (fast calculations).
There is a real type with floating point - "extended" ("double" is too), also for complex numbers.

Can I check a small array of bools in one go?

There was a similar question here, but the user in that question seemed to have a much larger array, or vector. If I have:
bool boolArray[4];
And I want to check if all elements are false, I can check [ 0 ], [ 1 ] , [ 2 ] and [ 3 ] either separately, or I can loop through it. Since (as far as I know) false should have value 0 and anything other than 0 is true, I thought about simply doing:
if ( *(int*) boolArray) { }
This works, but I realize that it relies on bool being one byte and int being four bytes. If I cast to (std::uint32_t) would it be OK, or is it still a bad idea? I just happen to have 3 or 4 bools in an array and was wondering if this is safe, and if not if there is a better way to do it.
Also, in the case I end up with more than 4 bools but less than 8 can I do the same thing with a std::uint64_t or unsigned long long or something?
As πάντα ῥεῖ noticed in comments, std::bitset is probably the best way to deal with that in UB-free manner.
std::bitset<4> boolArray {};
if(boolArray.any()) {
//do the thing
}
If you want to stick to arrays, you could use std::any_of, but this requires (possibly peculiar to the readers) usage of functor which just returns its argument:
bool boolArray[4];
if(std::any_of(std::begin(boolArray), std::end(boolArray), [](bool b){return b;}) {
//do the thing
}
Type-punning 4 bools to int might be a bad idea - you cannot be sure of the size of each of the types. It probably will work on most architectures, but std::bitset is guaranteed to work everywhere, under any circumstances.
Several answers have already explained good alternatives, particularly std::bitset and std::any_of(). I am writing separately to point out that, unless you know something we don't, it is not safe to type pun between bool and int in this fashion, for several reasons:
int might not be four bytes, as multiple answers have pointed out.
M.M points out in the comments that bool might not be one byte. I'm not aware of any real-world architectures in which this has ever been the case, but it is nevertheless spec-legal. It (probably) can't be smaller than a byte unless the compiler is doing some very elaborate hide-the-ball chicanery with its memory model, and a multi-byte bool seems rather useless. Note however that a byte need not be 8 bits in the first place.
int can have trap representations. That is, it is legal for certain bit patterns to cause undefined behavior when they are cast to int. This is rare on modern architectures, but might arise on (for example) ia64, or any system with signed zeros.
Regardless of whether you have to worry about any of the above, your code violates the strict aliasing rule, so compilers are free to "optimize" it under the assumption that the bools and the int are entirely separate objects with non-overlapping lifetimes. For example, the compiler might decide that the code which initializes the bool array is a dead store and eliminate it, because the bools "must have" ceased to exist* at some point before you dereferenced the pointer. More complicated situations can also arise relating to register reuse and load/store reordering. All of these infelicities are expressly permitted by the C++ standard, which says the behavior is undefined when you engage in this kind of type punning.
You should use one of the alternative solutions provided by the other answers.
* It is legal (with some qualifications, particularly regarding alignment) to reuse the memory pointed to by boolArray by casting it to int and storing an integer, although if you actually want to do this, you must then pass boolArray through std::launder if you want to read the resulting int later. Regardless, the compiler is entitled to assume that you have done this once it sees the read, even if you don't call launder.
You can use std::bitset<N>::any:
Any returns true if any of the bits are set to true, otherwise false.
#include <iostream>
#include <bitset>
int main ()
{
std::bitset<4> foo;
// modify foo here
if (foo.any())
std::cout << foo << " has " << foo.count() << " bits set.\n";
else
std::cout << foo << " has no bits set.\n";
return 0;
}
Live
If you want to return true if all or none of the bits set to on, you can use std::bitset<N>::all or std::bitset<N>::none respectively.
The standard library has what you need in the form of the std::all_of, std::any_of, std::none_of algorithms.
...And for the obligatory "roll your own" answer, we can provide a simple "or"-like function for any array bool[N], like so:
template<size_t N>
constexpr bool or_all(const bool (&bs)[N]) {
for (bool b : bs) {
if (b) { return b; }
}
return false;
}
Or more concisely,
template<size_t N>
constexpr bool or_all(const bool (&bs)[N]) {
for (bool b : bs) { if (b) { return b; } }
return false;
}
This also has the benefit of both short-circuiting like ||, and being optimised out entirely if calculable at compile time.
Apart from that, if you want to examine the original idea of type-punning bool[N] to some other type to simplify observation, I would very much recommend that you don't do that view it as char[N2] instead, where N2 == (sizeof(bool) * N). This would allow you to provide a simple representation viewer that can automatically scale to the viewed object's actual size, allow iteration over its individual bytes, and allow you to more easily determine whether the representation matches specific values (such as, e.g., zero or non-zero). I'm not entirely sure off the top of my head whether such examination would invoke any UB, but I can say for certain that any such type's construction cannot be a viable constant-expression, due to requiring a reinterpret cast to char* or unsigned char* or similar (either explicitly, or in std::memcpy()), and thus couldn't as easily be optimised out.

C++ enum flags vs bitset

What are pros/cons of usage bitsets over enum flags?
namespace Flag {
enum State {
Read = 1 << 0,
Write = 1 << 1,
Binary = 1 << 2,
};
}
namespace Plain {
enum State {
Read,
Write,
Binary,
Count
};
}
int main()
{
{
unsigned int state = Flag::Read | Flag::Binary;
std::cout << state << std::endl;
state |= Flag::Write;
state &= ~(Flag::Read | Flag::Binary);
std::cout << state << std::endl;
} {
std::bitset<Plain::Count> state;
state.set(Plain::Read);
state.set(Plain::Binary);
std::cout << state.to_ulong() << std::endl;
state.flip();
std::cout << state.to_ulong() << std::endl;
}
return 0;
}
As I can see so far, bitsets have more convinient set/clear/flip functions to deal with, but enum-flags usage is a more wide-spreaded approach.
What are possible downsides of bitsets and what and when should I use in my daily code?
Both std::bitset and c-style enum have important downsides for managing flags. First, let's consider the following example code :
namespace Flag {
enum State {
Read = 1 << 0,
Write = 1 << 1,
Binary = 1 << 2,
};
}
namespace Plain {
enum State {
Read,
Write,
Binary,
Count
};
}
void f(int);
void g(int);
void g(Flag::State);
void h(std::bitset<sizeof(Flag::State)>);
namespace system1 {
Flag::State getFlags();
}
namespace system2 {
Plain::State getFlags();
}
int main()
{
f(Flag::Read); // Flag::Read is implicitly converted to `int`, losing type safety
f(Plain::Read); // Plain::Read is also implicitly converted to `int`
auto state = Flag::Read | Flag::Write; // type is not `Flag::State` as one could expect, it is `int` instead
g(state); // This function calls the `int` overload rather than the `Flag::State` overload
auto system1State = system1::getFlags();
auto system2State = system2::getFlags();
if (system1State == system2State) {} // Compiles properly, but semantics are broken, `Flag::State`
std::bitset<sizeof(Flag::State)> flagSet; // Notice that the type of bitset only indicates the amount of bits, there's no type safety here either
std::bitset<sizeof(Plain::State)> plainSet;
// f(flagSet); bitset doesn't implicitly convert to `int`, so this wouldn't compile which is slightly better than c-style `enum`
flagSet.set(Flag::Read); // No type safety, which means that bitset
flagSet.reset(Plain::Read); // is willing to accept values from any enumeration
h(flagSet); // Both kinds of sets can be
h(plainSet); // passed to the same function
}
Even though you may think those problems are easy to spot on simple examples, they end up creeping up in every code base that builds flags on top of c-style enum and std::bitset.
So what can you do for better type safety? First, C++11's scoped enumeration is an improvement for type safety. But it hinders convenience a lot. Part of the solution is to use template-generated bitwise operators for scoped enums. Here is a great blog post which explains how it works and also provides working code : https://www.justsoftwaresolutions.co.uk/cplusplus/using-enum-classes-as-bitfields.html
Now let's see what this would look like :
enum class FlagState {
Read = 1 << 0,
Write = 1 << 1,
Binary = 1 << 2,
};
template<>
struct enable_bitmask_operators<FlagState>{
static const bool enable=true;
};
enum class PlainState {
Read,
Write,
Binary,
Count
};
void f(int);
void g(int);
void g(FlagState);
FlagState h();
namespace system1 {
FlagState getFlags();
}
namespace system2 {
PlainState getFlags();
}
int main()
{
f(FlagState::Read); // Compile error, FlagState is not an `int`
f(PlainState::Read); // Compile error, PlainState is not an `int`
auto state = FlagState::Read | FlagState::Write; // type is `FlagState` as one could expect
g(state); // This function calls the `FlagState` overload
auto system1State = system1::getFlags();
auto system2State = system2::getFlags();
if (system1State == system2State) {} // Compile error, there is no `operator==(FlagState, PlainState)`
auto someFlag = h();
if (someFlag == FlagState::Read) {} // This compiles fine, but this is another type of recurring bug
}
The last line of this example shows one problem that still cannot be caught at compile time. In some cases, comparing for equality may be what's really desired. But most of the time, what is really meant is if ((someFlag & FlagState::Read) == FlagState::Read).
In order to solve this problem, we must differentiate the type of an enumerator from the type of a bitmask. Here's an article which details an improvement on the partial solution I referred to earlier : https://dalzhim.github.io/2017/08/11/Improving-the-enum-class-bitmask/
Disclaimer : I'm the author of this later article.
When using the template-generated bitwise operators from the last article, you will get all of the benefits we demonstrated in the last piece of code, while also catching the mask == enumerator bug.
Some observations:
std::bitset< N > supports an arbitrary number of bits (e.g., more than 64 bits), whereas underlying integral types of enums are restricted to 64 bits;
std::bitset< N > can implicitly (depending on the std implementation) use the underlying integral type with the minimal size fitting the requested number of bits, whereas underlying integral types for enums need to be explicitly declared (otherwise, int will be used as the default underlying integral type);
std::bitset< N > represents a generic sequence of N bits, whereas scoped enums provide type safety that can be exploited for method overloading;
If std::bitset< N > is used as a bit mask, a typical implementation depends on an additional enum type for indexing (!= masking) purposes;
Note that the latter two observations can be combined to define a strong std::bitset type for convenience:
typename< Enum E, std::size_t N >
class BitSet : public std::bitset< N >
{
...
[[nodiscard]]
constexpr bool operator[](E pos) const;
...
};
and if the code supports some reflection to obtain the number of explicit enum values, then the number of bits can be deduced directly from the enum type.
scoped enum types do not have bitwise operator overloads (which can easily be defined once using SFINAE or concepts for all scoped and unscoped enum types, but need to be included before use) and unsoped enum types will decay to the underlying integral type;
bitwise operator overloads for enum types, require less boilerplate than std::bitset< N > (e.g., auto flags = Depth | Stencil;);
enum types support both signed and unsigned underlying integral types, whereas std::bitset< N > internally uses unsigned integral types (shift operators).
FWIIW, in my own code I mostly use std::bitset (and eastl::bitvector) as private bit/bool containers for setting/getting single bits/bools. For masking operations, I prefer scoped enum types with explicitly defined underlying types and bitwise operator overloads.
Do you compile with optimization on? It is very unlikely that there is a 24x speed factor.
To me, bitset is superior, because it manages space for you:
can be extended as much as wanted. If you have a lot of flags, you may run out of space in the int/long long version.
may take less space, if you only use just several flags (it can fit in an unsigned char/unsigned short - I'm not sure that implementations apply this optimization, though)
(Ad mode on)
You can get both: a convenient interface and max performance. And type-safety as well. https://github.com/oliora/bitmask

Best practice for checking for no index for an unsigned integer

Say I have this function:
void doThings(uint8_t index) {
if (an index is given) { ... }
}
Usually, an invalid index is -1, so that if statement would be if (index != -1). What if I'm using an unsigned integer to represent the index? Would it be silly to change the function definition to a signed int, just so I can test for -1? Is there a universally accepted number representing 'no index' for unsigned ints?
Simply overload doThings, something like this:
void doThings(uint8_t index) {
// do things for a given index
}
void doThings() {
// do things for no index
}
Or, if you're simply passing the results of a function, say findElement use a std::pair something like:
std::pair<std::uint8_t, bool> findElement(...);
void doThings(std::pair<std::uint8_t, bool>& arg) {
if (arg.second) {
// do things for given element arg.first
and call it with:
doThings(findElement(...));
If you must take into account the two situations in the same function, a better option may be to just provide a second parameter.
void doThings(uint8_t index, bool indexGiven) {
if (indexGiven) { ... }
}
However, using two entirely different functions, one for when the index is given and one for when it is not, may lead to a cleaner design.
It's not silly to change the function definition to use a signed integer so you can check against -1. This is a common practice.
However, if the function is part of a well-defined and documented API that is used by others, then you may not want to change the function to used a signed int. Instead, I would suggest using MAX_INT (or 0xFF for a uint8_t) as a flag for an invalid index.
I would go with a Maybe type.
// include some optional type, e.g. experimental/optional or boost/optional.hpp
using maybe_index = optional<std::uint8_t>;
void doThings(maybe_index index) {
if (index) { ... }
else { ... }
}
if I needed to be able to represent an index plus a special invalid state. GCC has an implementation of the proposed std::optional in std::experimental, and a Boost version is available.
Ask yourself what values are valid for the index. Typically, you have an array and the length of the array defines the valid range. If that array happens to be 256 elements long, you use every possible value of a uint8_t as valid index, which means that you need more to represent an "invalid index". If the array is smaller, any index out of range of that array is invalid, customarily the highest value would be used for that (i.e. static_cast<uint8_t>(-1) or using functions from the <limits> header).
There are a bunch of approaches here already, like e.g. an additional flag, using optional<uint8_t> (which you should remember, as it is applicable in any place where you have similar requirements, not just for indices) or using an overload (which is probably not the way to go since this requires a compile-time decision).
Instead, I'd use a larger index type. The usual type used to represent an index is size_t, which is an unsigned integral type that typically has the size of a pointer (i.e. 32 or 64 bit on common computers). If you switch to that, you will be able to address even the largest of arrays that you will ever have in memory (not e.g. on disk!). With that, you can also use static_cast<size_t>(-1) as signal value to represent "invalid index".

What optimizations are enabled by non-type template parameters?

I found this example at cppreference.com, and it seems to be the defacto example used through-out StackOverflow:
template<int N>
struct S {
int a[N];
};
Surely, non-type templatization has more value than this example. What other optimizations does this syntax enable? Why was it created?
I am curious, because I have code that is dependent on the version of a separate library that is installed. I am working in an embedded environment, so optimization is important, but I would like to have readable code as well. That being said, I would like to use this style of templating to handle version differences (examples below). First, am I thinking of this correctly, and MOST IMPORTANTLY does it provide a benefit or drawback over using a #ifdef statement?
Attempt 1:
template<int VERSION = 500>
void print (char *s);
template<int VERSION>
void print (char *s) {
std::cout << "ERROR! Unsupported version: " << VERSION << "!" << std::endl;
}
template<>
void print<500> (char *s) {
// print using 500 syntax
}
template<>
void print<600> (char *s) {
// print using 600 syntax
}
OR - Since the template is constant at compile time, could a compiler consider the other branches of the if statement dead code using syntax similar to:
Attempt 2:
template<int VERSION = 500>
void print (char *s) {
if (VERSION == 500) {
// print using 500 syntax
} else if (VERSION == 600) {
// print using 600 syntax
} else {
std::cout << "ERROR! Unsupported version: " << VERSION << "!" << std::endl;
}
}
Would either attempt produce output comparable in size to this?
void print (char *s) {
#if defined(500)
// print using 500 syntax
#elif defined(600)
// print using 600 syntax
#else
std::cout << "ERROR! Unsupported version: " << VERSION << "!" << std::endl;
#endif
}
If you can't tell I'm somewhat mystified by all this, and the deeper the explanation the better as far as I'm concerned.
Compilers find dead code elimination easy. That is the case where you have a chain of ifs depending (only) on a template parameter's value or type. All branches must contain valid code, but when compiled and optimized the dead branches evaporate.
A classic example is a per pixel operation written with template parameters that control details of code flow. The body can be full of branches, yet the compiled output branchless.
Similar techniques can be used to unroll loops (say scanline loops). Care must be taken to understand the code size multiplication that can result: especially if your compiler lacks ICF (aka comdat folding) such as the gold gcc linker and msvc (among others) have.
Fancier things can also be done, like manual jump tables.
You can do pure compile time type checks with no runtime behaviour at alll stuff like dimensional analysis. Or distinguish between points and vectors in n-space.
Enums can be used to name types or switches. Pointers to functions to enable efficient inlining. Pointers to data to allow 'global' state that is mockable, or siloable, or decoupled from implementation. Pointers to strings to allow efficient readable names in code. Lists of integral values for myriads of purposes, like the indexes trick to unpack tuples. Complex operations on static data, like compile time sorting of data in multiple indexes, or checking integrity of static data with complex invariants.
I am sure I missed some.
An obvious optimization is when using an integer, the compiler has a constant rather than a variable:
int foo(size_t); // definition not visible
// vs
template<size_t N>
size_t foo() {return N*N;}
With the template, there's nothing to compute at runtime, and the result may be used as a constant, which can aid other optimizations. You can take this example further by declaring it constexpr, as 5gon12eder mentioned below.
Next example:
int foo(double, size_t); // definition not visible
// vs
template<size_t N>
size_t foo(double p) {
double r(p);
for (size_t i(0) i < N; ++i) {
r *= p;
}
return r;
}
Ok. Now the number of iterations of the loop is known. The loop may be unrolled/optimized accordingly, which can be good for size, speed, and eliminating branches.
Also, basing off your example, std::array<> exists. std::array<> can be much better than std::vector<> in some contexts, because std::vector<> uses heap allocations and non-local memory.
There's also the possibility that some specializations will have different implementations. You can separate those and (potentially) reduce other referenced definitions.
Of course, templates<> can also work against you unnecessarily duplication of your programs.
templates<> also require longer symbol names.
Getting back to your version example: Yes, it's certainly possible that if VERSION is known at compilation, the code which is never executed can be deleted and you may also be able to reduce referenced functions. The primary difference will be that void print (char *s) will have a shorter name than the template (whose symbol name includes all template parameters). For one function, that's counting bytes. For complex programs with many functions and templates, that cost can go up quickly.
There is an enormous range of potential applications of non-typename template parameters. In his book The C++ Programming Language, Stroustrup gives an interesting example that sketches out a type-safe zero-overhead framework for dealing with physical quantities. Basically, the idea is that he writes a template that accepts integers denoting the powers of fundamental physical quantities such as length or mass and then defines arithmetic on them. In the resulting framework, you can add speed with speed or divide distance by time but you cannot add mass to time. Have a look at Boost.Units for an industry-strength implementation of this idea.
For your second question. Any reasonable compiler should be able to produce exactly the same machine code for
#define FOO
#ifdef FOO
do_foo();
#else
do_bar();
#endif
and
#define FOO_P 1
if (FOO_P)
do_foo();
else
do_bar();
except that the second version is much more readable and the compiler can catch errors in both branches simultaneously. Using a template is a third way to generate the same code but I doubt that it will improve readability.