Cross-platform C++ determining maximim integer value (no headers)

Cross-platform C++ determining maximim integer value (no headers) - c++

make C++ functions or structs, classes (using meta-programming) determining maximum value for signed and unsigned type, according to compilers architecture. One for signed and second for unsigned numbers.
Requirements:
no header files
self adjusting to variable sizes (no stdint.h)
no compiler warnings about possible overflow
Clarification:
After comment's I am surprised, on reaction for non typical C++ problem. I've learned it's good to stress out, that problem is not homework and not from the moon, but it's practical domain.
For all interested in application of this stuff... first of all: it is not homework :). And it's practical, answerable question based on actual problems that I face - as in SO.FAQ is suggested . Thanks you for tips about climits etc, but I am looking for "smart piece of code". For sure climits, limits are well tested and good pieces of code, but they are huge and not necessarily "smart,tricky". We are looking here for smart solutions (not "huge-any" solutions), aren't we? Even thou, climits suggestions are ok, as start point. For those interested about area, where including header files is not allowed, and size of source code is relevant, there are few: experiments with compilers, program transformations, preparing problemsets for programming contests, etc. Actually tree of them are relevant to problems I am currently struggling. So I don't think it's (SO.FAQ)too localized, and I think, it's for sure, question for (SO.FAQ)enthusiast programmers. If you think that even all of this, there is something wrong with this question, please let me know - I don't want to make mistake again. If it's ok, please let me know, what I could do better to not get it downvoted?

Under reasonable assumptions for two's complement representation:
template<typename T> struct maxval;
template<> struct maxval<unsigned char>
{
static const unsigned char value = (unsigned char) ~0;
};
template<> struct maxval<signed char>
{
static const signed char value = ((unsigned char) ~0) >> 1;
};
template<> struct maxval<unsigned short>
{
static const unsigned short value = (unsigned short) ~0;
};
template<> struct maxval<short>
{
static const short value = ((unsigned short) ~0) >> 1;
};
int
main ()
{
std::cout << (int)maxval<signed char>::value << std::endl;
}
Likewise for the rest of the types.
Need to distinguish between signed and unsigned types when determining the max value. The easy way is to enumerate all of them like in the above example.
Perhaps it can be done with a combination of enable_if and std::is_unsigned, but reimplementing them (no headers!) will still require enumerating all types.

For unsigned types, it's simple: T(-1) will always be the maximum for that type (-1 is reduced modulo the maximum to fit in the range, always giving the maximum for the type).
For signed integer types, the job is almost as easy, at least in practice: take the maximum unsigned value, shift right one bit, and cast to signed. For C99 or C++11 that will work because only three representations for integers (1's complement, signed magnitude and 2's complement) are allowed (and it gives the correct result for all three). In theory, for C89/90 and/or C++98/03, it might be possible to design a conforming signed type for which it would fail (e.g., a biased representation where the bias was not range/2).
For those, and for floating point types (which have no unsigned counterparts), the job is rather more difficult. There's a reason these are provided in a header instead of being left for you to compute on your own...
Edit: As far as how to implement this for in C++, most of the difficulty is in specializing a template for an unsigned type. The most obvious way to do that is probably to use SFINAE, with an expression that will only be legal for a signed type (or only for an unsigned type). The usual for that would be an array whose size is something like T()-1>0. This will yield false for a signed type, which will convert to 0; since you can't create a zero-sized array, that attempted substitution will fail. For an unsigned type, the -1 will "wrap" to the maximum value, so it would create a size of 1, which is allowed.
Since this seems to be homework, I'm not going to show an actual, working implementation for that though.

This works for unsigned types:
template <typename t>
constexpr t max_val() { // constexpr c++11 thing, you can remove it for c++03
return ~(t(0));
}
signed can't be portably found as you can't assume the number of bits and encoding.

Signedness could be determined at compile-time if you wish to merge maxSigned with maxUnsigned.
#include <iostream>
#include <cstddef> // for pytdiff_t, intptr_t
template <typename T> static inline bool is_signed() {
return ~T(0) < T(1);
}
template <typename T> static inline T min_value() {
return is_signed<T>() ? ~T(0) << (sizeof(T)*8-1) : T(0);
}
template <typename T> static inline T max_value() {
return ~min_value<T>();
}
#define REPORT(type) do{ std::cout\
<< "<" #type "> is " << (is_signed<type>() ? "signed" : "unsigned")\
<< ", with lower limit " << min_value<type>()\
<< " and upper limit " << max_value<type>()\
<< std::endl; }while(false)
int main(int argc, char* argv[]) {
REPORT(char); // min, max not numeric
REPORT(int);
REPORT(unsigned);
REPORT(long long);
REPORT(unsigned long long);
REPORT(ptrdiff_t);
REPORT(size_t);
REPORT(uintptr_t);
REPORT(intptr_t);
}

Here is my answer, due to the fact, I was not expecting either: enumerating all types (as Chill), either overflow (as I stated -> I don't want compilers warnings); as some of previous answers consisted. Here is what I've found.
As Pubby has shown, unsigned case is simple:
template <class T>
T maxUnsigned(){ return ~T(0); }
As Chill mentioned :
Under reasonable assumptions for two's complement representation
Here is my meta-programming solution for signed case: (metaprogramming is for omitting overflow compiler warnings)
template<class T, int N> struct SignedMax {
const static T value = (T(1)<<N) + SignedMax<T, N - 1>::value;
};
template<class T> struct SignedMax<T, 0> {
const static T value = 1;
};
template<class T>
T maxSigned(){
return SignedMax<T, sizeof(T)*8-2>::value;
}
End example of stuff working
#include <iostream>
using std::cout;
using std::endl;
//(...)
#define PSIGNED(T) std::cout << #T "\t" << maxSigned<T>() << std::endl
int main(){
cout << maxSigned<short int>() << endl;
cout << maxSigned<int>() << endl;
cout << maxSigned<long long>() << endl;
cout << maxUnsigned<unsigned short int>() << endl;
cout << maxUnsigned<unsigned int>() << endl;
cout << maxUnsigned<unsigned long long>() << endl;
return 0;
}

Related

C++ enum flags vs bitset

What are pros/cons of usage bitsets over enum flags?
namespace Flag {
enum State {
Read = 1 << 0,
Write = 1 << 1,
Binary = 1 << 2,
};
}
namespace Plain {
enum State {
Read,
Write,
Binary,
Count
};
}
int main()
{
{
unsigned int state = Flag::Read | Flag::Binary;
std::cout << state << std::endl;
state |= Flag::Write;
state &= ~(Flag::Read | Flag::Binary);
std::cout << state << std::endl;
} {
std::bitset<Plain::Count> state;
state.set(Plain::Read);
state.set(Plain::Binary);
std::cout << state.to_ulong() << std::endl;
state.flip();
std::cout << state.to_ulong() << std::endl;
}
return 0;
}
As I can see so far, bitsets have more convinient set/clear/flip functions to deal with, but enum-flags usage is a more wide-spreaded approach.
What are possible downsides of bitsets and what and when should I use in my daily code?

Both std::bitset and c-style enum have important downsides for managing flags. First, let's consider the following example code :
namespace Flag {
enum State {
Read = 1 << 0,
Write = 1 << 1,
Binary = 1 << 2,
};
}
namespace Plain {
enum State {
Read,
Write,
Binary,
Count
};
}
void f(int);
void g(int);
void g(Flag::State);
void h(std::bitset<sizeof(Flag::State)>);
namespace system1 {
Flag::State getFlags();
}
namespace system2 {
Plain::State getFlags();
}
int main()
{
f(Flag::Read); // Flag::Read is implicitly converted to `int`, losing type safety
f(Plain::Read); // Plain::Read is also implicitly converted to `int`
auto state = Flag::Read | Flag::Write; // type is not `Flag::State` as one could expect, it is `int` instead
g(state); // This function calls the `int` overload rather than the `Flag::State` overload
auto system1State = system1::getFlags();
auto system2State = system2::getFlags();
if (system1State == system2State) {} // Compiles properly, but semantics are broken, `Flag::State`
std::bitset<sizeof(Flag::State)> flagSet; // Notice that the type of bitset only indicates the amount of bits, there's no type safety here either
std::bitset<sizeof(Plain::State)> plainSet;
// f(flagSet); bitset doesn't implicitly convert to `int`, so this wouldn't compile which is slightly better than c-style `enum`
flagSet.set(Flag::Read); // No type safety, which means that bitset
flagSet.reset(Plain::Read); // is willing to accept values from any enumeration
h(flagSet); // Both kinds of sets can be
h(plainSet); // passed to the same function
}
Even though you may think those problems are easy to spot on simple examples, they end up creeping up in every code base that builds flags on top of c-style enum and std::bitset.
So what can you do for better type safety? First, C++11's scoped enumeration is an improvement for type safety. But it hinders convenience a lot. Part of the solution is to use template-generated bitwise operators for scoped enums. Here is a great blog post which explains how it works and also provides working code : https://www.justsoftwaresolutions.co.uk/cplusplus/using-enum-classes-as-bitfields.html
Now let's see what this would look like :
enum class FlagState {
Read = 1 << 0,
Write = 1 << 1,
Binary = 1 << 2,
};
template<>
struct enable_bitmask_operators<FlagState>{
static const bool enable=true;
};
enum class PlainState {
Read,
Write,
Binary,
Count
};
void f(int);
void g(int);
void g(FlagState);
FlagState h();
namespace system1 {
FlagState getFlags();
}
namespace system2 {
PlainState getFlags();
}
int main()
{
f(FlagState::Read); // Compile error, FlagState is not an `int`
f(PlainState::Read); // Compile error, PlainState is not an `int`
auto state = FlagState::Read | FlagState::Write; // type is `FlagState` as one could expect
g(state); // This function calls the `FlagState` overload
auto system1State = system1::getFlags();
auto system2State = system2::getFlags();
if (system1State == system2State) {} // Compile error, there is no `operator==(FlagState, PlainState)`
auto someFlag = h();
if (someFlag == FlagState::Read) {} // This compiles fine, but this is another type of recurring bug
}
The last line of this example shows one problem that still cannot be caught at compile time. In some cases, comparing for equality may be what's really desired. But most of the time, what is really meant is if ((someFlag & FlagState::Read) == FlagState::Read).
In order to solve this problem, we must differentiate the type of an enumerator from the type of a bitmask. Here's an article which details an improvement on the partial solution I referred to earlier : https://dalzhim.github.io/2017/08/11/Improving-the-enum-class-bitmask/
Disclaimer : I'm the author of this later article.
When using the template-generated bitwise operators from the last article, you will get all of the benefits we demonstrated in the last piece of code, while also catching the mask == enumerator bug.

Some observations:
std::bitset< N > supports an arbitrary number of bits (e.g., more than 64 bits), whereas underlying integral types of enums are restricted to 64 bits;
std::bitset< N > can implicitly (depending on the std implementation) use the underlying integral type with the minimal size fitting the requested number of bits, whereas underlying integral types for enums need to be explicitly declared (otherwise, int will be used as the default underlying integral type);
std::bitset< N > represents a generic sequence of N bits, whereas scoped enums provide type safety that can be exploited for method overloading;
If std::bitset< N > is used as a bit mask, a typical implementation depends on an additional enum type for indexing (!= masking) purposes;
Note that the latter two observations can be combined to define a strong std::bitset type for convenience:
typename< Enum E, std::size_t N >
class BitSet : public std::bitset< N >
{
...
[[nodiscard]]
constexpr bool operator[](E pos) const;
...
};
and if the code supports some reflection to obtain the number of explicit enum values, then the number of bits can be deduced directly from the enum type.
scoped enum types do not have bitwise operator overloads (which can easily be defined once using SFINAE or concepts for all scoped and unscoped enum types, but need to be included before use) and unsoped enum types will decay to the underlying integral type;
bitwise operator overloads for enum types, require less boilerplate than std::bitset< N > (e.g., auto flags = Depth | Stencil;);
enum types support both signed and unsigned underlying integral types, whereas std::bitset< N > internally uses unsigned integral types (shift operators).
FWIIW, in my own code I mostly use std::bitset (and eastl::bitvector) as private bit/bool containers for setting/getting single bits/bools. For masking operations, I prefer scoped enum types with explicitly defined underlying types and bitwise operator overloads.

Do you compile with optimization on? It is very unlikely that there is a 24x speed factor.
To me, bitset is superior, because it manages space for you:
can be extended as much as wanted. If you have a lot of flags, you may run out of space in the int/long long version.
may take less space, if you only use just several flags (it can fit in an unsigned char/unsigned short - I'm not sure that implementations apply this optimization, though)

(Ad mode on)
You can get both: a convenient interface and max performance. And type-safety as well. https://github.com/oliora/bitmask

std::cout deal with uint8_t as a character

If I run this code:
std::cout << static_cast<uint8_t>(65);
It will output:
A
Which is the ASCII equivalent of the number 65.
This is because uint8_t is simply defined as:
typedef unsigned char uint8_t;
Is this behavior a standard?
Should not be a better way to define uint8_t that guaranteed to be dealt with as a number not a character?
I can not understand the logic that if I want to print the value of a uint8_t variable, it will be printed as a character.
P.S. I am using MSVS 2013.

Is this behavior a standard
The behavior is standard in that if uint8_t is a typedef of unsigned char then it will always print a character as std::ostream has an overload for unsigned char and prints out the contents of the variable as a character.
Should not be a better way to define uint8_t that guaranteed to be dealt with as a number not a character?
In order to do this the C++ committee would have had to introduce a new fundamental type. Currently the only types that has a sizeof() that is equal to 1 is char, signed char, and unsigned char. It is possible they could use a bool but bool does not have to have a size of 1 and then you are still in the same boat since
int main()
{
bool foo = 42;
std::cout << foo << '\n';
}
will print 1, not 42 as any non zero is true and true is printed as 1 but default.
I'm not saying it can't be done but it is a lot of work for something that can be handled with a cast or a function
C++17 introduces std::byte which is defined as enum class byte : unsigned char {};. So it will be one byte wide but it is not a character type. Unfortunately, since it is an enum class it comes with it's own limitations. The bit-wise operators have been defined for it but there is no built in stream operators for it so you would need to define your own to input and output it. That means you are still converting it but at least you wont conflict with the built in operators for unsigned char. That gives you something like
std::ostream& operator <<(std::ostream& os, std::byte b)
{
return os << std::to_integer<unsigned int>(b);
}
std::istream& operator <<(std::istream& is, std::byte& b)
{
unsigned int temp;
is >> temp;
b = std::byte{b};
return is;
}
int main()
{
std::byte foo{10};
std::cout << foo;
}

Posting an answer as there is some misinformation in comments.
The uint8_t may or may not be a typedef for char or unsigned char. It is also possible for it to be an extended integer type (and so, not a character type).
Compilers may offer other integer types besides the minimum set required by the standard (short, int, long, etc). For example some compilers offer a 128-bit integer type.
This would not "conflict with C" either, since C and C++ both allow for extended integer types.
So, your code has to allow for both possibilities. The suggestion in comments of using unary + would work.
Personally I think it would make more sense if the standard required uint8_t to not be a character type, as the behaviour you have noticed is unintuitive.

It's indirectly standard behavior, because ostream has an overload for unsigned char and unsigned char is a typedef for same type uint8_t in your system.
§27.7.3.1 [output.streams.ostream] gives:
template<class traits>
basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&, unsigned char);
I couldn't find anywhere in the standard that explicitly stated that uint8_t and unsigned char had to be the same, though. It's just that it's reasonable that they both occupy 1 byte in nearly all implementations.
std::cout << std::boolalpha << std::is_same<uint8_t, unsigned char>::value << std::endl; // prints true
To get the value to print as an integer, you need a type that is not unsigned char (or one of the other character overloads). Probably a simple cast to uint16_t is adequate, because the standard doesn't list an overload for it:
uint8_t a = 65;
std::cout << static_cast<uint16_t>(a) << std::endl; // prints 65
Demo

C4244: '+=' : conversion from 'std::streamsize' to 'size_t', possible loss of data

I have migrated my VC++ project form VS2008 to VS2013 and got some warnings like:
C4244: '+=' : conversion from 'std::streamsize' to 'size_t', possible loss of data.
How can I resolve these type of warnings?

As it's stated in the c++ reference, std::streamsize is defined as signed (emphasis mine):
The type std::streamsize is a signed integral type used to represent the number of characters transferred in an I/O operation or the size of an I/O buffer. It is used as a signed counterpart of std::size_t, similar to the POSIX type ssize_t.
Anyways, the exact implementation seems not to be specified.
Usually a conversion from signed to unsigned type with the same base (e.g. long) shouldn't issue a warning about possible data loss (unless using the sign indicator is meant).
It's probably a poor implementation in Visual Studio C++.

In MSVC 2013 std::streamsize is:
typedef _Longlong streamsize;
typedef _LONGLONG _Longlong;
#define _LONGLONG __int64
And size_t is:
typedef unsigned __int64 size_t;
Thus an easy repro case is:
unsigned __int64 b = 1;
__int64 a = b;
However this doesn't issue a warning - so probably you redefined size_t somewhere to be 32bits?
For clarity:
std::streamsize b = 1;
size_t a = 0;
b = a;
Also issues no warning.

It depends on the use. According to cppreference.com,
Except in the constructors of std::strstreambuf, negative values of std::streamsize are never used.
So you can safely cast your signed value.
std::streamsize i;
// ...
size_t u = static_cast<size_t>(i);
However, in the more general case (as opposed to what πάντα ῥεῖ wrote), I think the warning is valid (even though gcc doesn't spit a similar one out). When comparing signed and unsigned values, it's best to be explicit what you mean.
You can force unsigned-ness, e.g. using a code snippet such as the following (thanks to this question for the general form):
#include <iostream>
template <typename T>
typename std::enable_if< std::is_signed<T>::value, typename std::make_unsigned<T>::type >::type
force_unsigned(T u) {
if (u < 0) {
throw std::overflow_error("Cannot use negative value as unsigned type");
}
return static_cast< typename std::make_unsigned<T>::type >(u);
}
template <typename T>
typename std::enable_if< !std::is_signed<T>::value, T >::type
force_unsigned(T u) {
return u;
}
int main() {
std::cout << force_unsigned((unsigned int)1) << std::endl;
std::cout << force_unsigned((int)1) << std::endl;
std::cout << force_unsigned((int)0) << std::endl;
std::cout << force_unsigned((int)-1) << std::endl;
return 0;
}

Type casting struct to integer and vice versa in C++

So, I've seen this thread Type casting struct to integer c++ about how to cast between integers and structs (bitfields) and undoubtly, writing a proper conversion function or overloading the relevant casting operators is the way to go for any cases where there is an operating system involved.
However, when writing firmware for a small embedded system where only one flash image is run, the case might be different insofar, as security isn't so much of a concern while performance is.
Since I can test whether the code works properly (meaning the bits of a bitfield are arranged the way I would expect them to be) each time when compiling my code, the answer might be different here.
So, my question is, whether there is a 'proper' way to convert between bitfield and unsigned int that does compile to no operations in g++ (maybe shifts will get optimised away when the compiler knows the bits are arranged correctly in memory).
This is an excerpt from the original question:
struct {
int part1 : 10;
int part2 : 6;
int part3 : 16;
} word;
I can then set part2 to be equal to whatever value is requested, and set the other parts as 0.
word.part1 = 0;
word.part2 = 9;
word.part3 = 0;
I now want to take that struct, and convert it into a single 32 bit integer. I do have it compiling by forcing the casting, but it does not seem like a very elegant or secure way of converting the data.
int x = *reinterpret_cast<int*>(&word);
EDIT:
Now, quite some time later, I have learned some things:
1) Type punning (changing the interpretation of data) by means of pointer casting is, undefined behaviour since C99 and C++98. These language changes introduced strict aliasing rules (They allow the compiler to reason that data is only accessed through pointers of compatible type) to allow for better optimisations. In effect, the compiler will not need to keep the ordering between accesses (or do the off-type access at all). For most cases, this does not seem to present a [immediate] problem, but when using higher optimisation settings (for gcc that is -O which includes -fstrict-aliasing) this will become a problem.
For examples see https://blog.regehr.org/archives/959
2) Using unions for type punning also seems to involve undefined behaviour in C++ but not C (See https://stackoverflow.com/a/25672839/4360539), however GCC (and probably others) does explicitly allow it: (See https://gcc.gnu.org/bugs/#nonbugs).
3) The only really reliable way of doing type punning in C++ seems to be using memcpy to copy the data to a new location and perform whatever is to be done and then to use another memcpy to return the changes. I did read somewhere on SO, that GCC (or most compilers probably) should be able to optimise the memcpy to a mere register copy for register-sized data types, but I cannot find it again.
So probably the best thing to do here is to use the union if you can be sure the code is compiled by a compiler supporting type punning through a union. For the other cases, further investigation would be needed how the compiler treats bigger data structures and memcpy and if this really involves copying back and forth, probably sticking with bitwise operations is the best idea.

union {
struct {
int part1: 10;
int part2: 6;
int part3: 16;
} parts;
int whole;
} word;
Then just use word.whole.

I had the same problem. I am guessing this is not very relevant today. But this is how I solved it:
#include <iostream>
struct PACKED{
int x:10;
int y:10;
int z:12;
PACKED operator=(int num )
{
*( int* )this = num;
return *this;
}
operator int()
{
int *x;
x = (int*)this;
return *x;
}
} __attribute__((packed));
int main(void) {
std::cout << "size: " << sizeof(PACKED) << std::endl;
PACKED bf;
bf = 0xFFF00000;
std::cout << "Values ( x, y, z ) = " << bf.x << " " << bf.y << " " << bf.z << std::endl;
int testint;
testint = bf;
std::cout << "As integer: " << testint << std::endl;
return 0;
}
This now fits on a int, and is assignable by standard ints. However I do not know how portable this solution is. The output of this is then:
size: 4
Values ( x, y, z ) = 0 0 -1
As integer: -1048576

When do we need to mention/specify the type of integer for number literals?

I came across a code like below:
#define SOME_VALUE 0xFEDCBA9876543210ULL
This SOME_VALUE is assigned to some unsigned long long later.
Questions:
Is there a need to have postfix like ULL in this case ?
What are the situation we need to specify the type of integer used ?
Do C and C++ behave differently in this case ?

In C, a hexadecimal literal gets the first type of int, unsigned int, long, unsigned long, long long or unsigned long long that can represent its value if it has no suffix. I wouldn't be surprised if C++ has the same rules.
You would need a suffix if you want to give a literal a larger type than it would have by default or if you want to force its signedness, consider for example
1 << 43;
Without suffix, that is (almost certainly) undefined behaviour, but 1LL << 43; for example would be fine.

I think not, but maybe that was required for that compiler.
For example, printf("%ld", SOME_VALUE); if SOME_VALUE's integer type is not specified, this might end up with the wrong output.

A good example for the use of specifying a suffix in C++ is overloaded functions. Take the following for example:
#include <iostream>
void consumeInt(unsigned int x)
{
std::cout << "UINT" << std::endl;
}
void consumeInt(int x)
{
std::cout << "INT" << std::endl;
}
void consumeInt(unsigned long long x)
{
std::cout << "ULL" << std::endl;
}
int main(int argc, const char * argv[])
{
consumeInt(5);
consumeInt(5U);
consumeInt(5ULL);
return 0;
}
Results in:
INT
UINT
ULL

You do not need suffixes if your only intent is to get the right value of the number; C automatically chooses a type in which the value fits.
The suffixes are important if you want to force the type of the expression, e.g. for purposes of how it interacts in expressions. Making it long, or long long, may be needed when you're going to perform an arithmetic operation that would overflow a smaller type (for example, 1ULL<<n or x*10LL), and making it unsigned is useful when you want to the expression as a whole to have unsigned semantics (for example, c-'0'<10U, or n%2U).

When you don't mention any suffix, then the type of integral literal is deduced to be int by the compiler. Since some integral literal may overflow if its type is deduced to be int, so you add suffix to tell the compiler to deduce the type to be something other than int. That is what you do when you write 0xFEDCBA9876543210ULL.
You can also use suffix when you write floating-pointer number. 1.2 is a double, while 1.2f is a float.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js