uintmax_t for vector indexes - c++

I am dealing with basically a bit-flag search mask and I'm using vectors. These indexes need to go up to the max integer on the machine (defined in stdint.h)
Basically the problem is
searchMask[ UINTMAX_MAX] = false; // or any value > unsigned int
results in the following warning
warning: C4244: 'argument' : conversion from 'uintmax_t' to 'unsigned int',
possible loss of data
I have considered just using something like an unsigned char* = "1110010..." and just flipping the bits that way, but dealing with C-strings is always a pain and I suspect accessing the char array index will have this same size problem?
Can I make the indexes of the vector go off the uintmax_t, or should I go the C string route, or what?

Practically all the STL containers will use size_t as their size types. So, depending on your system, size_t might be defined as an unsigned int, which will probably be a 32-bit integer in your case. That would explain why the compiler is complaining.
UINTMAX_MAX is defined as UINT64_MAX, so it won't fit in a 32-bit integer. Try using the UINT32_MAX macro, or be platform-independant and use std::numeric_limits<size_t>::max().
Also, try using std::bitset<N>.

Related

Converting Integer Types

How does one convert from one integer type to another safely and with setting off alarm bells in compilers and static analysis tools?
Different compilers will warn for something like:
int i = get_int();
size_t s = i;
for loss of signedness or
size_t s = get_size();
int i = s;
for narrowing.
casting can remove the warnings but don't solve the safety issue.
Is there a proper way of doing this?
You can try boost::numeric_cast<>.
boost numeric_cast returns the result of converting a value of type Source to a value of type Target. If out-of-range is detected, an exception is thrown (see bad_numeric_cast, negative_overflow and positive_overflow ).
How does one convert from one integer type to another safely and with setting off alarm bells in compilers and static analysis tools?
Control when conversion is needed. As able, only convert when there is no value change. Sometimes, then one must step back and code at a higher level. IOWs, was a lossy conversion needed or can code be re-worked to avoid conversion loss?
It is not hard to add an if(). The test just needs to be carefully formed.
Example where size_t n and int len need a compare. Note that positive values of int may exceed that of size_t - or visa-versa or the same. Note in this case, the conversion of int to unsigned only happens with non-negative values - thus no value change.
int len = snprintf(buf, n, ...);
if (len < 0 || (unsigned)len >= n) {
// Handle_error();
}
unsigned to int example when it is known that the unsigned value at this point of code is less than or equal to INT_MAX.
unsigned n = ...
int i = n & INT_MAX;
Good analysis tools see that n & INT_MAX always converts into int without loss.
There is no built-in safe narrowing conversion between int types in c++ and STL. You could implement it yourself using as an example Microsoft GSL.
Theoretically, if you want perfect safety, you shouldn't be mixing types like this at all. (And you definitely shouldn't be using explicit casts to silence warnings, as you know.) If you've got values of type size_t, it's best to always carry them around in variables of type size_t.
There is one case where I do sometimes decide I can accept less than 100.000% perfect type safety, and that is when I assign sizeof's return value, which is a size_t, to an int. For any machine I am ever going to use, the only time this conversion might lose information is when sizeof returns a value greater than 2147483647. But I am content to assume that no single object in any of my programs will ever be that big. (In particular, I will unhesitatingly write things like printf("sizeof(int) = %d\n", (int)sizeof(int)), explicit cast and all. There is no possible way that the size of a type like int will not fit in an int!)
[Footnote: Yes, it's true, on a 16-bit machine the assumption is the rather less satisfying threshold that sizeof won't return a value greater than 32767. It's more likely that a single object might have a size like that, but probably not in a program that's running on a 16-bitter.]

Detection of overflow

How to detect overflow of unsigned char variable in c++?
unsigned numbers always positive between 0 to 255.and obey the law 2^n(n = numbers of bit in type);if char 8 bit then unsigned char variables have values between 0 and 255 , while signed chars have values between -128 and 127.
unsigned char Test = 260;
Because 260 is an integer literal your compiler should emit a warning. How to handle that? Do not ignore compiler warnings (or use an alternative syntax to avoid automatic conversions or enable this warning as error). Also note that integer literals are always positive (unsigned): -1 is not an integer literal: it's i.l. 1 and unary operator -. For gcc I'd suggest to use -Wstrict-overflow=2 (or more, according to your code policies) and possibly enabling -Werror=strict-overflow. For MS VC++ you may enable warning C4307 with /we4307 and /W14307 if you keep warnings at level 1 (!!!) (you may also do it with #pragma warning directive).
How to detect overflow of unsigned char variable in c++?
At compile-time compiler warnings are your friends but at run-time?
There is not a portable way (like, for example, checked in C#) to do this and better technique depends on which type of operation you want to monitor. For a simple assignment (made with values known at run-time) you may write something like this:
int32_t bigNumber = 260;
uint8_t smallNumber = static_cast<uint8_t>(bigNumber);
if (static_cast<int32_t>(smallNumber) != bigNumber) {
// Overflow...
}
In alternative you may check before assigning:
int32_t bigNumber = 260;
if (bigNumber > UINT8_MAX) {
// Overflow
}
Note that you may also make compiler life easier writing (after assignment):
if (smallNumber != bigNumber) {
// Overflow
}
It works because automatic promotions will convert smallNumber to bigNumber type (unless you're performing a signed/unsigned comparison, in this case you should simply avoid this alternative).
If you need it often you may write a small helper function to perform this conversion. For some ideas and possible implementations, if you're using MS compiler, you may take a look to SafeInt family functions (note, however, that in this case assignment and casting won't throw).
You can use braces to initialize your value to force a compile-time error (assuming you use C++11 or later):
unsigned char Test{260};
Brace-initialization doesn't allow narrowing conversions.
Of course, that still wouldn't allow sticking the value 260 into an unsigned char but it would draw attention to the attempt. You'd need a bigger data type, e.g., unsigned short, to represent 260.

Are signed hexdecimal literals possible?

I have an array of bitmasks, the idea being to use them to clear a specified number of the least significant bits of an integer that is being used as a set of flags. It is defined as follows:
int clearLow[10]=
{
0xffffffff, 0xfffffffe, 0xfffffffc, 0xfffffff8, 0xfffffff0, 0xffffffe0, 0xffffffc0, 0xffffff80, 0xffffff00, 0xfffffe00
};
I recently switching to using gcc 4.8 I have found that this array starts throwing warnings,
warning: narrowing conversion of ‘4294967295u’ from ‘unsigned int’ to ‘int’ inside { } is ill-formed in C++11
etc
etc
Clearly my hexadecimal literals are being taken as unsigned ints and the fix is easy as, honestly, I do not care if this array is int or unsigned int it just needs to have the appropriate bits set in each cell, but my question is this:
Are there any ways to set literals in hexadecimal, for the purposes of simply setting bits, without the compiler assuming them to be unsigned?
You describe that you just want to use the values as operands to bit operations. As that is the case, just always use unsigned datatypes. That's the simple solution.
It looks like you just want an array of unsigned int to use for your bit masking:
const unsigned clearLow[] = {
0xffffffff, 0xfffffffe, 0xfffffffc, 0xfffffff8, 0xfffffff0, 0xffffffe0, 0xffffffc0, 0xffffff80, 0xffffff00, 0xfffffe00
};

What is the downside of replacing size_t with unsigned long

The library I am working on need to be used on both 32 and 64 bit machines; I have lots of compiler warnings because on 64bit machines unsigned int != size_t.
Is there any downside in replacing all unsigned ints and size_ts by 'unsigned long'? I appreciate it does not look very elegant, but, in out case, the memory is not too much of an issue... I am wondering if there is a possibility of any bugs/unwanted behaviour etc. created by such replace all operation (could you give examples)? Thanks.
What warnings? The most obvious one I can think of is for a "narrowing conversion", that is to say you're assigning size_t to unsigned int, and getting a warning that information might be lost.
The main downside of replacing size_t with unsigned long is that unsigned long is not guaranteed to be large enough to contain every possible value of size_t, and on Windows 64 it is not large enough. So you might find that you still have warnings.
The proper fix is that if you assign a size_t to a variable (or data member), you should make sure that variable has a type large enough to contain any value of size_t. That's what the warning is all about. So you should not switch to unsigned long, you should switch those variables to size_t.
Conversely, if you have a variable that doesn't need to be big enough to hold any size, just big enough for unsigned int, then don't use size_t for it in the first place.
Both types (size_t and unsigned int) have valid uses, so any approach that indiscriminately replaces all use of them by some other type must be wrong :-) Actually, you could replace everything with size_t or uintmax_t and for most programs that would be OK. The exceptions are where the code relies on using an unsigned type of the same size as int, or whatever, such that a larger type breaks the code.
The standard makes little guarantees about the sizes of types like int and long. size_t is guaranteed to be large enough to hold any object, and all std containers operate on size_t.
It's perfectly possible for a platform to define long as smaller than size_t, or have the size of long subject to compilation options, for example. To be safe, it's best to stick to size_t.
Another criterion to consider is that size_t carries a meaning - "this thing is used to store a size or an index." It makes the code slightly more self-documenting.
If you are using size_t in places where you should get a size_t and replace it with unsigned long, you will introduce new warnings.
example:
size_t count = some_vector.size();
Replace size_t with unsigned long, and (to the degree they are different) you will have introduced a new warning (because some_vector.size() returns a size_t - actually a std:::vector<something>::size_type but in practice it should evaluate to the same).
It may be a problem of assuming it's unsigned long when long is 8 bytes. then (unsigned int) -1 != (unsigned long) -1, the following code may have assertion failure.
unsigned int i = string::npos;
assert(i == string::npos);

size_t vs int in C++ and/or C

Why is it that in C++ containers, it returns a size_type rather than an int? If we're creating our own structures, should we also be encouraged to use size_type?
In general, size_t should be used whenever you are measuring the size of something. It is really strange that size_t is only required to represent between 0 and SIZE_MAX bytes and SIZE_MAX is only required to be 65,535...
The other interesting constraints from the C++ and C Standards are:
the return type of sizeof() is size_t and it is an unsigned integer
operator new() takes the number of bytes to allocate as a size_t parameter
size_t is defined in <cstddef>
SIZE_MAX is defined in <limits.h> in C99 but not mentioned in C++98?!
size_t is not included in the list of fundamental integer types so I have always assumed that size_t is a type alias for one of the fundamental types: char, short int, int, and long int.
If you are counting bytes, then you should definitely be using size_t. If you are counting the number of elements, then you should probably use size_t since this seems to be what C++ has been using. In any case, you don't want to use int - at the very least use unsigned long or unsigned long long if you are using TR1. Or... even better... typedef whatever you end up using to size_type or just include <cstddef> and use std::size_t.
A few reasons might be:
The type (size_t) can be defined as the largest unsigned integer on that platform. For example, it might be defined as a 32 bit integer or a 64 bit integer or something else altogether that's capable of storing unsigned values of a great length
To make it clear when reading a program that the value is a size and not just a "regular" int
If you're writing an app that's just for you and/or throwaway, you're probably fine to use a basic int. If you're writing a library or something substantial, size_t is probably a better way to go.
Some of the answers are more complicated than necessary. A size_t is an unsigned integer type that is guaranteed to be big enough to store the size in bytes of any object in memory. In practice, it is always the same size as the pointer type. On 32 bit systems it is 32 bits. On 64 bit systems it is 64 bits.
All containers in the stl have various typedefs. For example, value_type is the element type, and size_type is the number stored type. In this way the containers are completely generic based on platform and implementation.
If you are creating your own containers, you should use size_type too. Typically this is done
typedef std::size_t size_type;
If you want a container's size, you should write
typedef vector<int> ints;
ints v;
v.push_back(4);
ints::size_type s = v.size();
What's nice is that if later you want to use a list, just change the typedef to
typedef list<int> ints;
And it will still work!
I assume you mean "size_t" -- this is a way of indicating an unsigned integer (an integer that can only be positive, never negative) -- it makes sense for containers' sizes since you can't have an array with a size of -7. I wouldn't say that you have to use size_t but it does indicate to others using your code "This number here is always positive." It also gives you a greater range of positive numbers, but that is likely to be unimportant unless you have some very big containers.
C++ is a language that could be implemented on different hardware architectures and platforms. As time has gone by it has supported 16-, 32-, and 64-bit architecture, and likely others in the future. size_type and other type aliases are ways for libraries to insulate the programmers/code from implementation details.
Assuming the size_type uses 32 bits on 32-bit machines and 64 bits on 64-bit machines, the same source code likely would work better if you've used size_type where needed. In most cases you could assume it would be the same as unsigned int, but it's not guaranteed.
size_type is used to express capacities of STL containers like std::vector whereas size_t is used to express byte size of an object in C/C++.
ints are not guaranteed to be 4 bytes in the specification, so they are not reliable. Yes, size_type would be preferred over ints
size_t is unsigned, so even if they're both 32 bits it doesn't mean quite the same thing as an unqualified int. I'm not sure why they added the type, but on many platforms today sizeof (size_t) == sizeof (int) == sizeof (long), so which type you choose is up to you. Note that those relations aren't guaranteed by the standard and are rapidly becoming out of date as 64 bit platforms move in.
For your own code, if you need to represent something that is a "size" conceptually and can never be negative, size_t would be a fine choice.
void f1(size_t n) {
if (n <= myVector.size()) { assert(false); }
size_t n1 = n - myVector.size(); // bug! myVector.size() can be > n
do_stuff_n_times(n1);
}
void f2(int n) {
int n1 = n - static_cast<int>(myVector.size());
assert(n1 >= 0);
do_stuff_n_times(n1);
}
f1() and f2() both have the same bug, but detecting the problem in f2() is easier. For more complex code, unsigned integer arithmetic bugs are not as easy to identify.
Personally I use signed int for all my sizes unless unsigned int should be used. I have never run into situation where my size won't fit into a 32 bit signed integer. I will probably use 64 bit signed integers before I use unsigned 32 bit integers.
The problem with using signed integers for size is a lot of static_cast from size_t to int in your code.