Why size of QByteArray is `int` rather than `unsigned int`

Why size of QByteArray is `int` rather than `unsigned int` - c++

I have such expressions in my code:
QByteArray idx0 = ...
unsigned short ushortIdx0;
if ( idx0.size() >= sizeof(ushortIdx0) ) {
// Do something
}
But I'm getting the warning:
warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if ( idx0.size() >= sizeof(ushortIdx0) ) {
~~~~~~~~~~~~^~~~~~~~~~
Why size() of QByteArray is returned as int rather than unsigned int? How can I get rid of this warning safely?

Some folk feel that the introduction of unsigned types into C all those years ago was a bad idea. Such types found themselves introduced into C++, where they are deeply embedded in the C++ Standard Library and operator return types.
Yes, sizeof must, by the standard, return an unsigned type.
The Qt developers adopt the modern thinking that the unsigned types were a bad idea, and favour instead making the return type of size a signed type. Personally I find it idiosyncratic.
To solve, you could (i) live with the warning, (ii) switch it off for the duration of the function, or (iii) write something like
(std::size_t)idx0.size() >= sizeof(ushortIdx0)
at the expense of clarity.

Why size() of QByteArray is returned as int rather than unsigned int?
I literally have no idea why Qt chose a signed return for size(). However, there are good reasons to use a signed instead of an unsigned.
One infamous example where a unsigned size() fails miserably is this quite innocent looking loop:
for (int i = 0; i < some_container.size() - 1; ++i) {
do_somehting(some_container[i] , some_container[i+1] );
}
Its not too uncommon to make the loop body operate on two elements and in that case its seems to be a valid choice to iterate only till some_container.size() - 1.
However, if the container is empty some_container.size() - 1 will silently (unsigned overflow is well defined) turn into the biggest value for the unsigned type. Hence, instead of avoiding the out-of-bounds access it leads to the maximum out of bounds you can get.
Note that there are easy fixes for this problem, but if size() does return a signed value, then there is no issue that needs to be fixed in the first place.

Because in Qt containers (like: QByteArray, QVector, ...) there are functions which can return a negative number, like: indexOf, lastIndexOf, contains, ... and some can accept negative numbers, like: mid, ...; So to be class-compatible or even framework-compatibe, developers use a signed type (int).
You can use standard c++ casting:
if ( static_cast<size_t>(idx0.size()) >= sizeof(ushortIdx0) )

The reason why is a duplicate part of the question, but the solution to the type mismatch is a valid problem to solve. For the comparisons of the kind you're doing, it'd probably be useful to factor them out, as they have a certain reusable meaning:
template <typename T> bool fitsIn(const QByteArray &a) {
return static_cast<int>(sizeof(T)) <= a.size();
}
template <typename T> bool fitsIn(T, const QByteArray &a) {
return fitsIn<T>(a);
}
if (fitsIn(ushortIdx0, idx0)) ...
Hopefully you'll have just a few kinds of such comparisons, and it'd make most sense to DRY (do not repeat yourself) and instead of a copypasta of casts, use functions dedicated to the task - functions that also express the intent of the original comparison. It then becomes easy to centralize handling of any corner cases you might wish to handle, i.e. when sizeof(T) > INT_MAX.
Another approach would be to define a new type to wrap size_t and adapt it to the types you need to use it with:
class size_of {
size_t val;
template <typename T> static typename std::enable_if<std::is_signed<T>::value, size_t>::type fromSigned(T sVal) {
return (sVal > 0) ? static_cast<size_t>(sVal) : 0;
}
public:
template <typename T, typename U = std::enable_if<std::is_scalar<T>::value>::type>
size_of(const T&) : val(sizeof(T)) {}
size_of(const QByteArray &a) : val(fromSigned(a.size())) {}
...
bool operator>=(size_of o) const { return value >= o.value; }
};
if (size_of(idx0) >= size_of(ushortIdx0)) ...
This would conceptually extend sizeof and specialize it for comparison(s) and nothing else.

Related

What is the type of string.size() in C++?

I have the following code:
#include<iostream>
#include<string>
int main()
{
std::string s = "458";
std::cout <<s.size()-4;
}
When I run this I get 42944935 or something like this. But when I run it with the following modification:
int main()
{
std::string s = "458";
int i = s.size();
std::cout << i -4;
}
I get -1, which I would have expected from the first code. Can someone explain what is happening here?

For historical reasons the return type of std::string::size() is size_t, which is an unsigned type sufficient for the largest possible array size.
You can work around that by defining a number of general size/length-functions, like
using Size = ptrdiff_t; // signed type
template< class Collection >
constexpr auto n_items( Collection const& c )
-> Size
{ return c.size(); }
// Raw array. Using size_t template param for g++ compatibility.
template< class Item, size_t n >
constexpr auto n_items( Item (&)[n] )
-> Size
{ return n; }
Here I used the name n_items because C++17 will define a general size function that, unfortunately, returns size_t (and conflates a number of notions of size, also unfortunate). One doesn't want a name conflict there.
Where you don't have such functions available an alternative is to express a size as the difference of std::end and std::begin, e.g. end(s) - begin(s). The difference type for raw pointers is ptrdiff_t (which is signed), and the default difference type for iterators like you get from std::string::begin(), is also ptrdiff_t, from std::iterator_traits.

That happens because the return value of size() function is an unsigned value.
When you subtract -4 from the returned value, the result becomes a negative value. You need to tell cout to interpret the value as a signed value (for example, cast the value like this std::cout <<int(s.size()-4); or as you have said int i = s.size();), then you'll get what you would expect. The reason behind that big integer you get is that if you interpret the binary representation of a two's complement negative value as a positive value, it'd be a very big integer value.
To learn more about two's complement binary arithmetic you can refer to this link.

NumericTraits Zero and One

It is possible to use itk::NumericTraits to get 0 and 1 of some type. Thus we can see this kind of code in the wild:
const PixelType ZERO = itk::NumericTraits<PixelType>::Zero;
const PixelType ONE = itk::NumericTraits<PixelType>::One;
This feels heavy and hard to read. As a programmer, I would prefer a more pragmatic version like:
const PixelType ZERO = 0;
const PixelType ONE = 1;
But is it entirely equivalent? I think the cast is done during the compilation so both versions should be identical in term of speed. If it's the case, why would anyone want to use itk::NumericTraits to get 0 and 1? There must be an advantage I'm not seeing.

Traits are typically used/useful in the context of generic programming. It's kind of heavily used in STL.
Lets consider your NumericTraits looks like below:
template <typename PixelT>
struct NumericTraits {
static const int ZERO = 0;
static const int ONE = 1;
};
In addition to this, you should or can constrain you template instance to a particular kind of type too..using enable_if et al.
Now, there comes a particular type of pixel which is special, how would you define ZERO and ONE for that ? Just specialize your NumericTraits
template <>
struct NumericTraits<SpecialPixel>{
static const int ZERO = 10;
static const int ONE = 20;
};
Got the idea and the usefulness? Now, another benefit of this is for converting value to type and then using it for tag dispatching:
void func(int some_val, std::true_type) {....}
void func(int some_val, std::false_type) {.....}
And call it like:
func(42, typename std::conditional<NumericTraits<PixelType>::ONE == 1, std::true_type, std::false_type>::type());
Which overload to call is decided at compile time here, relieving you from doing if - else checks and there by probably improving performance :)

Custom implementation of a bool vector with bit representation - how to implement operator[]

Disclaimer - this is a school assignment, however the problem is still interesting I hope!
I have implemented a custom class called Vector<bool>, which stores the bool entries as bits in an array of numbers.
Everything has gone fine except for implementing this:
bool& operator[](std::size_t index) {
validate_bounds(index);
???
}
The const implementation is quite straight forward, just reading out the value. Here however I can't really understand what to do, and the course is a specialization course on C++ so I'm guessing I should do some type-deffing or something. The data is represented by an array of type unsigned int and should be dynamic (e.g. push_back(bool value) should be implemented).

I solved this implementing a proxy class:
class BoolVectorProxy {
public:
explicit BoolVectorProxy(unsigned int& reference, unsigned char index) {
this->reference = &reference;
this->index = index;
}
void operator=(const bool v) {
if (v) *reference |= 1 << index;
else *reference &= ~(1 << index);
}
operator bool() const {
return (*reference >> index) & 1;
}
private:
unsigned int* reference;
unsigned char index;
};
And inside the main class:
BoolVectorProxy operator[](std::size_t index) {
validate_bound(index);
return BoolVectorProxy(array[index / BLOCK_CAPACITY], index % BLOCK_CAPACITY);
}
I also use Catch as a testing library, the code passes this test:
TEST_CASE("access and assignment with brackets", "[Vector]") {
Vector<bool> a(10);
a[0] = true;
a[0] = false;
REQUIRE(!a[0]);
a[1] = true;
REQUIRE(a[1]);
const Vector<bool> &b = a;
REQUIRE(!b[0]);
REQUIRE(b[1]);
a[0] = true;
REQUIRE(a[0]);
REQUIRE(b[0]);
REQUIRE(b.size() == 10);
REQUIRE_THROWS(a[-1]);
REQUIRE_THROWS(a[10]);
REQUIRE_THROWS(b[-1]);
REQUIRE_THROWS(b[10]);
}
If anyone finds any issues or improvements that can be made, please comment, thanks!

Basically implementing operator[] is the same as implementing const operator[] as you might expect, it's just that one is writable (lvalue) and the other is read only (rvalue).
I think you've got a understanding of the problem : you can convert an unsigned int into a bool using bitwise operations, and you can also say "if the nth bool is modified in X, do a bitwise operation with X and it's done !". But this operator means : I want a lvalue of the bool so I can modify it whenever I want and have an impact on the integer associated. It means that you want a reference of a bool, or in your case a reference of a single bit, so you can modify that bit on the fly. Unfortunately you can't reference a single bit, the smallest you can do is a whole byte (with char), so you would have to take a chunk of at least 7 other booleans with you. That's not what you want.
That being said, I understand that it might be for your assignment, but converting bools into multiple unsigned int is more like useless C optimization to me. You would be better with having a single array of bools (C-style), and doing the memory handling manually, because that is almost what you are doing. Plus with that method, you would actually be able to reference one single boolean (and be able to modify it) without touching the others. Is it mandatory that you have to use an array of unsigned int for this assignment ?

Void pointer values comparing C++

My actual question is it really possible to compare values contained in two void pointers, when you actually know that these values are the same type? For example int.
void compVoids(void *firstVal, void *secondVal){
if (firstVal < secondVal){
cout << "This will not make any sense as this will compare addresses, not values" << endl;
}
}
Actually I need to compare two void pointer values, while outside the function it is known that the type is int. I do not want to use comparison of int inside the function.
So this will not work for me as well: if (*(int*)firstVal > *(int*)secondVal)
Any suggestions?
Thank you very much for help!

In order to compare the data pointed to by a void*, you must know what the type is. If you know what the type is, there is no need for a void*. If you want to write a function that can be used for multiple types, you use templates:
template<typename T>
bool compare(const T& firstVal, const T& secondVal)
{
if (firstVal < secondVal)
{
// do something
}
return something;
}
To illustrate why attempting to compare void pointers blind is not feasible:
bool compare(void* firstVal, void* secondVal)
{
if (*firstVal < *secondVal) // ERROR: cannot dereference a void*
{
// do something
}
return something;
}
So, you need to know the size to compare, which means you either need to pass in a std::size_t parameter, or you need to know the type (and really, in order to pass in the std::size_t parameter, you have to know the type):
bool compare(void* firstVal, void* secondVal, std::size_t size)
{
if (0 > memcmp(firstVal, secondVal, size))
{
// do something
}
return something;
}
int a = 5;
int b = 6;
bool test = compare(&a, &b, sizeof(int)); // you know the type!
This was required in C as templates did not exist. C++ has templates, which make this type of function declaration unnecessary and inferior (templates allow for enforcement of type safety - void pointers do not, as I'll show below).
The problem comes in when you do something (silly) like this:
int a = 5;
short b = 6;
bool test = compare(&a, &b, sizeof(int)); // DOH! this will try to compare memory outside the bounds of the size of b
bool test = compare(&a, &b, sizeof(short)); // DOH! This will compare the first part of a with b. Endianess will be an issue.
As you can see, by doing this, you lose all type safety and have a whole host of other issues you have to deal with.

It is definitely possible, but since they are void pointers you must specify how much data is to be compared and how.
The memcmp function may be what you are looking for. It takes two void pointers and an argument for the number of bytes to be compared and returns a comparison. Some comparisons, however, are not contingent upon all of the data being equal. For example: comparing the direction of two vectors ignoring their length.
This question doesn't have a definite answer unless you specify how you want to compare the data.

You need to dereference them and cast, with
if (*(int*) firstVal < *(int*) secondVal)
Why do you not want to use the int comparison inside the function, if you know that the two values will be int and that you want to compare the int values that they're pointing to?
You mentioned a comparison function for comparing data on inserts; for a comparison function, I recommend this:
int
compareIntValues (void *first, void *second)
{
return (*(int*) first - *(int*) second);
}
It follows the convention of negative if the first is smaller, 0 if they're equal, positive if the first is larger. Simply call this function when you want to compare the int data.

yes. and in fact your code is correct if the type is unsigned int. casting int values to void pointer is often used even not recommended.
Also you could cast the pointers but you have to cast them directly to the int type:
if ((int)firstVal < (int)secondVal)
Note: no * at all.
You may have address model issues doing this though if you build 32 and 64 bits. Check the intptr_t type that you could use to avoid that.
if ((intptr_t)firstVal < (intptr_t)secondVal)

Recursive template for compile-time bit mask

I'm trying to create a compile-time bit mask using metaprograming techniques, my idea is to create something like this:
unsigned int Mask3 = Mask<2>(); // value = 0x03 = b00000000000000000000000000000011
unsigned int Mask3 = Mask<3>(); // value = 0x07 = b00000000000000000000000000000111
unsigned int Mask3 = Mask<7>(); // value = 0x7F = b00000000000000000000000001111111
The code that I'm trying is this:
template <const unsigned int N> const unsigned int Mask()
{
if (N <= 1)
{
return 1;
}
else
{
return ((1 << N) | Mask<N - 1>());
}
}
return 1;
But it result in tons pairs of warnings:
warning C4554: '<<' : check operator precedence for possible error
warning C4293: '<<' : shift count negative or too big
And in the end, the compile error:
error C1202: recursive type or function dependency context too complex.
So, I deduce that the recursivity never ends and falls into a compiler infinite loop but I'm don't understanding WHY.

As has already been pointed out, you're depending on a runtime check to
stop a compile time recursion, which can't work. More importantly,
perhaps, for what you want to do, is that you're defining a function,
which has no value until you call it. So even after you stop the
recursion with a specialization, you still have a nested sequence of
functions, which will be called at runtime.
If you want full compile time evaluation, you must define a static data
member of a class template, since that's the only way a compile time
constant can appear in a template. Something like:
template <unsigned int N>
struct Mask
{
static unsigned int const value = (1 << (N - 1)) | Mask<N - 1>::value;
};
template <>
struct Mask<0>
{
static unsigned int const value = 0;
};
(I've also corrected the numerical values you got wrong.)
Of course, you don't need anything this complicated. The following
should do the trick:
template <unsigned int N>
struct Mask
{
static unsigned int const value = (1 << (N + 1)) - 1;
};
template <>
struct Mask<0>
{
static unsigned int const value = 0;
};
(You still need the specialization for 0. Otherwise, 0 means all bits
set.)
Finally, of course: to access the value, you need to write something
like Mask<3>::value. (You might want to wrap this in a macro.)

It doesn't need to be recursive. This should work just fine :
template <const unsigned int N> const unsigned int Mask()
{
return ((1 << N) - 1);
}
It doesn't even need to be a template really. An (inlined) function is ok.
Note that if you want to support any value of N, specifically N >= sizeof(unsigned int) * CHAR_BIT, you probably want to treat those as a special case.

A template is created at compile time, but you are relying on run time behavior to stop the recursion.
For example, if you instantiate Mask<2>, it is going to use Mask<1>, which is going to use Mask<0>, which is going to use Mask<-1>, etc.
You have a runtime check for N being <= 1, but this doesn't help when it's compiling. It still creates an infinite sequence of functions.

To blunt template instantiation recursion you need to introduce one explicit specialization:
template <0> const unsigned int Mask()
{
return 1;
}
Your recursion never ends, because compiler tries to generate template implementation for both if-branches. So, when it generates Mask<0> it also generates Mask<0xffffffff> and so on

C++11 -- no recursion or templates:
constexpr unsigned mask(unsigned N) { return unsigned(~(-1<<N)); }

So far the answers only addressed the second error (C1202), but you asked more than that.
Warning C4554 is caused by a Microsoft compiler bug involving template parameters and the << operator. So, (1 << N) generates a warning. If N were an ordinary parameter, there would be no warning of course.
The very simple workaround is to use (1 << (N)) instead of (1 << N), and C4554 goes away!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why size of QByteArray is `int` rather than `unsigned int` - c++

Related

What is the type of string.size() in C++?

NumericTraits Zero and One

Custom implementation of a bool vector with bit representation - how to implement operator[]

Void pointer values comparing C++

Recursive template for compile-time bit mask

Categories

Resources