C++: Division by vector.size() gives strange results [duplicate]

C++: Division by vector.size() gives strange results [duplicate] - c++

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
int divided by unsigned int causing rollover
Hi I am doing the following:
struct coord{
int col;
};
int main(int argc, char* argv[]) {
coord c;
c.col = 0;
std::vector<coord> v;
for(int i = 0; i < 5; i++){
v.push_back(coord());
}
c.col += -13;
cout << " c.col is " << c.col << endl;
cout << " v size is " << v.size() << endl;
c.col /= v.size();
cout << c.col << endl;
}
and I get the following output:
c.col is -13
v size is 5
858993456
However, if I change the division line to c.col /= ((int)v.size()); I get the expected output:
c.col is -13
v size is 5
-2
Why is this?

This is a consequence of v.size() being unsigned.
See int divided by unsigned int causing rollover

The problem is that vector< ... >::size() returns size_t, which is a typedef for an unigned integer type. Obviously the problem arises when you divide a signed integer with an unsigned one.

std::vector::size returns a size_t which is an unsigned integer type, usually unsigned int. When you perform an arithmetic operation with an int and an unsigned int, the int operand is converted to unsigned int to perform the operation. In this case, -13 is converted to unsigned int, which is some number close to 4294967295 (FFFFFFFF in hexadecimal). And then that is divided by 5.

As stated, the reason is that a signed / unsigned division is performed by first converting the signed value to unsigned.
So, you need to prevent this by manually converting the unsigned value to a signed type.
There's a risk that v.size() could be too big for an int. But since the dividend does fit in an int, the result of the division is fairly boring when the divisor is bigger than that. So assuming 2's complement and no padding bits:
if (v.size() <= INT_MAX) {
c.col /= int(v.size());
} else if (c.col == INT_MIN && v.size() - 1 == INT_MAX) {
c.col = -1;
} else {
c.col = (-1 / 2);
}
In C++03, it's implementation-defined whether a negative value divided by a larger positive value is 0 or -1, hence the funny (-1 / 2). In C++11 you can just use 0.
To cover other representations than 2's complement you need to deal with the special cases differently.

Related

Unexpected boolean result with vector size and -1 [duplicate]

Please take a look at this simple program:
#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<int> a;
std::cout << "vector size " << a.size() << std::endl;
int b = -1;
if (b < a.size())
std::cout << "Less";
else
std::cout << "Greater";
return 0;
}
I'm confused by the fact that it outputs "Greater" despite it's obvious that -1 is less than 0. I understand that size method returns unsigned value but comparison is still applied to -1 and 0. So what's going on? can anyone explain this?

Because the size of a vector is an unsigned integral type. You are comparing an unsigned type with a signed one, and the two's complement negative signed integer is being promoted to unsigned. That corresponds to a large unsigned value.
This code sample shows the same behaviour that you are seeing:
#include <iostream>
int main()
{
std::cout << std::boolalpha;
unsigned int a = 0;
int b = -1;
std::cout << (b < a) << "\n";
}
output:
false

The signature for vector::size() is:
size_type size() const noexcept;
size_type is an unsigned integral type. When comparing an unsigned and a signed integer, the signed one is promoted to unsigned. Here, -1 is negative so it rolls over, effectively yielding the maximal representable value of the size_type type. Hence it will compare as greater than zero.

-1 unsigned is a higher value than zero because the high bit is set to indicate that it's negative but unsigned comparison uses this bit to expand the range of representable numbers so it's no longer used as a sign bit. The comparison is done as (unsigned int)-1 < 0 which is false.

why for (int i=0; i<-1; i++) runs? [duplicate]

Please take a look at this simple program:
#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<int> a;
std::cout << "vector size " << a.size() << std::endl;
int b = -1;
if (b < a.size())
std::cout << "Less";
else
std::cout << "Greater";
return 0;
}
I'm confused by the fact that it outputs "Greater" despite it's obvious that -1 is less than 0. I understand that size method returns unsigned value but comparison is still applied to -1 and 0. So what's going on? can anyone explain this?

Because the size of a vector is an unsigned integral type. You are comparing an unsigned type with a signed one, and the two's complement negative signed integer is being promoted to unsigned. That corresponds to a large unsigned value.
This code sample shows the same behaviour that you are seeing:
#include <iostream>
int main()
{
std::cout << std::boolalpha;
unsigned int a = 0;
int b = -1;
std::cout << (b < a) << "\n";
}
output:
false

The signature for vector::size() is:
size_type size() const noexcept;
size_type is an unsigned integral type. When comparing an unsigned and a signed integer, the signed one is promoted to unsigned. Here, -1 is negative so it rolls over, effectively yielding the maximal representable value of the size_type type. Hence it will compare as greater than zero.

-1 unsigned is a higher value than zero because the high bit is set to indicate that it's negative but unsigned comparison uses this bit to expand the range of representable numbers so it's no longer used as a sign bit. The comparison is done as (unsigned int)-1 < 0 which is false.

Sum signed 32-bit int with unsigned 64bit int

On my application, I receive two signed 32-bit int and I have to store them. I have to create a sort of counter and I don't know when it will be reset, but I'll receive big values and frequently. Beacause of that, in order to store these values, I decided to use two unsigned 64-bit int.
The following could be a simple version of the counter.
struct Counter
{
unsigned int elementNr;
unsigned __int64 totalLen1;
unsigned __int64 totalLen2;
void UpdateCounter(int len1, int len2)
{
if(len1 > 0 && len2 > 0)
{
++elementNr;
totalLen1 += len1;
totalLen2 += len2;
}
}
}
I know that if a smaller type is casted to a bigger one (e.g. int to long) there should be no issues. However, passing from 32 bit rappresentation to 64 bit rappresentation and from signed to unsigned at the same time, is something new for me.
Reading around, I undertood that len1 should be expanded from 32 bit to 64 bit and then applied sign extension. Because the unsigned int and signen int have the same rank (Section 4.13), the latter should be converted.
If len1 stores a negative value, passing from signed to unsigned will return a wrong value, this is why I check the positivy at the beginning of the function. However, for positive values, there
should be no issues I think.
For clarity I could revrite UpdateCounter(int len1, int len2) like this
void UpdateCounter(int len1, int len2)
{
if(len1 > 0 && len2 > 0)
{
++elementNr;
__int64 tmp = len1;
totalLen1 += static_cast<unsigned __int64>(tmp);
tmp = len2;
totalLen2 += static_cast<unsigned __int64>(tmp);
}
}
Might there be some side effects that I have not considered.
Is there another better and safer way to do that?

A little background, just for reference: binary operators such arithmetic addition work on operands of the same type (the specific CPU instruction to which is translated depends on the number representation that must be the same for both instruction operands).
When you write something like this (using fixed width integer types to be explicit):
int32_t a = <some value>;
uint64_t sum = 0;
sum += a;
As you already know this involves an implicit conversion, more specifically an
integral promotion according to integer conversion rank.
So the expression sum += a; is equivalent to sum += static_cast<uint64_t>(a);, so a is promoted having the lesser rank.
Let's see what happens in this example:
int32_t a = 60;
uint64_t sum = 100;
sum += static_cast<uint64_t>(a);
std::cout << "a=" << static_cast<uint64_t>(a) << " sum=" << sum << '\n';
The output is:
a=60 sum=160
So all is all ok as expected. Let's se what happens adding a negative number:
int32_t a = -60;
uint64_t sum = 100;
sum += static_cast<uint64_t>(a);
std::cout << "a=" << static_cast<uint64_t>(a) << " sum=" << sum << '\n';
The output is:
a=18446744073709551556 sum=40
The result is 40 as expected: this relies on the two's complement integer representation (note: unsigned integer overflow is not undefined behaviour) and all is ok, of course as long as you ensure that the sum does not become negative.
Coming back to your question you won't have any surprises if you always add positive numbers or at least ensuring that sum will never be negative... until you reach the maximum representable value std::numeric_limits<uint64_t>::max() (2^64-1 = 18446744073709551615 ~ 1.8E19).
If you continue to add numbers indefinitely sooner or later you'll reach that limit (this is valid also for your counter elementNr).
You'll overflow the 64 bit unsigned integer by adding 2^31-1 (2147483647) every millisecond for approximately three months, so in this case it may be advisable to check:
#include <limits>
//...
void UpdateCounter(const int32_t len1, const int32_t len2)
{
if( len1>0 )
{
if( static_cast<decltype(totalLen1)>(len1) <= std::numeric_limits<decltype(totalLen1)>::max()-totalLen1 )
{
totalLen1 += len1;
}
else
{// Would overflow!!
// Do something
}
}
}
When I have to accumulate numbers and I don't have particular requirements about accuracy I often use double because the maximum representable value is incredibly high (std::numeric_limits<double>::max() 1.79769E+308) and to reach overflow I would need to add 2^32-1=4294967295 every picoseconds for 1E+279 years.

Why does this stl function call result in an incorrect boolean evaluation? [duplicate]

This question already has answers here:
c++ vector size. why -1 is greater than zero
(3 answers)
Closed 4 years ago.
I was humbly coding away when I ran into a strange situation involving checking the size of a vector. An isolated version of the issue is listed below:
#include <iostream>
#include <string>
#include <vector>
int main() {
std::vector<std::string> cw = {"org","app","tag"};
int j = -1;
int len = cw.size();
bool a = j>=cw.size();
bool b = j>=len;
std::cout<<"cw.size(): "<<cw.size()<<std::endl;
std::cout<<"len: "<<len<<std::endl;
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
return 0;
}
Compiling with both g++ and clang++ (with the -std=c++11 flag) and running results in the following output:
cw.size(): 3
len: 3
1
0
why does j >= cw.size() evaluate to true? A little experimenting that any negative value for j results in this weird discrepancy.

The pitfalls here are signed integral conversions that apply when you compare a signed integral value with an unsigned one. In such a case, the signed value will be converted to an unsigned one, and if the value was negative, it will get UINT_MAX - val + 1. So -1 will be converted to a very large number before comparison.
However, when you assign an unsigned value to a signed one, like int len = vec.size(), then the unsigned value will become a signed one, so (unsigned)10 will get (signed)10, for example. And a comparison between two signed ints will not convert any of the both operands and will work as expected.
You can simulate this rather easy:
int main() {
int j = -1;
bool a = j >= (unsigned int)10; // signed >= unsigned; will convert j to unsigned int, yielding 4294967295
bool b = j >= (signed int)10; // signed >= signed; will not convert j
cout << a << endl << b << endl;
unsigned int j_unsigned = j;
cout << "unsigned_j: " << j_unsigned << endl;
}
Output:
1
0
unsigned_j: 4294967295

Integer promotion unsigned in c++

int main() {
unsigned i = 5;
int j = -10;
double d = i + j;
long l = i + j;
int k = i + j;
std::cout << d << "\n"; //4.29497e+09
std::cout << l << "\n"; //4294967291
std::cout << k << "\n"; //-5
std::cout << i + j << "\n"; //4294967291
}
I believe signed int is promoted to unsigned before doing the arithmetic operators.
While -10 is converted to unsigned unsigned integer underflow (is this the correct term??) will occur and after addition it prints 4294967291.
Why this is not happening in the case of int k which print -5?

The process of doing the arithmetic operator involves a conversion to make the two values have the same type. The name for this process is finding the common type, and for the case of int and unsigned int, the conversions are called usual arithmetic conversions. The term promotion is not used in this particular case.
In the case of i + j, the int is converted to unsigned int, by adding UINT_MAX + 1 to it. So the result of i + j is UINT_MAX - 4, which on your system is 4294967291.
You then store this value in various data types; the only output that needs further explanation is k. The value UINT_MAX - 4 cannot fit in int. This is called out-of-range assignment and the resulting value is implementation-defined. On your system it apparently assigns the int value which has the same representation as the unsigned int value.

j will be converted to unsigned int before addition, and this happens in all your i + j. A quick experiment.
In the case of int k = i + j. As in the case of your implementation and mine, i + j produces: 4294967291. 4294967291 is larger than std::numeric_limits<int>::max(), the behavior is going to be implementation defined. Why not try assigning 4294967291 to an int?
#include <iostream>
int main(){
int k = 4294967291;
std::cout << k << std::endl;
}
Produces:
-5
As seen Here

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++: Division by vector.size() gives strange results [duplicate] - c++

This is a consequence of v.size() being unsigned. See int divided by unsigned int causing rollover

The problem is that vector< ... >::size() returns size_t, which is a typedef for an unigned integer type. Obviously the problem arises when you divide a signed integer with an unsigned one.

Related

Unexpected boolean result with vector size and -1 [duplicate]

why for (int i=0; i<-1; i++) runs? [duplicate]

Sum signed 32-bit int with unsigned 64bit int

Why does this stl function call result in an incorrect boolean evaluation? [duplicate]

Integer promotion unsigned in c++

Categories

Resources