While debugging an issue in our codebase, I stumbled upon a problem which is quite similar to this sample problem below
#include <iostream>
#include <vector>
int main() {
std::vector<int> v;
int MAX = 100;
int result = (v.size() - 1) / MAX;
std::cout << result << std::endl;
return 0;
}
I would expect the output of the program should be 0 but it's -171798692.
Can someone help me understand this?
v.size() returns an unsigned value std::vector::size_type. This is typically size_t.
Arithmetic in unsigned value wraps around, and (v.size() - 1) will be 0xffffffffffffffff (18446744073709551615) if your size_t is 64-bit long.
Dividing this value with 100 yields 0x28F5C28F5C28F5C (184467440737095516).
Then, this result is converted to int. If your int is 32-bit long and the conversion is done by simple truncation, the value will be 0xF5C28F5C.
This value represents -171798692, which you got, in two's complement.
The problem is v.size() - 1.
The size() function returns an unsigned value. When you subtract 1 from unsigned 0 don't get -1 but rather a very large value.
Then you convert this large unsigned value back into a signed integer type which could turn it negative.
Not only that, but on a 64-bit system it's likely that size() returns a 64 bit value, while int stays 32 bits, making you loose half the data.
Vector v is empty, so v.size() is 0. Also v.size() is unsigned, so strange things happen when you subtract from that 0.
Related
#include<iostream>
#include<string>
#include<vector>
using namespace std;
int main()
{
std::string qaz{};
vector <size_t> index ;
cout <<"qaz: "<<qaz<<" length: "<<qaz.length()<<"\n";
for (size_t i{0}; i<= ( qaz.length()-2);i++ )
{ cout<<"Entered"<<i<<"\n";
cout<<"Exited"<<i<<"\n";}
return 0;
}
//Here qaz is an empty string so qaz.length() == 0 (so qaz.length()-2 == -2) and i is initialized to 0 so I expected that we will not enter the loop. But on running it I find that it goes on in an infinite loop. Why? Please help me with it.
See docs for size_t:
std::size_t is the unsigned integer type of the result of the sizeof operator
(Emphasis mine.)
Furthermore, string::length returns a size_t too1.
But even if that were not the case, when comparing signed values to unsigned values, the signed value is converted to unsigned before the comparison, as explained in this answer.
(size_t)0 - 2 will underflow as size_t is unsigned and therefore its minimum value is zero resulting in a large number which is usually2 either 232-2 or 264-2 depending on the processor architecture. Let's go with the latter, then you will get 18,446,744,073,709,552,000 as result.
Now, looking at the result of 0 <= 18446744073709552000 you can see that zero is clearly less than or equal to 18.4 quintillion, so the loop condition is fulfilled. In fact the loop is not infinite, it will loop exactly 18,446,744,073,709,552,001 times, but it's true you will probably not want to wait for it to finally reach its finishing point.
The solution is to avoid the underflow by comparing i + y <= x instead of i <= x - y3, i.e. i + 2 <= qaz.length(). You will then have 2 <= 0 which is false.
1: Technically, it returns an std::allocator<char>::size_type but that is defined as std::size_t.
2: To be exact, it is SIZE_MAX - (2 - 1) i.e. SIZE_MAX - 1 (see limits). In terms of numeric value, it could also be 216-2 - such as on an ATmega328P microcontroller - or some other value, but on the architectures you get on desktop computers at the current point in time it's most likely one of the two I mentioned. It depends on the width of the std::size_t type. If it's X bits wide, you'd get 2X-n for (size_t)0 - n for 0<n<2X. Since C++11 it is however guaranteed that std::size_t is no less than 16 bits wide.
3: However, in the unlikely case that your length is very large, specifically at least the number calculated above with 2X-2 or larger, this would result in an overflow instead. But in that case your whole logic would be flawed and you'd need a different approach. I think this can't be the case anyway because std::ssize support means that string lengths would have to have one unused bit to be repurposed as sign bit, but I think this answer went down various rabbit holes far enough already.
length() returns unsigned value, which cannot be below zero. 0u - 2 wraps around and becomes very large number.
Use i + 2 <= qaz.length() instead.
The issue is that size_t is unsigned. length() returns the strings size_type which is unsigned and most likely also size_t. When the strings size is <2 then length() -2 wraps around to yield a large unsigned value.
Since C++20 there is std::ssize which returns a signed value. Though you also have to adjust the type of i to get correct number of iterations also when i < -2 is the condition:
#include<iostream>
#include<string>
#include<vector>
using namespace std;
int main()
{
std::string qaz{};
vector <size_t> index ;
cout <<"qaz: "<<qaz<<" length: "<<qaz.length()<<"\n";
for (int i{0}; i<= ( std::ssize(qaz)-2);i++ )
{
cout<<"Entered"<<i<<"\n";
cout<<"Exited"<<i<<"\n";
}
}
Alternatively stay with unsigneds and use i+2 <= qaz.length().
This code gives the meaningful output
#include <iostream>
int main() {
unsigned int ui = 100;
unsigned int negative_ui = -22u;
std::cout << ui + negative_ui << std::endl;
}
Output:
78
The variable negative_ui stores -22, but is an unsigned int.
My question is why does unsigned int negative_ui = -22u; work.
How can an unsigned int store a negative number? Is it save to be used or does this yield undefined behaviour?
I use the intel compiler 18.0.3. With the option -Wall no warnings occurred.
Ps. I have read What happens if I assign a negative value to an unsigned variable? and Why unsigned int contained negative number
How can an unsigned int store a negative number?
It doesn't. Instead, it stores a representable number that is congruent with that negative number modulo the number of all representable values. The same is also true with results that are larger than the largest representable value.
Is it save to be used or does this yield undefined behaviour?
There is no UB. Unsigned arithmetic overflow is well defined.
It is safe to rely on the result. However, it can be brittle. For example, if you add -22u and 100ull, then you get UINT_MAX + 79 (i.e. a large value assuming unsigned long long is a larger type than unsigned) which is congruent with 78 modulo UINT_MAX + 1 that is representable in unsigned long long but not representable in unsigned.
Note that signed arithmetic overflow is undefined.
Signed/Unsigned is a convention. It uses the last bit of the variable (in case of x86 int, the last 31th bit). What you store in the variable takes the full bit length.
It's the calculations that follow that take the upper bit as a sign indicator or ignore it. Therefore, any "unsigned" variable can contain a signed value which will be converted to the unsigned form when the unsigned variable participates in a calculation.
unsigned int x = -1; // x is now 0xFFFFFFFF.
x -= 1; // x is now 0xFFFFFFFE.
if (x < 0) // false. x is compared as 0xFFFFFFFE.
int x = -1; // x stored as 0xFFFFFFFF
x -= 1; // x stored as 0xFFFFFFFE
if (x < 0) // true, x is compared as -2.
Technically valid, bad programming.
if i used nounce = 32766 it only gives 1 time output but for 32767 it goes to infinite loop..... why ?? same thing happen when i used int
#include<iostream>
using namespace std;
class Mining
{
short int Nounce;
};
int main()
{
Mining Mine;
Mine.Nounce = 32767;
for (short int i = 0; i <= Mine.Nounce; i++)
{
if (i == Mine.Nounce)
{
cout << " Nounce is " << i << endl;
}
}
return 0;
}
When you use the largest possible positive value, every other value will be <= to it, so this loop goes on forever:
for(short int i=0;i<=Mine.Nounce;i++)
You can see that 32767 is the largest value for a short on your platform by using numeric_limits:
std::cout << std::numeric_limits<short>::max() << std::endl; //32767
When i reaches 32767, i++ will attempt to increment it. This is undefined behavior because of signed overflow, however most implementations (like your own apparently) will simply roll over to the maximum negative value, and then i++ will happily increment up again.
Numeric types have a limit to the range of values they can represent. It seems like the maximum value a int short can store on your platform is 32767. So i <= 32767 is necessarily true, there exists no int short that is larger than 32767 on your platform. This is also why the compiler complains when you attempt to assign 100000 to Mine.Nounce, it cannot represent that value. See std::numeric_limits to find out what the limits are for your platform.
To increment a signed integer variable that already has the largest possible representable value is undefined behavior. Your loop will eventually try to execute i++ when i == 32767 which will lead to undefined behavior.
Consider using a larger integer type. int is at least 32 bit on the majority of platforms, which would allow it to represent values up to 2147483647. You could also consider using unsigned short which on your platform would likely be able to represent values up to 65535.
In your for loop, i will never be greater than the value of Mine.Nounce because of the way that shorts are represented in memory. Most implementations use 2 bytes for a short with one bit for the sign bit. Therefore , the maximum value that can be represented by a signed short is 2^15 - 1 = 32767.
It goes to infinite loop because your program exhibits undefined behavior due to a signed integer overflow.
Variable i of type short overflows after it reaches the value of Mine.Nounce which is 32767 which is probably the max value short can hold on your implementation. You should change your condition to:
i < Mine.Nounce
which will keep the value of i at bay.
int main()
{
unsigned n;
cin>>n;
for(int i=(1<<31);i>0;i/=2)
(i&n)?(cout<<1):(cout<<0);
}
I ran the following code with n=1 but it prints nothing on the console. Changing the type of variable i to unsigned did the trick and printed 00000000000000000000000000000001. Any idea why?
Assuming two's complement, 1 << 31 results in a negative value, so your test for i > 0 fails immediately with the first test. You most likely would have had more luck with i != 0 then.
But we aware that 1 << 31 is a signed integer overflow, which is undefined behaviour anyway! So you should do 1U << 31 instead, too. If you assign this then positive value to a signed int, which is not capable to hold it, you have again undefined behaviour. So the correct for loop would look like this:
for(unsigned int i = 1U << 31; i > 0; i /= 2)
Although i /= 2 for unsigned values is equivalent to a bitshift (and is likely to be compiled to), I would prefere the bitshift operation explicitly here (i >>= 1), as this is what you actually intend.
Given that your platform is a 32-bit one, int i with a value of (i<<31) is a negative number. So, the execution never enters for-loop because you want i>0.
I am going through C++ Primer 5th Edition and am currently doing the signed/unsigned section. A quick question I have is when there is a wrap-around, say, in this block of code:
unsigned u = 10;
int i = -42;
std::cout << i + i << std::endl; // prints -84
std::cout << u + i << std::endl; // if 32-bit ints, prints 4294967264
I thought that the max range was 4294967295 with the 0 being counted, so I was wondering why the wrap-around seems to be done from 4294967296 in this problem.
Unsigned arithmetic is modulo (maximum value of the type plus 1).
If maximum value of an unsigned is 4294967295 (2^32 - 1), the result will be mathematically equal to (10-42) modulo 4294967296 which equals 10-42+4294967296 i.e. 4294967264
When an out-of-range value is converted to an unsigned type, the result is the remainder of it modulo the number of values the target unsigned type can hold. For instance, the result of n converted to unsigned char is n % 256, because unsigned char can hold values 0 to 255.
It's similar in your example, the wrap-around is done using 4294967296, the number of values that a 32-bit unsigned integer can hold.
Given unsigned int that is 32 bits you're correct that the range is [0, 4294967295].
Therefore -1 is 4294967295. Which is logically equivalent to 4294967296 - 1 which should explain the behavior you're seeing.