Why does this loop run? - c++

#include<iostream>
#include<string>
#include<vector>
using namespace std;
int main()
{
std::string qaz{};
vector <size_t> index ;
cout <<"qaz: "<<qaz<<" length: "<<qaz.length()<<"\n";
for (size_t i{0}; i<= ( qaz.length()-2);i++ )
{ cout<<"Entered"<<i<<"\n";
cout<<"Exited"<<i<<"\n";}
return 0;
}
//Here qaz is an empty string so qaz.length() == 0 (so qaz.length()-2 == -2) and i is initialized to 0 so I expected that we will not enter the loop. But on running it I find that it goes on in an infinite loop. Why? Please help me with it.

See docs for size_t:
std::size_t is the unsigned integer type of the result of the sizeof operator
(Emphasis mine.)
Furthermore, string::length returns a size_t too1.
But even if that were not the case, when comparing signed values to unsigned values, the signed value is converted to unsigned before the comparison, as explained in this answer.
(size_t)0 - 2 will underflow as size_t is unsigned and therefore its minimum value is zero resulting in a large number which is usually2 either 232-2 or 264-2 depending on the processor architecture. Let's go with the latter, then you will get 18,446,744,073,709,552,000 as result.
Now, looking at the result of 0 <= 18446744073709552000 you can see that zero is clearly less than or equal to 18.4 quintillion, so the loop condition is fulfilled. In fact the loop is not infinite, it will loop exactly 18,446,744,073,709,552,001 times, but it's true you will probably not want to wait for it to finally reach its finishing point.
The solution is to avoid the underflow by comparing i + y <= x instead of i <= x - y3, i.e. i + 2 <= qaz.length(). You will then have 2 <= 0 which is false.
1: Technically, it returns an std::allocator<char>::size_type but that is defined as std::size_t.
2: To be exact, it is SIZE_MAX - (2 - 1) i.e. SIZE_MAX - 1 (see limits). In terms of numeric value, it could also be 216-2 - such as on an ATmega328P microcontroller - or some other value, but on the architectures you get on desktop computers at the current point in time it's most likely one of the two I mentioned. It depends on the width of the std::size_t type. If it's X bits wide, you'd get 2X-n for (size_t)0 - n for 0<n<2X. Since C++11 it is however guaranteed that std::size_t is no less than 16 bits wide.
3: However, in the unlikely case that your length is very large, specifically at least the number calculated above with 2X-2 or larger, this would result in an overflow instead. But in that case your whole logic would be flawed and you'd need a different approach. I think this can't be the case anyway because std::ssize support means that string lengths would have to have one unused bit to be repurposed as sign bit, but I think this answer went down various rabbit holes far enough already.

length() returns unsigned value, which cannot be below zero. 0u - 2 wraps around and becomes very large number.
Use i + 2 <= qaz.length() instead.

The issue is that size_t is unsigned. length() returns the strings size_type which is unsigned and most likely also size_t. When the strings size is <2 then length() -2 wraps around to yield a large unsigned value.
Since C++20 there is std::ssize which returns a signed value. Though you also have to adjust the type of i to get correct number of iterations also when i < -2 is the condition:
#include<iostream>
#include<string>
#include<vector>
using namespace std;
int main()
{
std::string qaz{};
vector <size_t> index ;
cout <<"qaz: "<<qaz<<" length: "<<qaz.length()<<"\n";
for (int i{0}; i<= ( std::ssize(qaz)-2);i++ )
{
cout<<"Entered"<<i<<"\n";
cout<<"Exited"<<i<<"\n";
}
}
Alternatively stay with unsigneds and use i+2 <= qaz.length().

Related

Binary search doesn't work for large numbers

I am trying to solve problem of find square root of given number using binary search on c++ which works perfect for small numbers, but if input >= 2000000000 it doesn't work at all
code:
int main() {
int n; cin >> n;
int l = 0, r = n + 1;
while (r - l > 1) {
int m = (r + l) / 2;
if (m * m <= n) {
l = m;
} else {
r = m;
}
}
cout << l;
return 0;
}
some tests:
1
1
16
4
but
2000000000000000
-3456735426738
can't understand why...
tested the same code on python, it works good
probably it's some c++ feature which i don't know
A number n >= 2000000000 surely works, as long as it doesn't reach it's maximum allowed value (more on that shortly).
Because it seems you're not familiar with data types and their sizes in C and C++, I'll keep it simple.
A type of int is normally 4 bytes (yes, I said "normally" as there are exceptions to this rule - this is a different discussion regarding platforms and their architecture, for now, take the simple explanation that it's 4 bytes in most cases), meaning 32 bits. It can be signed or unsigned.
Minor caveat: when unsigned is not explicitly specified, then it's considered to be signed by default, so int x; would mean that x can take negative values as well
A signed int (signed, meaning it has both, positive and negative numbers, so apart from zero and the maximum negative number, you'd have each value "twice", once with + and one more time with -, hence the terminology of signed) has the following ranges: -2147483648 to +2147483647.
To "increase" the maximum allowed value, you'd need an unsigned int. Its range is 0 to 4294967295.
There are "bigger" types in C and C++ but I think that discussion is slightly more advanced. Short version is this: for a 64bit integer, if you're using GCC you can use uint64_t, if you're using MSVS you can either use __int64, but you can also use uint64_t.
For even larger values, well... it gets really complicated. Python has native support for larger numbers, that is why it works there from the get-go.
You need to check the data types available in C and C++, preferably read up on C-17 (the 2017 standard on C, which is the newest released) and C++20 (the 2020 standard for C++). The roadmap says the next standard update for both would be in 2023 (so fingers crossed :) ).
Regarding your code, however, also keep in mind what molbdnilo and ALX23z said regarding overflowing, in their comments. Even if you would cover sufficient data type ranges, there's still a risk of overflowing due to mistakes in your code:
molbdnilo: m * m overflows
ALX23z: Instead of m * m <= n write m < n/m. And inspect better the case when m == n/m

Can someone explain me the output of this simple program in C++?

While debugging an issue in our codebase, I stumbled upon a problem which is quite similar to this sample problem below
#include <iostream>
#include <vector>
int main() {
std::vector<int> v;
int MAX = 100;
int result = (v.size() - 1) / MAX;
std::cout << result << std::endl;
return 0;
}
I would expect the output of the program should be 0 but it's -171798692.
Can someone help me understand this?
v.size() returns an unsigned value std::vector::size_type. This is typically size_t.
Arithmetic in unsigned value wraps around, and (v.size() - 1) will be 0xffffffffffffffff (18446744073709551615) if your size_t is 64-bit long.
Dividing this value with 100 yields 0x28F5C28F5C28F5C (184467440737095516).
Then, this result is converted to int. If your int is 32-bit long and the conversion is done by simple truncation, the value will be 0xF5C28F5C.
This value represents -171798692, which you got, in two's complement.
The problem is v.size() - 1.
The size() function returns an unsigned value. When you subtract 1 from unsigned 0 don't get -1 but rather a very large value.
Then you convert this large unsigned value back into a signed integer type which could turn it negative.
Not only that, but on a 64-bit system it's likely that size() returns a 64 bit value, while int stays 32 bits, making you loose half the data.
Vector v is empty, so v.size() is 0. Also v.size() is unsigned, so strange things happen when you subtract from that 0.

why ouput goes to infinite loop

if i used nounce = 32766 it only gives 1 time output but for 32767 it goes to infinite loop..... why ?? same thing happen when i used int
#include<iostream>
using namespace std;
class Mining
{
short int Nounce;
};
int main()
{
Mining Mine;
Mine.Nounce = 32767;
for (short int i = 0; i <= Mine.Nounce; i++)
{
if (i == Mine.Nounce)
{
cout << " Nounce is " << i << endl;
}
}
return 0;
}
When you use the largest possible positive value, every other value will be <= to it, so this loop goes on forever:
for(short int i=0;i<=Mine.Nounce;i++)
You can see that 32767 is the largest value for a short on your platform by using numeric_limits:
std::cout << std::numeric_limits<short>::max() << std::endl; //32767
When i reaches 32767, i++ will attempt to increment it. This is undefined behavior because of signed overflow, however most implementations (like your own apparently) will simply roll over to the maximum negative value, and then i++ will happily increment up again.
Numeric types have a limit to the range of values they can represent. It seems like the maximum value a int short can store on your platform is 32767. So i <= 32767 is necessarily true, there exists no int short that is larger than 32767 on your platform. This is also why the compiler complains when you attempt to assign 100000 to Mine.Nounce, it cannot represent that value. See std::numeric_limits to find out what the limits are for your platform.
To increment a signed integer variable that already has the largest possible representable value is undefined behavior. Your loop will eventually try to execute i++ when i == 32767 which will lead to undefined behavior.
Consider using a larger integer type. int is at least 32 bit on the majority of platforms, which would allow it to represent values up to 2147483647. You could also consider using unsigned short which on your platform would likely be able to represent values up to 65535.
In your for loop, i will never be greater than the value of Mine.Nounce because of the way that shorts are represented in memory. Most implementations use 2 bytes for a short with one bit for the sign bit. Therefore , the maximum value that can be represented by a signed short is 2^15 - 1 = 32767.
It goes to infinite loop because your program exhibits undefined behavior due to a signed integer overflow.
Variable i of type short overflows after it reaches the value of Mine.Nounce which is 32767 which is probably the max value short can hold on your implementation. You should change your condition to:
i < Mine.Nounce
which will keep the value of i at bay.

why declare "score[11] = {};" and "grade" as "unsigned" instead of "int'

I'm new to C++ and is trying to learn the concept of array. I saw this code snippet online. For the sample code below, does it make any difference to declare:
unsigned scores[11] = {};
unsigned grade;
as:
int scores[11] = {};
int grade;
I guess there must be a reason why score[11] = {}; and grade is declared as unsigned, but what is the reason behind it?
int main() {
unsigned scores[11] = {};
unsigned grade;
while (cin >> grade) {
if (0 <= grade <= 100) {
++scores[grade / 10];
}
}
for (int i = 0; i < 11; i++) {
cout << scores[i] << endl;
}
}
unsigned means that the variable will not hold a negative values (or even more accurate - It will not care about the sign-). It seems obvious that scores and grades are signless values (no one scores -25). So, it is natural to use unsigned.
But note that: if (0 <= grade <= 100) is redundant. if (grade <= 100) is enough since no negative values are allowed.
As Blastfurnace commented, if (0 <= grade <= 100) is not right even. if you want it like this you should write it as:
if (0 <= grade && grade <= 100)
Unsigned variables
Declaring a variable as unsigned int instead of int has 2 consequences:
It can't be negative. It provides you a guarantee that it never will be and therefore you don't need to check for it and handle special cases when writing code that only works with positive integers
As you have a limited size, it allows you to represent bigger numbers. On 32 bits, the biggest unsigned int is 4294967295 (2^32-1) whereas the biggest int is 2147483647 (2^31-1)
One consequence of using unsigned int is that arithmetic will be done in the set of unsigned int. So 9 - 10 = 4294967295 instead of -1 as no negative number can be encoded on unsigned int type. You will also have issues if you compare them to negative int.
More info on how negative integer are encoded.
Array initialization
For the array definition, if you just write:
unsigned int scores[11];
Then you have 11 uninitialized unsigned int that have potentially values different than 0.
If you write:
unsigned int scores[11] = {};
Then all int are initialized with their default value that is 0.
Note that if you write:
unsigned int scores[11] = { 1, 2 };
You will have the first int intialized to 1, the second to 2 and all the others to 0.
You can easily play a little bit with all these syntax to gain a better understanding of it.
Comparison
About the code:
if(0 <= grade <= 100)
as stated in the comments, this does not do what you expect. In fact, this will always evaluate to true and therefore execute the code in the if. Which means if you enter a grade of, say, 20000, you should have a core dump. The reason is that this:
0 <= grade <= 100
is equivalent to:
(0 <= grade) <= 100
And the first part is either true (implicitly converted to 1) or false (implicitly converted to 0). As both values are lower than 100, the second comparison is always true.
unsigned integers have some strange properties and you should avoid them unless you have a good reason. Gaining 1 extra bit of positive size, or expressing a constraint that a value may not be negative, are not good reasons.
unsigned integers implement arithmetic modulo UINT_MAX+1. By contrast, operations on signed integers represent the natural arithmetic that we are familiar with from school.
Overflow semantics
unsigned has well defined overflow; signed does not:
unsigned u = UINT_MAX;
u++; // u becomes 0
int i = INT_MAX;
i++; // undefined behaviour
This has the consequence that signed integer overflow can be caught during testing, while an unsigned overflow may silently do the wrong thing. So use unsigned only if you are sure you want to legalize overflow.
If you have a constraint that a value may not be negative, then you need a way to detect and reject negative values; int is perfect for this. An unsigned will accept a negative value and silently overflow it into a positive value.
Bit shift semantics
Bit shift of unsigned by an amount not greater than the number of bits in the data type is always well defined. Until C++20, bit shift of signed was undefined if it would cause a 1 in the sign bit to be shifted left, or implementation-defined if it would cause a 1 in the sign bit to be shifted right. Since C++20, signed right shift always preserves the sign, but signed left shift does not. So use unsigned for some kinds of bit twiddling operations.
Mixed sign operations
The built-in arithmetic operations always operate on operands of the same type. If they are supplied operands of different types, the "usual arithmetic conversions" coerce them into the same type, sometimes with surprising results:
unsigned u = 42;
std::cout << (u * -1); // 4294967254
std::cout << std::boolalpha << (u >= -1); // false
What's the difference?
Subtracting an unsigned from another unsigned yields an unsigned result, which means that the difference between 2 and 1 is 4294967295.
Double the max value
int uses one bit to represent the sign of the value. unsigned uses this bit as just another numerical bit. So typically, int has 31 numerical bits and unsigned has 32. This extra bit is often cited as a reason to use unsigned. But if 31 bits are insufficient for a particular purpose, then most likely 32 bits will also be insufficient, and you should be considering 64 bits or more.
Function overloading
The implicit conversion from int to unsigned has the same rank as the conversion from int to double, so the following example is ill formed:
void f(unsigned);
void f(double);
f(42); // error: ambiguous call to overloaded function
Interoperability
Many APIs (including the standard library) use unsigned types, often for misguided reasons. It is sensible to use unsigned to avoid mixed-sign operations when interacting with these APIs.
Appendix
The quoted snippet includes the expression 0 <= grade <= 100. This will first evaluate 0 <= grade, which is always true, because grade can't be negative. Then it will evaluate true <= 100, which is always true, because true is converted to the integer 1, and 1 <= 100 is true.
Yes it does make a difference. In the first case you declare an array of 11 elements a variable of type "unsigned int". In the second case you declare them as ints.
When the int is on 32 bits you can have values from the following ranges
–2,147,483,648 to 2,147,483,647 for plain int
0 to 4,294,967,295 for unsigned int
You normally declare something unsigned when you don't need negative numbers and you need that extra range given by unsigned. In your case I assume that that by declaring the variables unsigned, the developer doesn't accept negative scores and grades. You basically do a statistic of how many grades between 0 and 10 were introduced at the command line. So it looks like something to simulate a school grading system, therefore you don't have negative grades. But this is my opinion after reading the code.
Take a look at this post which explains what unsigned is:
what is the unsigned datatype?
As the name suggests, signed integers can be negative and unsigned cannot be. If we represent an integer with N bits then for unsigned the minimum value is 0 and the maximum value is 2^(N-1). If it is a signed integer of N bits then it can take the values from -2^(N-2) to 2^(N-2)-1. This is because we need 1-bit to represent the sign +/-
Ex: signed 3-bit integer (yes there are such things)
000 = 0
001 = 1
010 = 2
011 = 3
100 = -4
101 = -3
110 = -2
111 = -1
But, for unsigned it just represents the values [0,7]. The most significant bit (MSB) in the example signifies a negative value. That is, all values where the MSB is set are negative. Hence the apparent loss of a bit in its absolute values.
It also behaves as one might expect. If you increment -1 (111) we get (1 000) but since we don't have a fourth bit it simply "falls off the end" and we are left with 000.
The same applies to subtracting 1 from 0. First take the two's complement
111 = twos_complement(001)
and add it to 000 which yields 111 = -1 (from the table) which is what one might expect. What happens when you increment 011(=3) yielding 100(=-4) is perhaps not what one might expect and is at odds with our normal expectations. These overflows are troublesome with fixed point arithmetic and have to be dealt with.
One other thing worth pointing out is the a signed integer can take one negative value more than it can positive which has a consequence for rounding (when using integer to represent fixed point numbers for example) but am sure that's better covered in the DSP or signal processing forums.

Can n %= m ever return negative value for very large nonnegative n and m?

This question is regarding the modulo operator %. We know in general a % b returns the remainder when a is divided by b and the remainder is greater than or equal to zero and strictly less than b. But does the above hold when a and b are of magnitude 10^9 ?
I seem to be getting a negative output for the following code for input:
74 41 28
However changing the final output statement does the work and the result becomes correct!
#include<iostream>
using namespace std;
#define m 1000000007
int main(){
int n,k,d;
cin>>n>>k>>d;
if(d>n)
cout<<0<<endl;
else
{
long long *dp1 = new long long[n+1], *dp2 = new long long[n+1];
//build dp1:
dp1[0] = 1;
dp1[1] = 1;
for(int r=2;r<=n;r++)
{
dp1[r] = (2 * dp1[r-1]) % m;
if(r>=k+1) dp1[r] -= dp1[r-k-1];
dp1[r] %= m;
}
//build dp2:
for(int r=0;r<d;r++) dp2[r] = 0;
dp2[d] = 1;
for(int r = d+1;r<=n;r++)
{
dp2[r] = ((2*dp2[r-1]) - dp2[r-d] + dp1[r-d]) % m;
if(r>=k+1) dp2[r] -= dp1[r-k-1];
dp2[r] %= m;
}
cout<<dp2[n]<<endl;
}
}
changing the final output statement to:
if(dp2[n]<0) cout<<dp2[n]+m<<endl;
else cout<<dp2[n]<<endl;
does the work, but why was it required?
By the way, the code is actually my solution to this question
This is a limit imposed by the range of int.
int can only hold values between –2,147,483,648 to 2,147,483,647.
Consider using long long for your m, n, k, d & r variables. If possible use unsigned long long if your calculations should never have a negative value.
long long can hold values from –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
while unsigned long long can hold values from 0 to 18,446,744,073,709,551,615. (2^64)
The range of positive values is approximately halved in signed types compared to unsigned types, due to the fact that the most significant bit is used for the sign; When you try to assign a positive value greater than the range imposed by the specified Data Type the most significant bit is raised and it gets interpreted as a negative value.
Well, no, modulo with positive operands does not produce negative results.
However .....
The int type is only guaranteed by the C standards to support values in the range -32767 to 32767, which means your macro m is not necessarily expanding to a literal of type int. It will fit in a long though (which is guaranteed to have a large enough range).
If that's happening (e.g. a compiler that has a 16-bit int type and a 32-bit long type) the results of your modulo operations will be computed as long, and may have values that exceed what an int can represent. Converting that value to an int (as will be required with statements like dp1[r] %= m since dp1 is a pointer to int) gives undefined behaviour.
Mathematically, there is nothing special about big numbers, but computers only have a limited width to write down numbers in, so when things get too big you get "overflow" errors. A common analogy is the counter of miles traveled on a car dashboard - eventually it will show as all 9s and roll round to 0. Because of the way negative numbers are handled, standard signed integers don't roll round to zero, but to a very large negative number.
You need to switch to larger variable types so that they overflow less quickly - "long int" or "long long int" instead of just "int", the range doubling with each extra bit of width. You can also use unsigned types for a further doubling, since no range is used for negatives.