why does subtraction overflow with static_cast? - c++

I understand that s1.size() - s2.size() underflows when s2 is bigger because it's subtraction of unsigned.
Why casting one them to int doesn't result in integer subtraction?
Why casting the whole thing gives me the correct result? I expected it to evaluate what is inside parentheses, then underflow which would give a big number and then the cast to int would not make difference. What am I missing?
#include <iostream>
#include <string>
using std::cout;
using std::cin;
using std::endl;
using std::string;
bool isShorter(const string &s1, const string &s2) {
return (static_cast<int>(s1.size()) - s2.size() < 0) ? true : false; // underflows
//return (static_cast<int>(s1.size() - s2.size()) < 0) ? true : false; // this works
}
int main() {
string s, t;
getline(cin, s);
getline(cin, t);
cout << "s: " << s << endl;
cout << "t: " << t << endl;
cout << "printing shorter string of the two..." << endl;
cout << ((isShorter(s, t)) ? s : t) << endl;
}

When you do
static_cast<int>(s1.size()) - s2.size()
You convert s1.size() to a int and then when you subtract s2.size() from it that int is promoted to the same type as s2.size() and then it is subtracted. This means you still have unsigned integer subtraction and since that can't ever be negative it will wrap around to a larger number. It is no different from doing s1.size() - s2.size().
You have the same thing with
static_cast<int>(s1.size() - s2.size())
With the added bonus of possible signed integer overflow which is undefined behavior. You are still doing unsigned integer subtraction so if s1 is smaller than s2 than you wrap around to a large number.
What you need to do is convert both s1.size() and s2.size() to a signed integer type to get singed integer subtraction. That could look like
static_cast<ptrdiff_t>(s1.size()) - static_cast<ptrdiff_t>(s2.size())
And now you will actually get a negative number if s1.size() is less than s2.size().
It should be noted that all of this can be avoided by using less than operator. Your function can be rewritten to be
bool isShorter(const string &s1, const string &s2)
{
return s1.size() < s2.size();
}
which, IMHO, is much easier to read and understand.

Casting "one of them" to int leaves you with arithmetic operation that mixes string::size_type and int. In this mix the unsigned type has the same rank as int or higher, which means that the unsigned type still "wins": your int is implicitly converted back to string::size_type and the calculations are performed in the domain of string::size_type. Your conversion to int is effectively ignored.
Meanwhile, casting the result to int means that you are attempting to convert a value that does not fit into int's range. The behavior in such cases is implementation defined. In real-life 2's-complement implementations it is not unusual to see a simple truncation of the representation, which produces the "correct" result. This is not a good approach though.
If you want to perform this subtraction as a signed one, you have to convert both operands to signed types, making sure that the target signed type can represent both values.
(Theoretically, you can get away with converting just one operand to signed type, but for that you'd need to choose a type that can represent the entire range of string::size_type.)

Related

How to convert char(or string, other type) -> bits?

In c++,
I don't understand about this experience. I need your help.
in this topic, answers saying use to_string.
but they say 'to_string' is converting bitset to string and cpp reference do too.
So, I wonder the way converting something data(char or string (maybe ASCII, can convert unicode?).
{It means the statement can be divided bit and can be processed it}
The question "How to convert char to bits?"
then answers say "use to_string in bitset"
and I want to get each bit of my input.
Can I cleave and analyze bits of many types and process them? If I can this, how to?
#include <iostream>
#include <bitset>
#include <string>
using namespace std;
int main() {
char letter;
cout << "letter: " << endl;
cin >> letter;
cout << bitset<8>(letter).to_string() << endl;
bitset<8> letterbit(letter);
int lettertest[8];
for (int i = 0; i < 8; ++i) {
lettertest[i] = letterbit.test(i);
}
cout << "letter bit: ";
for (int i = 0; i < 8; ++i) {
cout << lettertest[i];
}
cout << endl;
int test = letterbit.test(0);
}
When executing this code, I get result I want.
But I don't understand 'to_string'.
An important point is using of "to_string"
{to_string is function converting bitset to string(including in name),
then Is there function converting string to bitset???
Actually, in my code, use the function with a letter -> convert string to bitset(at fitst, it is result I want)}
help me understand this action.
Q: What is a bitset?
https://www.cplusplus.com/reference/bitset/bitset/
A bitset stores bits (elements with only two possible values: 0 or 1,
true or false, ...).
The class emulates an array of bool elements, but optimized for space
allocation: generally, each element occupies only one bit (which, on
most systems, is eight times less than the smallest elemental type:
char).
In other words, a "bitset" is a binary object (like an "int", a "char", a "double", etc.).
Q: What is bitset<>.to_string()?
Bitsets have the feature of being able to be constructed from and
converted to both integer values and binary strings (see its
constructor and members to_ulong and to_string). They can also be
directly inserted and extracted from streams in binary format (see
applicable operators).
In other words, to_string() allows you to convert the binary bitset to text.
Q: How to to I convert convert char(or string, other type) -> bits?
A: Per the above, simply use bitset<>.to_ulong()
Here is an example:
https://en.cppreference.com/w/cpp/utility/bitset/to_string
Code:
#include <iostream>
#include <bitset>
int main()
{
std::bitset<8> b(42);
std::cout << b.to_string() << '\n'
<< b.to_string('*') << '\n'
<< b.to_string('O', 'X') << '\n';
}
Output:
00101010
**1*1*1*
OOXOXOXO

C++ pointer Warning: Arithmetic overflow: Using operator '-' on a 4-byte value and then casting the result to an 8-byte value

When I run this function, It gives me a 2 warning for setw(*torPtr - *harePtr) and setw(*harePtr - *torPtr)
It said :
Arithmetic overflow: Using operator '-' on a 4-byte value and then casting the result to an 8-byte value. Cast the value to the wider type before calling operator '-' to avoid overflow (io.2).
How can I fix this please?
void Posi(const int* const tPtr,const int* const hPtr)
{
if (*hPtr == *tPtr) {
cout <<setw(*hPtr) << "bang!" << '\a';
}
else if (*hPtr < *tPtr) {
cout << setw(*hPtr) << 'H' << setw(*tPtr - *hPtr) << 'A';
}
else {
cout << setw(*tPtr) << 'T' << setw(*hPtr - *tPtr) << 'B';
}
}
When using Visual Studio, I get this error as well.
After looking into setw that I linked from #include <iomanip>, I found that setw is provided with a parameter streamsize which is actually a long long.
The resulting problem seems to be that you are trying to cast the arithmetic result of two int (with a size of 4 bytes) to a long long (with a size of 8 bytes) to conform to the definition of streamsize in setw.
An overflow caused by an arithmetic operation would not yield correct results.
If you want to learn, how an overflow is caused, you could look at the following web-article https://www.cplusplus.com/articles/DE18T05o/.
To fix the problem, you will need to prevent the overflow from occuring, which can be achieved by casting the values to a larger data type. For example:
const long long value_cast = static_cast<long long>(*tPtr) - static_cast<long long>(*hPtr);
cout << setw(*hPtr) << 'H' << setw(value_cast) << 'A';
I hope that this answers your question. :)
Edit:
I changed the cast from c-style to static. Thank you for your contribution anastaciu!

Integer overflow in boolean expressions

I have the following c++ code:
#include <iostream>
using namespace std;
int main()
{
long long int currentDt = 467510400*1000000;
long long int addedDt = 467510400*1000000;
if(currentDt-addedDt >= 0 && currentDt-addedDt <= 30*24*3600*1000000)
{
cout << "1" << endl;
cout << currentDt-addedDt << endl;
}
if(currentDt-addedDt > 30*24*3600*1000000 && currentDt-addedDt <= 60*24*3600*1000000)
{
cout << "2" << endl;
cout << currentDt-addedDt << endl;
}
if(currentDt-addedDt > 60*24*3600*1000000 && currentDt-addedDt <= 90*24*3600*1000000)
{
cout << "3" << endl;
cout << currentDt-addedDt << endl;
}
return 0;
}
Firstly, I get a warning for integer overflow, which strikes me as odd because the number 467510400*1000000 falls well within the range of a long long int, does it not? Secondly, I get the following output:
1
0
3
0
If in both cases currentDt-addedDt evaluates to 0, how could the third if statement possibly evaluate to true?
467510400*1000000 is within the range of long long, but it's not within the range of int. Since both literals are of type int, the type of the product is also of type int - and that will overflow. Just because you're assigning the result to a long long doesn't change the value that gets assigned. For the same reason that in:
double d = 1 / 2;
d will hold 0.0 and not 0.5.
You need to explicitly cast one of the literals to be of a larger integral type. For example:
long long int addedDt = 467510400LL * 1000000;
long long int currentDt = 467510400ll*1000000ll;
long long int addedDt = 467510400ll*1000000ll;
Note the two lowercase letter "l"s following the digits. These make your constants long long. C++ normally interpret strings of digits in source as plain ints.
The problem you are having is that all of your integer literals are int. When you multiply them they overflow giving you the unexpected behavior. To correct this you can make them long long literals using 467510400ll * 1000000ll
It because
60*24*3600*1000000 evaluates to -25526272
use
60LL*24LL*3600LL*1000000LL
instead (note the 'LL' suffix)
You have tagged this with C++.
My minimal change to your code would use the c++ static_cast to promote at least one of the literal numbers (of any overflow generating expression) to an int64_t (found in include file cstdint).
Example:
// 0 true
if(currentDt-addedDt >= 0
&& // true because vvvv
// 0 true
currentDt-addedDt <= 30*24*3600*static_cast<int64_t>(1000000))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(for test 1, the result of the if clause is true.
for test 2 and 3 is false)
Upon finding the static_cast, the compiler promotes the 3 other integers (in the clause) to int64_t, and thus generates no warnings about overflow.
Yes, it adds a lot of chars for being, in some sense, 'minimal'.

Casting from size_t to char and around

I'm making the transition from C to C++11 now and I try to learn more about casting.
At the end of this question you see a small program which asked a number as input and then shows it as number and as character. Then it is cast to a char, and after that I cast it back to a size_t.
When I give 200 as input, the first cout prints 200, but the second cout prints 18446744073709551560.
How do I made it to print 200 again? Do I use the wrong cast? I have already tried different cast as dynamic and reintepret.
#include<iostream>
using namespace std;
int main(){
size_t value;
cout << "Give a number between 32 and 255: ";
cin >> value;
cout << "unsigned value: " << value << ", as character: `" << static_cast<char>(value) << "\'\n";
char ch = static_cast<char>(value);
cout << "unsigned value: " << static_cast<size_t>(ch) << ", as character: `" << ch << "\'\n";
}
size_t is unsigned, plain char's signed-ness is implementation-defined.
Casting 200 to a signed char will yield a negative result as 200 is larger than CHAR_MAX which is 127 for the most common situation, an 8-bit char. (Advanced note - this conversion is also implementation-defined but for all practical purposes you can assume a negative result; in fact usually -56).
Casting that negative value back to an unsigned (but wider) integer type will yield a rather large value, because unsigned arithmetic wraps around.
You might cast the value to an unsigned char first (yielding the expected small positive value), then cast that the wider unsigned type.
Most compilers have a switch that lets you toggle plain char over to unsigned so you could experiment with that. When you come to write portable code, try to write code that will work correctly in both cases!

how to check if given c++ string or char* contains only digits?

Or from the other way around find first non digit character.
Do the same functions apply for string and for char* ?
Of course, there are many ways to test a string for only numeric characters. Two possible methods are:
bool is_digits(const std::string &str)
{
return str.find_first_not_of("0123456789") == std::string::npos;
}
or
bool is_digits(const std::string &str)
{
return std::all_of(str.begin(), str.end(), ::isdigit); // C++11
}
Several people already mentioned to use isdigit(). However, note that this isn't entirely trivial because char can be signed which would cause a negative value to be passed to isdigit(). However, this function can only take positive values. That is, you want something akin to this:
if (s.end() == std::find_if(s.begin(), s.end(),
[](unsigned char c)->bool { return !isdigit(c); })) {
std::cout << "string '" << s << "' contains only digits\n";
}
It seems the reasoning for the conversion to unsigned char isn't obvious. So, here are the relevant quotes from their respective standards:
According to ISO/IEC 9899:2011 (or ISO/IEC 9899:1999) 7.4 paragraph 1 the following applies to the arguments of the functions from <ctype.h>:
... In all cases the argument is an int, the value of which shall be
representable as an unsigned char or shall equal the value of the macro EOF. If the
argument has any other value, the behavior is undefined.
Unfortunately, the C++ standard doesn't specify that char is an unsigned type. Instead it specifies in ISO/IEC 14882:2011 3.9.1 [basic.fundamental] paragraph 1:
... It is implementation-defined whether a char object can hold negative values. ...
Clearly, a negative value cannot be represented as an unsigned char. That is, if char is using a signed type on an implementation (there are actually several which do, e.g., it is signed on MacOS using gcc or clang) there is the danger that calling any of the <ctype.h> function would cause undefined behavior.
Now, why does the conversion to unsigned char does the right things?
According to 4.7 [conv.integral] paragraph 2:
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]
That is, the conversion from a [potentially] signed char to unsigned char is well-defined and causes the result to be in the permitted range for the <ctype.h> functions.
isdigit(int) tells you if a character is a digit. If you are going to assume ASCII and base 10, you can also use:
int first_non_digit_offset= strspn(string, "0123456789")
In the same spirit as Misha's answer, but more correct: sscanf(buf, "%*u%*c")==1.
scanf returns 0 if the %d digit extraction fails, and 2 if there is anything after the digits captured by %c. And since * prevents the value from being stored, you can't even get an overflow.
The cctype header file has a good number of character classifications functions which you can use on each character in the string. For numeric checks, that would be isdigit.
The following program shows how to check each character of a C or C++ string ( the process is pretty much identical in terms of checking the actual characters, the only real difference being how to get the length):
#include <iostream>
#include <cstring>
#include <cctype>
int main (void) {
const char *xyzzy = "42x";
std::cout << xyzzy << '\n';
for (int i = 0; i < std::strlen (xyzzy); i++) {
if (! std::isdigit (xyzzy[i])) {
std::cout << xyzzy[i] << " is not numeric.\n";
}
}
std::string plugh ("3141y59");
std::cout << plugh << '\n';
for (int i = 0; i < plugh.length(); i++) {
if (! std::isdigit (plugh[i])) {
std::cout << plugh[i] << " is not numeric.\n";
}
}
return 0;
}
#include <regex>
std::string string( "I only have 3 dollars!" );
std::cout << std::regex_search( string, std::regex( "\\d+" ) ); // true
and
std::string string( "I only have three dollars!" );
std::cout << std::regex_search( string, std::regex( "\\d+" ) ); // false
From the cplusplus.com you can use isdigit function as follow:
// isdigit example (C++)
#include <iostream> // std::cout
#include <string> // std::string
#include <locale> // std::locale, std::isdigit
#include <sstream> // std::stringstream
int main ()
{
std::locale loc;
std::string str="1776ad";
if (isdigit(str[0],loc))
{
int year;
std::stringstream(str) >> year;
std::cout << "The year that followed " << year << " was " << (year+1) << ".\n";
}
return 0;
}
Note: there is 2 types of isdigit the other version is local independent and ASCII based.
If it's a strict requirement that you can find exactly where the first non-character digit is, then you'll have to check each character. If not, I'd use either something like this:
unsigned safe_atoi(const std::string& a)
{
std::stringstream s(a);
unsigned b;
s >> b;
return b;
}