stringstream unsigned conversion broken? - c++

Consider this program:
#include <iostream>
#include <string>
#include <sstream>
#include <cassert>
int main()
{
std::istringstream stream( "-1" );
unsigned short n = 0;
stream >> n;
assert( stream.fail() && n == 0 );
std::cout << "can't convert -1 to unsigned short" << std::endl;
return 0;
}
I tried this on gcc (version 4.0.1 Apple Inc. build 5490) on OS X 10.5.6 and the assertion is true; it fails to convert -1 to an unsigned short.
In Visual Studio 2005 (and 2008) however, the assertion fails and the resulting value of n is the same as what you would expect from an compiler generated implicit conversion - i.e "-1" is 65535, "-2" is 65534, etc. But then it gets weird at "-32769" which converts to 32767.
Who's right and who's wrong here? (And what the hell's going on with -32769??)

The behaviour claimed by GCC in Max Lybbert's post is based on the tables om the C++ Standard that map iostream behaviour onto printf/scanf converters (or at least that;'s my reading). However, the scanf behaviour of g++ seems to be different from the istream behavior:
#include <iostream>
#include <cstdio>
using namespace std;;
int main()
{
unsigned short n = 0;
if ( ! sscanf( "-1", "%hu", &n ) ) {
cout << "conversion failed\n";
}
else {
cout << n << endl;
}
}
actually prints 65535.

First, reading the string "-1" as a negative number is locale dependent (it would be possible for a locale to identify negative numbers by enclosing them in parenthesis). Your default standard is the "classic" C locale:
By far the dominant use of locales is implicitly, in stream I/O. Each istream and ostream has its own locale. The locale of a stream is by default the global locale at the time of the stream’s creation (page 6). ...
Initially, the global locale is the standard C locale, locale::classic() (page 11).
According to the GCC guys, numeric overflow is allowed to fail the stream input operation (talking about negative numbers that overflowed a signed int):
[T]he behaviour of libstdc++-v3 is strictly standard conforming. ... When the read is attempted it does not fit in a signed int i, and it fails.
Thanks to another answer, a bug was filed and this behavior changed:
Oops, apparently we never parsed correctly negative values for unsigned. The
fix is simple. ...
Fixed in mainline, will be fixed in 4.4.1 too.
Second, although integer overflow is generally predictable, I believe it's officially undefined behavior, so while I can't say why -32769" converts to 32767, I think it's allowed.

Try this code:
#include <iostream>
#include <string>
#include <sstream>
#include <cassert>
int main()
{
std::istringstream stream( "-1" );
std::cout << "flags: " << (unsigned long)stream.flags() << std::endl;
return 0;
}
I tried this on my VS2005:
flags: 513
and on codepad.org (which I think uses g++) this gives:
flags: 4098
This tells me that gcc uses a different default fmtflags. Since fmtflags control what conversions are possible you are getting different results.

Related

std::istringstream: error handling difficulties when reading negative values into unsigned variables [duplicate]

I want to read unsigned integers in base-10 (decimal) representation from a C++ iostream with at least rudimentary error detection. In my view, minus signs would clearly be an error in this case, because unsigned integers have no sign. However, the gcc is of a different opinion:
#include <iostream>
#include <sstream>
int main() {
std::stringstream a("5"), b("-0"), c("-4");
unsigned int i;
a >> i; if ( a ) std::cout << i << std::endl; else std::cout << "Conversion failure" << std::endl;
b >> i; if ( b ) std::cout << i << std::endl; else std::cout << "Conversion failure" << std::endl;
c >> i; if ( c ) std::cout << i << std::endl; else std::cout << "Conversion failure" << std::endl;
return 0;
}
gives me an output of
4294967292
for the last line, as if a signed integer -4 had been read and converted to unsigned int.
Apparently, the GCC people see this as a feature. Is there some standard that mandates this behaviour, and is there any way short of writing an own parser to get out of it, i.e. detect "-4" (and maybe "-0") as conversion errors?
Consulting C++03, 22.2.2.1.2/11, the formats are inherited from scanf and friends, which in turn all say that the converted character sequence is "optionally signed", even for the ones with unsigned output. strtoul is the same.
So, I suppose you could say the standard that mandates the behavior is C89 for C++03, C99 for C++11.
Since the - happens to be allowed only as the first character, I suppose that the workaround is to check for it with peek before using operator>>.
If I read 22.2.2.1.2/table 5 correctly it shows that extracting into an unsigned is equivalent to scanf with %u which appears to also do the negative -> positive conversion.
Would it be really different if you'd done this?
int i;
unsigned int u;
c >> i;
u = i;
std :: cout << u;
It doesn't matter so much that operator>> tolerates the sign mismatch because the underlying C rules will allow a silent conversion in any case. You're not fundamentally adding any safety by "strengthening" the input operator.
That said, my gcc (4.3.5 on Solaris) says it's a conversion error.

C++ initialization of vector of structs

I am trying to make a keyword-recognizing subroutine under OSX Yosemite, see the listing below. I do have a couple of strange things.
I am using the "playground" for making MWE, and the project builds seemingly OK, but does not want to run:
"My Mac runs OS X 10.10.5, which is lower than String sort's minimum deployment target."
I do not understand even the message, and especially not what my code makes with sorting?
Then, I pasted the relevant code to my app, where the project was generated using CMake, and the same compiler, and the same IDE, in the same configuration presents with the message
"Non-aggregate type 'vector cannot be initialized with an initializer list"
in the "vector QInstructions={..}" construction.
When searching for similar error messages, I found several similar questions, and the suggested solutions use default constructor, manual initialization, and the like. I wonder if standard-resistant compact initialization is possible?
#include <iostream>
using namespace std;
#include <vector>
enum KeyCode {QNONE=-1,
QKey1=100, QKey2
};
struct QKeys
{ /** The code command code*/
std::string Instr; ///< The command string
unsigned int Length; ///< The significant length
KeyCode Code; //
};
vector<QKeys> QInstructions={
{"QKey1",6,QKey1},
{"QKey2",5,QKey2}
};
KeyCode FindCode(string Key)
{
unsigned index = (unsigned int)-1;
for(unsigned int i=0; i<QInstructions.size(); i++)
if(strncmp(Key.c_str(),QInstructions[i].Instr.c_str(),QInstructions[i].Length)==0)
{
index = i;
cout << QInstructions[i].Instr << " " <<QInstructions[i].Length << " " << QInstructions[i].Code << endl;
return QInstructions[i].Code;
break;
}
return QNONE;
}
int main(int argc, const char * argv[]) {
string Key = "QKey2";
cout << FindCode(Key);
}
In your code
vector<QKeys> QInstructions={
("QKey1",6,QKey1),
{"QKey2",5,QKey2}
};
the first line of data is using parenthesis "()". Replace them with accolades "{}" and it will work.
Also, i see you have written unsigned index = (unsigned int)-1;. This is undefined behavior according to the standard. This is also bad because you are using a C-style cast (see here). You should replace it with:
unsigned index = std::numeric_limits<unsigned int>::max();
Finally, I found the right solution as
Initialize a vector of customizable structs within an header file . Unfortunately, replacing parenthesis did not help.
Concerning setting an unsigned int to its highest possible value using -1, I find as overkill to use std::numeric_limits<unsigned int>::max() for such a case, a kind of over-standardization. I personally think that as long as we are using two's complement representation, the assignment will be correct. For example, at
http://www.cplusplus.com/reference/string/string/npos/
you may read:
static const size_t npos = -1;
...
npos is a static member constant value with the greatest possible
value for an element of type size_t.
...
This constant is defined with a value of -1, which because size_t is
an unsigned integral type, it is the largest possible representable
value for this type.

Efficiently store array of up to 2048 characters?

Getting input from another source; which populates a string of up to 2048 characters.
What is the most efficient way of populating and comparing this string? - I want to be able to easily append to the string also.
Here are three attempts of mine:
C-style version
#include <cstdio>
#include <cstring>
int main(void) {
char foo[2048];
foo[0]='a', foo[1]='b', foo[2]='c', foo[3]='\0'; // E.g.: taken from u-input
puts(strcmp(foo, "bar")? "false": "true");
}
C++-style version 0
#include <iostream>
int main() {
std::string foo;
foo.reserve(2048);
foo += "abc"; // E.g.: taken from user-input
std::cout << std::boolalpha << (foo=="bar");
}
C++-style version 1
#include <iostream>
int main() {
std::string foo;
foo += "abc"; // E.g.: taken from user-input
std::cout << std::boolalpha << (foo=="bar");
}
What is most efficient depends on what you optimize for.
Some common criteria:
Program Speed
Program Size
Working Set Size
Code Size
Programmer Time
Safety
Undoubted King for 1 and 2, in your example probably also 3, is C style.
For 4 and 5, C++ style 1.
Point 6 is probably with C++-style.
Still, the proper mix of emphasizing these goal is called for, which imho favors C++ option 0.

Why compiler warns about implicit conversion in setprecision?

When I compile the following code, compiler gives me the warning:
"Implicit conversion loses integer precision: 'std::streamsize' (aka 'long') to 'int'".
I'm a little bit confused about this warning since I just try to save the current value of the precision to set it back to the original value later.
#include <iomanip>
#include <iostream>
int main() {
std::streamsize prec = std::cout.precision();
std::cout << std::setprecision(prec);
}
What is the right way to save the precision value and set it back later in this case?
It looks like it's just an oversight in the standard specification.
ios_base::precision has two overloads, one that gets and one that sets the precision:
// returns current precision
streamsize precision() const;
// sets current precision and returns old value
streamsize precision(streamsize prec) const;
So this code will not give you warnings:
#include <iostream>
int main() {
std::streamsize prec = std::cout.precision(); // gets
std::cout.precision(prec); // sets
}
However, the setprecision() function simply takes a plain old int:
unspecified-type setprecision(int n);
and returns an unspecified functor, which when consumed by a stream str has the effect of:
str.precision(n);
In your case, streamsize is not an int (and does not have to be), hence the warning. The standard should probably be changed so that setprecision's parameter is not int, but streamsize.
You can either just call precison() yourself, as above, or assume int is sufficient and cast.
#include <iomanip>
#include <iostream>
int main() {
std::streamsize prec = std::cout.precision();
std::cout << std::setprecision(static_cast<int>(prec));
}
Edit: Apparently it was submitted to be fixed and reached no concensus (closed as not-a-defect).

Issue with vector<bool> and printf

#include <vector>
#include <iostream>
#include <stdio.h>
using namespace std;
int main(int argc, const char *argv[])
{
vector<bool> a;
a.push_back(false);
int t=a[0];
printf("%d %d\n",a[0],t);
return 0;
}
This code give output "5511088 1". I thought it would be "0 0".
Anyone know why is it?
The %d format specifier is for arguments the size of integers, therefore the printf function is expecting two arguments both the size of an int. However, you're providing it with one argument that isn't an int, but rather a special object returned by vector<bool> that is convertible to bool.
This is basically causing the printf function to treat random bytes from the stack as part of the values, while in fact they aren't.
The solution is to cast the first argument to an int:
printf("%d %d\n", static_cast<int>(a[0]), t);
An even better solution would be to prefer streams over printf if at all possible, because unlike printf they are type-safe which makes it impossible for this kind of situation to happen:
cout << a[0] << " " << t << endl;
And if you're looking for a type-safe alternative for printf-like formatting, consider using the Boost Format library.
%d format specifier is for int type. So, try -
cout << a[0] << "\t" << t << endl;
The key to the answer is that vector isn't really a vector of bools. It's really a vector of proxy objects, which are translatable into ints & bools. This allows each bool to be stored as a single bit, for greater space efficiency (at the cost of speed efficiency), but causes a number of problems like the one seen here. This requirement was voted into the C++ Standard in a rash moment, and I believe most committee members now believe it was a mistake, but it's in the Standard and we're kind-of stuck with it.
The problem is triggered by the specialization for bool of vectors.
The Standard Library defines a specialization of the vector template for bool. The description of this specialization indicates that the implementation should pack the elements so that every bool only uses one bit of memory. This is widely considered a mistake.
Basically std::bool use 1 bit instead of 1 byte, so you face undefined behavior regarding printf.
If you are really willing to use printf, you can solve this issue by defining std::bool as char and print it as integer %d (implicit conversion, 1 for true and 0 for false).
#include <vector>
#include <iostream>
#include <stdio.h>
#define bool char // solved
using namespace std;
int main(int argc, const char *argv[])
{
vector<bool> a;
a.push_back(false);
int t = a[0];
printf("%d %d\n", a[0], t);
return 0;
}