Improper printing of uint8_t variable [duplicate] - c++

This question already has answers here:
uint8_t can't be printed with cout
(8 answers)
Closed 6 years ago.
I am trying to read a small integer value (less than 10) to a uint8_t variable. I do it like this
uint8_t myID = atoi(argv[5]);
However when I do this
std::cout << "My ID is "<< myID <<std::endl;
It prints some non-alphanumeric character. There is no issue when myID is of type int. I tried casting explicitly by doing
uint8_t myID = (uint8_t)atoi(argv[5]);
But the results are the same. Could anyone explain why this is the case and if there is any possible solution?

uint8_t is not a separate data type. On systems that provide it the actual type is aliased to some standard data type, most commonly, an unsigned char.
Operator << provides an overload that takes unsigned char, and prints it as a character. When you are printing your uint8_t variable as an int, cast it to an int for printing:
std::cout << "My ID is "<< int(myID) <<std::endl;
// ^^^^^
Demo.

That's because on your platform, uint8_t is a typedef for an unsigned char.
And the ostream overloaded << for an unsigned char outputs a character, rather than a number, since the clever C++ folk thought that to be sensible. It normally is.
You can fix this by casting to an int, which will always be able to accept an uint8_t value.
(Note that prior to C++14, a char could be a 1's complement or signed magnitude 8 bit signed type, so it could be different to uint8_t).

Related

printing the bits of float

I know. I know. This question has been answered before, but I have a slightly different and a bit more specific question.
My goal is, as the title suggest, to cout the 32 bits sequence of a float.
And the solution provided in the previous questions is to use a union.
union ufloat{
float f;
uint32_t u;
};
This is all good and well. And I have managed to print the bits of a floating number.
union ufloat uf;
uf.f = numeric_limits<float>::max();
cout << bitset<32>(uf.f) << "\n"; // This give me 0x00000000, which is wrong.
cout << bitset<32>(uf.u) << "\n"; // This give me the correct bit sequence.
My question is why does bitset<32>(uf.f) doesn't work, but bitset<32>(uf.u) does?
The green-ticked anwser of this question
Obtaining bit representation of a float in C says something about "type-punning", and I presume it's got something to do with that. But I am not sure how exactly.
Can someone please clearify? Thanks
The constructor of std::bitset you are calling is:
constexpr bitset( unsigned long long val ) noexcept;
When you do bitset<32>(uf.f), you are converting a float to an unsigned long long value. For numeric_limits<float>::max(), that's undefined behavior because it's outside the range of the destination type, and in your case it happened to produce zero.
When you do bitset<32>(uf.u), you are again relying on undefined behavior, and in this case it happens to do what you want: convert a uint32_t that contains the bits of a float to an unsigned long long value.
In C++, what you should do instead is use memcpy:
uint32_t u;
std::memcpy(&u, &uf.f, sizeof(u));
std::cout << std::bitset<32>(u) << "\n";
This question is tagged C++, but the linked one is about C.
Those are different languages, with different rules. In particular, as mentioned in the comments by Yksisarvinen:
Type punning is a way to read bytes of memory as if they were different type than they really are. This is possible to do through union in C, but in C++ reading anything but the last written to member of union is Undefined Behaviour.
Since C++20, we can use std::bit_cast.
auto f{ std::numeric_limits<float>::max() }; // -> 3.40282e+38
auto u{ std::bit_cast<unsigned>(f) }; // -> 7f7fffff
See e.g. https://godbolt.org/z/d9T3G6qGx.

Given a variable whose type is `uint16_t`, is there any difference between (int16_t) uVal and *(int16_t*)&uVal?

It seems that they are equivalent. But I can't figure why.
Here is related code snippet:
#include<iostream>
void foo(uint16_t uVal)
{
int16_t auxVal1 = (int16_t) uVal;
int16_t auxVal2 = *(int16_t*)&uVal;
std::cout << auxVal1<< std::endl;
std::cout << auxVal2<< std::endl;
std::cout << (uint16_t)auxVal1 << std::endl;
std::cout << *(uint16_t*)&auxVal2 << std::endl;
}
int main()
{
foo(0xFFFF);
std::cout << std::endl;
foo(1);
}
Here is the output:
-1
-1
65535
65535
1
1
1
1
(int16_t)uval converts the uint16_t value to a int16_t value. For 1 this works as expected, for 0xffff it is implementation-defined behavior because 0xffff does not fit in the bounds of int16_t. (https://en.cppreference.com/w/cpp/language/implicit_conversion#Integral_conversions). Since C++20, it is defined such that it produces the expected value (see below).
*(int16_t*)&uval first casts the uint16_t* pointer to a int16_t* pointer, and then dereferences it. With the C-style pointer cast, the expression is equivalent to *reinterpret_cast<int16_t*>(&uval). (https://en.cppreference.com/w/cpp/language/explicit_cast). static_cast is not possible because uint16_t and int16_t are different types.
Because they are also not "similar types", dereferencing the resulting int16_t* pointer is also undefined behavior (https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing).
Because it is undefined behavior the expressions could theoretically result in anything, but with a normal compiler (without optimizations), with the first expression it would attempt to convert the uint16_t into a int16_t, whereas with the second expression, it would attempt to access the raw uint16_t value as if it were a int16_t, without modifying it.
This results in the same value, because of the way signed integer values are stored in two's-complement: Positive values have the same bitwise expression in signed and unsigned types. 0x0001 means 1 for both. But 0xffff (all-one bytes) means 65535 for an uint16_t, and -1 for an int16_t.
So (int16_t)uval (and also (uint16_t)sval) does not need to modify the bitwise value at all, because all values that are in the range of both int16_t and uint16_t are expressed the same way for both. And for values outside the range, it is undefined behavior, so the compiler simply doesn't modify the bitwise value in that case either.
The only way to get the effect of the second expression (accessing raw bitwise data as if it were another type) without undefined behavior would be to use memcpy: int16_t sval; std::memcpy(&sval, &uval, 2);.

std::byte's to_integer<uint8_t> interprets to char instead of integer

I was playing around with C++17 std::byte, and I came across some weird behaviour.
I don't know if it is intended or not, or if I am doing something wrong.
std::byte test_byte{80};
std::cout << "Test: " << std::to_integer<uint8_t>(test_byte) << " vs " << std::to_integer<uint16_t>(test_byte) << "\n";
This will print out:
Test: P vs 80
I then looked up the ASCII table and found that uppercase P's numerical value is 80.
My question is if this is the intended or if it's a bug, or maybe even OS/Compiler specific?
Running Windows 10 and compiling with VS Build Tools.
My question is if this is the intended
Sort of.
if it's a bug
No. It's just a slight inconvenience in the API.
or maybe even OS/Compiler specific?
No.
This has nothing to do with std::to_integer. The issue is that the integer type that is 8 bits wide on your system happens to be (unsigned) char. And this integer type happens to also be a character type.
And integers that are characters are treated differently from integers that aren't character types by character streams. Specifically, the behaviour is to print the character encoded by that integer, rather than textual representation of the value.
The solution is to convert the uint8_t into a wider integer type such as unsigned int before inserting it into a character stream.
Compare these two lines of code:
unsigned char c = 80;
std::cout << c << '\n';
std::cout << +c << '\n';
This should help you to understand what is going on!
In the first case, it will print the ASCII character or even possibly the UNICODE character of that value depending on the architect and platform (Hardware Manufacturer: Intel, AMD, etc, and Windows, Mac, Linux, etc.) and the type of stream_buf that is being used... It may not even print anything and cause your internal computer's buzzer to beep depending on the value...
In the second case, it will print the actual value 80 to the console.
What is happening here is that the unsigned char is interpreted by the std::basic_streambuf class different than a conventional integral signed or unsigned type because it is of a char type. You may or may not see this same effect with a signed char I'm not 100% sure on that, but I am 100% sure of the unsigned char type.
The reason the value 80 is printed in the second case is due to the unary operator+() being prefixed to the unsigned char type. This causes integer promotion.
This characteristic or side-effect will carry out for any and all types that are either typedef or aliased from unsigned char.
In your case, you are seeing P being printed because you are casting or converting to the unsigned char type since uint8_t is an unsigned char!

Are int8_t and uint8_t really integers? What are their use? [duplicate]

This question already has answers here:
uint8_t can't be printed with cout
(8 answers)
Closed 3 years ago.
What exactly is uint8_t made for? If it remains indistinguishable from unsigned char and cannot be used to overload functions?
I found many answers in this post : std::cout deal with uint8_t as a character
So I've reedited the question entirely and re-titled the question.
On all systems with 8-bit bytes, they are variants of char. This includes the type-aliases for e.g. uint8_t (which is a type-alias for unsigned char on such a system).
And no matter what type-alias you have, char (and unsigned char and signed char (yes those are three distinctive types)) will be treated as characters by the stream output operator <<.
If you want to print the integer value of any char based type you need to cast it to e.g. int. As in
std::cout << static_cast<unsigned>(my_uint8_t_var) << '\n';
As a side-note: There are systems which doesn't have 8-bit bytes. On such systems type-aliases like uint8_t are not possible, and does not exist. If you see e.g. this fixed-width integer reference you will see that the exact fixed-width integer types are optional.
uint8_t is an 8 bit unsigned integer. For storing values from 0 to 255. It usually has the same range as an unsigned char but is one of several explicitly sized integer types available with cstdint

std::cout deal with uint8_t as a character

If I run this code:
std::cout << static_cast<uint8_t>(65);
It will output:
A
Which is the ASCII equivalent of the number 65.
This is because uint8_t is simply defined as:
typedef unsigned char uint8_t;
Is this behavior a standard?
Should not be a better way to define uint8_t that guaranteed to be dealt with as a number not a character?
I can not understand the logic that if I want to print the value of a uint8_t variable, it will be printed as a character.
P.S. I am using MSVS 2013.
Is this behavior a standard
The behavior is standard in that if uint8_t is a typedef of unsigned char then it will always print a character as std::ostream has an overload for unsigned char and prints out the contents of the variable as a character.
Should not be a better way to define uint8_t that guaranteed to be dealt with as a number not a character?
In order to do this the C++ committee would have had to introduce a new fundamental type. Currently the only types that has a sizeof() that is equal to 1 is char, signed char, and unsigned char. It is possible they could use a bool but bool does not have to have a size of 1 and then you are still in the same boat since
int main()
{
bool foo = 42;
std::cout << foo << '\n';
}
will print 1, not 42 as any non zero is true and true is printed as 1 but default.
I'm not saying it can't be done but it is a lot of work for something that can be handled with a cast or a function
C++17 introduces std::byte which is defined as enum class byte : unsigned char {};. So it will be one byte wide but it is not a character type. Unfortunately, since it is an enum class it comes with it's own limitations. The bit-wise operators have been defined for it but there is no built in stream operators for it so you would need to define your own to input and output it. That means you are still converting it but at least you wont conflict with the built in operators for unsigned char. That gives you something like
std::ostream& operator <<(std::ostream& os, std::byte b)
{
return os << std::to_integer<unsigned int>(b);
}
std::istream& operator <<(std::istream& is, std::byte& b)
{
unsigned int temp;
is >> temp;
b = std::byte{b};
return is;
}
int main()
{
std::byte foo{10};
std::cout << foo;
}
Posting an answer as there is some misinformation in comments.
The uint8_t may or may not be a typedef for char or unsigned char. It is also possible for it to be an extended integer type (and so, not a character type).
Compilers may offer other integer types besides the minimum set required by the standard (short, int, long, etc). For example some compilers offer a 128-bit integer type.
This would not "conflict with C" either, since C and C++ both allow for extended integer types.
So, your code has to allow for both possibilities. The suggestion in comments of using unary + would work.
Personally I think it would make more sense if the standard required uint8_t to not be a character type, as the behaviour you have noticed is unintuitive.
It's indirectly standard behavior, because ostream has an overload for unsigned char and unsigned char is a typedef for same type uint8_t in your system.
ยง27.7.3.1 [output.streams.ostream] gives:
template<class traits>
basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&, unsigned char);
I couldn't find anywhere in the standard that explicitly stated that uint8_t and unsigned char had to be the same, though. It's just that it's reasonable that they both occupy 1 byte in nearly all implementations.
std::cout << std::boolalpha << std::is_same<uint8_t, unsigned char>::value << std::endl; // prints true
To get the value to print as an integer, you need a type that is not unsigned char (or one of the other character overloads). Probably a simple cast to uint16_t is adequate, because the standard doesn't list an overload for it:
uint8_t a = 65;
std::cout << static_cast<uint16_t>(a) << std::endl; // prints 65
Demo