Why int8 works like char (C++) (Visual Studio 2015)

Why int8 works like char (C++) (Visual Studio 2015) - c++

I'm trying to find the smallest int type in C++. I read that there are no types which use less than a byte, because is the minimum addressable size. The smallest one I found is int8_t, or _int8 visual studio type. I tried to use both, but the program stores the values as chars. There are a way to use this types as integers? Or even, if there are a way to use smaller types (2 bits (signed) would be perfect XD, I need to store only -1, 0 and 1), or any other numeric byte type.
Thanks in advance

The standard gives you the following types:
signed char
unsigned char
(with char being equivalent to one of these)
And your fixed-width int8_t, uint8_t on an 8-bit system are simply aliases for these.
They are all integers already.
Your statement "but the program stores the values as chars" suggests a misunderstanding of what values with these types fundamentally are. This may be due to the following.
The only "problem" you may encounter is the special-case formatting for these types in the IOStreams sublibrary:
const char c = 40;
std::cout << c << '\n';
// Output: "("
Since int8_t is/may be signed char, this is literally the same:
const int8_t c = 40;
std::cout << c << '\n';
// Output: "("
But this is easily "fixable" with a little integral promotion:
const char c = 40;
std::cout << +c << '\n';
// Output: 40
Now the stream gets an int, and lexically-casts it as part of its formatting routine.
This has to do with the type system and the function overloads provided by the standard library; it has nothing to do with "how the values are stored". They are all integers.

I tried to use both, but the program stores the values as chars.
It doesn't (store the values as chars). In C++, a char value is an integer value. The difference to other integer types, is in two places:
the compiler converts char literals to integers transparently
the standard library treats char type separately from other integer types (that is, it considers char to represent letters text characters, not numbers).
As far as the compiler is concerned, the code:
char x = 'A';
is equivalent to:
char x = 65; // set value to 65 (ASCII code for letter "A")
If you look in the debugger at the second definition of x, is probable the debugger/IDE will tell you x is 'A'.
Or even, if there are a way to use smaller types (2 bits (signed) would be perfect XD, I need to store only -1, 0 and 1), or any other numeric byte type.
There is no integer type with a smaller than 8 bits representation.
That said, you can (and probably should) create a custom type for it:
enum class your_type: std::uint8_t
{
negative = -1,
neutral = 0,
positive = 1
};
Choose values that make sense in the context of your program (i.e. not "negative, neutral and positive").

You can use a bit field with two bits.

With MSVC you could use #pragma pack(1) and structs to obtain this.
#pragma pack(1)
struct int2 {
int value:2;
};
#pragma pack()
int main()
{
int2 a;
for (int i = 0; i < 10; i++)
{
a.value = i;
std::cout << a.value << std::endl;
}
return 0;
}
Output:
0
1
-2
-1
0
1
-2
-1
0
1
I guess building constructors and operator functions to access the value indirectly shouldn't be a problem.
EDIT: Improved version with operators:
#pragma pack(1)
template <typename INT_TYPE, int BYTE_SIZE>
class int_x {
INT_TYPE value: BYTE_SIZE;
public:
int_x() :value(0) {}
int_x(INT_TYPE v) :value(v) {}
operator INT_TYPE() const {
return value;
}
int_x& operator=(INT_TYPE v) {
value = v; return *this;
}
int_x& operator+=(INT_TYPE v) {
value += v; return *this;
}
int_x& operator-=(INT_TYPE v) {
value -= v; return *this;
}
int_x& operator/=(INT_TYPE v) {
value /= v; return *this;
}
int_x& operator*=(INT_TYPE v) {
value *= v; return *this;
}
};
typedef int_x<int, 1> int1; //Range [-1;0]
typedef int_x<unsigned int, 1> uint1; //Range [0;1]
typedef int_x<int, 2> int2; //Range [-2;1]
typedef int_x<unsigned int, 2> uint2; //Range [0;3]
typedef int_x<int, 3> int3; //Range [-4;3]
typedef int_x<unsigned int, 3> uint3; //Range [0;8]
#pragma pack()

Related

Can an enum be reduced to its bit size in C++?

Given the following - can I get sizeof(A) to be 1? Right now I'm getting 8, but I'd like A to be equal in layout to Z - as the enum only has one bit of data.
enum BOOL { x , y};
struct A {
BOOL b : 1;
unsigned char c : 7;
};
struct Z {
unsigned char r : 1;
unsigned char c : 7;
};
int main()
{
A b;
b.b = x;
std::cout << b.b << "," << sizeof(A) << ","<< sizeof(Z) << std::endl;
return 0;
}

The issue here is that BOOL will use an int as the underlying type by default. Since it uses an int, it is padding the struct out to have a size of 8 as that will keep the int part of the struct nicely aligned.
What you can do though is specify that you don't want an int, but instead want an unsigned char so that it can pack both bitfields in a single member. This isn't guaranteed, but makes it much more likely to happen. Using
enum BOOL : unsigned char { x , y};
makes A have a sizeof 1 in GCC, Clang and MSVC

You can use bool as the underlying type of the enum:
enum BOOL : bool { x , y};
Given this, on my system, sizeof(A) is 1. I don't think that is guaranteed given that much of bit field structure is implementation defined, and bool itself is technically not guaranteed to have size 1.
Using unsigned char is another alternative, which may be handled better with the adjacent unsigned char bitfield member on some implementations. Unfortunately though, GCC for example warns warning: 'A::b' is too small to hold all values of 'enum BOOL' which is technically a false positive, since one bit is sufficient to represent 0 and 1.

std::cout deal with uint8_t as a character

If I run this code:
std::cout << static_cast<uint8_t>(65);
It will output:
A
Which is the ASCII equivalent of the number 65.
This is because uint8_t is simply defined as:
typedef unsigned char uint8_t;
Is this behavior a standard?
Should not be a better way to define uint8_t that guaranteed to be dealt with as a number not a character?
I can not understand the logic that if I want to print the value of a uint8_t variable, it will be printed as a character.
P.S. I am using MSVS 2013.

Is this behavior a standard
The behavior is standard in that if uint8_t is a typedef of unsigned char then it will always print a character as std::ostream has an overload for unsigned char and prints out the contents of the variable as a character.
Should not be a better way to define uint8_t that guaranteed to be dealt with as a number not a character?
In order to do this the C++ committee would have had to introduce a new fundamental type. Currently the only types that has a sizeof() that is equal to 1 is char, signed char, and unsigned char. It is possible they could use a bool but bool does not have to have a size of 1 and then you are still in the same boat since
int main()
{
bool foo = 42;
std::cout << foo << '\n';
}
will print 1, not 42 as any non zero is true and true is printed as 1 but default.
I'm not saying it can't be done but it is a lot of work for something that can be handled with a cast or a function
C++17 introduces std::byte which is defined as enum class byte : unsigned char {};. So it will be one byte wide but it is not a character type. Unfortunately, since it is an enum class it comes with it's own limitations. The bit-wise operators have been defined for it but there is no built in stream operators for it so you would need to define your own to input and output it. That means you are still converting it but at least you wont conflict with the built in operators for unsigned char. That gives you something like
std::ostream& operator <<(std::ostream& os, std::byte b)
{
return os << std::to_integer<unsigned int>(b);
}
std::istream& operator <<(std::istream& is, std::byte& b)
{
unsigned int temp;
is >> temp;
b = std::byte{b};
return is;
}
int main()
{
std::byte foo{10};
std::cout << foo;
}

Posting an answer as there is some misinformation in comments.
The uint8_t may or may not be a typedef for char or unsigned char. It is also possible for it to be an extended integer type (and so, not a character type).
Compilers may offer other integer types besides the minimum set required by the standard (short, int, long, etc). For example some compilers offer a 128-bit integer type.
This would not "conflict with C" either, since C and C++ both allow for extended integer types.
So, your code has to allow for both possibilities. The suggestion in comments of using unary + would work.
Personally I think it would make more sense if the standard required uint8_t to not be a character type, as the behaviour you have noticed is unintuitive.

It's indirectly standard behavior, because ostream has an overload for unsigned char and unsigned char is a typedef for same type uint8_t in your system.
§27.7.3.1 [output.streams.ostream] gives:
template<class traits>
basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&, unsigned char);
I couldn't find anywhere in the standard that explicitly stated that uint8_t and unsigned char had to be the same, though. It's just that it's reasonable that they both occupy 1 byte in nearly all implementations.
std::cout << std::boolalpha << std::is_same<uint8_t, unsigned char>::value << std::endl; // prints true
To get the value to print as an integer, you need a type that is not unsigned char (or one of the other character overloads). Probably a simple cast to uint16_t is adequate, because the standard doesn't list an overload for it:
uint8_t a = 65;
std::cout << static_cast<uint16_t>(a) << std::endl; // prints 65
Demo

Division of integers in C++ not working as expected

I'm new here so really sorry if this is to basic but what am I missing here? This is just a dummy code:
#include <iostream>
using namespace std;
int main() {
unsigned int a, b, c;
int d;
a = 10E06;
b = 25E06;
c = 4096;
d = (a - b)/c;
std::cout << d << std::endl;
return 0;
}
cout is printing 1044913 instead of -3662. If I cast a and b to long the problem is solved. Is there a problem of overflow or something?

That's because (a-b) itself is unsigned:
#include <iostream>
using namespace std;
int main() {
unsigned int a, b, c;
int d;
a = 10E06;
b = 25E06;
c = 4096;
d = (a - b)/c;
std::cout << (a-b) << std::endl; // 4279967296
std::cout << d << std::endl; // 1044913
return 0;
}
The conversion from unsigned to int happens when d is assigned to, not before.
So (a-b)/c must be unsigned since a,b,c are.

Operations between unsigned numbers yield unsigned numbers. It's up to you to make sure the operations make sense, or protect against the opposite.
If you have unsigned int a = 2, b = 3;, what do you think the value of a-b would be?

Since a b and c are all declared as unsigned, the output of the computation (a-b)/c will be unsigned. Since the calculation of the values you provided cannot be properly represented with an unsigned type, things get a little messy. The unsigned value is then assigned to d, and even though this is signed, the value is already garbled.
I will also note that the notation 10E06 represents a floating point number that is then being implicitly cast to an unsigned int. Depending on the particular floating point value provided, this may or may not cast as expected.

You want your result to take a sign. So you should declare your variables as signed int or just int. That will give the desired result. If you cast a and b to long, a-b will be long and hence take a sign. Following is a solution.
int main() {
int a, b, c;
int d;
a = 10E06;
b = 25E06;
c = 4096;
d = (a - b)/c;
std::cout << d << std::endl;
return 0;
}
If you also want rational numbers you should use doubles or float (atlhough it won't give a different result for this particular case).
int main() {
double a, b, c;
double d;
a = 10E06;
b = 25E06;
c = 4096;
d = (a - b)/c;
std::cout << d << std::endl;
return 0;
}

Because of the way C++ (and many other C-based languages) deal with operators, when unsigned numbers are put into an expression, that expression returns an unsigned value, and not held in a mysterious inter-type state that might be expected.
Step-by-step:
(a - b) subtracts 25E06 from 10E06, which would normally return -15E06, is unsigned, so it's wrapped around to a whole bunch of junk.
This junk is then divided by c, and both inputs are unsigned, so the output is also unsigned.
Lastly, this is stuffed into a signed int, remaining at 1044913.
"unsigned int" is a type just like float and bool, even though it requires two keywords. If you want it to turn into a signed int for that calculation, you must either make sure a, b, and c are all signed (remove the unsigned keyword), or cast them as such when putting them into the expression, like this: d = ((signed)a - (signed)b) / (signed)c;

Is there a way to make `enum` type to be unsigned?

Is there a way to make enum type to be unsigned? The following code gives me a warning about signed/unsigned comparison.
enum EEE {
X1 = 1
};
int main()
{
size_t x = 2;
EEE t = X1;
if ( t < x ) std::cout << "ok" << std::endl;
return 0;
}
I've tried to force compiler to use unsigned underlying type for enum with the following:
enum EEE {
X1 = 1,
XN = 18446744073709551615LL
// I've tried XN = UINT_MAX (in Visual Studio). Same warning.
};
But that still gives the warning.
Changing constant to UINT_MAX makes it working in GNU C++ as should be according to the standard. Seems to be a bug in VS. Thanks to James for hint.

You might try:
enum EEE {
X1 = 1,
XN = -1ULL
};
Without the U, the integer literal is signed.
(This of course assumes your implementation supports long long; I assume it does since the original question uses LL; otherwise, you can use UL for a long).

Not in the current version of C++. C++0x will provide strongly typed enums.
For the time being, you can use if ( static_cast<size_t>(t) < x ) to remove the warning.

You could also overload the operators if you want to compare it
enum EEE {
X1 = 1
};
bool operator<(EEE e, std::size_t u) {
return (int)e < (int)u;
}
However you have to do that dance for any integer type on the right side. Otherwise if you do e < 2 it would be ambiguous: The compiler could use your operator< matching the left side exactly but needing a conversion on the right side, or its built-in operator, needing a promotion for the left side and matching the rigth side exactly.
So ultimately, i would put the following versions:
/* everything "shorter" than "int" uses either int or unsigned */
bool operator<(EEE e, int u) {
return (int)e < (int)u;
}
bool operator<(EEE e, unsigned u) {
return (unsigned int)e < (unsigned int)u;
}
bool operator<(EEE e, long u) {
return (long)e < (long)u;
}
bool operator<(EEE e, unsigned long u) {
return (unsigned long)e < (unsigned long)u;
}
/* long long if your compiler has it, too */
Not very nice :) But at least the user of your enumeration has easy going. However if you ultimately don't want to compare against ordinary int but against some meaningful value, i would do what some other guy proposed, and add another enumerator that has as value 2, and name it. That way, warnings will go away too.

According to Are C++ enums signed or unsigned?
your compiler gets to choose whether enum is signed or not, though there are some comments saying that in C++0x you will be able to specify that it is unsigned.

Per C++ Enumeration Declarations on MSDN:
enum EEE : unsigned {
X1 = 1
};

Why not
enum EEE {
X1 = 1,
x = 2 // pick more descriptive name, a'course
};
or
if ( size_t( t ) < x )

C++ binary constant/literal

I'm using a well known template to allow binary constants
template< unsigned long long N >
struct binary
{
enum { value = (N % 10) + 2 * binary< N / 10 > :: value } ;
};
template<>
struct binary< 0 >
{
enum { value = 0 } ;
};
So you can do something like binary<101011011>::value. Unfortunately this has a limit of 20 digits for a unsigned long long.
Does anyone have a better solution?

Does this work if you have a leading zero on your binary value? A leading zero makes the constant octal rather than decimal.
Which leads to a way to squeeze a couple more digits out of this solution - always start your binary constant with a zero! Then replace the 10's in your template with 8's.

The approaches I've always used, though not as elegant as yours:
1/ Just use hex. After a while, you just get to know which hex digits represent which bit patterns.
2/ Use constants and OR or ADD them. For example (may need qualifiers on the bit patterns to make them unsigned or long):
#define b0 0x00000001
#define b1 0x00000002
: : :
#define b31 0x80000000
unsigned long x = b2 | b7
3/ If performance isn't critical and readability is important, you can just do it at runtime with a function such as "x = fromBin("101011011");".
4/ As a sneaky solution, you could write a pre-pre-processor that goes through your *.cppme files and creates the *.cpp ones by replacing all "0b101011011"-type strings with their equivalent "0x15b" strings). I wouldn't do this lightly since there's all sorts of tricky combinations of syntax you may have to worry about. But it would allow you to write your string as you want to without having to worry about the vagaries of the compiler, and you could limit the syntax trickiness by careful coding.
Of course, the next step after that would be patching GCC to recognize "0b" constants but that may be an overkill :-)

C++0x has user-defined literals, which could be used to implement what you're talking about.
Otherwise, I don't know how to improve this template.

template<unsigned int p,unsigned int i> struct BinaryDigit
{
enum { value = p*2+i };
typedef BinaryDigit<value,0> O;
typedef BinaryDigit<value,1> I;
};
struct Bin
{
typedef BinaryDigit<0,0> O;
typedef BinaryDigit<0,1> I;
};
Allowing:
Bin::O::I::I::O::O::value
much more verbose, but no limits (until you hit the size of an unsigned int of course).

You can add more non-type template parameters to "simulate" additional bits:
// Utility metafunction used by top_bit<N>.
template <unsigned long long N1, unsigned long long N2>
struct compare {
enum { value = N1 > N2 ? N1 >> 1 : compare<N1 << 1, N2>::value };
};
// This is hit when N1 grows beyond the size representable
// in an unsigned long long. It's value is never actually used.
template<unsigned long long N2>
struct compare<0, N2> {
enum { value = 42 };
};
// Determine the highest 1-bit in an integer. Returns 0 for N == 0.
template <unsigned long long N>
struct top_bit {
enum { value = compare<1, N>::value };
};
template <unsigned long long N1, unsigned long long N2 = 0>
struct binary {
enum {
value =
(top_bit<binary<N2>::value>::value << 1) * binary<N1>::value +
binary<N2>::value
};
};
template <unsigned long long N1>
struct binary<N1, 0> {
enum { value = (N1 % 10) + 2 * binary<N1 / 10>::value };
};
template <>
struct binary<0> {
enum { value = 0 } ;
};
You can use this as before, e.g.:
binary<1001101>::value
But you can also use the following equivalent forms:
binary<100,1101>::value
binary<1001,101>::value
binary<100110,1>::value
Basically, the extra parameter gives you another 20 bits to play with. You could add even more parameters if necessary.
Because the place value of the second number is used to figure out how far to the left the first number needs to be shifted, the second number must begin with a 1. (This is required anyway, since starting it with a 0 would cause the number to be interpreted as an octal number.)

Technically it is not C nor C++, it is a GCC specific extension, but GCC allows binary constants as seen here:
The following statements are identical:
i = 42;
i = 0x2a;
i = 052;
i = 0b101010;
Hope that helps. Some Intel compilers and I am sure others, implement some of the GNU extensions. Maybe you are lucky.

A simple #define works very well:
#define HEX__(n) 0x##n##LU
#define B8__(x) ((x&0x0000000FLU)?1:0)\
+((x&0x000000F0LU)?2:0)\
+((x&0x00000F00LU)?4:0)\
+((x&0x0000F000LU)?8:0)\
+((x&0x000F0000LU)?16:0)\
+((x&0x00F00000LU)?32:0)\
+((x&0x0F000000LU)?64:0)\
+((x&0xF0000000LU)?128:0)
#define B8(d) ((unsigned char)B8__(HEX__(d)))
#define B16(dmsb,dlsb) (((unsigned short)B8(dmsb)<<8) + B8(dlsb))
#define B32(dmsb,db2,db3,dlsb) (((unsigned long)B8(dmsb)<<24) + ((unsigned long)B8(db2)<<16) + ((unsigned long)B8(db3)<<8) + B8(dlsb))
B8(011100111)
B16(10011011,10011011)
B32(10011011,10011011,10011011,10011011)
Not my invention, I saw it on a forum a long time ago.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why int8 works like char (C++) (Visual Studio 2015) - c++

You can use a bit field with two bits.

Related

Can an enum be reduced to its bit size in C++?

std::cout deal with uint8_t as a character

Division of integers in C++ not working as expected

Is there a way to make `enum` type to be unsigned?

C++ binary constant/literal

Categories

Resources