Is there a fixed-width bool type in standard C++? - c++

As far as I could find, the width of the bool type is implementation-defined. But are there any fixed-width boolean types, or should I stick to, for e.g., a uint8_t to represent a fixed-width bool?
[EDIT]
I made this python script that auto-generates a C++ class which can hold the variables I want to be able to send between a micro controller and my computer. The way it works is that it also keeps two arrays holding a pointer to each one of these variables and the sizeof each one of them. This gives me the necessary information to easily serialize and deserialize each one of these variables. For this to work however the sizeof, endianness, etc of the variable types have to be the same on both sides since I'm using the same generated code on both sides.
I don't know if this will be a problem yet, but I don't expect it to be. I have already worked with this (32bit ARM) chip before and haven't had problems sending integer and float types in the past. However it will be a few days until I'm back and can try booleans out on the chip. This might be a bigger issue later, since this code might be reused on other chips later.
So my question is. Is there a fixed width bool type defined in the standard libraries or should I just use a uint8_t to represent the boolean?

There is not. Just use uint8_t if you need to be sure of the size. Any integer type can easily be treated as boolean in C-related languages. See https://stackoverflow.com/a/4897859/1105015 for a lengthy discussion of how bool's size is not guaranteed by the standard to be any specific value.

Related

C++ UE4 - bool vs. uint8 : 1 vs. uint32 : 1 - pros and cons of each?

So, I'm familiar with the concept of packing a bunch of Boolean values using a single bit inside of an integer (bit masking I think its called), and thus you conserve memory because a Boolean is a byte and you fit more than one Boolean in an byte long integer. Thus, if you have enough Booleans, packing them together can make a big difference, and we see that in the native Unreal source code this particular optimization is used quite heavily. What I'm not clear on however, is what are the downsides of this? There are places where many regular Booleans are used instead. Also, why in some paces are uint32 used and some places unint8 are used? I've read there may be some read write related inefficiencies or something?
The biggest problem is that there is no pointer to "packed bool" - like you have an int32 that packs 32 booleans then you cannot make bool* or bool& that refers to any of them in a proper way. This is due to the fact that byte is a minimal memory unit.
In STL they made std::vector<bool> that saved space and had the same interface semantically as other vectors. To do so they had to make special proxy class that is returned from operator [] so one do stuff like boolVec[5] = true. Unfortunately, this over-complication resulted in many problems in performance and usage of std::vector<bool>.
Even simple instructions on packed booleans tend to be composite and thus heavier than if the booleans represented via bool and took a whole byte. Additionally, modifying values of packed boolean is could be causing data-racing in multi-threaded environment.
Basically, hardware simply doesn't support booleans too well.
Next, image POV of OS designer and you create common interface of shared libraries (aka dll). How to treat booleans now? Byte is a minimal memory unit so to pass a single boolean one would still need to use at least a single byte. So why not simply forget about existence of bool and simply pass it via a single byte? So we don't even need to implement this needless type of bool and it will save lots of time for all compiler writers of all languages.
uint8 vs uint32; Also, note that Windows' COM (component object model - not serial port) uses int16 for boolean. In general, it is inherently unimportant as when passing values to a shared library's function that does complex stuff will not make any noticeable difference in performance as you are already calling a much heavier function. Still why is it so? I imagine they had some reasons a long time ago when they designed it and everybody has already forgotten why and they simply keep it unchanged as changing it will result in complete disaster in terms of backwards compatibility.
In C99 _Bool was introduced for booleans but it is just a different name for an unsigned int. I imagine from this originated usage of uint32 for booleans. In general, int is supposedly the most efficient integer type in terms of performance (which it why its size is not strictly defined) - so the C committee chose the supposedly most efficient type to represent booleans.

Is it necessary to take care of long data type in program that is written for Windows and Linux?

According to cpp reference in 64 bit systems:
LLP64 or 4/4/8 (int and long are 32-bit, pointer is 64-bit)
Win64 API
LP64 or 4/8/8 (int is 32-bit, long and pointer are 64-bit)
Unix and Unix-like systems (Linux, Mac OS X)
Then how to consider long data type for codes which is written for Linux and Windows?
In C and C++, in portable code, you never know the exact size of a type like int or long int. If you move your code to a different compiler (or a different machine, or a different OS), the sizes of some of your types may change. This needn't be a problem; in fact it's only a problem if you want to make it a problem. (All of this has always been the case, and has nothing to do with someone's definitions of "LLP64" and "LP64" architecture families.)
On those (hopefully rare) occasions when you need a type of an exact size, one good way is to use types like int32_t and uint64_t from <cstdint> (or <stdint.h> in C).
But you really, really shouldn't need to specify the exact size of a type, most of the time. (There are those who say you need to specify the exact size of every type, but my advice is to ignore those people.)
Pretty much the only time you need to specify exact sizes is when trying to define a structure which you can read and write in "binary" fashion to conform to some externally-imposed storage layout. But there, specifying the exact sizes of data types isn't generally sufficient, because of issues like alignment, padding, and byte order. So you're better off writing explicit serialization and deserialization code anyway (or using "text" data formats instead, if you can get away with it).
My bottom line is that I rarely worry about the exact sizes of types.

INT_PTR in 64 bit conversion

I am converting a large project of multiple c++ applications from 32 bit MFC to 64 bit. These applications will be required to be compiled in both 32 and 64 bit for the forseeable future.
I've run across these types INT_PTR and UINT_PTR. I have two questions.
Is it considered best practice to use these types as a "default" type for general integer purposes, such as loop counters, etc?
I understand that the size of these types are related to the pointer size of the environment you are compiling for, but it seems confusing to use them for general purpose integer. For example for (INT_PTR i = 0; i<10; i++) ... ; i isn't a pointer or pointer related, so the name of the type is confusing to me. Are there better predefines to be used in this situation or should I make my own?
My compiler is VS2010.
Thanks
INT_PTR and similar have a very specific use-case: They model a data type that is large enough and with appropriate alignment requirements to hold an integer or a pointer type. It is used in situations where a method has parameters or a return type that is either an integral data type or a pointer (like GetWindowLongPtr).
For loops there is no generic advice other than this: Use the most appropriate type. If your loop index is used to index into a container, make it a size_t. If your loop index is an integer that runs across all values in between x and y, use the type of x and y, or one that is compatible with both. An INT_PTR is not really an appropriate loop index data type.
You might want to take a look at this: http://msdn.microsoft.com/en-us/library/windows/desktop/aa383751%28v=vs.85%29.aspx/ "Windows Data Types"
You might be a bit confused about the purpose for this type. It's there to ensure that if you cast a pointer to an int (which you probably shouldn't do anyway) you have an int type that fits. Windows is a bit odd in that (if memory serves) int is still 32 bits when compiling for 64 bits. If you need an int of a specific size I'd use the exact width types (in stdint.h)
If your loop is not related to pointers (that is, it is not used as an index of an array or a vector or anything like that), use int whenever a generic integer type is required. It's a good thing to make sure your code doesn't depend on size of int (although for MS VS int is 32 bit for both platforms).
Use size_t, vector<...>::size_type or other appropriate type for index-related loops.
You may use the datatype "intptr_t". It's a platform specific datatype. It holds 4 byte values in 32 bit platform while 8 byte values in 64 bit platform.

Forcing types to a specific size

I've been learning C++ and one thing that I'm not really comfortable with is the fact that datatype sizes are not consistent. Depending on what system something is deployed on an int could be 16 bits or 32 bits, etc.
So I was thinking it might be a good idea to make my own header file with data types like byte, word, etc. that are defined to be a specific size and will maintain that size on any platform.
Two questions. First is this a good idea? Or is it going to create other problems I'm not aware of? Second, how do you define a type as being, say, 8 bits? I can't just say #define BYTE char, cause char would vary across platforms.
Fortunately, other people have noticed this same problem. In C99 and C++11 (so set your compiler to compatibility with one of those two modes, there should be a switch in your compiler settings), they added the header stdint.h (for C) and cstdint (for C++). If you #include <cstdint>, you get the types int8_t, int16_t, int32_t, int64_t, and the same prefixed with a u for unsigned versions. If your platform supports those types, they will be defined in the header, along with several others.
If your compiler does not yet support that standard (or you are forced by reasons out of your control to remain on C++03), then there is also Boost.
However, you should only use this if you care exactly about the size of the type. int and unsigned are fine for throw-away variables in most cases. size_t should be used for indexing std::vector, etc.
First you need to figure out if you really care what sizes things are. If you are using an int to count the number of lines in a file, do you really care if it's 32-bit or 64? You need BYTE, WORD, etc if you are working with packed binary data, but generally not for any other reason. So you may be worrying over something that doesn't really matter.
Better yet, use the already defined stuff in stdint.h. See here for more details. Similar question here.
Example:
int32_t is always 32 bits.
Many libraries have their own .h with a lots of typedef to have constant size types. This is useful when making portable code, and avoid relying on the headers of the platform you are currently working with.
If you only want to make sure the builtin data types have a minimum size you can use std::numeric_limits in the header to check.
std::numeric_limits<int>::digits
will give you, for example, the number of bits of an int without the sign bit. And
std::numeric_limits<int>::max()
will give you the max value.

What is the uintptr_t data type?

What is uintptr_t and what can it be used for?
First thing, at the time the question was asked, uintptr_t was not in C++. It's in C99, in <stdint.h>, as an optional type. Many C++03 compilers do provide that file. It's also in C++11, in <cstdint>, where again it is optional, and which refers to C99 for the definition.
In C99, it is defined as "an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer".
Take this to mean what it says. It doesn't say anything about size.
uintptr_t might be the same size as a void*. It might be larger. It could conceivably be smaller, although such a C++ implementation approaches perverse. For example on some hypothetical platform where void* is 32 bits, but only 24 bits of virtual address space are used, you could have a 24-bit uintptr_t which satisfies the requirement. I don't know why an implementation would do that, but the standard permits it.
uintptr_t is an unsigned integer type that is capable of storing a data pointer (whether it can hold a function pointer is unspecified). Which typically means that it's the same size as a pointer.
It is optionally defined in C++11 and later standards.
A common reason to want an integer type that can hold an architecture's pointer type is to perform integer-specific operations on a pointer, or to obscure the type of a pointer by providing it as an integer "handle".
It's an unsigned integer type exactly the size of a pointer. Whenever you need to do something unusual with a pointer - like for example invert all bits (don't ask why) you cast it to uintptr_t and manipulate it as a usual integer number, then cast back.
There are already many good answers to "what is uintptr_t data type?". I will try to address the "what it can be used for?" part in this post.
Primarily for bitwise operations on pointers. Remember that in C++ one cannot perform bitwise operations on pointers. For reasons see Why can't you do bitwise operations on pointer in C, and is there a way around this?
Thus in order to do bitwise operations on pointers one would need to cast pointers to type uintptr_t and then perform bitwise operations.
Here is an example of a function that I just wrote to do bitwise exclusive or of 2 pointers to store in a XOR linked list so that we can traverse in both directions like a doubly linked list but without the penalty of storing 2 pointers in each node.
template <typename T>
T* xor_ptrs(T* t1, T* t2)
{
return reinterpret_cast<T*>(reinterpret_cast<uintptr_t>(t1)^reinterpret_cast<uintptr_t>(t2));
}
Running the risk of getting another Necromancer badge, I would like to add one very good use for uintptr_t (or even intptr_t) and that is writing testable embedded code.
I write mostly embedded code targeted at various arm and currently tensilica processors. These have various native bus width and the tensilica is actually a Harvard architecture with separate code and data buses that can be different widths.
I use a test driven development style for much of my code which means I do unit tests for all the code units I write. Unit testing on actual target hardware is a hassle so I typically write everything on an Intel based PC either in Windows or Linux using Ceedling and GCC.
That being said, a lot of embedded code involves bit twiddling and address manipulations. Most of my Intel machines are 64 bit. So if you are going to test address manipulation code you need a generalized object to do math on. Thus the uintptr_t give you a machine independent way of debugging your code before you try deploying to target hardware.
Another issue is for the some machines or even memory models on some compilers, function pointers and data pointers are different widths. On those machines the compiler may not even allow casting between the two classes, but uintptr_t should be able to hold either.
-- Edit --
Was pointed out by #chux, this is not part of the standard and functions are not objects in C. However it usually works and since many people don't even know about these types I usually leave a comment explaining the trickery. Other searches in SO on uintptr_t will provide further explanation. Also we do things in unit testing that we would never do in production because breaking things is good.