Is it good to use int_fastN_t to replace intN_t - c++

I just read this link: The difference of int8_t, int_least8_t and int_fast8_t? and now I know that int8_t is exactly 8 bits whereas int_fast8_t is the fastest int type that has at least 8 bits.
I'm a developer who develops backend processes with c++11 on Linux. Most of time I don't need to worry about the size of my processes. But I need always care about the sizes of integers in my project. For example, if I want to use an int to store the ID of user or to store a millisecond-timepoint, I can't simply use int because it may cause overflow, I must use int32_t or int64_t.
So I'm thinking if it's good to use int_fast8_t everywhere and stop using int8_t (same as int_fast32_t, int_fast64_t, uint_fast8_t etc).
Well, using int_fastN_t may change nothing because my program is always deployed on X86 or Arm64. But I still want to know if there is any drawback if I change all of intN_t into int_fastN_t. If there isn't any drawback, I think I would start to use int_fastN_t and stop using intN_t.

So I'm thinking if it's good to use int_fast8_t everywhere
No. It's not good to use it everywhere.
I still want to know if there is any drawback if I change all of intN_t into int_fastN_t
The main drawback is that the size of the integer won't be exactly N bits. In some use cases, this is crucial.
Another potential drawback is that it may be slower. Yes, "fast" type alias can be slower. The alias isn't magic; the efficiency depends on use case.
Using the "fast" alias is fine if:
The exact size doesn't matter, but only the minimum.
You have the option of changing the type later (no need for backward compatibility).
You don't have time to measure which type is actually fastest (which is often reasonable).
You didn't ask for drawbacks of using the fixed width integers. But for balance, I'll mention: they are not guaranteed to be provided on all systems. And of course, they may be slower in some other use cases (which is probably less surprising given the naming).

Related

Advantages/Disadvantages of using __int16 (or int16_t) over int

As far as I understand, the number of bytes used for int is system dependent. Usually, 2 or 4 bytes are used for int.
As per Microsoft's documentation, __int8, __int16, __int32 and __int64 are Microsoft Specific keywords. Furthermore, __int16 uses 16-bits (i.e. 2 bytes).
Question: What are advantage/disadvantage of using __int16 (or int16_t)? For example, if I am sure that the value of my integer variable will never need more than 16 bits then, will it be beneficial to declare the variable as __int16 var (or int16_t var)?
UPDATE: I see that several comments/answers suggest using int16_t instead of __int16, which is a good suggestion but not really an advantage/disadvantage of using __int16. Basically, my question is, what is the advantage/disadvantage of saving 2 bytes by using 16-bit version of an integer instead of int ?
Saving 2 bytes is almost never worth it. However, saving thousands of bytes is. If you have an large array containing integers, using a small integer type can save quite a lot of memory. This leads to faster code, because the less memory one uses the less cache misses one receives (cache misses are a major loss of performance).
TL;DR: this is beneficial to do in large arrays, but pointless for 1-off variables.
The second use of these is if for dealing with binary files and messages. If you are reading a binary file that uses 16-bit integers, well, it's pretty convenient if you can represent that type exactly in your code.
BTW, don't use microsoft's versions. Use the standard versions (std::int16_t)
It depends.
On x86, primitive types are generally aligned on their size. So 2-byte types would be aligned on a 2-byte boundary. This is useful when you have more than one of these short variables, because you will be saving 50% of space. That directly translates to better memory and cache utilization and thus theoretically, better performance.
On the other hand, doing arithmetic on shorter-than-int types usually involves widening conversion to int. So if you do a lot of arithmetic on these types, using int types might result in better performance (contrived example).
So if you care about performance of a critical section of code, profile it to find out for sure if using a certain data type is faster or slower.
A possible rule of thumb would be - if you're memory-bound (i.e. you have lots of variables and especially arrays), use as short a data types as possible. If not - don't worry about it and use int types.
If you for some reason just need a shorter integer type it's already have that in the language - called short - unless you know you need exactly 16 bits there's really no good reason not to just stick with the agnostic short and int types. The broad idea is that these types should align well the target architecture (for example see word ).
That being said, theres no need to use the platform specific type (__int16), you can just use the standard one:
int16_t
See https://en.cppreference.com/w/cpp/types/integer for more information and standard types
Even if you still insist on __int16 you probably want a typedef something ala.:
using my_short = __int16;
Update
Your main question is:
What is the advantage/disadvantage of
saving 2 bytes by using 16-bit version of an integer instead of int ?
If you have a lot of data (In the ballpark of at least some 100.000-1.000.000 elements as a rule of thumb) - then there could be an overall performance saving in terms of using less cpu-cache. Overall there's no disadvantage of using a smaller type - except for the obvious one - and possible conversions as explained in this answer
The main reason for using these types is to make sure about the size of your variable in different architectures and compilers. we call it "code reusability" and "portability"
in higher-level modern languages, all this will handle with compiler/interpreter/virtual machine/etc. that you don't need to worry about, but it has some performance and memory usage costs.
When you have some kind of limitation you may need to optimize everything. The best example is embedded systems that have a very limited size of memory and work at low frequency. In the other hand, there are lots of compilers out there with different implementations. Some of them interpret "int" as a "16bit" value and some as a "32bit".
for example, you receive and specific stream of values over a communication system, you want to save them in a buffer or array and you want to make sure the input data is always interpreted as a 16bit noting else.

Why is std::ssize being forced to a minimum size for its signed size type?

In C++20, std::ssize is being introduced to obtain the signed size of a container for generic code. (And the reason for its addition is explained here.)
Somewhat peculiarly, the definition given there (combining with common_type and ptrdiff_t) has the effect of forcing the return value to be "either ptrdiff_t or the signed form of the container's size() return value, whichever is larger".
P1227R1 indirectly offers a justification for this ("it would be a disaster for std::ssize() to turn a size of 60,000 into a size of -5,536").
This seems to me like an odd way to try to "fix" that, however.
Containers which intentionally define a uint16_t size and are known to never exceed 32,767 elements will still be forced to use a larger type than required.
The same thing would occur for containers using a uint8_t size and 127 elements, respectively.
In desktop environments, you probably don't care; but this might be important for embedded or otherwise resource-constrained environments, especially if the resulting type is used for something more persistent than a stack variable.
Containers which use the default size_t size on 32-bit platforms but which nevertheless do contain between 2B and 4B items will hit exactly the same problem as above.
If there still exist platforms for which ptrdiff_t is smaller than 32 bits, they will hit the same problem as well.
Wouldn't it be better to just use the signed type as-is (without extending its size) and to assert that a conversion error has not occurred (eg. that the result is not negative)?
Am I missing something?
To expand on that last suggestion a bit (inspired by Nicol Bolas' answer): if it were implemented the way that I suggested, then this code would Just Work™:
void DoSomething(int16_t i, T const& item);
for (int16_t i = 0, len = std::ssize(rng); i < len; ++i)
{
DoSomething(i, rng[i]);
}
With the current implementation, however, this produces warnings and/or errors unless static_casts are explicitly added to narrow the result of ssize, or to use int i instead and then narrow it in the function call (and the range indexing), neither of which seem like an improvement.
Containers which intentionally define a uint16_t size and are known to never exceed 32,767 elements will still be forced to use a larger type than required.
It's not like the container is storing the size as this type. The conversion happens via accessing the value.
As for embedded systems, embedded systems programmers already know about C++'s propensity to increase the size of small types. So if they expect a type to be an int16_t, they're going to spell that out in the code, because otherwise C++ might just promote it to an int.
Furthermore, there is no standard way to ask about what size a range is "known to never exceed". decltype(size(range)) is something you can ask for; sized ranges are not required to provide a max_size function. Without such an ability, the safest assumption is that a range whose size type is uint16_t can assume any size within that range. So the signed size should be big enough to store that entire range as a signed value.
Your suggestion is basically that any ssize call is potentially unsafe, since half of any size range cannot be validly stored in the return type of ssize.
Containers which use the default size_t size on 32-bit platforms but which nevertheless do contain between 2B and 4B items will hit exactly the same problem as above.
Assuming that it is valid for ptrdiff_t to not be a signed 64-bit integer on such platforms, there isn't really a valid solution to that problem. So yes, there will be cases where ssize is potentially unsafe.
ssize currently is potentially unsafe in cases where it is not possible to be safe. Your proposal would make ssize potentially unsafe in all cases.
That's not an improvement.
And no, merely asserting/contract checking is not a viable solution. The point of ssize is to make for(int i = 0; i < std::ssize(rng); ++i) work without the compiler complaining about signed/unsigned mismatch. To get an assert because of a conversion failure that didn't need to happen (and BTW, cannot be corrected without using std::size, which we are trying to avoid), one which is ultimately irrelevant to your algorithm? That's a terrible idea.
if it were implemented the way that I suggested, then this code would Just Work™:
Let us ignore the question of how often it is that a user would write this code.
The reason your compiler will expect/require you to use a cast there is because you are asking for an inherently dangerous operation: you are potentially losing data. Your code only "Just Works™" if the current size fits into an int16_t; that makes the conversion statically dangerous. This is not something that should implicitly take place, so the compiler suggests/requires you to explicitly ask for it. And users looking at that code get a big, fat eyesore reminding them that a dangerous thing is being done.
That is all to the good.
See, if your suggested implementation were how ssize behaved, then that means we must treat every use of ssize as just as inherently dangerous as the compiler treats your attempted implicit conversion. But unlike static_cast, ssize is small and easily missed.
Dangerous operations should be called out as such. Since ssize is small and difficult to notice by design, it therefore should be as safe as possible. Ideally, it should be as safe as size, but failing that, it should be unsafe only to the extend that it is impossible to make it safe.
Users should not look on ssize usage as something dubious or disconcerting; they should not fear to use it.

Is there a reason to use C++11's std::int_fast32_t or std::int_fast16_t over int in cross-platform code?

In C++11 we are provided with fixed-width integer types, such as std::int32_tand std::int64_t, which are optional and therefore not optimal for writing cross-platform code. However, we also got non-optional variants for the types: e.g. the "fast" variants, e.g. std::int_fast32_tand std::int_fast64_t, as well as the "smallest-size" variants, e.g. std::int_least32_t, which both are at least the specified number of bits in size.
The code I am working on is part of a C++11-based cross-platform library, which supports compilation on the most popular Unix/Windows/Mac compilers. A question that now came up is if there is an advantage in replacing the existing integer types in the code by the C++11 fixed-width integer types.
A disadvantage of using variables like std::int16_t and std::int32_t is the lack of a guarantee that they are available, since they are only provided if the implementation directly supports the type (according to http://en.cppreference.com/w/cpp/types/integer).
However, since int is at least 16 bits and 16-bit are large enough for the integers used in the code, what about the usage of std::int_fast16_t over int? Does it provide a benefit to replace all int types by std::int_fast16_t and all unsigned int's by std::uint_fast16_t in that way or is this unnecessary?
Anologously, if knowing that all supported platforms and compilers feature an int of at least 32 bits size, does it make sense to replace them by std::int_fast32_t and std::uint_fast32_t respectively?
int can be 16, 32 or even 64 bit on current computers and compilers. In the future, it could be bigger (say, 128 bits).
If your code is ok with that, go with it.
If your code is only tested and working with 32 bit ints, then consider using int32_t. Then the code will fail at compile time instead of at run time when run on a system that doesn't have 32 bit ints (which is extremely rare today).
int_fast32_t is when you need at least 32 bits, but you care a lot about performance. On hardware that a 32 bit integer is loaded as a 64 bit integer, then bitshifted back down to a 32 bit integer in a cumbersome process, the int_fast_32_t may be a 64 bit integer. The cost of this is that on obscure platforms, your code behaves very differently.
If you are not testing on such platforms, I would advise against it.
Having things break at build time is usually better than having breaks at run time. If and when your code is actually run on some obscure processor needing these features, then fix it. The rule of "you probably won't need it" applies.
Be conservative, generate early errors on hardware you are not tested on, and when you need to port to said hardware do the work and testing required to be reliable.
In short:
Use int_fast##_t if and only if you have tested your code (and will continue to test it) on platforms where the int size varies, and you have shown that the performance improvement is worth that future maintenance.
Using int##_t with common ## sizes means that your code will fail to compile on platforms that you have not tested it on. This is good; untested code is not reliable, and unreliable code is usually worse than useless.
Without using int32_t, and using int, your code will sometimes have ints that are 32 and sometimes ints that are 64 (and in theory more), and sometimes ints that are 16. If you are willing to test and support every such case in every such int, go for it.
Note that arrays of int_fast##_t can have cache problems: they could be unreasonably big. As an example, int_fast16_t could be 64 bits. An array of a few thousand or million of them could be individually fast to work with, but the cache misses caused by their bulk could make them slower overall; and the risk that things get swapped out to slower storage grows.
int_least##_t can be faster in those cases.
The same applies, doubly so, to network-transmitted and file-stored data, on top of the obvious issue that network/file data usually has to follow formats that are stable over compiler/hardware changes. This, however, is a different question.
However, when using fixed width integer types you must pay special attention to the fact that int, long, etc. still have the same width as before. Integer promotion still happens based on the size of int, which depends on the compiler you are using. An integral number in your code will be of type int, with the associated width. This can lead to unwanted behaviour if you compile your code using a different compiler. For more detailed info: https://stackoverflow.com/a/13424208/3144964
I have just realised that the OP is just asking about int_fast##_t not int##_t since the later is optional. However, I will keep the answer hopping it may help someone.
I would add something. Fixed size integers are so important (or even a must) for building APIs for other languages. One example is when when you want to pInvoke functions and pass data to them in a native C++ DLL from a .NET managed code for example. In .NET, int is guaranteed to be a fixed size (I think it is 32bit). So, if you used int in C++ and it was considered as 64-bit rather than 32bit, this may cause problems and cuts down the sequence of wrapped structs.

Forcing types to a specific size

I've been learning C++ and one thing that I'm not really comfortable with is the fact that datatype sizes are not consistent. Depending on what system something is deployed on an int could be 16 bits or 32 bits, etc.
So I was thinking it might be a good idea to make my own header file with data types like byte, word, etc. that are defined to be a specific size and will maintain that size on any platform.
Two questions. First is this a good idea? Or is it going to create other problems I'm not aware of? Second, how do you define a type as being, say, 8 bits? I can't just say #define BYTE char, cause char would vary across platforms.
Fortunately, other people have noticed this same problem. In C99 and C++11 (so set your compiler to compatibility with one of those two modes, there should be a switch in your compiler settings), they added the header stdint.h (for C) and cstdint (for C++). If you #include <cstdint>, you get the types int8_t, int16_t, int32_t, int64_t, and the same prefixed with a u for unsigned versions. If your platform supports those types, they will be defined in the header, along with several others.
If your compiler does not yet support that standard (or you are forced by reasons out of your control to remain on C++03), then there is also Boost.
However, you should only use this if you care exactly about the size of the type. int and unsigned are fine for throw-away variables in most cases. size_t should be used for indexing std::vector, etc.
First you need to figure out if you really care what sizes things are. If you are using an int to count the number of lines in a file, do you really care if it's 32-bit or 64? You need BYTE, WORD, etc if you are working with packed binary data, but generally not for any other reason. So you may be worrying over something that doesn't really matter.
Better yet, use the already defined stuff in stdint.h. See here for more details. Similar question here.
Example:
int32_t is always 32 bits.
Many libraries have their own .h with a lots of typedef to have constant size types. This is useful when making portable code, and avoid relying on the headers of the platform you are currently working with.
If you only want to make sure the builtin data types have a minimum size you can use std::numeric_limits in the header to check.
std::numeric_limits<int>::digits
will give you, for example, the number of bits of an int without the sign bit. And
std::numeric_limits<int>::max()
will give you the max value.

Do bit operations cause programs to run slower?

I'm dealing with a problem which needs to work with a lot of data. Currently its values are represented as an unsigned int. I know that real values do not exceed a limit of 1000.
Questions
I can use unsigned short to store it. An upside to this is that it'll use less storage space to store the value. Will performance suffer?
If I decided to store data as short but all the calling functions use int, it's recognized that I need to convert between these datatypes when storing or extracting values. Will performance suffer? Will the loss in performance be dramatic?
If I decided to not use short but just 10 bits packed into an array of unsigned int. What will happen in this case comparing with previous ones?
This all depends on architecture. Bit-fields are generally slower, but if you are able to to significantly cut down memory usage with them, you can even gain in performance due to better CPU caching and similar things. Likewise with short (though it is not dramatic in any case).
The best way is to make your source code able to switch representation easily (at compilation time, of course). Then you will be able to test and profile different implementations in your specific circumstances just by, say, changing one #define.
Also, don't forget about premature optimization rule. Make it work first. If it turns out to be slow/not fast enough, only then try to speed up.
I can use unsigned short to store it.
Yes you can use unsigned short (assuming (sizeof(unsigned short) * CHAR_BITS) >= 10)
An upside to this is that it'll use less storage space to store the value.
Less than what? Less than int? Depends what is the sizeof(int) on your system?
Will performance suffer?
Depends. The type int is supposed to be the most efficient integer type for your system so potentially using short may affect your performance. Whether it does will depend on the system. Time it and find out.
If I decided to store data as short but all the calling functions use int, it's recognized that I need to convert between these datatypes when storing or extracting values.
Yes. But the compiler will do the conversion automatically. One thing you need to watch though is conversion between signed and unsigned types. If the value does not fit the exact result may be implementation defined.
Will performance suffer?
Maybe. if sizeof(unsigned int) == sizeof(unsigned short) then probably not. Time it and see.
Will the loss in performance be dramatic?
Time it and see.
If I decided to not use short but just 10 bits packed into an array of unsigned int. What will happen in this case comparing with previous ones?
Time it and see.
A good compromise for you is probably packing three values into a 32 bit int (with two bits unused). Untangling 10 bits from a bit array is a lot more expensive, and doesn't save much space. You can either use bit fields, or do it by hand yourself:
(i&0x3FF) // Get i[0]
(i>>10)&0x3FF // Get i[1]
(i>>20)&0x3FF // Get i[2]
i = (i&0x3FFFFC00) | (j&0x3FF) // Set i[0] to j
i = (i&0x3FF003FF) | ((j&0x3FF)<<10) // Set i[1] to j
i = (i&0xFFFFF) | ((j&0x3FF)<<20) // Set i[2] to j
You can see here how much extra expense it is: a bit operation and 2/3 of a shift (on average) for get, and three bit operations and 2/3 of a shift (on average) to set. Probably not too bad, especially if you're mostly getting the values not setting them.