Is static_cast on bounded types implementation-dependent?

Is static_cast on bounded types implementation-dependent? - static-cast

I'm looking at static_cast with bounded types .
Is the behavior implementation-specific? In other words (given 16-bit shorts and 32-bit longs) is
long x = 70000;
short y = static_cast<short>(x);
guaranteed to produce y = 4464 (the low-order 16 bits of x)? Or only on a little-endian machine?
I have always assumed it would but I am getting odd results on a big-endian machine and trying to figure them out.
Here's the actual problem. I have two time_t's (presumably 64 bits) that I "know" will always be within some reasonable number of seconds of each other. I want to display that difference with printf. The code is multi-platform, so rather than worry about what the underlying type of time_t is, I am doing a printf("%d") passing static_cast<int>(time2-time1). I'm seeing a zero, despite the fact that the printf is in a block conditioned on (time2 != time1). (The printf is in a library; no reasonable possibility of using cout instead.)
Is static_cast possibly returning the high 32 bits of time_t?
Is there a better way to do this?
Thanks,

I think perhaps the problem was unrelated to the static_cast. #ifdef platform confusion. I'd still be interested if someone definitively knows the answer.

Related

Difference and nuances of std::uint8_t(-1) vs std::uint8_t(0xffu)

I have seen the use std::uint8_t(-1) for example here:
https://en.cppreference.com/w/cpp/language/fold
The CPP reference is illustrating endianness swap and I'm wondering what is the difference compared to a std::uint8_t(0xffu)?
On x86 there doesn't seem any difference:
https://godbolt.org/z/Kb7v8K1nT
My question could be reading into it too much and it's just a convention how somebody writes code and there is no deeper meaning to it. However, I suspect it's due to the portability of the code on some esoteric architectures where CHAR_BIT != 8
But then I was wondering in a case of byte order swap which needs to be 8-bit aligned then I would expect the std::uint8_t(0xffu) and forcing to do 8-bit calculations even when the CHAR_BIT != 8 then that would produce more portable code as it would be expected not to change between platforms? For example, when I'm producing TCP/IP packets and need to have specific endianness (and possibly swap some values), they need to be the values be same no matter what underlying architecture is used.
Maybe in a nibble, char swap (where we expect the size of the type to change and mechanism to adjust) then the std::uint8_t(-1) would be better?
In essence with std::uint8_t(-1) we are saying set all bits high no matter how many bits are there (even more if CHAR_BIT > 8), while with std::uint8_t(0xffu) we want 8-bit set (we get less if CHAR_BIT < 8)?
Or is there something I'm completely missing?

It's a shortcut to get the maximum value of an unsigned type without having to care how wide it is. All unsigned types behave in modulo 2n so unsigned_type(-1) is the same as std::numeric_limits<unsigned_type>::max().

Perform 64 bit calculations in 64 bit executable

I am using MinGW64 (with the -m64 flag) with Code::Blocks and am looking to know how to perform 64 bit calculations without having to cast a really big number to int64_t before multiplying it. For example, this does not result in overflow:
int64_t test = int64_t(2123123123) * 17; //Returns 36093093091
Without the cast, the calculation overflows like such:
int64_t test = 2123123123 * 17; //Returns 1733354723
A VirusTotal scan confirms that my executable is x64.
Additional Information: OS is Windows 7 x64.

The default int type is still 32 bit even in 64 bit compilations for compatibility resons.
The "shortest" version I guess would be to add the ll suffix to the number
int64_t test = 2123123123ll * 17;
Another way would be to store the numbers in their own variables of type int64_t (or long long) and multiply the varaibles. usually it's rare anyway in a program to have many "magic-numbers" hard-coded into the codebase.

Some background:
Once upon a time, most computers had 8-bit arithmetic logic units and a 16-bit address bus. We called them 8-bit computers.
One of the first things we learned was that no real-world arithmetic problem can be expressed in 8-bits. It's like trying to reason about space flight with the arithmetic abilities of a chimpanzee. So we learned to write multi-word add, multiply, subtract and divide sequences. Because in most real-world problems, the numerical domain of the problem was bigger than 255.
The we briefly had 16-bit computers (where the same problem applied, 65535 is just not enough to model things) and then quite quickly, 32-bit arithmetic logic built in to chips. Gradually, the address bus caught up (20 bits, 24 bits, 32 bits if designers were feeling extravagant).
Then an interesting thing happened. Most of us didn't need to write multi-word arithmetic sequences any more. It turns out that most(tm) real world integer problems could be expressed in 32 bits (up to 4 billion).
Then we started producing more data at a faster rate than ever before, and we perceived the need to address more memory. The 64-bit computer eventually became the norm.
But still, most real-world integer arithmetic problems could be expressed in 32 bits. 4 billion is a big (enough) number for most things.
So, presumably through statistical analysis, your compiler writers decided that on your platform, the most useful size for an int would be 32 bits. Any smaller would be inefficient for 32-bit arithmetic (which we have needed from day 1) and any larger would waste space/registers/memory/cpu cycles.
Expressing an integer literal in c++ (and c) yields an int - the natural arithmetic size for the environment. In the present day, that is almost always a 32-bit value.
The c++ specification says that multiplying two ints yields an int. If it didn't then multiplying two ints would need to yield a long. But then what would multiplying two longs yield? A long long? Ok, that's possible. Now what if we multiply those? A long long long long?
So that's that.
int64_t x = 1 * 2; will do the following:
take the integer (32 bits) of value 1.
take the integer (32 bits) of value 2.
multiply them together, storing the result in an integer. If the arithmetic overflows, so be it. That's your lookout.
cast the resulting integer (whatever that may now be) to int64 (probably on your system a long int.
So in a nutshell, no. There is no shortcut to spelling out the type of at least one of the operands in the code snippet in the question. You can, of course, specify a literal. But there is no guarantee that the a long long (LL literal suffix) on your system is the same as int64_t. If you want an int64_t, and you want the code to be portable, you must spell it out.
For what it's worth:
In a post-c++11 world all the worrying about extra keystrokes and non-DRYness can disappear:
definitely an int64:
auto test = int64_t(2123123123) * 17;
definitely a long long:
auto test = 2'123'123'123LL * 17;
definitely int64, definitely initialised with a (possibly narrowing, but that's ok) long long:
auto test = int64_t(36'093'093'091LL);

Since you're most likely in an LP64 environment, where int is only 32 bits, you have to be careful about literal constants in expressions. The easiest way to do this is to get into the habit of using the proper suffix on literal constants, so you would write the above as:
int64_t test = 2123123123LL * 17LL;

2123123123 is an int (usually 32 bits).
Add an L to make it a long: 2123123123L (usually 32 or 64 bits, even in 64-bit mode).
Add another L to make it a long long: 2123123123LL (64 bits or more starting with C++11).
Note that you only need to add the suffix to constants that exceed the size of an int. Integral conversion will take care of producing the right result*.
(2123123123LL * 17) // 17 is automatically converted to long long, the result is long long
* But beware: even if individual constants in an expression fit into an int, the whole operation can still overflow like in
(1024 * 1024 * 1024 * 10)
In that case you should make sure the arithmetic is performed at sufficient width (taking operator precedence into account):
(1024LL * 1024 * 1024 * 10)
- will perform all 3 operations in 64 bits, with a 64-bit result.

Edit: Literal constants (A.K.A. magic numbers) are frowned upon, so the best way to do it would be to use symbolic constants (const int64_t value = 5). See What is a magic number, and why is it bad? for more info. It's best that you don't read the rest of this answer, unless you really want to use magic numbers for some strange reason.
Also, you can use intptr_t and uintprt_t from #include <cstdint> to let the compiler choose whether to use int or __int64.
For those who stumble upon this question, `LL` at the end of a number can do the trick, but it isn't recommended, as Richard Hodges told me that `long long` may not be always 64 bit, and can increase in size in the future, although it's not likely. See Richard Hodge's answer and the comments on it for more information.
The reliable way would be to put `using QW = int_64t;` at the top and use `QW(5)` instead of `5LL`.
Personally I think there should be an option to define all literals 64 bit without having to add any suffixes or functions to them, and use `int32_t(5)` when necessary, because some programs are unaffected by this change. Example: only use numbers for normal calculations instead of relying on integer overflow to do it's work. The problem is going from 64 bit to 32 bit, rather than going from 32 to 64, as the first 4 bytes are cut off.

Is there a way to increase the size of an int in C++ without using long?

If the range of int is up to 32768, then I have to input a value of around 50000 and use it,I want to input it without using long and if possible without using typecasting also. Is there any way to do it. I want the datatype to remain int only.

Any built-in type cannot be altered nor expanded in any sense. You have to switch to a different type.
The type int has the following requirements:
represents at least the range -32767 to 32767 (16bit)
is at least as large as short (sizeof(short) <= sizeof(int))
This means, that strictly speaking (although most platforms use at least 32bit for int), you can't safely store the value 50000 in an int.
If you need a guaranteed range, use int16_t, int32_t or int64_t. They are defined in the header <cstdint>. There is no arbitrary precision integer type in the language or in the standard library.
If you only need to observe the range of valid integers, use the header <limits>:
std::cout << std::numeric_limits<int>::min() << " to " << std::numeric_limits<int>::max() << "\n";

You may try unsigned int. Its same as int but with positive range(if you really dont want to use long).
see this for the range of data types
suggestion:
You might aswell consider switching your compiler. From the range you've mentioned for int, it seems you are using a 16 bit compiler(probably turbo c). A 16-bit compiler would restrict unsigned int range to 0-65536(2^16) and signed int to –32,768 to 32,767.

No!
An int depends on the native machine word, which really means it depends on 3 things - the processor, the OS, and the compiler.
The only way you can "increase" an int foo; (not a long foo;, int is not a long) is:
You are compiling with Turbo-C or a legacy 16-bit DOS compiler on a modern computer, likely because your university requires you to use that, because that's what your professor knows. Switch the compiler. If your professor insists you use it, switch the university.
You are compiling with a 32-bit compiler on a 64-bit OS. Switch the compiler.
You have 32-bit OS on a 64-bit computer. Reinstall a 64-bit OS.
You have 32-bit processor. Buy a new computer.
You have a 16-bit processor. Really, buy a new computer.

Several possibilities come to mind.
#abcthomas had the idea to use unsigned; since you are restricted to int, you may abuse int as unsigned. That will probably work, although it is UB according to the standard (cf. Integer overflow in C: standards and compilers).
Use two ints. probably involves writing your own scanf and printf versions, but that shouldn't be too hard. Strictly spoken though, you still haven't expanded the range of an int.
[Use long long] Not possible since you must use int.
You can always use some big number library. Probably not allowed either.
Keep the numbers in strings and do arithmetic digit-wise on the strings. Doesn't use int though.
But you'll never ever be able to store something > MAX_INT in an int.

Try splitting up your value (that would fit inside a 64-bit int) into two 32-bit chunks of data, then use two 32-bit ints to store it. A while ago, I wrote some code that helped me split 16-bit values into 8-bit ones. If you alter this code a bit, then you can split your 64-bit values into two 32-bit values each.
#define BYTE_T uint8_t
#define TWOBYTE_T uint16_t
#define LOWBYTE(x) ((BYTE_T)x)
#define HIGHBYTE(x) ((TWOBYTE_T)x >> 0x8)
#define BYTE_COMBINE(h, l) (((BYTE_T)h << 0x8) + (BYTE_T)l)
I don't know if this is helpful or not, since it doesn't actually answer your original question, but at least you could store your values this way even if your platform only supports 32-bit ints.

Here is an idea to actually store values larger than MAX_INT in an int. It is based on the condition that there is only a small, known number of possible values.
You could write a compression method which computes something akin to a 2-byte hash. The hashes would have to have a bijective (1:1) relation to the known set of possible values. That way you would actually store the value (in compressed form) in the int, and not in a string as before, and thus expand the range of possible values at the cost of not being able to represent every value within that range.
The hashing algorithm would depend on the set of possible values. As a simple example let's assume that the possible values are 2^0, 2^1, 2^2... 2^32767. The obvious hash algorithm is to store the exponent in the int. A stored value of 4 would represent the value 16, 5 would represent 32, 1000 would represent a number close to 10^301 etc. One can see that one can "store" extraordinarily large numbers in a 16 bit int ;-). Less regular sets would require more complicated algorithms, of course.

Is using a non-32-bit integer reasonable? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
The importance of using a 16bit integer
If today's processors perform (under standard conditions) 32-bit operations -- then is using a "short int" reasonable? Because in order to perform an operation on that data, it will convert it to a 32-bit (from 16-bit) integer, perform the operations, and then go back to 16-bit -- I think. So what is the point?
In essence my questions are as follows:
What (if any) performance gain/hindrance does using a smaller ranged integer bring? Like, if instead of using a standard 32-bit integer for storage, I use a 16-bit short integer.
"and then go back to 16-bit" -- Am I correct here? See above.
Are all integer data stored as 32-bit integer space on CPU/RAM?

The answer to your first question should also clarify the last one: if you need to store large numbers of 16-bit ints, you save half the amount of memory required for 32-bit ints, with whatever "fringe benefits" that may come along with it, such as using the cache more efficiently.
Most CPUs these days have separate instructions for 16-bit vs. 32-bit operations, along with instructions to read and write 16-bit values from and to memory. Internally, the ALU may be performing a 32-bit operation, but the result for the upper half does not make it back into the registers.

The processor doesn't need to "expand" a value to work with it. It just pads the unused spaces with zeroes and ignores them when performing calculations. So, actually, it is faster to operate on a short int than a long int, although with today's fast CPUs it is very hard to notice even a bit of difference (pun intended).
The machine doesn't really convert. When changing the size of a value, it either pads zeroes to the left or totally ignores extra bits to the left that won't fit in the target memory region.
No, and this is usually the reason people use short int values for purposes where the range of a long int just isn't needed. The memory allocated is different for each length of int, like a short int takes up fewer bits of memory than a long int. One of the steps in optimization is to change long int values to short int values when the range does not exceed that of a short int, meaning that the value would never use the extra bits allocated with a long int. The memory saved from such an optimization can actually be quite significant when dealing with a lot of elements in arrays or a lot of objects of the same struct or class.
Different int sizes are stored with different amounts of bits in both the RAM and the internal processor cache. This is also true of float, double, and long double, although long double is mainly for 64-bit systems and most compilers just ignore the long if running on 32-bit machines because a 64-bit value in a 32-bit accumulator & ALU will be 'mowed down' during any calculation and would likely never receive anything but zeros for the first 32 bits.

What (if any) performance gain/hindrance does using a smaller ranged integer bring? Like, if instead of using a standard 32-bit integer for storage, I use a 16-bit short integer.
It uses less memory. Under normal circumstances, it will use half as much.
"and then go back to 16-bit" -- Am I correct here? See above.
It only converts between 16 an 32-bit if that is needed by your code, which you failed to show.
Are all integer data stored as 32-bit integer space on CPU/RAM?
No. 32-bit processors can address and work directly with values up to 32 bits. Many operations can be done on 8 and 16-bit values as well.

No is not reasonable unless you have some sort of (very tight) memory constraints you should use int
You dont gain performance, just memory. In fact you lose performance because of what you just said, since registers need to strip out the upper bits.
See above
Yes depends on the CPU, No it's 16 bit on the RAM

What (if any) performance gain/hindrance does using a smaller ranged
integer bring? Like, if instead of using a standard 32-bit integer for
storage, I use a 16-bit short integer.
Performance comes from cache locality. The more data you fit in cache, the faster your program runs. This is more relevant if you have lots of short values.
"and then go back to 16-bit" -- Am I correct here?
I'm not so sure about this. I would have expected that the CPU can optimize multiple operations in parallel, and you get bigger throughput if you can pack data into 16 bits. It may also be that this can happen at the same time as other 32-bit operations. I am speculating here, so I'll stop!
Are all integer data stored as 32-bit integer space on CPU/RAM?
No. The various integer datatypes have a specific size. However, you may encounter padding inside structs when you use char and short in particular.
Speed efficiency is not the only concern. Obviously you have storage benefits, as well as intrinsic behaviour (for example, I have written performance-specific code that exploits the integer overflow of a unsigned short just so that I don't have to do any modulo). You also have the benefit of using specific data sizes for reading and writing binary data. There's probably more that I haven't mentioned, but you get the point =)

Usage of 'short' in C++

Why is it that for any numeric input we prefer an int rather than short, even if the input is of very few integers.
The size of short is 2 bytes on my x86 and 4 bytes for int, shouldn't it be better and faster to allocate than an int?
Or I am wrong in saying that short is not used?

CPUs are usually fastest when dealing with their "native" integer size. So even though a short may be smaller than an int, the int is probably closer to the native size of a register in your CPU, and therefore is likely to be the most efficient of the two.
In a typical 32-bit CPU architecture, to load a 32-bit value requires one bus cycle to load all the bits. Loading a 16-bit value requires one bus cycle to load the bits, plus throwing half of them away (this operation may still happen within one bus cycle).

A 16-bit short makes sense if you're keeping so many in memory (in a large array, for example) that the 50% reduction in size adds up to an appreciable reduction in memory overhead. They are not faster than 32-bit integers on modern processors, as Greg correctly pointed out.

In embedded systems, the short and unsigned short data types are used for accessing items that require less bits than the native integer.
For example, if my USB controller has 16 bit registers, and my processor has a native 32 bit integer, I would use an unsigned short to access the registers (provided that the unsigned short data type is 16-bits).
Most of the advice from experienced users (see news:comp.lang.c++.moderated) is to use the native integer size unless a smaller data type must be used. The problem with using short to save memory is that the values may exceed the limits of short. Also, this may be a performance hit on some 32-bit processors, as they have to fetch 32 bits near the 16-bit variable and eliminate the unwanted 16 bits.
My advice is to work on the quality of your programs first, and only worry about optimization if it is warranted and you have extra time in your schedule.

Using type short does not guarantee that the actual values will be smaller than those of type int. It allows for them to be smaller, and ensures that they are no bigger. Note too that short must be larger than or equal in size to type char.
The original question above contains actual sizes for the processor in question, but when porting code to a new environment, one can only rely on weak relative assumptions without verifying the implementation-defined sizes.
The C header <stdint.h> -- or, from C++, <cstdint> -- defines types of specified size, such as uint8_t for an unsigned integral type exactly eight bits wide. Use these types when attempting to conform to an externally-specified format such as a network protocol or binary file format.

The short type is very useful if you have a big array full of them and int is just way too big.
Given that the array is big enough, the memory saving will be important (instead of just using an array of ints).
Unicode arrays are also encoded in shorts (although other encode schemes exist).
On embedded devices, space still matters and short might be very beneficial.
Last but not least, some transmission protocols insists in using shorts, so you still need them there.

Maybe we should consider it in different situations. For example, x86 or x64 should consider more suitable type, not just choose int. In some cases, int have faster speed than short. The first floor have answered this question

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js