I am just wondering whether differences will stay correct through an overflow. As example, I am trying to use a windows high resolution timer with QueryPerformanceFrequency(&local).
The starting value of that counter is undefined. However, the interesting bit is only the difference from the starting point. So at the beginning you record the value, and then always look at the diff. Now if I can guarantee that the difference won't be larger than a LARGE_INTEGER, is this sufficient?
Say e.g. one has 4 bits. That allows for 1...15. If the counter now starts at 14, and stops at 2, and I do 2 - 14, I should be getting 4, shouldn't I? So I needn't worry about an overflow as long the difference is smaller?
Thanks
Since you are using a Windows-specific structure, your problem is easier since it only needs to run on machines that support Windows. Windows requires twos-complement arithmetic, and twos-complement arithmetic behaves on overflow in the manner you expect (overflows are computed mod 2^n).
I'm not going to answer the general question but rather the specific one: do you need to worry about overflows from QueryPerformanceCounter?
If you have a performance counter that is incrementing at 4 GHz, it will take 73 years for a 63-bit signed integer to wrap around to a negative number. No need to worry about overflow.
On my computer at least, the definition of LARGE_INTEGER is:
typedef union _LARGE_INTEGER {
struct {
DWORD LowPart;
LONG HighPart;
};
LONGLONG QuadPart;
} LARGE_INTEGER;
The tricky part is all of those are signed. So if you have four bits, the range is (-8,7). Then if you start at 6, and stop at 0, you get a difference of -6.
However, if you cast the LONGLONG to a unsigned long long (either before or after the subtraction, either is fine), then you should get the correct answer. Converting -6 to unsigned long long results in 10, the correct difference.
Using 2's complement (the way integers are represented in computers), you can add or subtract multiple numbers -- and the result will be correct as long as it fits in the number of bits allocated. The temporary results need not fit in the allocated number of bits.
So yes, if you use an integer of N bits, you'll get the correct result as long as the difference is less than 2^N.
Related
I'm using cout to print digits to the console. I am also storing values of up to 13+billion as a digit and doing computations on it. What data type should I use?
When I do the following:
int a = 6800000000;
cout << a;
It prints -1789934592.
thanks.
long long can hold up to 9223372036854775807. Use something like gmp if you need larger.
Use int64_t to guarantee you won't overflow. It is available from stdint.h.
Just a note that both int64_t and long long are included in C99 and in C++ 0x, but not in the current version of C++. As such, using either does put your code at some risk of being non-portable. Realistically, however, that risk is probably already pretty low -- to the point that when/if you port your code, there are likely to be much bigger problems.
If, however, you really want to assure against that possibility, you might consider using a double precision floating point. Contrary to popular belief, floating point types can represent integers exactly up to a certain limit -- that limit being set (in essence) by the size of the mantissa in the F.P. type. The typical implementation of a double has a 53-bit mantissa, so you can represent 53-bit integers with absolute precision. That supports numbers up to 9,007,199,254,740,992 (which is substantially more than 13 of either of the popular meanings of "billion").
Your data type (int) is too small to hold such large numbers. You should use a larger data type or one of the fixed size data types as given in the other answer (though you should really use uint64_t if you're not using negative numbers).
It's a good idea to understand the range limits of different sized types.
A 32 bit type (on most 32 bit platforms, both int and long are 32 bit) have the following ranges:
signed: -2,147,483,648 to 2,147,483,647
unsigned: 0 to 4,294,967,295
While 64 bit types (typically long long's are 64 bit, on most Unix 64 bit platforms a long is also 64) have the following range:
signed: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
unsigned: 0 to 18,446,744,073,709,551,615
just use double in the declaration statement
You could use a long int:
long int a
Or if it's always going to be positive, an unsigned long int:
unsigned long int a
See: http://forums.guru3d.com/showthread.php?t=131678
unsigned long long
can be used
I am using MinGW64 (with the -m64 flag) with Code::Blocks and am looking to know how to perform 64 bit calculations without having to cast a really big number to int64_t before multiplying it. For example, this does not result in overflow:
int64_t test = int64_t(2123123123) * 17; //Returns 36093093091
Without the cast, the calculation overflows like such:
int64_t test = 2123123123 * 17; //Returns 1733354723
A VirusTotal scan confirms that my executable is x64.
Additional Information: OS is Windows 7 x64.
The default int type is still 32 bit even in 64 bit compilations for compatibility resons.
The "shortest" version I guess would be to add the ll suffix to the number
int64_t test = 2123123123ll * 17;
Another way would be to store the numbers in their own variables of type int64_t (or long long) and multiply the varaibles. usually it's rare anyway in a program to have many "magic-numbers" hard-coded into the codebase.
Some background:
Once upon a time, most computers had 8-bit arithmetic logic units and a 16-bit address bus. We called them 8-bit computers.
One of the first things we learned was that no real-world arithmetic problem can be expressed in 8-bits. It's like trying to reason about space flight with the arithmetic abilities of a chimpanzee. So we learned to write multi-word add, multiply, subtract and divide sequences. Because in most real-world problems, the numerical domain of the problem was bigger than 255.
The we briefly had 16-bit computers (where the same problem applied, 65535 is just not enough to model things) and then quite quickly, 32-bit arithmetic logic built in to chips. Gradually, the address bus caught up (20 bits, 24 bits, 32 bits if designers were feeling extravagant).
Then an interesting thing happened. Most of us didn't need to write multi-word arithmetic sequences any more. It turns out that most(tm) real world integer problems could be expressed in 32 bits (up to 4 billion).
Then we started producing more data at a faster rate than ever before, and we perceived the need to address more memory. The 64-bit computer eventually became the norm.
But still, most real-world integer arithmetic problems could be expressed in 32 bits. 4 billion is a big (enough) number for most things.
So, presumably through statistical analysis, your compiler writers decided that on your platform, the most useful size for an int would be 32 bits. Any smaller would be inefficient for 32-bit arithmetic (which we have needed from day 1) and any larger would waste space/registers/memory/cpu cycles.
Expressing an integer literal in c++ (and c) yields an int - the natural arithmetic size for the environment. In the present day, that is almost always a 32-bit value.
The c++ specification says that multiplying two ints yields an int. If it didn't then multiplying two ints would need to yield a long. But then what would multiplying two longs yield? A long long? Ok, that's possible. Now what if we multiply those? A long long long long?
So that's that.
int64_t x = 1 * 2; will do the following:
take the integer (32 bits) of value 1.
take the integer (32 bits) of value 2.
multiply them together, storing the result in an integer. If the arithmetic overflows, so be it. That's your lookout.
cast the resulting integer (whatever that may now be) to int64 (probably on your system a long int.
So in a nutshell, no. There is no shortcut to spelling out the type of at least one of the operands in the code snippet in the question. You can, of course, specify a literal. But there is no guarantee that the a long long (LL literal suffix) on your system is the same as int64_t. If you want an int64_t, and you want the code to be portable, you must spell it out.
For what it's worth:
In a post-c++11 world all the worrying about extra keystrokes and non-DRYness can disappear:
definitely an int64:
auto test = int64_t(2123123123) * 17;
definitely a long long:
auto test = 2'123'123'123LL * 17;
definitely int64, definitely initialised with a (possibly narrowing, but that's ok) long long:
auto test = int64_t(36'093'093'091LL);
Since you're most likely in an LP64 environment, where int is only 32 bits, you have to be careful about literal constants in expressions. The easiest way to do this is to get into the habit of using the proper suffix on literal constants, so you would write the above as:
int64_t test = 2123123123LL * 17LL;
2123123123 is an int (usually 32 bits).
Add an L to make it a long: 2123123123L (usually 32 or 64 bits, even in 64-bit mode).
Add another L to make it a long long: 2123123123LL (64 bits or more starting with C++11).
Note that you only need to add the suffix to constants that exceed the size of an int. Integral conversion will take care of producing the right result*.
(2123123123LL * 17) // 17 is automatically converted to long long, the result is long long
* But beware: even if individual constants in an expression fit into an int, the whole operation can still overflow like in
(1024 * 1024 * 1024 * 10)
In that case you should make sure the arithmetic is performed at sufficient width (taking operator precedence into account):
(1024LL * 1024 * 1024 * 10)
- will perform all 3 operations in 64 bits, with a 64-bit result.
Edit: Literal constants (A.K.A. magic numbers) are frowned upon, so the best way to do it would be to use symbolic constants (const int64_t value = 5). See What is a magic number, and why is it bad? for more info. It's best that you don't read the rest of this answer, unless you really want to use magic numbers for some strange reason.
Also, you can use intptr_t and uintprt_t from #include <cstdint> to let the compiler choose whether to use int or __int64.
For those who stumble upon this question, `LL` at the end of a number can do the trick, but it isn't recommended, as Richard Hodges told me that `long long` may not be always 64 bit, and can increase in size in the future, although it's not likely. See Richard Hodge's answer and the comments on it for more information.
The reliable way would be to put `using QW = int_64t;` at the top and use `QW(5)` instead of `5LL`.
Personally I think there should be an option to define all literals 64 bit without having to add any suffixes or functions to them, and use `int32_t(5)` when necessary, because some programs are unaffected by this change. Example: only use numbers for normal calculations instead of relying on integer overflow to do it's work. The problem is going from 64 bit to 32 bit, rather than going from 32 to 64, as the first 4 bytes are cut off.
I can read that int range (signed) is from [−32767, +32767]
but I can say, for example
int a=70000;
int b=71000;
int c=a+b;
printf("%i", c);
return 0;
And the output is 141000 (correct). Should not the debugger tell me
"this operation is out of range" or something similar?
I suppose that this has to be with me ignoring the basics of C programming, but none of the books that I'm currently reading tell nothing about this "issue".
EDIT:
2147483647 seems to be the upper limit, thank you. If a sum exceeds that number, the result is negative, wich is expected, BUT if it is a subtraction, for example: 2147483649-2147483647=2 the result is still good. I mean, why the value 2147483649 is correctly hold for that substraction purpose (or at least it seems to me)?
The range [−32767, +32767] is the required minimum range. An implementation is allowed to provide a larger range.
All types are compiler-dependent. int used to be the "native word" of the underlying hardware, which on 16-bit systems meant that int was 16 bits (which leads to the -32k to +32k range). When 32-bit systems started coming then int naturally followed along and became 32 bits, which can store values around -2 billion to +2 billion.
However this "native word" use for int didn't follow along when 64-bit systems came around, I know of no 64-bit system or compiler that have int being 64 bits.
See e.g. this reference of integer types for more information.
In C++, int is at least 16-bits long, but typically 32-bits on modern hardware. You can write INT_MIN and INT_MAX and check yourself.
Note that signed integer overflow is undefined behavior, you are not guaranteed to get a warning, except perhaps with high compiler warnings and debug mode.
You have misunderstood. The standard guarantees that a int holds [-32767, +32767], but it is permitted to hold more. (In particular, nearly every compiler you are likely to use allows a range [-2147483648, 2147483647]).
There is another problem. If you make the value you assign to a and b bigger you still probably won't get any warning or error. Integer overflow causes "undefined behaviour", and literally anything is allowed to happen.
If an int is four bytes an unsigned is 4294967295, signed max. 2147483647 and signed min. -2147483648
unsigned int ui = ~0;
int max = ui>>1;
int min = ~max;
int size = sizeof(max);
While the standard guarantees the size of int to be 16 bit, it is usually implemented as a 32-bit value.
The size of an int (and the max value it can hold) depends on the compiler and the computer you are using. There is no guarantee that it will have 2 bytes or 4 bytes, but there is a guaranteed minimum size for the c++ variables.
You can see a list of minimum sizes for c++ types in this page: http://www.cplusplus.com/doc/tutorial/variables/
I'm investigating a standard for my team around using size_t vs int (or long, etc). The biggest drawback I've seen pointed out is that taking the difference of two size_t objects can cause problems (I'm unsure of specific problems -- maybe something wasn't 2s complemented and the signed/unsigned angers the compiler). I wrote a quick program in C++ using the V120 VS2013 compiler that allowed me to do the following:
#include <iostream>
main()
{
size_t a = 10;
size_t b = 100;
int result = a - b;
}
The program resulted in -90, which although correct, makes me nervous about type mismatches, signed/unsigned problems, or just plain undefined behavior if the size_t happens to get used in complex math.
My question is if it's safe to do math with size_t objects, specifically, taking the difference? I'm considering using size_t as a standard for things like indexes. I've seen some interesting posts on the topic here, but they don't address the math issue (or I missed it).
What type for subtracting 2 size_t's?
typedef for a signed type that can contain a size_t?
This is not guaranteed to work portably, but is not UB either. The code must run without error, but the resulting int value is implementation defined. So as long as you are working on platforms that guarantee the desired behavior, this is fine (as long as the difference can be represented by an int of course), otherwise, just use signed types everywhere (see last paragraph).
Subtracting two std::size_ts will yield a new std::size_t† and its value will be determined by wrapping. In your example, assuming 64 bit size_t, a - b will equal 18446744073709551526. This does not fit into an (commonly used 32 bit) int, so an implementation defined value is assigned to result.
To be honest, I would recommend to not use unsigned integers for anything but bit magic. Several members of the standard committee agree with me: https://channel9.msdn.com/Events/GoingNative/2013/Interactive-Panel-Ask-Us-Anything 9:50, 42:40, 1:02:50
Rule of thumb (paraphrasing Chandler Carruth from the above video): If you could count it yourself, use int, otherwise use std::int64_t.
†Unless its conversion rank is less than int, e.g. if std::size_t is unsigned short. In that case, the result is an int and everything will work fine (unless int is not wider than short). However
I do not know of any platform that does this.
This would still be platform specific, see first paragraph.
The size_t type is unsigned. The subtraction of any two size_t values is defined-behavior
However, firstly, the result is implementation-defined if a larger value is subtracted from a smaller one. The result is the mathematical value, reduced to the smallest positive residue modulo SIZE_T_MAX + 1. For instance if the largest value of size_t is 65535, and the result of subtracting two size_t values is -3, then the result will be 65536 - 3 = 65533. On a different compiler or machine with a different size_t, the numeric value will be different.
Secondly, a size_t value might be out of range of the type int. If that is the case, we get a second implementation-defined result arising from the forced conversion. In this situation, any behavior can apply; it just has to be documented by the implementation, and the conversion must not fail. For instance, the result could be clamped into the int range, producing INT_MAX. A common behavior seen on two's complement machines (virtually all) in the conversion of wider (or equal width) unsigned types to narrower signed types is simple bit truncation: enough bits are taken from the unsigned value to fill the signed value, including its sign bit.
Because of the way two's complement works, if the original arithmetically correct abstract result itself fits into int, then the conversion will produce that result.
For instance, suppose that the subtraction of a pair of 64 bit size_t values on a two's complement machine yields the abstract arithmetic value -3, which is becomes the positive value 0xFFFFFFFFFFFFFFFD. When this is coerced into a 32 bit int, then the common behavior seen in many compilers for two's complement machines is that the lower 32 bits are taken as the image of the resulting int: 0xFFFFFFFD. And, of course, that is just the value -3 in 32 bits.
So the upshot is, that your code is de facto quite portable because virtually all mainstream machines are two's complement with conversion rules based on sign extension and bit truncation, including between signed and unsigned.
Except that sign extension doesn't occur when a value is widened while converting from unsigned to signed. Thus he one problem is the rare situation in which int is wider than size_t. If a 16 bit size_t result is 65533, due to 4 being subtracted from 1, this will not produce a -3 when converted to a 32 bit int; it will produce 65533!
If you don't use size_t, you are screwed: size_t is the one type that exists to be used for memory sizes, and which is consequently guaranteed to always be big enough for that purpose. (uintptr_t is quite similar, but it's neither the first such type, nor is it used by the standard libraries, nor is it available without including stdint.h.) If you use an int, you can get undefined behavior when your allocations exceed 2GiB of address space (or 32kiB if you are on a platform where int is only 16 bits!), even though the machine has more memory and you are executing in 64 bit mode.
If you need a difference of size_t that may become negative, use the signed variant ssize_t.
This question already has answers here:
Why is unsigned integer overflow defined behavior but signed integer overflow isn't?
(6 answers)
Closed 7 years ago.
I just simply wanted to know, who is responsible to deal with mathematical overflow cases in a computer ?
For example, in the following C++ code:
short x = 32768;
std::cout << x;
Compiling and running this code on my machine gave me a result of -32767
A "short" variable's size is 2 bytes .. and we know 2 bytes can hold a maximum decimal value of 32767 (if signed) .. so when I assigned 32768 to x .. after exceeding its max value 32767 .. It started counting from -32767 all over again to 32767 and so on ..
What exactly happened so the value -32767 was given in this case ?
ie. what are the binary calculations done in the background the resulted in this value ?
So, who decided that this happens ? I mean who is responsible to decide that when a mathematical overflow happens in my program .. the value of the variable simply starts again from its min value, or an exception is thrown for example, or the program simply freezes .. etc ?
Is it the language standard, the compiler, my OS, my CPU, or who is it ?
And how does it deal with that overflow situation ? (Simple explanation or a link explaining it in details would be appreciated :) )
And btw, pls .. Also, who decides what a size of a 'short int' for example on my machine would be ? also is it a language standard, compiler, OS, CPU .. etc ?
Thanks in advance! :)
Edit:
Ok so I understood from here : Why is unsigned integer overflow defined behavior but signed integer overflow isn't?
that It's the processor who defines what happens in an overflow situation (like for example in my machine it started from -32767 all over again), depending on "representations for signed values" of the processor, ie. is it sign magnitude, one's complement or two's complement ...
is that right ?
and in my case (When the result given was like starting from the min value -32767 again.. how do you suppose my CPU is representing the signed values, and how did the value -32767 for example come up (again, binary calculations that lead to this, pls :) ? )
It doesn't start at it's min value per se. It just truncates its value, so for a 4 bit number, you can count until 1111 (binary, = 15 decimal). If you increment by one, you get 10000, but there is no room for that, so the first digit is dropped and 0000 remains. If you would calculate 1111 + 10, you'd get 1.
You can add them up as you would on paper:
1111
0010
---- +
10001
But instead of adding up the entire number, the processor will just add up until it reaches (in this case) 4 bits. After that, there is no more room to add up any more, but if there is still 1 to 'carry', it sets the overflow register, so you can check whether the last addition it did overflowed.
Processors have basic instructions to add up numbers, and they have those for smaller and larger values. A 64 bit processor can add up 64 bit numbers (actually, usually they don't add up two numbers, but actually add a second number to the first number, modifying the first, but that's not really important for the story).
But apart from 64 bits, they often can also add up 32, 16 and 8 bit numbers. That's partly because it can be efficient to add up only 8 bits if you don't need more, but also sometimes to be backwards compatible with older programs for a previous version of a processor which could add up to 32 bits but not 64 bits.
Such a program uses an instruction to add up 32 bits numbers, and the same instruction must also exist on the 64 bit processor, with the same behavior if there is an overflow, otherwise the program wouldn't be able to run properly on the newer processor.
Apart from adding up using the core constructions of the processor, you could also add up in software. You could make an inc function that treats a big chunk of bits as a single value. To increment it, you can let the processor increment the first 64 bits. The result is stored in the first part of your chunk. If the overflow flag is set in the processor, you take the next 64 bits and increment those too. This way, you can extend the limitation of the processor to handle large numbers from software.
And same goes for the way an overflow is handled. The processor just sets the flag. Your application can decide whether to act on it or not. If you want to have a counter that just increments to 65535 and then wraps to 0, you (your program) don't need to do anything with the flag.