I have a few questions about divide overflow errors on x86 or x86_64 architecture. Lately I've been reading about integer overflows. Usually, when an arithmetic operation results in an integer overflow, the carry bit or overflow bit in the FLAGS register is set. But apparently, according to this article, overflows resulting from division operations don't set the overflow bit, but rather trigger a hardware exception, similar to when you divide by zero.
Now, integer overflows resulting from division are a lot more rare than say, multiplication. There's only a few ways to even trigger a division overflow. One way would be to do something like:
int16_t a = -32768;
int16_t b = -1;
int16_t c = a / b;
In this case, due to the two's complement representation of signed integers, you can't represent positive 32768 in a signed 16-bit integer, so the division operation overflows, resulting in the erroneous value of -32768.
A few questions:
1) Contrary to what this article says, the above did NOT cause a hardware exception. I'm using an x86_64 machine running Linux, and when I divide by zero the program terminates with a Floating point exception. But when I cause a division overflow, the program continues as usual, silently ignoring the erroneous quotient. So why doesn't this cause a hardware exception?
2) Why are division errors treated so severely by the hardware, as opposed to other arithmetic overflows? Why should a multiplication overflow (which is much more likely to accidentally occur) be silently ignored by the hardware, but a division overflow is supposed to trigger a fatal interrupt?
=========== EDIT ==============
Okay, thanks everyone for the responses. I've gotten responses saying basically that the above 16-bit integer division shouldn't cause a hardware fault because the quotient is still less than the register size. I don't understand this. In this case, the register storing the quotient is 16-bit - which is too small to store signed positive 32768. So why isn't a hardware exception raised?
Okay, let's do this directly in GCC inline assembly and see what happens:
int16_t a = -32768;
int16_t b = -1;
__asm__
(
"xorw %%dx, %%dx;" // Clear the DX register (upper-bits of dividend)
"movw %1, %%ax;" // Load lower bits of dividend into AX
"movw %2, %%bx;" // Load the divisor into BX
"idivw %%bx;" // Divide a / b (quotient is stored in AX)
"movw %%ax, %0;" // Copy the quotient into 'b'
: "=rm"(b) // Output list
:"ir"(a), "rm"(b) // Input list
:"%ax", "%dx", "%bx" // Clobbered registers
);
printf("%d\n", b);
This simply outputs an erroneous value: -32768. Still no hardware exception, even though the register storing the quotient (AX) is too small to fit the quotient. So I don't understand why no hardware fault is raised here.
In C language arithmetic operations are never performed within the types smaller than int. Any time you attempt arithmetic on smaller operands, they are first subjected to integral promotions which convert them to int. If on your platform int is, say, 32-bit wide, then there's no way to force a C program to perform 16-bit division. The compiler will generate 32-bit division instead. This is probably why your C experiment does not produce the expected overflow on division. If your platform does indeed have 32-bit int, then your best bet would be to try the same thing with 32-bit operands (i.e. divide INT_MIN by -1). I'm pretty sure that way you'll be able to eventually reproduce the overflow exception even in C code.
In your assembly code you are using 16-bit division, since you specified BX as the operand for idiv. 16-bit division on x86 divides the 32-bit dividend stored in DX:AX pair by the idiv operand. This is what you are doing in your code. The DX:AX pair is interpreted as one composite 32-bit register, meaning that the sign bit in this pair is now actually the highest-order bit of DX. The highest-order bit of AX is not a sign bit anymore.
And what you did you do with DX? You simply cleared it. You set it to 0. But with DX set to 0, your dividend is interpreted as positive! From the machine point of view, such a DX:AX pair actually represents a positive value +32768. I.e. in your assembly-language experiment you are dividing +32768 by -1. And the result is -32768, as it should be. Nothing unusual here.
If you want to represent -32768 in the DX:AX pair, you have to sign-extend it, i.e. you have to fill DX with all-one bit pattern, instead of zeros. Instead of doing xor DX, DX you should have initialized AX with your -32768 and then done cwd. That would have sign-extended AX into DX.
For example, in my experiment (not GCC) this code
__asm {
mov AX, -32768
cwd
mov BX, -1
idiv BX
}
causes the expected exception, because it does indeed attempt to divide -32768 by -1.
When you get an integer overflow with integer 2's complement add/subtract/multiply you still have a valid result - it's just missing some high order bits. This behaviour is often useful, so it would not be appropriate to generate an exception for this.
With integer division however the result of a divide by zero is useless (since, unlike floating point, 2's complement integers have no INF representation).
Contrary to what this article says, the above did NOT cause a hardware exception
The article did not say that. Is says
... they generate a division error if the source operand (divisor) is zero or if the quotient is too large for the designated register
Register size is definitely greater than 16 bits (32 || 64)
From the relevant section on integer overflow:
Unlike the add, mul, and imul
instructions, the Intel division
instructions div and idiv do not set
the overflow flag; they generate a
division error if the source operand
(divisor) is zero or if the quotient
is too large for the designated
register.
The size of a register is on a modern platform either 32 or 64 bits; 32768 will fit into one of those registers. However, the following code will very likely throw an integer overflow execption (it does on my core Duo laptop on VC8):
int x= INT_MIN;
int y= -1;
int z= x/y;
The reason your example did not generate a hardware exception is due to C's integer promotion rules. Operands smaller than int get automatically promoted to ints before the operation is performed.
As to why different kinds of overflows are handled differently, consider that at the x86 machine level, there's really no such thing a multiplication overflow. When you multiply AX by some other register, the result goes in the DX:AX pair, so there is always room for the result, and thus no occasion to signal an overflow exception. However, in C and other languages, the product of two ints is supposed to fit in an int, so there is such a thing as overflow at the C level. The x86 does sometimes set OF (overflow flag) on MULs, but it just means that the high part of the result is non-zero.
On an implementation with 32-bit int, your example does not result in a divide overflow. It results in a perfectly representable int, 32768, which then gets converted to int16_t in an implementation-defined manner when you make the assignment. This is due to the default promotions specified by the C language, and as a result, an implementation which raised an exception here would be non-conformant.
If you want to try to cause an exception (which still may or may not actually happen, it's up to the implementation), try:
int a = INT_MIN, b = -1, c = a/b;
You might have to do some tricks to prevent the compiler from optimizing it out at compile-time.
I would conjecture that on some old computers, attempting to divide by zero would cause some severe problems (e.g. put the hardware into an endless cycle of trying to subtract enough so the remainder would be less than the dividend until an operator came along to fix things), and this started a tradition of divide overflows being regarded as more severe faults than integer overflows.
From a programming standpoint, there's no reason that an unexpected divide overflow should be any more or less serious than an unexpected integer overflow (signed or unsigned). Given the cost of division, the marginal cost of checking an overflow flag afterward would be pretty slight. Tradition is the only reason I can see for having a hardware trap.
Related
int main(){
unsigned int num1 = 0x65764321;
unsigned int num2 = 0x23657432;
unsigned int sum = num1 + num2;
cout << hex << sum;
return 0;
}
If i have two unsigned integers say num1 and num2. And then I tell the computer to unsigned
int sum = num1 + num2;
What method does the computer use to add them, would it be two's complement. Would the sum variable be printed in two's complement.
2's complement addition is identical to unsigned addition as far the actual bits are concerned. In the actual hardware, the design will be something complicated like a https://en.wikipedia.org/wiki/Carry-lookahead_adder, so it can be low latency (not having to wait for the carry to ripple across 32 or 64 bits, because that's too many gate-delays for add to be single-cycle latency.)
One's complement and sign/magnitude are the other signed-integer representations that C++ allows implementations to use, and their wrap-around behaviour is different from unsigned.
For example, one's complement addition has to wrap the carry-out back into the low bit. See this article about optimizing TCP checksum calculation for how you implement one's complement addition on hardware that only provide 2's complement / unsigned addition. (Specifically x86).
C++ leaves signed overflow as undefined behaviour, but real one's complement and sign/magnitude hardware does have specific documented behaviour. reinterpret_casting an unsigned bit pattern to a signed integer gives a result that depends on what kind of hardware you're running on. (All modern hardware is 2's complement, though.)
Since the bitwise operation is the same for unsigned or 2's complement, it's all about how you interpret the results. On CPU architectures like x86 that set flags based on the results of an instruction, the overflow flag is only relevant for the signed interpretation, and the carry flag is only relevant for the unsigned interpretation. The hardware produces both from a single instruction, instead of having separate signed/unsigned add instructions that do the same thing.
See http://teaching.idallen.com/dat2343/10f/notes/040_overflow.txt for a great write-up about unsigned carry vs. signed overflow, and x86 flags.
On other architectures, like MIPS, there is no FLAGS register. You have to use a compare or test instruction to figure out what happened (carry or zero or whatever). The add instruction doesn't set flags. See this MIPS Q&A about add-with-carry for a 64-bit add on 32-bit MIPS.
But for detecting signed overflow, add raises an exception on overflow (where x86 would set OF), so you use addu for signed or unsigned addition if you want it to not fault on signed overflow.
now the overflow flag here is 1(its an example given by our instructor) meaning there is overflow but there is no carry, so how can there be overflow here
You have a C++ program, not an x86 assembly language program! C++ doesn't have a carry or overflow flag.
If you compiled this program for x86 with a non-optimizing compiler, and it used the ADD instruction with your two inputs, you would get OF=1 and CF=0 from that ADD instruction.
But the compiler might use lea edi, [rax+rdx] to do the sum without overwriting either input, and LEA doesn't set flags.
Or if the compiler did the addition at compile time, your code would compile the same as source like this:
cout << hex << 0x88dbb753U;
and no addition of your numbers would take place at run-time. (There will of course be lots of addition in the iostream library functions, and maybe even an add instruction in main() as part of making a stack frame, if your compiler chooses to emit code that sets up a stack frame.)
i have two unsigned integers
What method does the computer use to add them
Whatever method is available on the target CPU architecture. Most have an instruction named ADD.
Would the sum variable be printed in two's complement.
Two's complement is a way to represent an integer type in binary. It is not a way to print numbers.
I don't doubt the need to check for division by zero. I've never heard of checking for division by negative one though!
if( *y == 0 )
return 0; //undefined
else
return *x / *y;
x, y are pointers to int32_t, I include this detail in case of relevance.
At runtime, if *x==0x80000000, *y==0xffffffff, I get the error (in Xcode):
EXC_ARITHMETIC (code=EXC_I386_DIV, subcode=0x0)
All I can find online is suggestions that it is division by zero, but as you can see from the check above, and I can see from the debug window, that is not the case here.
What does the error mean, and how can I fix it?
2's complement representation is asymmetric: there is one more negative number than positive number, and that negative number has no positive counterpart. Consequently, negating MIN_INT is an integer overflow (where MIN_INT is the value whose only 1-bit is the sign bit, 0x80000000 for 32-bit integers).
MIN_INT / -1 is, therefore, also an arithmetic overflow. Unlike overflow from subtraction (which is rarely checked), overflow when dividing can cause a trap, and apparently that's what is happening in your case.
And, yes, technically speaking you should check for the MIN_INT / -1 overflow case before dividing, because the result is undefined.
Note: In the common case of Intel x64 architecture, division overflow does trap, exactly the same as division by 0. Confusingly, the corresponding Posix signal is SIGFPE which is normally thought of as "Floating Point Exception", although there is no floating point in sight. The current Posix standard actually glosses SIGFPE as meaning "Erroneous Arithmetic Operation".
disclaimer: I know, unsigned integers are primitive numeric types but they don't technically overflow, I'm using the term "overflow" for all the primitive numeric types in general here.
in C or C++, according to the standard or to a particular implementation, there are primitive numeric types where given an arithmetic operation, even if this operation could possibly overflow, I can save the result + plus the part that overflows ?
Even if this sounds strange, my idea is that the registers on modern CPUs are usually much larger than a 32 bit float or a 64 bit uint64_t, so there is the potential to actually control the overflow and store it somewhere.
No, the registers are not "usually much larger than a 64 bit uint64_t".
There's an overflow flag, and for a limited number of operations (addition and subtraction), pairing this single additional bit with the result is enough to capture the entire range of outcomes.
But in general, you'd need to cast to a larger type (potentially implemented in software) to handle results that overflow the type of your input.
Any operations that do this sort of thing (for example some 32-bit processors had a 32x32 => 32 high, 32 low wide multiply instruction) will be provided by your compiler as intrinsic functions, or via inline assembly.
look, I found a 64-bit version of that, named Multiply128 and the matching __mul128 instrinsic available in Visual C++
See #Ben Voigt about larger registers.
There really may only be an overflow bit that could help you.
Another approach, without resorting to wider integers, is to test overflow yourself:
unsigned a,b,sum;
sum = a + b;
if (sum < a) {
OverflowDetected(); // mathematical result is `sum` + UINT_MAX + 1
}
Similar approach for int.
The following likely may be simplified - just don't have it at hand.
[Edit]
My below apporach has potentila UB. For a better way to detect int overflow, see Simpler method to detect int overflow
int a,b,sum;
sum = a + b;
// out-of-range only possible when the signs are the same.
if ((a < 0) == (b < 0)) {
if (a < 0) {
if (sum > b) UnderflowDetected();
}
else {
if (sum < b) OverflowDetected();
}
For floating point type, you can actually 'control' the overflow on x86 platform. With these functions "_control87, _controlfp, __control87_2", you can Gets and sets the floating-point control word. By default, the run-time libraries mask all floating-point exceptions; you can unmask them in your code, so when a overflow occurs, you'll get an exception. However, the code we write today, all assume that the floating point exceptions are masked, so if you unmask them, you'll encounter some problem.
You can use these functions to get the status word.
For floating point types the results are well defined by the hardware, and you're not going to be able to get much control without working around C/C++.
For integer types smaller than int, they will be upsized to an int by any arithmetic operation. You will probably be able to detect overflow before you coerce the result back into the smaller type.
For addition and subtraction you can detect overflow by comparing the result to the inputs. Adding two positive integers will always yield a result larger than either of the inputs, unless there was overflow.
I believe that during arithmetic overflow (in the context of an integer variable being assigned a value too large for it to hold), bits beyond the end of the variable could be overwritten.
But in the following C++11 program does this really still hold? I don't know whether it's UB, or disallowed, or implementation-specific or what, but when I take the variable past its maximum value, on a modern architecture, will I actually see arithmetic overflow of bits in memory? Or is that really more of a historical thing?
int main() {
// (not unsigned; unsigned is defined to wrap-around)
int x = std::numeric_limits<int>::max();
x++;
}
I don't know whether it's UB
It is undefined, as specified in C++11 5/4:
If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.
(As you say, it is defined for unsigned types, since they are defined by 3.9.1/4 to obey modular arithmetic)
on a modern architecture, will I actually see arithmetic overflow of bits in memory?
On all the modern architectures I know of (x86, ARM, 68000, and various DSPs), arithmetic is modular, with fixed-width 2s-complement results; on those architectures that can write the result to memory rather than registers, it will never overwrite more memory than the result size. For addition and subtraction, there is no difference to the CPU between signed and unsigned arithmetic. Overflow (signed or unsigned) can be detected from the state of CPU flags after the operation.
I could imagine a compiler for, say, a 32-bit DSP that tried to implement arithmetic on 8 or 16-bit values packed into a larger word, where overflow would affect the rest of the word; however, all compilers I've seen for such architectures just defined char, short and int to be 32-bit types.
Or is that really more of a historical thing?
It would have happened on Babbage's Difference Engine, since "memory" is a single number; if you partition it into smaller numbers, and don't insert guard digits, then overflow from one will alter the value of the next. However, you couldn't run any non-trivial C++ program on this architecture.
Historically, I believe some processors would produce an exception on overflow - that would be why the behaviour is undefined.
The content of addresses almost never really "overflowed". For example we would expect primitive integers to roll over their values. See http://www.pbm.com/~lindahl/mel.html.
I think overflow is when the pointers move beyond their desired limits so that you have pointers that point to unexpected places.
He had located the data he was working on near the top of memory --
the largest locations the instructions could address -- so, after the
last datum was handled, incrementing the instruction address would
make it overflow. The carry would add one to the operation code,
changing it to the next one in the instruction set: a jump
instruction. Sure enough, the next program instruction was in address
location zero, and the program went happily on its way.
Personally, I have never seen an architecture where overflow would cause memory outside the variable to be overwritten.
I think you should read overflow as you leave a domain that is well defined (like the positive values of a signed integer) and enter one that is not well defined.
Concretely, lets take the max short value, 0x7FFF. If you add one to it you get 0x8000. This value has different meaning depending on if you use one-complement or two-complement negative numbers, both of which are allowed by the C standard.
It still happens. With floating point arithemitic it is sometimes necessary to ensure that the caluculations are dcarried out in the right order to ensure that this event is unlikely to happen. (also it can reduce rounding errors!)
There are two main overflows in Computer buisness
Arithmetic overflow: like in your example which is defined and needed for work
Negativ example:
int a = std::numeric_limits<int>::max()/2;
int b = a + a + 3; // b becomes negativ and the plane crashes
Positiv example:
double a = std::numeric_limits<double>::max()/2;
double b = a + a + 3; // b becomes Inf and it is determined
Thats how the overflow is defined for Processor integer and floatingpoint units and it will not go awayand has to be handled.
Memory Overflow: Still happens if things do not match like
memory size
calling conventions
data type size of client server
memory alignment
...
A lot of security issues are based on memory overflow.
Read wikipedia for more facets of overflows:
http://en.wikipedia.org/wiki/Overflow
For an integer that is never expected to take -ve values, one could unsigned int or int.
From a compiler perspective or purely cpu cycle perspective is there any difference on x86_64 ?
It depends. It might go either way, depending on what you are doing with that int as well as on the properties of the underlying hardware.
An obvious example in unsigned ints favor would be the integer division operation. In C/C++ integer division is supposed to round towards zero, while machine integer division on x86 rounds towards negative infinity. Also, various "optimized" replacements for integer division (shifts, etc.) also generally round towards negative infinity. So, in order to satisfy standard requirements the compiler are forced to adjust the signed integer division results with additional machine instructions. In case of unsigned integer division this problem does not arise, which is why generally integer division works much faster for unsigned types than for signed types.
For example, consider this simple expression
rand() / 2
The code generated for this expression by MSVC complier will generally look as follows
call rand
cdq
sub eax,edx
sar eax,1
Note that instead of a single shift instruction (sar) we are seeing a whole bunch of instructions here, i.e our sar is preceded by two extra instructions (cdq and sub). These extra instructions are there just to "adjust" the division in order to force it to generate the "correct" (from C language point of view) result. Note, that the compiler does not know that your value will always be positive, so it has to generate these instructions always, unconditionally. They will never do anything useful, thus wasting the CPU cycles.
Not take a look at the code for
(unsigned) rand() / 2
It is just
call rand
shr eax,1
In this case a single shift did the trick, thus providing us with an astronomically faster code (for the division alone).
On the other hand, when you are mixing integer arithmetics and FPU floating-point arithmetics, signed integer types might work faster since the FPU instruction set contains immediate instruction for loading/storing signed integer values, but has no instructions for unsigned integer values.
To illustrate this one can use the following simple function
double zero() { return rand(); }
The generated code will generally be very simple
call rand
mov dword ptr [esp],eax
fild dword ptr [esp]
But if we change our function to
double zero() { return (unsigned) rand(); }
the generated code will change to
call rand
test eax,eax
mov dword ptr [esp],eax
fild dword ptr [esp]
jge zero+17h
fadd qword ptr [__real#41f0000000000000 (4020F8h)]
This code is noticeably larger because the FPU instruction set does not work with unsigned integer types, so the extra adjustments are necessary after loading an unsigned value (which is what that conditional fadd does).
There are other contexts and examples that can be used to demonstrate that it works either way. So, again, it all depends. But generally, all this will not matter in the big picture of your program's performance. I generally prefer to use unsigned types to represent unsigned quantities. In my code 99% of integer types are unsigned. But I do it for purely conceptual reasons, not for any performance gains.
Signed types are inherently more optimizable in most cases because the compiler can ignore the possibility of overflow and simplify/rearrange arithmetic in whatever ways it sees fit. On the other hand, unsigned types are inherently safer because the result is always well-defined (even if not to what you naively think it should be).
The one case where unsigned types are better optimizable is when you're writing division/remainder by a power of two. For unsigned types this translates directly to bitshift and bitwise and. For signed types, unless the compiler can establish that the value is known to be positive, it must generate extra code to compensate for the off-by-one issue with negative numbers (according to C, -3/2 is -1, whereas algebraically and by bitwise operations it's -2).
It will almost certainly make no difference, but occasionally the compiler can play games with the signedness of types in order to shave a couple of cycles, but to be honest it probably is a negligible change overall.
For example suppose you have an int x and want to write:
if(x >= 10 && x < 200) { /* ... */ }
You (or better yet, the compiler) can transform this a little to do one less comparison:
if((unsigned int)(x - 10) < 190) { /* ... */ }
This is making an assumption that int is represented in 2's compliment, so that if (x - 10) is less that 0 is becomes a huge value when viewed as an unsigned int. For example, on a typical x86 system, (unsigned int)-1 == 0xffffffff which is clearly bigger than the 190 being tested.
This is micro-optimization at best and best left up the compiler, instead you should write code that expresses what you mean and if it is too slow, profile and decide where it really is necessary to get clever.
I don't imagine it would make much difference in terms of CPU or the compiler. One possible case would be if it enabled the compiler to know that the number would never be negative and optimize away code.
However it IS useful to a human reading your code so they know the domain of the variable in question.
From the ALU's point of view adding (or whatever) signed or unsigned values doesn't make any difference, since they're both represented by a group of bit. 0100 + 1011 is always 1111, but you choose if that is 4 + (-5) = -1 or 4 + 11 = 15.
So I agree with #Mark, you should choose the best data-type to help others understand your code.