C++ usual arithmetic conversions not converting - c++

First, I'd like to point out that I did read some of the other answers with generally similar titles. However, those either refer to what I assume is an older version of the standard, or deal with promotions, not conversions.
I've been crawling through "C++ Crash Course", where the author states the following in the chapter exploring built-in operators:
If none of the floating-point promotion rules apply, you then check
whether either argument is signed. If so, both operands become signed.
Finally, as with the promotion rules for floating-point types, the size of the
largest operand is used to promote the other operand: ...
If I read the standard correctly, this is not true, because, as per cppreference.com,
If both operands are signed or both are unsigned, the operand with lesser conversion rank is converted to the operand with the greater integer conversion rank
Otherwise, if the unsigned operand's conversion rank is greater or equal to the conversion rank of the signed operand, the signed operand is converted to the unsigned operand's type.
Otherwise, if the signed operand's type can represent all values of the unsigned operand, the unsigned operand is converted to the signed operand's type
Otherwise, both operands are converted to the unsigned counterpart of the signed operand's type.
What confuses me even more is the fact that the following code:
printf("Size of int is %d bytes\n", sizeof(int));
printf("Size of short is %d bytes\n", sizeof(short));
printf("Size of long is %d bytes\n", sizeof(long));
printf("Size of long long is %d bytes\n", sizeof(long long));
unsigned int x = 4000000000;
signed int y = -1;
signed long z = x + y;
printf("x + y = %ld\n", z);
produces the following output:
Size of int is 4 bytes
Size of short is 2 bytes
Size of long is 8 bytes
Size of long long is 8 bytes
x + y = 3999999999
As I understand the standard, y should have been converted to unsigned int, leading to an incorrect result. The result is correct, which leads me to assume that no conversion happens in this case. Why? I would be grateful for any clafirication on this matter. Thank you.
(I would also appreciate someone telling me that I won't ever need this kind of arcane knowledge in real life, even if it's not true - just for the peace of mind.)

A conversion of a negative signed value to an unsigned leads into the binary complement value which corresponds to it. An addition of this unsigned value will lead into an overflow which then results just exactly in the same value as adding the negative value would do.
By the way: That's the trick how the processor does a subtraction (or am I wrong in these modern times?).

If I read the standard correctly, this is not true
You're correct. CCC is wrong.
As I understand the standard, y should have been converted to unsigned int,
It did.
The result is correct, which leads me to assume that no conversion happens in this case.
You assume wrong.
Since -1 is not representable unsigned number, it simply converts to the representable number that is representable and is congruent with - 1 modulo the number of representable values. When you add this to x, the result is larger than any representable value, so again result is the representable value that is congruent with the modulo. Because of the magic of how modular arithmetic works, you get the correct result.

You have an addition of an unsigned int and a signed int, so same conversion rank for both operands. So the second rule will apply (emphasize mine):
[Otherwise,] if the unsigned operand's conversion rank is greater or equal to the conversion rank of the signed operand, the signed operand is converted to the unsigned operand's type.
-1 is converted to an unsigned type by adding to it the smallest power of two greater that the highest unsigned int (in fact its representation in 2's complement)
that number is added to 4000000000 and the result is the modulo of the sum and that power of two (in fact discarding its higher bit)
and you just get the expected result.
The standard mandates conversion to the unsigned type because unsigned overflow is defined by the standard, while unsigned one is not.

I would also appreciate someone telling me that I won't ever need this
kind of arcane knowledge in real life
There is one implicit promotion not addressed above that is a common cause of bugs:
unsigned short x = 0xffff;
unsigned short y = 0x0001;
if ((x + y) == 0)
The result of this is false because arithmetic operations are implicitly promoted to at least int.

From the output of the first print statement from your code snippet it is evident that since the size of int is 4 bytes, the unsigned int values can range from 0 to 4,294,967,295. (2^32-1)
The declaration below is hence perfectly valid:
unsigned int x = 4000000000;
Remember here that declaring x as an unsigned integer gives itself twice the (+ve) range of a signed counterpart, but it falls in exactly the same range and hence conversions between them are possible.
signed int y = -1;
signed long z = x + y;
printf("x + y = %ld\n", z);
Hence for the rest of the code above, it doesn't matter here whether the operands of y and z are signed (+ve/-ve) or unsigned (+ve) because you are dealing with an unsigned integer in the first place, for which the signed integers will be converted to their corresponding unsigned integer versions by adding UINTMAX+1 to the signed version.
Here UINTMAX stands for the largest unsigned value which is added with the smallest power of 2 (2^0=1). The end result would obviously be larger than values we can store, hence it is taken as modulo to collect the correct unsigned int.
However, if you were to convert x to signed instead:
signed int x = 4000000000;
int y = -1;
long z = x + y;
printf("x + y = %ld\n", z);
you would get what you might have expected:
x + y = -294967297

Related

Comparing unsigned integer with negative literals

I have this simple C program.
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
bool foo (unsigned int a) {
return (a > -2L);
}
bool bar (unsigned long a) {
return (a > -2L);
}
int main() {
printf("foo returned = %d\n", foo(99));
printf("bar returned = %d\n", bar(99));
return 0;
}
Output when I run this -
foo returned = 1
bar returned = 0
Recreated in godbolt here
My question is why does foo(99) return true but bar(99) return false.
To me it makes sense that bar would return false. For simplicity lets say longs are 8 bits, then (using twos complement for signed value):
99 == 0110 0011
-2 == unsigned 254 == 1111 1110
So clearly the CMP instruction will see that 1111 1110 is bigger and return false.
But I dont understand what is going on behind the scenes in the foo function. The assembly for foo seems to hardcode to always return mov eax,0x1. I would have expected foo to do something similar to bar. What is going on here?
This is covered in C classes and is specified in the documentation. Here is how you use documents to figure this out.
In the 2018 C standard, you can look up > or “relational expressions” in the index to see they are discussed on pages 68-69. On page 68, you will find clause 6.5.8, which covers relational operators, including >. Reading it, paragraph 3 says:
If both of the operands have arithmetic type, the usual arithmetic conversions are performed.
“Usual arithmetic conversions” is listed in the index as defined on page 39. Page 39 has clause 6.3.1.8, “Usual arithmetic conversions.” This clause explains that operands of arithmetic types are converted to a common type, and it gives rules determining the common type. For two integer types of different signedness, such as the unsigned long and the long int in bar (a and -2L), it says that, if the unsigned type has rank greater than or equal to the rank of the other type, the signed type is converted to the unsigned type.
“Rank” is not in the index, but you can search the document to find it is discussed in clause 6.3.1.1, where it tells you the rank of long int is greater than the rank of int, and the any unsigned type has the same rank as the corresponding type.
Now you can consider a > -2L in bar, where a is unsigned long. Here we have an unsigned long compared with a long. They have the same rank, so -2L is converted to unsigned long. Conversion of a signed integer to unsigned is discussed in clause 6.3.1.3. It says the value is converted by wrapping it modulo ULONG_MAX+1, so converting the signed long −2 produces a ULONG_MAX+1−2 = ULONG_MAX−1, which is a large integer. Then comparing a, which has the value 99, to a large integer with > yields false, so zero is returned.
For foo, we continue with the rules for the usual arithmetic conversions. When the unsigned type does not have rank greater than or equal to the rank of the signed type, but the signed type can represent all the values of the type of the operand with unsigned type, the operand with the unsigned type is converted to the operand of the signed type. In foo, a is unsigned int and -2L is long int. Presumably in your C implementation, long int is 64 bits, so it can represent all the values of a 32-bit unsigned int. So this rule applies, and a is converted to long int. This does not change the value. So the original value of a, 99, is compared to −2 with >, and this yields true, so one is returned.
In the first function
bool foo (unsigned int a) {
return (a > -2L);
}
the both operands of the expression a > -2L have the type long (the first operand is converted to the type long due to the usual arithmetic conversions because the rank of the type long is greater than the rank of the type unsigned int and all values of the type unsigned int in the used system can be represented by the type long). And it is evident that the positive value 99L is greater than the negative value -2L.
The first function could produce the result 0 provided that sizeof( long ) is equal to sizeof( unsigned int ). In this case the type long is unable to represent all (positive) values of the type unsigned int. As a result due to the usual arithmetic conversions the both operands will be converted to the type unsigned long.
For example running the function foo using MS VS 2019 where sizeof( long ) is equal to 4 as sizeof( unsigned int ) you will get the result 0.
Here is a demonstration program written in C++ that visually shows the reason why the result of a call of the function foo using MS VS 2019 can be equal to 0.
#include <iostream>
#include <iomanip>
#include <type_traits>
int main()
{
unsigned int x = 0;
long y = 0;
std::cout << "sizeof( unsigned int ) = " << sizeof( unsigned int ) << '\n';
std::cout << "sizeof( long ) = " << sizeof(long) << '\n';
std::cout << "std::is_same_v<decltype( x + y ), unsigned long> is "
<< std::boolalpha
<< std::is_same_v<decltype( x + y ), unsigned long>
<< '\n';
}
The program output is
sizeof( unsigned int ) = 4
sizeof( long ) = 4
std::is_same_v<decltype( x + y ), unsigned long> is true
That is in general the result of the first function is implementation defined.
In the second functions
bool bar (unsigned long a) {
return (a > -2L);
}
the both operands have the type unsigned long (again due to the usual arithmetic conversions and ranks of the types unsigned long and signed long are equal each other, so an object of the type signed long is converted to the type unsigned long) and -2L interpreted as unsigned long is greater than 99.
The reason for this has to do with the rules of integer conversions.
In the first case, you compare an unsigned int with a long using the > operator, and in the second case you compare a unsigned long with a long.
These operands must first be converted to a common type using the usual arithmetic conversions. These are spelled out in section 6.3.1.8p1 of the C standard, with the following excerpt focusing on integer conversions:
If both operands have the same type, then no further conversion is
needed.
Otherwise, if both operands have signed integer types or both have
unsigned integer types, the operand with the type of lesser integer
conversion rank is converted to the type of the operand with greater
rank.
Otherwise, if the operand that has unsigned integer type has rank
greater or equal to the rank of the type of the other operand, then
the operand with signed integer type is converted to the type of the
operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can
represent all of the values of the type of the operand with unsigned
integer type, then the operand with unsigned integer type is converted
to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
In the case of comparing an unsigned int with a long the second bolded paragraph applies. long has higher rank and (assuming long is 64 bit and int is 32 bit) can hold all values than an unsigned int can, so the unsigned int operand a is converted to a long. Since the value in question is in the range of long, section 6.3.1.3p1 dictates how the conversion happens:
When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged
So the value is preserved and we're left with 99 > -2 which is true.
In the case of comparing an unsigned long with a long, the first bolded paragraph applies. Both types are of the same rank with different signs, so the long constant -2L is converted to unsigned long. -2 is outside the range of an unsigned long so a value conversion must happen. This conversion is specified in section 6.3.1.3p2:
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
So the long value -2 will be converted to the unsigned long value 264-2, assuming unsigned long is 64 bit. So we're left with 99 > 264-2, which is false.
I think what is happening here is implicit promotion by the compiler. When you perform comparison on two different primitives, the compiler will promote one of them to the same type as the other. I believe the rules are that the type with the larger possible value is used as the standard.
So in foo() you are implicitly promoting your argument to a signed long type and the comparison works as expected.
In bar() your argument is an unsigned long, which has a larger maximum value than signed long. Here the compiler promotes -2L to unsigned long, which turns into a very large number.

Vector size comparison with integer for non-zero vector size [duplicate]

See this code snippet
int main()
{
unsigned int a = 1000;
int b = -1;
if (a>b) printf("A is BIG! %d\n", a-b);
else printf("a is SMALL! %d\n", a-b);
return 0;
}
This gives the output: a is SMALL: 1001
I don't understand what's happening here. How does the > operator work here? Why is "a" smaller than "b"? If it is indeed smaller, why do i get a positive number (1001) as the difference?
Binary operations between different integral types are performed within a "common" type defined by so called usual arithmetic conversions (see the language specification, 6.3.1.8). In your case the "common" type is unsigned int. This means that int operand (your b) will get converted to unsigned int before the comparison, as well as for the purpose of performing subtraction.
When -1 is converted to unsigned int the result is the maximal possible unsigned int value (same as UINT_MAX). Needless to say, it is going to be greater than your unsigned 1000 value, meaning that a > b is indeed false and a is indeed small compared to (unsigned) b. The if in your code should resolve to else branch, which is what you observed in your experiment.
The same conversion rules apply to subtraction. Your a-b is really interpreted as a - (unsigned) b and the result has type unsigned int. Such value cannot be printed with %d format specifier, since %d only works with signed values. Your attempt to print it with %d results in undefined behavior, so the value that you see printed (even though it has a logical deterministic explanation in practice) is completely meaningless from the point of view of C language.
Edit: Actually, I could be wrong about the undefined behavior part. According to C language specification, the common part of the range of the corresponding signed and unsigned integer type shall have identical representation (implying, according to the footnote 31, "interchangeability as arguments to functions"). So, the result of a - b expression is unsigned 1001 as described above, and unless I'm missing something, it is legal to print this specific unsigned value with %d specifier, since it falls within the positive range of int. Printing (unsigned) INT_MAX + 1 with %d would be undefined, but 1001u is fine.
On a typical implementation where int is 32-bit, -1 when converted to an unsigned int is 4,294,967,295 which is indeed ≥ 1000.
Even if you treat the subtraction in an unsigned world, 1000 - (4,294,967,295) = -4,294,966,295 = 1,001 which is what you get.
That's why gcc will spit a warning when you compare unsigned with signed. (If you don't see a warning, pass the -Wsign-compare flag.)
You are doing unsigned comparison, i.e. comparing 1000 to 2^32 - 1.
The output is signed because of %d in printf.
N.B. sometimes the behavior when you mix signed and unsigned operands is compiler-specific. I think it's best to avoid them and do casts when in doubt.
#include<stdio.h>
int main()
{
int a = 1000;
signed int b = -1, c = -2;
printf("%d",(unsigned int)b);
printf("%d\n",(unsigned int)c);
printf("%d\n",(unsigned int)a);
if(1000>-1){
printf("\ntrue");
}
else
printf("\nfalse");
return 0;
}
For this you need to understand the precedence of operators
Relational Operators works left to right ...
so when it comes
if(1000>-1)
then first of all it will change -1 to unsigned integer because int is by default treated as unsigned number and it range it greater than the signed number
-1 will change into the unsigned number ,it changes into a very big number
Find a easy way to compare, maybe useful when you can not get rid of unsigned declaration, (for example, [NSArray count]), just force the "unsigned int" to an "int".
Please correct me if I am wrong.
if (((int)a)>b) {
....
}
The hardware is designed to compare signed to signed and unsigned to unsigned.
If you want the arithmetic result, convert the unsigned value to a larger signed type first. Otherwise the compiler wil assume that the comparison is really between unsigned values.
And -1 is represented as 1111..1111, so it a very big quantity ... The biggest ... When interpreted as unsigned.
while comparing a>b where a is unsigned int type and b is int type, b is type casted to unsigned int so, signed int value -1 is converted into MAX value of unsigned**(range: 0 to (2^32)-1 )**
Thus, a>b i.e., (1000>4294967296) becomes false. Hence else loop printf("a is SMALL! %d\n", a-b); executed.

C++ can not calculate a formula with a vector's size in it?

int main() {
vector<int> v;
if (0 < v.size() - 1) {
printf("true");
} else {
printf("false");
}
}
It prints true which indicates 0 < -1
std::vector::size() returns an unsigned integer. If it is 0 and you subtract 1, it underflows and becomes a huge value (specifically std::numeric_limits<std::vector::size_type>::max()). The comparison works fine, but the subtraction produces a value you did not expect.
For more about unsigned underflow (and overflow), see: C++ underflow and overflow
The simplest fix for your code is probably if (1 < v.size()).
v.size() returns a result of size_t, which is an unsigned type. An unsigned value minus 1 is still unsigned. And all non-zero unsigned values are greater than zero.
std::vector<int>::size() returns type size_t which is an unsigned type whose rank is usually at least that of int.
When, in a math operation, you put together a signed type with a unsigned type and the unsigned type doesn't have a lower rank, the signed typed will get converted to the unsigned type (see 6.3.1.8 Usual arithmetic conversions (I'm linking to the C standard, but rules for integer arithmetic are foundational and need to be common to both languages)).
In other words, assuming that size_t isn't unsigned char or unsigned short
(it's usually unsigned long and the C standard recommends it shouldn't be unsigned long long unless necessary)
(size_t)0 - 1
gets implicitly translated to
(size_t)0 - (size_t)1
which is a positive number equal to SIZE_MAX (-1 cannot be represented in an unsigned type so it gets converted converted by formally "repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type" (6.3.1.3p)).
0 is always less than SIZE_MAX.

c++ safeness of code with implicit conversion between signed and unsigned

According to the rules on implicit conversions between signed and unsigned integer types, discussed here and here, when summing an unsigned int with a int, the signed int is first converted to an unsigned int.
Consider, e.g., the following minimal program
#include <iostream>
int main()
{
unsigned int n = 2;
int x = -1;
std::cout << n + x << std::endl;
return 0;
}
The output of the program is, nevertheless, 1 as expected: x is converted first to an unsigned int, and the sum with n leads to an integer overflow, giving the "right" answer.
In a code like the previous one, if I know for sure that n + x is positive, can I assume that the sum of unsigned int n and int x gives the expected value?
In a code like the previous one, if I know for sure that n + x is positive, can I assume that the sum of unsigned int n and int x gives the expected value?
Yes.
First, the signed value converted to unsigned, using modulo arithmetic:
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n
where n is the number of bits used to represent the unsigned type).
Then two unsigned values will be added using modulo arithmetic:
Unsigned integers shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.
This means that you'll get the expected answer.
Even, if the result would be negative in the mathematical sense, the result in C++ would be a number which is modulo-equal to the negative number.
Note that I've supposed here that you add two same-sized integers.
I think you can be sure and it is not implementation defined, although this statement requires some interpretations of the standard when it comes to systems that do not use two's complement for representing negative values.
First, let's state the things that are clear: unsigned integrals do not overflow but take on a modulo 2^nrOfBits-value (cf this online C++ standard draft):
6.7.1 Fundamental types
(7) Unsigned integers shall obey the laws of arithmetic modulo 2n
where n is the number of bits in the value representation of that
particular size of integer.
So it's just a matter of whether a negative value nv is converted correctly into an unsigned integral bit pattern nv(conv) such that x + nv(conv) will always be the same as x - nv. For the case of a system using two's complement, things are clear, since the two's complement is actually designed such that this arithmetic works immediately.
For systems using other representations of negative values, we'll have to read the standard carefully:
7.8 Integral conversions
(2) If the destination type is unsigned, the resulting value is the
least unsigned integer congruent to the source integer (modulo 2n
where n is the number of bits used to represent the unsigned type). [
Note: In a two’s complement representation, this conversion is
conceptual and there is no change in the bit pattern (if there is
notruncation). —endnote]
As the footnote explicitly says, that in a two's complement representation, there is no change in the bit pattern, we may assume that in systems other than 2s complement a real conversion will take place such that x + nv(conv) == x - nv.
So due to 7.8 (2), I'd say that your assumption is valid.

In case of integer overflows what is the result of (unsigned int) * (int) ? unsigned or int?

In case of integer overflows what is the result of (unsigned int) * (int) ? unsigned or int? What type does the array index operator (operator[]) take for char*: int, unsigned int or something else?
I was auditing the following function, and suddenly this question arose. The function has a vulnerability at line 17.
// Create a character array and initialize it with init[]
// repeatedly. The size of this character array is specified by
// w*h.
char *function4(unsigned int w, unsigned int h, char *init)
{
char *buf;
int i;
if (w*h > 4096)
return (NULL);
buf = (char *)malloc(4096+1);
if (!buf)
return (NULL);
for (i=0; i<h; i++)
memcpy(&buf[i*w], init, w); // line 17
buf[4096] = '\0';
return buf;
}
Consider both w and h are very large unsigned integers. The multiplication at line 9 have a chance to pass the validation.
Now the problem is at line 17. Multiply int i with unsigned int w: if the result is int, it is possible that the product is negative, resulting in accessing a position that is before buf. If the result is unsigned int, the product will always be positive, resulting in accessing a position that is after buf.
It's hard to write code to justify this: int is too large. Does anyone has ideas on this?
Is there any documentation that specifies the type of the product? I have searched for it, but so far haven't found anything.
I suppose that as far as the vulnerability is concerned, whether (unsigned int) * (int) produces unsigned int or int doesn't matter, because in the compiled object file, they are just bytes. The following code works the same no matter the type of the product:
unsigned int x = 10;
int y = -10;
printf("%d\n", x * y); // print x * y in signed integer
printf("%u\n", x * y); // print x * y in unsigned integer
Therefore, it does not matter what type the multiplication returns. It matters that whether the consumer function takes int or unsigned.
The question here is not how bad the function is, or how to improve the function to make it better. The function undoubtedly has a vulnerability. The question is about the exact behavior of the function, based on the prescribed behavior from the standards.
do the w*h calculation in long long, check if bigger than MAX_UINT
EDIT : alternative : if overflown (w*h)/h != w (is this always the case ?! should be, right ?)
To answer your question: the type of an expression multiplying an int and an unsigned int will be an unsigned int in C/C++.
To answer your implied question, one decent way to deal with possible overflow in integer arithmetic is to use the "IntSafe" set of routines from Microsoft:
http://blogs.msdn.com/michael_howard/archive/2006/02/02/523392.aspx
It's available in the SDK and contains inline implementations so you can study what they're doing if you're on another platform.
Ensure that w * h doesn't overflow by limiting w and h.
The type of w*i is unsigned in your case. If I read the standard correctly, the rule is that the operands are converted to the larger type (with its signedness), or unsigned type corresponding to the signed type (which is unsigned int in your case).
However, even if it's unsigned, it doesn't prevent the wraparound (writing to memory before buf), because it might be the case (on i386 platform, it is), that p[-1] is the same as p[-1u]. Anyway, in your case, both buf[-1] and buf[big unsigned number] would be undefined behavior, so the signed/unsigned question is not that important.
Note that signed/unsigned matters in other contexts - eg. (int)(x*y/2) gives different results depending on the types of x and y, even in the absence of undefined behaviour.
I would solve your problem by checking for overflow on line 9; since 4096 is a pretty small constant and 4096*4096 doesn't overflow on most architectures (you need to check), I'd do
if (w>4096 || h>4096 || w*h > 4096)
return (NULL);
This leaves out the case when w or h are 0, you might want to check for it if needed.
In general, you could check for overflow like this:
if(w*h > 4096 || (w*h)/w!=h || (w*h)%w!=0)
In C/C++ the p[n] notation is really a shortcut to writting *(p+n), and this pointer arithmetic takes into account the sign. So p[-1] is valid and refers to the value immediately before *p.
So the sign really matters here, the result of arithmetic operator with integer follow a set of rules defined by the standard, and this is called integer promotions.
Check out this page: INT02-C. Understand integer conversion rules
2 changes make it safer:
if (w >= 4096 || h >= 4096 || w*h > 4096) return NULL;
...
unsigned i;
Note also that it's not less a bad idea to write to or read from past the buffer end. So the question is not whether iw may become negative, but whether 0 <= ih +w <= 4096 holds.
So it's not the type that matters, but the result of h*i.
For example, it doesn't make a difference whether this is (unsigned)0x80000000 or (int)0x80000000, the program will seg-fault anyway.
For C, refer to "Usual arithmetic conversions" (C99: Section 6.3.1.8, ANSI C K&R A6.5) for details on how the operands of the mathematical operators are treated.
In your example the following rules apply:
C99:
Otherwise, if the type of the operand
with signed integer type can represent
all of the values of the type of the
operand with unsigned integer type,
then the operand with unsigned integer
type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted
to the unsigned integer type
corresponding to the type of the
operand with signed integer type.
ANSI C:
Otherwise, if either operand is unsigned int, the other is converted to unsigned int.
Why not just declare i as unsigned int? Then the problem goes away.
In any case, i*w is guaranteed to be <= 4096, as the code tests for this, so it's never going to overflow.
memcpy(&buf[iw > -1 ? iw < 4097? iw : 0 : 0], init, w);
I don't think the triple calculation of iw does degrade the perfomance)
w*h could overflow if w and/or h are sufficiently large and the following validation could pass.
9. if (w*h > 4096)
10. return (NULL);
On int , unsigned int mixed operations, int is elevated to unsigned int, in which case, a negative value of 'i' would become a large positive value. In that case
&buf[i*w]
would be accessing a out of bound value.
Unsigned arithmetic is done as modular (or wrap-around), so the product of two large unsigned ints can easily be less than 4096. The multiplication of int and unsigned int will result in an unsigned int (see section 4.5 of the C++ standard).
Therefore, given large w and a suitable value of h, you can indeed get into trouble.
Making sure integer arithmetic doesn't overflow is difficult. One easy way is to convert to floating-point and doing a floating-point multiplication, and seeing if the result is at all reasonable. As qwerty suggested, long long would be usable, if available on your implementation. (It's a common extension in C90 and C++, does exist in C99, and will be in C++0x.)
There are 3 paragraphs in the current C1X draft on calculating (UNSIGNED TYPE1) X (SIGNED TYPE2) in 6.3.1.8 Usual arithmetic coversions, N1494,
WG 14: C - Project status and milestones
Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
So if a is unsigned int and b is int, parsing of (a * b) should generate code (a * (unsigned int)b). Will overflow if b < 0 or a * b > UINT_MAX.
If a is unsigned int and b is long of greater size, (a * b) should generate ((long)a * (long)b). Will overflow if a * b > LONG_MAX or a * b < LONG_MIN.
If a is unsigned int and b is long of the same size, (a * b) should generate ((unsigned long)a * (unsigned long)b). Will overflow if b < 0 or a * b > ULONG_MAX.
On your second question about the type expected by "indexer", the answer appears "integer type" which allows for any (signed) integer index.
6.5.2.1 Array subscripting
Constraints
1 One of the expressions shall have type ‘‘pointer to complete object type’’, the other
expression shall have integer type, and the result has type ‘‘type’’.
Semantics
2 A postfix expression followed by an expression in square brackets [] is a subscripted
designation of an element of an array object. The definition of the subscript operator []
is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that
apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the
initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th
element of E1 (counting from zero).
It is up to the compiler to perform static analysis and warn the developer about possibility of buffer overrun when the pointer expression is an array variable and the index may be negative. Same goes about warning on possible array size overruns even when the index is positive or unsigned.
To actually answer your question, without specifying the hardware you're running on, you don't know, and in code intended to be portable, you shouldn't depend on any particular behavior.