I understood how this solution works.
int add_no_arithm(int a, int b) {
if (b == 0) return a;
int sum = a ^ b; // add without carrying
int carry = (a & b) << 1; // carry, but don’t add
return add_no_arithm(sum, carry); // recurse
}
But the author comments over this problem as:
"Our first instinct in problems like these should be that we’re going to have to work with bits. Why? Because when you take away the + sign, what other choice do we have? Plus, that’s how computers do it."
What is the author trying to imply?
What he means is quite simple - if you don't have the + operation, you'll need to replicate the behaviour on the bit level of an integer. The code you posted is about the same of what the + operation does natively in the ALU (algorithmic logical unit, the place where calculations take place in a CPU).
Yeah, this is what is used - http://en.wikipedia.org/wiki/Adder_%28electronics%29
Related
Is there a more concise way to write the second line of this function without repeating a reduntantly?
void myfunc()
{
bool a = false; // Assume that this won't alway be hardcoded to false.
a = !a;
}
Is there a more concise way to write the second line of this function without repeating a reduntantly?
! is an unary operator, you cannot associate it with = if this is what you expected.
But you can do something like a ^= true;, this is not a more concise way but this is without repeating a redundantly. Anyway this is not a good way to do contrarily to a = !a; which is much more readable
Well, I don't really see the value of doing so, but you could use xor or a simple decrement. Both of these work.
a ^= 1;
a--;
Then you won't have to repeat a.
But if you want it to be very clear, consider using a function:
void flip(bool *b)
{
(*b)--;
}
bool b = true;
flip(&b);
In C++, you can use references
void flip(bool &b)
{
b--;
}
bool b = true;
flip(b);
Or write a macro. Actually, macros are pretty handy for solving duplication problems. Although, I almost never use them for that, since it's simply rarely worth the effort. I wrote one macro to avoid duplication for malloc calls. Such a call typically look like this:
int *x = malloc(12 * sizeof *x);
You can avoid the duplication with this:
#define ALLOC(p, n) \
((p) = malloc((n) * sizeof *(p)))
But even this is something I hesitate to use.
To be honest, it's not really a problem worth solving. :)
The only possible answer is: unfortunately not.
The unary operator ! is actually concise enough, and any other trick would lead to unreadable code.
Any other shortcut form, for example for binary operators such as + or *:
a += 5;
b *= 7;
mantain the reference to the original meaning of theoperator: multiplication and addition respectively.
In your sentence:
without repeating a reduntantly
there's the wrong assumption that a is redundant in case the compiler could be instructed to negate a without repeating the variable name. No: it tells to people reading the code (even to yourself, for example an year later!) that the variable to be negated is a. And since C grammar doesn't define any syntactic sugar for logical negation operator, a is not redundant at all.
Often I convert some if statements into boolean expressions for code compactness. For instance, if I have something like
foo(int x)
{
if (x > 5) return 100 + 5;
return 100;
}
I'll do it like
foo(int x)
{
return 100 + (x > 5) * 5;
}
This is very simple so no problem, the thing is when I have multiple tests, I can greatly simplify them (at the expense of readability but that's a different issue).
So the question is if that (x > 5) evaluation is as onerous as explicitly branching with it.
In both cases the expression (x > 5) has to be checked if it evaluates to true . And as demonstrated already, both versions compile to the same assembly even without any optimization enabled.
However, the Philosophy section of C++ Core Guidelines has these two rules you would do well to pay heed to:
P.1: Express ideas directly in code
P.3: Express intent
Though these rules cannot be enforced in anyway, adhering to them will make you adopt the version with the if statement.
Doing so will make it less onerous for those who have to maintain the code after you and even yourself a few months later.
You seem to be conflating C++ language constructs with patterns in the assembly. It may have been viable to reason about code on this level given the compilers of the late eighties or early nineties. At this point, however, compilers apply a lot of optimizations and transformations whose correctness or utility is not even obvious to the average programmer. A very simple example is the common beginner's mistake of assuming the following equivalences:
std::uint16_t a = ...;
a *= 2; // a multiplication in assembly
a *= 17; // ditto
a /= 3; // a division in assembly
They may then be surprised to find out that their compiler of choice translates these into the assembly equivalent of e.g.:
a <<= 1u;
a = (a << 4u) + a; // or even (a << 4u) | a if a < 16
a *= 43691u;
Note that the last transformation is only allowed if a is known to be a multiple of the divisor, so you may not see this kind of optimization all too often. How does it even work? In mathematical terms, uint16_t can be thought of as the residue class ring Z/(2^16)Z, and in this ring, there exists a multiplicative inverse for any element that is coprime to 2^16 (i.e. not divisible by 2). If d (e.g. 3) is coprime to 2, it has such an inverse, and then dividing by d is simply equivalent to multiplying by the inverse of d if the remainder is known to be zero. (I won't go into how this inverse can be calculated here.)
Here is another surprising optimization:
long arithsum(long n)
{
long result = 0;
for (long i=0; i<=n; ++i)
result += i;
return result;
}
GCC with -O3 rather mundanely translates this into an unrolled loop of additions. My version (9.0.0svn-something) of Clang, however, will pull a Gauss on you if you do this, and translate this into something like:
long arithsum(long n)
{
return (n * (n+1)) >> 1;
}
Anyway, the same caveats apply to if/switch etc. – while these are control flow structures, and so you'd think they correspond to branching, this may not be so. Likewise, what appears to be a non-branching operation might be translated to a branching operation if the compiler has an optimization rule under which this seems beneficial, or even if it is just unable to translate its own AST or intermediate representation into machine code without use of branching (on the given architecture).
TL;DR: Before you try to outsmart your compiler, figure out which assembly the compiler produces for the straightforward / readable code in this first place. If this assembly is good, there is no point in making the code more subtle / less readable.
Assuming by onerous you mean 1/0. Sure it might work in C/C++ due to implicit typecasting but might not for other languages. If that's what you want to achieve why not use ternary operator (? :) which also makes the code more readable
foo(int x) {
return (x > 5) ? (100 + 5) : 100;
}
Also read this stackoverflow article -- bool to int conversion
The following code snippet is scattered all over the web and seems to be used in multiple different projects with very little changes:
union Float_t {
Float_t(float num = 0.0f) : f(num) {}
// Portable extraction of components.
bool Negative() const { return (i >> 31) != 0; }
int RawMantissa() const { return i & ((1 << 23) - 1); }
int RawExponent() const { return (i >> 23) & 0xFF; }
int i;
float f;
};
inline bool AlmostEqualUlpsAndAbs(float A, float B, float maxDiff, int maxUlpsDiff)
{
// Check if the numbers are really close -- needed
// when comparing numbers near zero.
float absDiff = std::fabs(A - B);
if (absDiff <= maxDiff)
return true;
Float_t uA(A);
Float_t uB(B);
// Different signs means they do not match.
if (uA.Negative() != uB.Negative())
return false;
// Find the difference in ULPs.
return (std::abs(uA.i - uB.i) <= maxUlpsDiff);
}
See, for example here or here or here.
However, I don't understand what is going on here. To my (maybe naive) understanding, the floating-point member variable f is initialized in the constructor, but the integer member i is not.
I'm not terribly familiar with the binary operators that are used here, but I fail to understand how accesses of uA.i and uB.i produce anything but random numbers, given that no line in the code actually connects the values of f and i in any meaningful way.
If somebody could enlighten my on why (and how) exactly this code produces the desired result, I would be very delighted!
A lot of Undefined Behaviour are being exploited here. First assumption is that fields of union can be accessed in place of each other, which is, in itself, UB. Furthermore, coder assumes that: sizeof(int) == sizeof(float), that floats have a given length of mantissa and exponent, that all union members are aligned to zero, that the binary representation of float coincides with the binary representation with int in a very specific way. In short, this will work as long as you're on x86, have specific int and float types and you say a prayer at every sunrise and sunset.
What you probably didn't note is that this is a union, therefore int i and float f is usually aligned in a specific manner in a common memory array by most compilers. This is, in general, still UB and you can't even safely assume that the same physical bits of memory will be used without restricting yourself to a specific compiler and a specific architecture. All that's guaranteed is, the address of both members will be the same (but there might be alignment and/or typedness issues). Assuming that your compiler uses the same physical bits (which is by no means guaranteed by standard) and they both start at offset 0 and have the same size, then i will represent the binary storage format of f.. as long as nothing changes in your architecture. Word of advice? Do not use it until you don't have to. Stick to floating point operations for AlmostEquals(), you can implement it like that. It's the very final pass of optimization when we consider these specialities and we usually do it in a separate branch, you shouldn't plan your code around it.
In my software I am using the input values from the user at run time and performing some mathematical operations. Consider for simplicity below example:
int multiply(const int a, const int b)
{
if(a >= INT_MAX || B >= INT_MAX)
return 0;
else
return a*b;
}
I can check if the input values are greater than the limits, but how do I check if the result will be out of limits? It is quite possible that a = INT_MAX - 1 and b = 2. Since the inputs are perfectly valid, it will execute the undefined code which makes my program meaningless. This means any code executed after this will be random and eventually may result in crash. So how do I protect my program in such cases?
This really comes down to what you actually want to do in this case.
For a machine where long or long long (or int64_t) is a 64-bit value, and int is a 32-bit value, you could do (I'm assuming long is 64 bit here):
long x = static_cast<long>(a) * b;
if (x > MAX_INT || x < MIN_INT)
return 0;
else
return static_cast<int>(x);
By casting one value to long, the other will have to be converted as well. You can cast both if that makes you happier. The overhead here, above a normal 32-bit multiply is a couple of clock-cycles on modern CPU's, and it's unlikely that you can find a safer solution, that is also faster. [You can, in some compilers, add attributes to the if saying that it's unlikely to encourage branch prediction "to get it right" for the common case of returning x]
Obviously, this won't work for values where the type is as big as the biggest integer you can deal with (although you could possibly use floating point, but it may still be a bit dodgy, since the precision of float is not sufficient - could be done using some "safety margin" tho' [e.g. compare to less than LONG_INT_MAX / 2], if you don't need the entire range of integers.). Penalty here is a bit worse tho', especially transitions between float and integer isn't "pleasant".
Another alternative is to actually test the relevant code, with "known invalid values", and as long as the rest of the code is "ok" with it. Make sure you test this with the relevant compiler settings, as changing the compiler options will change the behaviour. Note that your code then has to deal with "what do we do when 65536 * 100000 is a negative number", and your code didn't expect so. Perhaps add something like:
int x = a * b;
if (x < 0) return 0;
[But this only works if you don't expect negative results, of course]
You could also inspect the assembly code generated and understand the architecture of the actual processor [the key here is to understand if "overflow will trap" - which it won't by default in x86, ARM, 68K, 29K. I think MIPS has an option of "trap on overflow"], and determine whether it's likely to cause a problem [1], and add something like
#if (defined(__X86__) || defined(__ARM__))
#error This code needs inspecting for correct behaviour
#endif
return a * b;
One problem with this approach, however, is that even the slightest changes in code, or compiler version may alter the outcome, so it's important to couple this with the testing approach above (and make sure you test the ACTUAL production code, not some hacked up mini-example).
[1] The "undefined behaviour" is undefined to allow C to "work" on processors that have trapping overflows of integer math, as well as the fact that that a * b when it overflows in a signed value is of course hard to determine unless you have a defined math system (two's complement, one's complement, distinct sign bit) - so to avoid "defining" the exact behaviour in these cases, the C standard says "It's undefined". It doesn't mean that it will definitely go bad.
Specifically for the multiplication of a by b the mathematically correct way to detect if it will overflow is to calculate log₂ of both values. If their sum is higher than the log₂ of the highest representable value of the result, then there is overflow.
log₂(a) + log₂(b) < log₂(UINT_MAX)
The difficulty is to calculate quickly the log₂ of an integer. For that, there are several bit twiddling hacks that can be used, like counting bit, counting leading zeros (some processors even have instructions for that). This site has several implementations
https://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
The simplest implementation could be:
unsigned int log2(unsigned int v)
{
unsigned int r = 0;
while (v >>= 1)
r++;
return r;
}
In your program you only need to check then
if(log2(a) + log2(b) < MYLOG2UINTMAX)
return a*b;
else
printf("Overflow");
The signed case is similar but has to take care of the negative case specifically.
EDIT: My solution is not complete and has an error which makes the test more severe than necessary. The equation works in reality if the log₂ function returns a floating point value. In the implementation I limited thevalue to unsigned integers. This means that completely valid multiplication get refused. Why? Because log2(UINT_MAX) is truncated
log₂(UINT_MAX)=log₂(4294967295)≈31.9999999997 truncated to 31.
We have there for to change the implementation to replace the constant to compare to
#define MYLOG2UINTMAX (CHAR_BIT*sizeof (unsigned int))
You may try this:
if ( b > ULONG_MAX / a ) // Need to check a != 0 before this division
return 0; //a*b invoke UB
else
return a*b;
I'm looking for a nice readable way to write this(a and b are input):
int value = 50;
if(a == value) return b;
if(b == value) return a;
return max(a,b);
This is far to long. I have come up with this, but this is not clear enough:
return (a==value)?b:((b==value)?a:max(a,b))
is there a way to achieve this with only the max define?
Code should be easy to read and understand. Your first code, that you said is far too long, is very clear, and three lines are not an unreasonable amount of code for C or C++.
If you will do this often, package it up in a function and just call the function. If the forbidden number can change, make it an argument to the function.
Note that the ternary expression is essentially identical to the code using if statements. They will likely compile to identical code.