I'm looking for a neat way (most likely, a "bitwise shortcut") for calculating the signed value of the expression (x - y) / z, given unsigned operands x, y and z.
Here is a "kinda real kinda pseudo" code illustrating what I am currently doing (please don't mind the actual syntax being "100% perfect C or C++"):
int64 func(uint64 x, uint64 y, uint64 z)
{
if (x >= y) {
uint64 result = (x - y) / z;
if (int64(result) >= 0)
return int64(result);
}
else {
uint64 result = (y - x) / z;
if (int64(result) >= 0)
return -int64(result);
}
throwSomeError();
}
Please assume that I don't have a larger type at hand.
I'd be happy to read any idea of how to make this simpler/shorter/neater.
There is a shortcut, by using a bitwise trick for conditional-negation twice (once for the absolute difference, and then again to restore the sign).
I'll use some similar non-perfect C-ish syntax I guess, to match the question.
First get a mask that has all bits set iff x < y:
uint64 m = -uint64(x < y);
(x - y) and -(y - x) are actually the same, even in unsigned arithmetic, and conditional negation can be done by using the definition of two's complement: -a = ~(a - 1) = (a + (-1) ^ -1). (a + 0) ^ 0 is of course equal to a again, so when m is -1, (a + m) ^ m = -a and when m is zero, it is a. So it's a conditional negation.
uint64 absdiff = (x - y + m) ^ m;
Then divide as usual, and restore the sign by doing another conditional negation:
return int64((absdiff / z + m) ^ m);
Find the number of paths on the Cartesian plane from (0, 0) to (n, n), which never raises above the y = x line. It is possible to make three types of moves along the path:
move up, i.e. from (i, j) to (i, j + 1);
move to the right, i.e. from (i, j) to (i + 1, j);
the right-up move, i.e. from (i, j) to (i + 1, j + 1)
Path count 101
First, we solve a simpler problem:
Find the number of paths on the Cartesian plane from (0, 0) to (n, n) with:
move up, i.e. from (i, j) to (i, j + 1);
move to the right, i.e. from (i, j) to (i + 1, j);
and we can go to grid which x < y.
How to solve it? Too Hard? Okay, we try to find the number of paths from (0, 0) to (2, 2) first. We could draw all paths in a grid:
We define
f(x,y) => the number of paths from (0, 0) to (x, y)
You can see the path to (2, 2) from either (1, 2) or (1, 2), so we can get:
f(2, 2) = f(2, 1) + f(1, 2)
And then you will notice for point(x, y), its path from either (x, y - 1) or (x - 1, y). That's very natural, since we have only two possible moves:
move up, i.e. from (i, j) to (i, j + 1);
move to the right, i.e. from (i, j) to (i + 1, j);
I draw a larger illustration for you, and you can check our conclusion:
So we can get that:
f(x, y) = f(x, y - 1) + f(x - 1, y)
Wait... What if x = 0 or y = 0? That's quite direct:
if x = 0 => f(x, y) = f(x, y - 1)
if y = 0 => f(x, y) = f(x - 1, y)
The last... How about f(0, 0)? We define:
f(0, 0) = 1
since there just 1 path from (0,0) to (1,0) and (0, 1).
OK, summarise:
f(x, y) = f(x, y - 1) + f(x - 1, y)
if x = 0 => f(x, y) = f(x, y - 1)
if y = 0 => f(x, y) = f(x - 1, y)
f(0, 0) = 1
And by recursion, we can solve that problem.
Your problem
Now let's discuss your original problem, just modify our equations a little bit:
f(x, y) = f(x, y - 1) + f(x - 1, y) + f(x - 1, y - 1)
if x = 0 => f(x, y) = f(x, y - 1)
if y = 0 => f(x, y) = f(x - 1, y)
if x < y => f(x, y) = 0
f(0, 0) = 1
and it will result my code.
The last thing I add to my code is Memoization. In short, Memoization can eliminate the repeat calculation -- if we have calculated f(x,y) already, just store it in a dictionary and never calculate it again. You can read the wiki page for a further learning.
So, that's all of my code. If you still get some questions, you can leave a comment here, and I will reply it as soon as possible.
Code:
d = {} # Memoization
def find(x, y):
if x == 0 and y == 0:
return 1
if x < y:
return 0
if d.get((x, y)) is not None:
return d.get((x, y))
ret = 0
if x > 0:
ret += find(x - 1, y)
if y > 0:
ret += find(x, y - 1)
if x > 0 and y > 0:
ret += find(x - 1, y - 1)
d[(x, y)] = ret
return ret
print find(2, 1) # 4
For additional ideas for solving problems such as this one, there is a mathematical curiosity that is in 1-1 correspondence with not only the sequences produced by lattice walks where one's steps must remain below the line x = y, but a veritable plethora, of fantastical mathematical beasts that cut a wide swath of applicability to problem solving and research.
What are these curiosities?
Catalan Numbers:
C_n = 1/(n+1)*(2n)Choose(n), n >= 0, where if n = 0, C_0 = 1.
They also count:
The number of expressions containing $n$ pairs of parentheses
eg. n = 3: ((())), (()()), (())(), ()(()), ()()()
Number of plane trees with n + 1 vertices
Number of Triangulations of a convex (n + 2)-gon
Monomials from the product: p(x1, ..., xn) = x1(x1 + x2)(x1 + x2 + x3) ... (x1 + ... + xn)
bipartite vertices of rooted planar trees
And soooo many more things
These object appear in a lot of active research in mathematical physics, for instance, which is an active area of algorithms research due to the enormous data sets.
So you never know what seemingly far flung concepts are intimately linked in some deep dark mathematical recess.
I was trying to write a simple CUDA function to blur images. I use myself defined max and min macro as
#define min(a, b) ((float)a > (float)b)? (float)b: (float)a
#define max(a, b) ((float)a > (float)b)? (float)a: (float)b
The part of __global__ kernel is:
float norm;
float sum = 0;// when filter exceed border, norm will affect!
int center = radius * filterWidth + radius;
int imgx = 0, imgy = 0;
for (int y = -radius; y <= radius; y++)
{
for (int x = -radius; x <= radius; x++)
{
imgx = min(max(x + absolute_image_position_x, 0), numCols-1);
//imgx = min(numCols - 1, imgx);
imgy = min(max(y + absolute_image_position_y, 0), numRows -1);
//imgy = min(numRows-1, imgy);
sum += (float) inputChannel[(imgy*numCols) + imgx] * filter[center + (y*filterWidth) + x];
}
}
outputChannel[pos] = (unsigned char) sum;
But the min and max can not give correct answer when I tried to debug. For example, min(max(10,0),100) will give 100.0f!
I did not check each step why it was wrong. But later I changed to cuda math functions, the results became right.
Anyone has idea. Is there any restriction in use of macro in CUDA kernel?
Getting rid of the (float) to clear the clutter, your macros look like this:
#define min(a, b) (a > b)? b: a
#define max(a, b) (a > b)? a: b
And example use (simplifying a few variable names):
imgx = min(max(x + aipx, 0), nc-1);
will expand to:
imgx = ((x + aipx > 0)? x + aipx: 0 > nc-1)? nc-1: (x + aipx > 0)? x + aipx: 0;
Perhaps that is getting parsed incorrectly? Try putting extra parens around the use of your macros' arguments:
#define min(a, b) ((a) > (b))? (b): (a)
#define max(a, b) ((a) > (b))? (a): (b)
I have an integer-valued bounded variable, call it X. (Somewhere around 0<=X<=100)
I want to have a binary variable, call it Y, such that Y=1 if X >= A and X <= B, otherwise Y=0.
The best I've come up with thus far is the following (where T<x> are introduced binary variables, and M is a large number)
(minimize Y)
(X - A) <= M*Ta
(B - X) <= M*Tb
Y <= Ta
Y <= Tb
Y >= Ta + Tb - 1
(In other words, introducing two binary variables that are true if the variable satisfies the lower and upper bounds of the range, respectively, and setting the result to the binary multiplication of those variables)
This... Works, sort of, but has a couple major flaws. In particular, it's not rigorously defined - Y can be 1 even if X is outside the range.
So: is there a better way to do this? In particular: is there a way to rigorously define it, or if not, a way to at least prevent false positives?
Edit: to clarify: A and B are variables, not parameters.
I think the below works.
(I) A * Y <= X <= B * Y + 100 * (1 - Y)
(II) (X - A) <= M * Ta
(III) (B - X) <= M * Tb
(IV) Y >= Ta + Tb - 1
So X < A makes:
(I)Y=0
and (II), (III), (IV) do not matter.
X > B makes:
(I) Y = 0
and (II), (III), (IV) do not matter.
A <= X <= B makes:
(I) Y = 1 or Y = 0
(II) Ta = 1
(III) Tb = 1
(IV) Y = 1
Rewriting loannis's answer in a linear form by expanding out multiplication of binary variables with a continuous variable:
Tc <= M*Y
Tc <= A
Tc >= A - M*(1-Y)
Tc >= 0
Tc <= X
Td <= M*Y
Td <= B
Td >= B - M*(1-Y)
Td >= 0
X <= Td + 100*(1-Y)
(X - A + 1) <= M * Ta
(B - X + 1) <= M * Tb
Y >= Ta + Tb - 1
This seems to work, although I have not yet had the chance to expand it out to prove it. Also, some of these constraints may be unnecessary; I have not checked.
The expansion I did was according to the following rule:
If b is a binary variable, and c is a continuous one, and 0 <= c <= M, then y=b*c is equivalent to the following:
y <= M*b
y <= c
y >= c - M*(1 - b)
y >= 0
I was working with C++ for a long time and now I am on a C project.
I am in the process of converting a C++ program to C.
I am having difficulty with the constants used in the program.
In the C++ code we have constants defined like
static const int X = 5 + 3;
static const int Y = (X + 10) * 5
static const int Z = ((Y + 8) + 0xfff) & ~0xfff
In C, these definitions throw error.
When I use #defines instead of the constants like
#define X (5+3);
#define Y (((X) + 10) * 5)
#define Z ((((Y) + 8) + 0xfff) & ~0xfff)
the C compiler complains about the definitions of "Y" and "Z".
Could anyone please help me to find a solution for this.
You need to remove the semi-colon from the #define X line
#define X (5+3)
#define Y (((X) + 10) * 5)
#define Z ((((Y) + 8) + 0xfff) & ~0xfff)
#define X (5+3); is wrong, it needs to be #define X (5+3) (without ';')
also be aware of the difference between using static const and #define: in static const, the value is actually evaluated, in #define, it's pre-processor command, so
#define n very_heavy_calc()
...
n*n;
will result in evaluating very_heavy_calc() twice
Another option is to use an enum:
enum {
X = 5 + 3,
Y = (X + 10) * 5,
Z = ((Y + 8) + 0xfff) & ~0xfff
};