C++ define expression evaluation [duplicate]

C++ define expression evaluation [duplicate] - c++

This question already has answers here:
The need for parentheses in macros in C [duplicate]
(8 answers)
Closed 7 years ago.
Suppose we have this expression:
#define cube(x) x * x * x
And then we call it:
int n = 3, v;
v = cube(n + 1); // v = 10
v = cube((n + 1)); // v = 64
v = cube(n); // v = 27
So the question is: why first operation do not make v = 64?

Macros are not evaluated (in the sense of the common interpretation of evaluation), they are expanded at compile time.
Before the file is compiled, there is another program called the C Preprocessor that replaces the macro invocation literally/textually and prepares the file for actual compilation, so for your macro
#define cube(x) x * x * x when you do this
This
v = cube(n + 1);
is replaced with this (expaned is the correct term)
v = n + 1 * n + 1 * n + 1;
// Simplifies to
v = n + n + n + 1;
// and again
v = 3 * n + 1;
which for n = 3 gives you 10 exactly the observed result.
Note, that when you add parentheses
v = cube((n + 1));
then, the expansion is
v = (n + 1) * (n + 1) * (n + 1);
which is what you would expect cube() to do, so prevent this you should redefine your macro like this
#define cube(x) ((x) * (x) * (x))
If you are using gcc try
gcc -E source.c
and check the result to verify how the macro was expanded.

Related

Difference between "for" and "forall" on the right-hand side of assignment

In the following code, I am assigning some dummy (test) values to all the elements of a 2-dimensional array (a[][]) and then checking whether the array is filled as expected. When using the line (A), the code works as expected (i.e., it prints "passed").
proc test()
{
const N = 1000;
var a: [1..N, 1..3] real = (for m in 1 .. (3 * N) do (1.0 / m)); // (A)
// var a: [1..N, 1..3] real = [m in 1 .. (3 * N)] (1.0 / m); // (B)
// var a: [1..N, 1..3] real = forall m in 1 .. (3 * N) do (1.0 / m); // (C)
var m = 0;
for i in 1 .. N {
for k in 1 .. 3 {
m += 1;
assert( abs( a[i, k] - 1.0 / m ) < 1.0e-10 );
}
}
writeln("passed");
}
test();
On the other hand, if I use line (B) or (C) instead of (A), the code gives this error:
test.chpl:7: error: iteration over a range with multi-dimensional iterator
I am wondering what is the meaning of this error message? (My expectation is that the right-hand side of the assignment is evaluated first, i.e., a temporary array is created in parallel and assigned to the left-hand side. But is this expectation not correct?)
(I am also wondering whether it is valid to assign a one-dimensional array to a two-dimensional array (as in line (A)) without reshape...?)

Check multiple bits in bitset

I got the following code:
p = B[m] & B[m + 5] & B[m + 6] & B[m + 11];
m -= d * (l > 0) * 11 + !d * (c % 5 > 0);
p += m ^ M ? B[m] & B[m + 5] & B[m + 6] & B[m + 11] : 0;
I know it's hard to read, but here's a TL;DR for it : I check multiple bits (all are related to m) in a bitset, then i change the value of variable m and i check again (other bits). Is there a way i can acces those bits in less code, or to template the check (cuz are the same formulas for bits)?
B[m] & B[m + 5] & B[m + 6] & B[m + 11]
Thank you :D.

I suggest using a function to precompute a helper bitset for that:
bitset<99> prepare_bitset(const bitset<99>& B)
{
return B & (B<<5) & (B<<6) & (B<<11);
}
Then you can just use it like this:
auto HB = prepare_bitset(B);
p = HB[m];
m -= d * (l > 0) * 11 + !d * (c % 5 > 0);
p += m ^ M ? HB[m] : 0;
UPD: Another option is to just define HB in place:
auto HB = B & (B<<5) & (B<<6) & (B<<11);
p = HB[m];
m -= d * (l > 0) * 11 + !d * (c % 5 > 0);
p += m ^ M ? HB[m] : 0;

Make a function that takes B and m.
So p = yourFunc(B, m) and p += m ^M ? yourFunc(B, m) : 0
The function is something like:
TYPEOFP yourFunc(TYPEOFB b, TYPEOFM m) {
return b[m] & b[m + 5] & b[m + 6] & b[m + 11];
}
I don't know your types, so you need to fill it in.
I wouldn't recommend a macro, but if you want that it's
#define yourMACRO(b, m) ((b)[(m)] & (b)[(m) + 5] & (b)[(m) + 6] & (b)[(m) + 11])
All of those extra parens are to protect you if you ever pass in an expression for b or m. The macro will fail if you pass in something with side-effects (like ++m).
EDIT: From your comments, you said you can't write outside the function.
It's unorthodox, but you can do the #define in the function and #undef it at the end of the function.
Depending on the version of C++ you have, you might have lambdas, which let you make function expressions.
If you are desperate, you can define an inner class or struct with a static function: C++ can we have functions inside functions?

Range Reduction Poor Precision For Single Precision Floating Point

I am trying to implement range reduction as the first step of implementing the sine function.
I am following the method described in the paper "ARGUMENT REDUCTION FOR HUGE ARGUMENTS" by K.C. NG
I am getting error as large as 0.002339146 when using the input range of x from 0 to 20000. My error obviously shouldn't be that large, and I'm not sure how I can reduce it. I noticed that the error magnitude is associated with the input theta magnitude to cosine/sine.
I was able to obtain the nearpi.c code that the paper mentions, but I'm not sure how to utilize the code for single precision floating point. If anyone is interested, the nearpi.c file can be found at this link: nearpi.c
Here is my MATLAB code:
x = 0:0.1:20000;
% Perform range reduction
% Store constant 2/pi
twooverpi = single(2/pi);
% Compute y
y = (x.*twooverpi);
% Compute k (round to nearest integer
k = round(y);
% Solve for f
f = single(y-k);
% Solve for r
r = single(f*single(pi/2));
% Find last two bits of k
n = bitand(fi(k,1,32,0),fi(3,1,32,0));
n = single(n);
% Preallocate for speed
z(length(x)) = 0;
for i = 1:length(x)
switch(n(i))
case 0
z(i)=sin(r(i));
case 1
z(i) = single(cos(r(i)));
case 2
z(i) = -sin(r(i));
case 3
z(i) = single(-cos(r(i)));
otherwise
end
end
maxerror = max(abs(single(z - single(sin(single(x))))))
minerror = min(abs(single(z - single(sin(single(x))))))
I have edited the program nearpi.c so that it compiles. However I am not sure how to interpret the output. Also the file expects an input, which I had to input by hand, also I am not sure of the significance of the input.
Here is the working nearpi.c:
/*
============================================================================
Name : nearpi.c
Author :
Version :
Copyright : Your copyright notice
Description : Hello World in C, Ansi-style
============================================================================
*/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
/*
* Global macro definitions.
*/
# define hex( double ) *(1 + ((long *) &double)), *((long *) &double)
# define sgn(a) (a >= 0 ? 1 : -1)
# define MAX_k 2500
# define D 56
# define MAX_EXP 127
# define THRESHOLD 2.22e-16
/*
* Global Variables
*/
int CFlength, /* length of CF including terminator */
binade;
double e,
f; /* [e,f] range of D-bit unsigned int of f;
form 1X...X */
// Function Prototypes
int dbleCF (double i[], double j[]);
void input (double i[]);
void nearPiOver2 (double i[]);
/*
* This is the start of the main program.
*/
int main (void)
{
int k; /* subscript variable */
double i[MAX_k],
j[MAX_k]; /* i and j are continued fractions
(coeffs) */
// fp = fopen("/src/cfpi.txt", "r");
/*
* Compute global variables e and f, where
*
* e = 2 ^ (D-1), i.e. the D bit number 10...0
* and
* f = 2 ^ D - 1, i.e. the D bit number 11...1 .
*/
e = 1;
for (k = 2; k <= D; k = k + 1)
e = 2 * e;
f = 2 * e - 1;
/*
* Compute the continued fraction for (2/e)/(pi/2) , i.e.
* q's starting value for the first binade, given the continued
* fraction for pi as input; set the global variable CFlength
* to the length of the resulting continued fraction (including
* its negative valued terminator). One should use as many
* partial coefficients of pi as necessary to resolve numbers
* of the width of the underflow plus the overflow threshold.
* A rule of thumb is 0.97 partial coefficients are generated
* for every decimal digit of pi .
*
* Note: for radix B machines, subroutine input should compute
* the continued fraction for (B/e)/(pi/2) where e = B ^ (D - 1).
*/
input (i);
/*
* Begin main loop over all binades:
* For each binade, find the nearest multiples of pi/2 in that binade.
*
* [ Note: for hexadecimal machines ( B = 16 ), the rest of the main
* program simplifies(!) to
*
* B_ade = 1;
* while (B_ade < MAX_EXP)
* {
* dbleCF (i, j);
* dbleCF (j, i);
* dbleCF (i, j);
* CFlength = dbleCF (j, i);
* B_ade = B_ade + 1;
* }
* }
*
* because the alternation of source & destination are no longer necessary. ]
*/
binade = 1;
while (binade < MAX_EXP)
{
/*
* For the current (odd) binade, find the nearest multiples of pi/2.
*/
nearPiOver2 (i);
/*
* Double the continued fraction to get to the next (even) binade.
* To save copying arrays, i and j will alternate as the source
* and destination for the continued fractions.
*/
CFlength = dbleCF (i, j);
binade = binade + 1;
/*
* Check for main loop termination again because of the
* alternation.
*/
if (binade >= MAX_EXP)
break;
/*
* For the current (even) binade, find the nearest multiples of pi/2.
*/
nearPiOver2 (j);
/*
* Double the continued fraction to get to the next (odd) binade.
*/
CFlength = dbleCF (j, i);
binade = binade + 1;
}
return 0;
} /* end of Main Program */
/*
* Subroutine DbleCF doubles a continued fraction whose partial
* coefficients are i[] into a continued fraction j[], where both
* arrays are of a type sufficient to do D-bit integer arithmetic.
*
* In my case ( D = 56 ) , I am forced to treat integers as double
* precision reals because my machine does not have integers of
* sufficient width to handle D-bit integer arithmetic.
*
* Adapted from a Basic program written by W. Kahan.
*
* Algorithm based on Hurwitz's method of doubling continued
* fractions (see Knuth Vol. 3, p.360).
*
* A negative value terminates the last partial quotient.
*
* Note: for the non-C programmers, the statement break
* exits a loop and the statement continue skips to the next
* case in the same loop.
*
* The call modf ( l / 2, &l0 ) assigns the integer portion of
* half of L to L0.
*/
int dbleCF (double i[], double j[])
{
double k,
l,
l0,
j0;
int n,
m;
n = 1;
m = 0;
j0 = i[0] + i[0];
l = i[n];
while (1)
{
if (l < 0)
{
j[m] = j0;
break;
};
modf (l / 2, &l0);
l = l - l0 - l0;
k = i[n + 1];
if (l0 > 0)
{
j[m] = j0;
j[m + 1] = l0;
j0 = 0;
m = m + 2;
};
if (l == 0) {
/*
* Even case.
*/
if (k < 0)
{
m = m - 1;
break;
}
else
{
j0 = j0 + k + k;
n = n + 2;
l = i[n];
continue;
};
}
/*
* Odd case.
*/
if (k < 0)
{
j[m] = j0 + 2;
break;
};
if (k == 0)
{
n = n + 2;
l = l + i[n];
continue;
};
j[m] = j0 + 1;
m = m + 1;
j0 = 1;
l = k - 1;
n = n + 1;
continue;
};
m = m + 1;
j[m] = -99999;
return (m);
}
/*
* Subroutine input computes the continued fraction for
* (2/e) / (pi/2) , where e = 2 ^ (D-1) , given pi 's
* continued fraction as input. That is, double the continued
* fraction of pi D-3 times and place a zero at the front.
*
* One should use as many partial coefficients of pi as
* necessary to resolve numbers of the width of the underflow
* plus the overflow threshold. A rule of thumb is 0.97
* partial coefficients are generated for every decimal digit
* of pi . The last coefficient of pi is terminated by a
* negative number.
*
* I'll be happy to supply anyone with the partial coefficients
* of pi . My ARPA address is mcdonald#ucbdali.BERKELEY.ARPA .
*
* I computed the partial coefficients of pi using a method of
* Bill Gosper's. I need only compute with integers, albeit
* large ones. After writing the program in bc and Vaxima ,
* Prof. Fateman suggested FranzLisp . To my surprise, FranzLisp
* ran the fastest! the reason? FranzLisp's Bignum package is
* hand coded in assembler. Also, FranzLisp can be compiled.
*
*
* Note: for radix B machines, subroutine input should compute
* the continued fraction for (B/e)/(pi/2) where e = B ^ (D - 1).
* In the case of hexadecimal ( B = 16 ), this is done by repeated
* doubling the appropriate number of times.
*/
void input (double i[])
{
int k;
double j[MAX_k];
/*
* Read in the partial coefficients of pi from a precalculated file
* until a negative value is encountered.
*/
k = -1;
do
{
k = k + 1;
scanf ("%lE", &i[k]);
printf("hello\n");
printf("%d", k);
} while (i[k] >= 0);
/*
* Double the continued fraction for pi D-3 times using
* i and j alternately as source and destination. On my
* machine D = 56 so D-3 is odd; hence the following code:
*
* Double twice (D-3)/2 times,
*/
for (k = 1; k <= (D - 3) / 2; k = k + 1)
{
dbleCF (i, j);
dbleCF (j, i);
};
/*
* then double once more.
*/
dbleCF (i, j);
/*
* Now append a zero on the front (reciprocate the continued
* fraction) and the return the coefficients in i .
*/
i[0] = 0;
k = -1;
do
{
k = k + 1;
i[k + 1] = j[k];
} while (j[k] >= 0);
/*
* Return the length of the continued fraction, including its
* terminator and initial zero, in the global variable CFlength.
*/
CFlength = k;
}
/*
* Given a continued fraction's coefficients in an array i ,
* subroutine nearPiOver2 finds all machine representable
* values near a integer multiple of pi/2 in the current binade.
*/
void nearPiOver2 (double i[])
{
int k, /* subscript for recurrences (see
handout) */
K; /* like k , but used during cancel. elim.
*/
double p[MAX_k], /* product of the q's (see
handout) */
q[MAX_k], /* successive tail evals of CF (see
handout) */
j[MAX_k], /* like convergent numerators (see
handout) */
tmp, /* temporary used during cancellation
elim. */
mk0, /* m[k - 1] (see
handout) */
mk, /* m[k] is one of the few ints (see
handout) */
mkAbs, /* absolute value of m sub k
*/
mK0, /* like mk0 , but used during cancel.
elim. */
mK, /* like mk , but used during cancel.
elim. */
z, /* the object of our quest (the argument)
*/
m0, /* the mantissa of z as a D-bit integer
*/
x, /* the reduced argument (see
handout) */
ldexp (), /* sys routine to multiply by a power of
two */
fabs (), /* sys routine to compute FP absolute
value */
floor (), /* sys routine to compute greatest int <=
value */
ceil (); /* sys routine to compute least int >=
value */
/*
* Compute the q's by evaluating the continued fraction from
* bottom up.
*
* Start evaluation with a big number in the terminator position.
*/
q[CFlength] = 1.0 + 30;
for (k = CFlength - 1; k >= 0; k = k - 1)
q[k] = i[k] + 1 / q[k + 1];
/*
* Let THRESHOLD be the biggest | x | that we are interesed in
* seeing.
*
* Compute the p's and j's by the recurrences from the top down.
*
* Stop when
*
* 1 1
* ----- >= THRESHOLD > ------ .
* 2 |j | 2 |j |
* k k+1
*/
p[0] = 1;
j[0] = 0;
j[1] = 1;
k = 0;
do
{
p[k + 1] = -q[k + 1] * p[k];
if (k > 0)
j[1 + k] = j[k - 1] - i[k] * j[k];
k = k + 1;
} while (1 / (2 * fabs (j[k])) >= THRESHOLD);
/*
* Then mk runs through the integers between
*
* k + k +
* (-1) e / p - 1/2 & (-1) f / p - 1/2 .
* k k
*/
for (mkAbs = floor (e / fabs (p[k]));
mkAbs <= ceil (f / fabs (p[k])); mkAbs = mkAbs + 1)
{
mk = mkAbs * sgn (p[k]);
/*
* For each mk , mk0 runs through integers between
*
* +
* m q - p THRESHOLD .
* k k k
*/
for (mk0 = floor (mk * q[k] - fabs (p[k]) * THRESHOLD);
mk0 <= ceil (mk * q[k] + fabs (p[k]) * THRESHOLD);
mk0 = mk0 + 1)
{
/*
* For each pair { mk , mk0 } , check that
*
* k
* m = (-1) ( j m - j m )
* 0 k-1 k k k-1
*/
m0 = (k & 1 ? -1 : 1) * (j[k - 1] * mk - j[k] * mk0);
/*
* lies between e and f .
*/
if (e <= fabs (m0) && fabs (m0) <= f)
{
/*
* If so, then we have found an
*
* k
* x = ((-1) m / p - m ) / j
* 0 k k k
*
* = ( m q - m ) / p .
* k k k-1 k
*
* But this later formula can suffer cancellation. Therefore,
* run the recurrence for the mk 's to get mK with minimal
* | mK | + | mK0 | in the hope mK is 0 .
*/
K = k;
mK = mk;
mK0 = mk0;
while (fabs (mK) > 0)
{
p[K + 1] = -q[K + 1] * p[K];
tmp = mK0 - i[K] * mK;
if (fabs (tmp) > fabs (mK0))
break;
mK0 = mK;
mK = tmp;
K = K + 1;
};
/*
* Then
* x = ( m q - m ) / p
* K K K-1 K
*
* as accurately as one could hope.
*/
x = (mK * q[K] - mK0) / p[K];
/*
* To return z and m0 as positive numbers,
* x must take the sign of m0 .
*/
x = x * sgn (m0);
m0 = fabs (m0);
/*d
* Set z = m0 * 2 ^ (binade+1-D) .
*/
z = ldexp (m0, binade + 1 - D);
/*
* Print z (hex), z (dec), m0 (dec), binade+1-D, x (hex), x (dec).
*/
printf ("%08lx %08lx Z=%22.16E M=%17.17G L+1-%d=%3d %08lx %08lx x=%23.16E\n", hex (z), z, m0, D, binade + 1 - D, hex (x), x);
}
}
}
}

Theory
First let's note the difference using single-precision arithmetic makes.
[Equation 8] The minimal value of f can be larger. As double-precision numbers are a super-set of the single-precision numbers, the closest single to a multiple of 2/pi can only be farther away then ~2.98e-19, therefore the number of leading zeros in fixed-arithmetic representation of f must be at most 61 leading zeros (but will probably be less). Denote this quantity fdigits.
[Equation Before 9] Consequently, instead of 121 bits, y must be accurate to fdigits + 24 (non-zero significant bits in single-precision) + 7 (extra guard bits) = fdigits + 31, and at most 92.
[Equation 9] "Therefore, together with the width of x's exponent, 2/pi must contain 127 (maximal exponent of single) + 31 + fdigits, or 158 + fdigits and at most 219 bits.
[Subsection 2.5] The size of A is determined by the number of zeros in x before the binary point (and is unaffected by the move to single), while the size of C is determined by Equation Before 9.
For large x (x>=2^24), x looks like this: [24 bits, M zeros]. Multiplying it by A, whose size is the first M bits of 2/pi, will result in an integer (the zeros of x will just shift everything into the integers).
Choosing C to be starting from the M+d bit of 2/pi will result in the product x*C being of size at most d-24. In double precision, d is chosen to be 174 (and instead of 24, we have 53) so that the product will be of size at most 121. In single, it is enough to choose d such that d-24 <= 92, or more precisely, d-24 <= fdigits+31. That is, d can be chosen as fdigits+55, or at most 116.
As a result, B should be of size at most 116 bits.
We are therefore left with two problems :
Computing fdigits. This involves reading ref 6 from the linked paper and understanding it. Might not be that easy. :) As far as I can see, that's the only place where nearpi.c is used.
Computing B, the relevant bits of 2/pi. Since M is bounded below by 127, we can just compute the first 127+116 bits of 2/pi offline and store them in an array. See Wikipedia.
Computing y=x*B. This involves multipliying x by a 116-bits number. This is where Section 3 is used. The size of the blocks is chosen to be 24 because 2*24 + 2 (multiplying two 24-bits numbers, and adding 3 such numbers) is smaller than the precision of double, 53 (and because 24 divides 96). We can use blocks of size 11 bits for single arithmetic for similar reasons.
Note - the trick with B only applies to numbers whose exponents are positive (x>=2^24).
To summarize - first, you have to solve the problem with double precision. Your Matlab code doesn't work in double precision too (try removing single and computing sin(2^53), because your twooverpi only has 53 significant bits, not 175 (and anyway, you can't directly multiply such precise numbers in Matlab). Second, the scheme should be adapted to work with single, and again, the key problem is representing 2/pi precisely enough, and supporting multiplication of highly-precise numbers. Last, when everything works, you can try and figure out a better fdigits to reduce the number of bits you have to store and multiply.
Hopefully I'm not completely off - comments and contradictions are welcome.
Example
As an example, let us compute sin(x) where x = single(2^24-1), which has no zeros after the significant bits (M = 0). This simplifies finding B, as B consists of the first 116 bits of 2/pi. Since x has precision of 24 bits and B of 116 bits, the product
y = x * B
will have 92 bits of precision, as required.
Section 3 in the linked paper describes how to perform this product with enough precision; the same algorithm can be used with blocks of size 11 to compute y in our case. Being drudgery, I hope I'm excused for not doing this explicitly, instead relying on Matlab's symbolic math toolbox. This toolbox provides us with the vpa function, which allows us to specify the precision of a number in decimal digits. So,
vpa('2/pi', ceil(116*log10(2)))
will produce an approximation of 2/pi of at least 116 bits of precision. Because vpa accepts only integers for its precision argument, we usually can't specify the binary precision of a number exactly, so we use the next-best.
The following code computes sin(x) according to the paper, in single precision :
x = single(2^24-1);
y = x * vpa('2/pi', ceil(116*log10(2))); % Precision = 103.075
k = round(y);
f = single(y - k);
r = f * single(pi) / 2;
switch mod(k, 4)
case 0
s = sin(r);
case 1
s = cos(r);
case 2
s = -sin(r);
case 3
s = -cos(r);
end
sin(x) - s % Expected value: exactly zero.
(The precision of y is obtained using Mathematica, which turned out to be a much better numerical tool than Matlab :) )
In libm
The other answer to this question (which has been deleted since) lead me to an implementation in libm, which although works on double-precision numbers, follows the linked paper very thoroughly.
See file s_sin.c for the wrapper (Table 2 from the linked paper appears as a switch statement at the end of the file), and e_rem_pio2.c for the argument reduction code (of particular interest is an array containing the first 396 hex-digits of 2/pi, starting at line 69).

Properties of the modulo operation

I have the compute the sum S = (a*x + b*y + c) % N. Yes it looks like a quadratic equation but it is not because the x and y have some properties and have to be calculated using some recurrence relations. Because the sum exceeds even the limits of unsigned long long I want to know how could I compute that sum using the properties of the modulo operation, properties that allow the writing of the sum something like that(I say something because I do not remember exactly how are those properties): (a*x)%N + (b*y)%N + c%N, thus avoiding exceeding the limits of unsigned long long.
Thanks in advance for your concern! :)

a % N = x means that for some integers 0 <= x < N and m: m * N + x = a.
You can simply deduce then that if a % N = x and b % N = y then
(a + b) % N =
= (m * N + x + l * N + y) % N =
= ((m + l) * N + x + y) % N =
= (x + y) % N =
= (a % N + b % N) % N.
We know that 0 < x + y < 2N, that is why you need to keep remainder calculation. This shows that it is okay to split the summation and calculate the remainders separately and then add them, but don't forget to get the remainder for the sum.
For multiplication:
(a * b) % N =
= ((m * N + x) * (l * N + y)) % N =
= ((m * l + x * l + m * y) * N + x * y) % N =
= (x * y) % N =
= ((a % N) * (b % N)) % N.
Thus you can also do the same with products.
These properties can be simply derived in a more general setting using some abstract algebra (the remainders form a factor ring Z/nZ).

You can take the idea even further, if needed:
S = ( (a%N)*(x%N)+(b%N)*(y%N)+c%N )%N

You can apply the modulus to each term of the sum as you've suggested; but even so after summing them you must apply the modulus again to get your final result.

How about this:
int x = (7 + 7 + 7) % 10;
int y = (7 % 10 + 7 % 10 + 7 % 10) % 10;

You remember right. The equation you gave, where you %N every of the summands is correct. And that would be exactly what I use. You should also %N for every partial sum (and the total) again, as the addition results can be still greater than N. BUT be careful this works only if your size limit is at least twice as big as your N. If this is not the case, it can get really nasty.
Btw for the following %N operations of the partial sums, you dont have to perform a complete division, a check > N and if bigger just subtraction of N is enough.

Not only can you reduce all variable mod n before starting the calculation, you can write your own mod-mul to compute a*x mod n by using a shift-and-add method and reduce the result mod n at each step. That way your intermediate calculations will only require one more bit than n. Once these products are computed, you can add them pairwise and reduce mod n after each addition which will also not require more than 1 bit beyond the range of n.
There is a python implementation of modular multiplication in my answer to this question. Conversion to C should be trivial.

Defining constants in C

I was working with C++ for a long time and now I am on a C project.
I am in the process of converting a C++ program to C.
I am having difficulty with the constants used in the program.
In the C++ code we have constants defined like
static const int X = 5 + 3;
static const int Y = (X + 10) * 5
static const int Z = ((Y + 8) + 0xfff) & ~0xfff
In C, these definitions throw error.
When I use #defines instead of the constants like
#define X (5+3);
#define Y (((X) + 10) * 5)
#define Z ((((Y) + 8) + 0xfff) & ~0xfff)
the C compiler complains about the definitions of "Y" and "Z".
Could anyone please help me to find a solution for this.

You need to remove the semi-colon from the #define X line
#define X (5+3)
#define Y (((X) + 10) * 5)
#define Z ((((Y) + 8) + 0xfff) & ~0xfff)

#define X (5+3); is wrong, it needs to be #define X (5+3) (without ';')
also be aware of the difference between using static const and #define: in static const, the value is actually evaluated, in #define, it's pre-processor command, so
#define n very_heavy_calc()
...
n*n;
will result in evaluating very_heavy_calc() twice

Another option is to use an enum:
enum {
X = 5 + 3,
Y = (X + 10) * 5,
Z = ((Y + 8) + 0xfff) & ~0xfff
};

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ define expression evaluation [duplicate] - c++

Related

Difference between "for" and "forall" on the right-hand side of assignment

Check multiple bits in bitset

Range Reduction Poor Precision For Single Precision Floating Point

Properties of the modulo operation

Defining constants in C

Categories

Resources