Related
I am looking at some Fortran code from an old scanned paper. The scan quality is not great so I may have copied it wrong. I tried to run this using an online Fortran compiler but it bombs out. Not being familiar with Fortran, I was wondering if someone can point out where the syntax does not make sense? The code is from a paper on sediment dynamics:
Komar, P.D. and Miller, M.C., 1975. On the comparison between the threshold of sediment motion under waves and unidirectional currents with a discussion of the practical evaluation of the threshold: Reply. Journal of Sedimentary Research, 45(1).
PROGRAM TSHOLD
REAL LI, LO
G = 981.0
PIE = 3.1416
RHOW = 1.00
READ (6O,1) DIAM, RHOS
1 FORMAT (2X, F6.3,2X, F5.3)
IF(DIAM .LT. 0.05) GO TO 5
A = 0.463 * PIE
B = 0.25
GO TO 7
5 A = 0.21
B = 0.50
7 PWR = 1.0 / (2.0 - B)
FAC = (A * (RHOS - RHOW) * G/(RHOW * PIE**B))**PWR
FAC1 = FAC * DIAM**((1.0 - B) * PWR)
T = 1.0
15 J = 1.20
LD = 156.13 * (T**2)
UM = FAC1 * T**(B*PWR)
WRITE(61,9) DIAM, T, UM
9 FORMAT(1H0, 10X, 17HGRAIN DIAMETER = ,F6.3,1X,2HCM //
1 11X, 14HWAVE PERIOD = ,F5.2, 1X, 3HSEC //
2 11X, 22HORBITAL VELOCITY, UM = ,F6.2, 1X, 6HCM/SECl //
3 20X, 6HHEIGHT, 5X, 5HDEPTH, 8X, 3HH/L, 6X, 7HH/DEPTH //
4 22X, 2HCM, 8X, 2HCM /)
C INCREMENT WAVE HEIGHT, CALCULATE DEPTH
H = 10.0
DO 12 K = 1.60
SING = PIE * H / (UM * T)
X = SING
IF(X.LT.1.0) GO TO 30
30 ASINH = X - 0.16666*X**3.0 + 0.07500* X ** 5.0 - 0.04464 * X ** 7.0
1 + 0.03038 * X ** 9.0 - 0.02237 * X ** 11.0
32 LI = LD * (SINH(ASINH)/COSH(ASINH))
OPTH = ASINH * LI / 6.2832
C CHECK WAVE STABILITY
RATIO = H / DPTH
IF(RATIO.GE.0.78) GO TO 11
STEEP = H / LI
TEST = 0.142 * (SINH(ASINH)/COSH(ASINH))
IF(STEEP.GE.TEST) GO TO 11
WRITE(61,10) H, OPTH, STEEP, RATIO
I0 FORMAT(IH0, 20X, F5.1, 4X, E9.3, 4X, F5.3, 4X, F4.2)
11 H = H + 10.0
12 CONTINUE
T = T + 1.0
15 CONTINUE
END
The problem is more likely that old Fortran requires fixed form code formatting where the number of spaces before a statement is very important.
Here are some general rules
Normal statements start at column 7 and beyond
Lines cannot exceed 72 columns
Any character placed on column 6 indicates the line is a continuation from the line above. I see that on the code above in the lines following 9 FORMAT(..
A number placed between columns 1-5 indicates a label, which can be a target of a GO TO statement, a DO statement or a formatting specification.
The character C on the first column, and sometimes any character on the first column indicate the line is a comment line.
see https://people.cs.vt.edu/~asandu/Courses/MTU/CS2911/fortran_notes/node4.html for more info.
Based on the rules above, here is how to enter the code, with the correct spacing. I run the F77 code through a converter to make it compatible with F90 and F77 at the same time. The code below might compile with the online compiler now.
PROGRAM TSHOLD
REAL LI, LO
G = 981.0
PIE = 3.1416
RHOW = 1.00
READ (60,1) DIAM, RHOS
1 FORMAT (2X, F6.3,2X, F5.3)
IF(DIAM .LT. 0.05) GO TO 5
A = 0.463 * PIE
B = 0.25
GO TO 7
5 A = 0.21
B = 0.50
7 PWR = 1.0 / (2.0 - B)
FAC = (A * (RHOS - RHOW) * G/(RHOW * PIE**B))**PWR
FAC1 = FAC * DIAM**((1.0 - B) * PWR)
T = 1.0
DO 15 J=1,20
LD = 156.13 * (T**2)
UM = FAC1 * T**(B*PWR)
WRITE(61,9) DIAM, T, UM
9 FORMAT(1H0, 10X, 17HGRAIN DIAMETER = ,F6.3,1X,2HCM // &
& 11X, 14HWAVE PERIOD = ,F5.2, 1X, 3HSEC // &
& 11X, 22HORBITAL VELOCITY, UM = ,F6.2, 1X, 6HCM/SECl // &
& 20X, 6HHEIGHT, 5X, 5HDEPTH, 8X, 3HH/L, 6X, 7HH/DEPTH // &
& 22X, 2HCM, 8X, 2HCM /)
! INCREMENT WAVE HEIGHT, CALCULATE DEPTH
H = 10.0
DO 12 K = 1,60
SING = PIE * H / (UM * T)
X = SING
IF(X.LT.1.0) GO TO 30
30 ASINH = X - 0.16666*X**3.0 + 0.07500* X ** 5.0 - 0.04464 * X ** 7.&
& + 0.03038 * X ** 9.0 - 0.02237 * X ** 11.0
32 LI = LD * (SINH(ASINH)/COSH(ASINH))
OPTH = ASINH * LI / 6.2832
! CHECK WAVE STABILITY
RATIO = H / DPTH
IF(RATIO.GE.0.78) GO TO 11
STEEP = H / LI
TEST = 0.142 * (SINH(ASINH)/COSH(ASINH))
IF(STEEP.GE.TEST) GO TO 11
WRITE(61,10) H, OPTH, STEEP, RATIO
10 FORMAT(G14.4, 20X, F5.1, 4X, E9.3, 4X, F5.3, 4X, F4.2)
11 H = H + 10.0
12 CONTINUE
T = T + 1.0
15 CONTINUE
END
I found several transcription errors, replacing commas with dots, zeros with the letter O, and a missing DO statement.
I have two sets of financial data that tend to contain differences due to unit errors e.g. $10000 in one dataset may be $1000 in the other.
I'm trying to code a check for such differences, but the only way I can think of is to divide the two variables and see if the difference is in a table of 0.001, 0.01, 0.1, 10, 100 etc, but it would be hard to catch all of the differences.
Is there a smarter way to do this?
Use proc compare. Be sure the two datasets are sorted in identical order, either by row or by specific groups. Use the by statement as needed. More info on options can be found in the documentation.
Example - compare a modified cars dataset with sashelp.cars:
data cars_modified;
set sashelp.cars;
if(mod(_N_, 2) = 0) then msrp = msrp - 100;
run;
proc compare base = sashelp.cars
compare = cars_modified
out = out_differences
outnoequal
outdif
noprint;
var msrp;
run;
Only the observations with differences are output in out_differences:
_TYPE_ _OBS_ MSRP
DIF 2 $-100
DIF 4 $-100
DIF 6 $-100
DIF 8 $-100
DIF 10 $-100
...
So you appear to be asking to find cases where X/Y is a number that is exactly 1.00Exx where XX is an integer, other than 0.
data _null_;
do x=1,10,100,1000;
do y=1,2,3,10.1,10 ;
ratio = x/y;
power = floor(log10(ratio));
if power ne 0 and 1.00 = round(ratio/10**power,0.01) then
put 'Ratio of ' x 'over ' y 'is 10**' power '.'
;
end;
end;
run;
Results:
Ratio of 1 over 10 is 10**-1 .
Ratio of 10 over 1 is 10**1 .
Ratio of 100 over 1 is 10**2 .
Ratio of 100 over 10 is 10**1 .
Ratio of 1000 over 1 is 10**3 .
Ratio of 1000 over 10 is 10**2 .
For a numeric value X you can compute the nearest the rational expression, p/q.
If you calculate ratio
X = amount_for_source_A / amount_from_source_B;
status = math.rational(X,1e5,p,q);
the ratio will be a multiple of 10 if p=1 or q=1
Example:
proc ds2;
package math / overwrite = yes;
method rational(double x, double maxden, in_out integer p, in_out integer q) returns double;
/*
** FROM: https://www.ics.uci.edu/~eppstein/numth/frap.c
** FROM: https://stackoverflow.com/questions/95727/how-to-convert-floats-to-human-readable-fractions
**
** find rational approximation to given real number
** David Eppstein / UC Irvine / 8 Aug 1993
**
** With corrections from Arno Formella, May 2008
**
** Modified for Proc DS2, Richard DeVenezia, Jan 2020.
**
** usage: rational(r,d,p,q)
** x is real number to approx
** maxden is the maximum denominator allowed
** p is return for numerator
** q is return for denominator
** returns 0 if no problems
**
** based on the theory of continued fractions
** if x = a1 + 1/(a2 + 1/(a3 + 1/(a4 + ...)))
** then best approximation is found by truncating this series
** (with some adjustments in the last term).
**
** Note the fraction can be recovered as the first column of the matrix
** ( a1 1 ) ( a2 1 ) ( a3 1 ) ...
** ( 1 0 ) ( 1 0 ) ( 1 0 )
** Instead of keeping the sequence of continued fraction terms,
** we just keep the last partial product of these matrices.
*/
declare integer m[0:1,0:1];
declare double startx e1 e2;
declare integer ai t result p1 q1 p2 q2;
startx = x;
/* initialize matrix */
m[0,0] = 1; m[1,1] = 1;
m[0,1] = 0; m[1,0] = 0;
/* loop finding terms until denom gets too big */
do while (1);
ai = x;
if not ( m[1,0] * ai + m[1,1] < maxden ) then leave;
t = m[0,0] * ai + m[0,1];
m[0,1] = m[0,0];
m[0,0] = t;
t = m[1,0] * ai + m[1,1];
m[1,1] = m[1,0];
m[1,0] = t;
if x = ai then leave; %* AF: division by zero;
x = 1 / (x - ai);
if x > 2147483647 /*x'7FFFFFFF'*/ then leave; %* AF: representation failure;
end;
/* now remaining x is between 0 and 1/ai */
/* approx as either 0 or 1/m where m is max that will fit in maxden */
/* first try zero */
p1 = m[0,0];
q1 = m[1,0];
e1 = startx - 1.0 * p1 / q1;
/* now try other possibility */
ai = (maxden - m[1,1]) / m[1,0];
m[0,0] = m[0,0] * ai + m[0,1];
m[1,0] = m[1,0] * ai + m[1,1];
p2 = m[0,0];
q2 = m[1,0];
e2 = startx - 1.0 * p2 / q2;
if abs(e1) <= abs(e2) then do;
p = p1;
q = q1;
end;
else do;
p = p2;
q = q2;
end;
return 0;
end;
endpackage;
run;
quit;
* Example uage;
proc ds2;
data _null_;
declare package math math();
declare double x;
declare int p1 q1 p q;
method run();
streaminit(12345);
x = 0;
do _n_ = 1 to 20;
p1 = ceil(rand('uniform',9));
q1 = ceil(rand('uniform',9));
x + 1. * p1 / q1;
math.rational (x, 10000, p, q);
put 'add' p1 '/' q1 ' ' x=best16. 'is' p '/' q;
end;
end;
enddata;
run;
quit;
----- LOG -----
add 4 / 1 x= 4 is 4 / 1
add 4 / 2 x= 6 is 6 / 1
add 2 / 7 x=6.28571428571429 is 44 / 7
add 4 / 6 x=6.95238095238095 is 146 / 21
add 5 / 2 x=9.45238095238095 is 397 / 42
add 5 / 2 x= 11.952380952381 is 251 / 21
add 7 / 1 x= 18.952380952381 is 398 / 21
add 8 / 6 x=20.2857142857143 is 142 / 7
add 9 / 3 x=23.2857142857143 is 163 / 7
add 8 / 2 x=27.2857142857143 is 191 / 7
add 3 / 1 x=30.2857142857143 is 212 / 7
add 9 / 3 x=33.2857142857143 is 233 / 7
add 4 / 3 x=34.6190476190476 is 727 / 21
add 4 / 6 x=35.2857142857143 is 247 / 7
add 1 / 9 x=35.3968253968254 is 2230 / 63
add 8 / 3 x=38.0634920634921 is 2398 / 63
add 2 / 4 x=38.5634920634921 is 4859 / 126
add 5 / 1 x=43.5634920634921 is 5489 / 126
add 1 / 2 x=44.0634920634921 is 2776 / 63
add 2 / 7 x=44.3492063492064 is 2794 / 63
DS2 math package
I've googled about the implementation of a fast DCT. I've found the Loeffler algorithm and I have implemented in C++ and in ARM assembly with NEON. Moving ahead, I've found the binDCT that avoid floating calculation. My reference paper/schema is this one:
That said, I've tried to implement in C++ with the following code, just to test:
void my_binDCT(int in[8][8], int data[8][8],const int xpos, const int ypos)
{
int i;
int row[8][8];
int x0, x1, x2, x3, x4, x5, x6, x7;
int tmp0, tmp1, tmp2, tmp3, tmp4, tmp5, tmp6, tmp7, tmp10, tmp11, tmp12, tmp13, tmp14, tmp15, tmp16, tmp17;
// transform rows
for (i = 0; i < 8; i++) {
x0 = in[xpos + 0][ypos + i];
x1 = in[xpos + 1][ypos + i];
x2 = in[xpos + 2][ypos + i];
x3 = in[xpos + 3][ypos + i];
x4 = in[xpos + 4][ypos + i];
x5 = in[xpos + 5][ypos + i];
x6 = in[xpos + 6][ypos + i];
x7 = in[xpos + 7][ypos + i];
//stage 1
tmp0 = x0 + x7;
tmp7 = x0 - x7;
tmp1 = x1 + x6;
tmp6 = x1 - x6;
tmp2 = x2 + x5;
tmp5 = x2 - x5;
tmp3 = x3 + x4;
tmp4 = x3 - x4;
//stage 2
tmp16 = ((tmp5*3)>>3) + tmp6;
tmp15 = ((tmp16*5)>>3) - tmp5;
//stage 3
tmp10 = tmp0 + tmp3;
tmp13 = tmp0 - tmp3;
tmp11 = tmp1 + tmp2;
tmp12 = tmp1 - tmp2;
tmp14 = tmp4 + tmp15;
tmp15 = tmp4 - tmp15;
auto z = tmp16;
tmp16 = tmp7 - tmp16;
tmp17 = z + tmp7;
//stage 4
tmp14 = (tmp17 >> 3) - tmp14;
tmp10 = tmp10 + tmp11;
tmp11 = (tmp10 >> 1) - tmp11;
tmp12 = ((tmp13*3)>>3) - tmp12;
tmp13 = ((tmp12*3)>>3) + tmp13;
tmp15 = ((tmp16*7)>>3) + tmp15;
tmp16 = (tmp15>>1) - tmp16;
//stage 5
row[i][0] = tmp10;
row[i][4] = tmp11;
row[i][6] = tmp12;
row[i][2] = tmp13;
row[i][7] = tmp14;
row[i][5] = tmp15;
row[i][3] = tmp16;
row[i][1] = tmp17;
}
//rotate columns
/* transform columns */
for (i = 0; i < 8; i++) {
x0 = row[0][i];
x1 = row[1][i];
x2 = row[2][i];
x3 = row[3][i];
x4 = row[4][i];
x5 = row[5][i];
x6 = row[6][i];
x7 = row[7][i];
//stage 1
tmp0 = x0 + x7;
tmp7 = x0 - x7;
tmp1 = x1 + x6;
tmp6 = x1 - x6;
tmp2 = x2 + x5;
tmp5 = x2 - x5;
tmp3 = x3 + x4;
tmp4 = x3 - x4;
//stage 2
tmp16 = ((tmp5*3)>>3) + tmp6;
tmp15 = ((tmp16*5)>>3) - tmp5;
//stage 3
tmp10 = tmp0 + tmp3;
tmp13 = tmp0 - tmp3;
tmp11 = tmp1 + tmp2;
tmp12 = tmp1 - tmp2;
tmp14 = tmp4 + tmp15;
tmp15 = tmp4 - tmp15;
auto z = tmp16;
tmp16 = tmp7 - tmp16;
tmp17 = z + tmp7;
//stage 4
tmp14 = (tmp17 >> 3) - tmp14;
tmp10 = tmp10 + tmp11;
tmp11 = (tmp10 >> 1) - tmp11;
tmp12 = ((tmp13*3)>>3) - tmp12;
tmp13 = ((tmp12*3)>>3) + tmp13;
tmp15 = ((tmp16*7)>>3) + tmp15;
tmp16 = (tmp15>>1) - tmp16;
//stage 5
data[0][i] = tmp10 >> 3;
data[4][i] = tmp11 >> 3;
data[6][i] = tmp12 >> 3;
data[2][i] = tmp13 >> 3;
data[7][i] = tmp14 >> 3;
data[5][i] = tmp15 >> 3;
data[3][i] = tmp16 >> 3;
data[1][i] = tmp17 >> 3;
}
}
I've coded the first DCT by rows and the second one by columns and I've supposed to normalize the results dividing by 8 (as per DCT formula with N=8).
I've tested on a 8x8 matrix:
int matrix_a[8][8] = {
12, 16, 19, 12, 12, 27, 51, 47,
16, 24, 12, 19, 12, 20, 39, 51,
24, 27, 8, 39, 35, 34, 24, 44,
40, 17, 28, 32, 24, 27, 8, 32,
34, 20, 28, 20, 12, 8, 19, 34,
19, 39, 12, 27, 27, 12, 8, 34,
8, 28, -5, 39, 34, 16, 12, 19,
20, 27, 8, 27, 24, 19, 19, 8,
};
And I got this outcome:
MYBINDCT-2:
186 13 -3 4 -2 4 6 0
-13 -20 -10 1 2 -2 1 -4
1 19 -10 -3 7 -12 -2 -4
5 2 -4 -3 -1 -4 -2 -1
11 -5 -7 1 -3 4 -1 0
-13 8 -3 0 10 -4 -6 3
-11 6 -11 1 6 0 -1 -4
-13 4 -1 -3 5 -5 -1 0
that is quite far from the (rounded) real dct:
186 20 -11 -9 -4 3 8 -1
-18 -35 -24 -5 9 -3 0 -8
14 26 -2 14 7 -19 -3 -3
-9 -10 5 -15 1 8 3 1
23 -11 -19 -9 -11 8 -2 1
-10 10 3 -3 17 -4 -8 4
-14 13 -21 -4 18 0 -1 -7
-19 7 -1 8 15 -7 -3 0
I've applied the algorithm, done a lot of tests, but I still don't understand where I made mistakes.
Does anybody with much better experience than me can explain me the mistakes I've done?
The strange thing is that I've implemented Loeffler,as I wrote, and it works very well. And the procedure, apart for the coefficients and the floating numbers, is quite similar (butterfly schema, floating scaled factors, normalization).
I'm stuck with it.
Thanks to everyone can suggest me the answer.
EDIT:
A brief call is:
int main(int argc, char **argv)
{
int MYBINDCT[8][8];
my_binDCT(matrix_a, MYBINDCT, 0, 0);
cout << "\nMYBINDCT: \n";
for (int i = 0; i < 8; i++)
{
cout << '\n;
for (int j = 0; j < 8; j++)
{
cout << MYBINDCT[i][j] << " ";
}
}
return 0;
}
A calculation scheme that doesn't have multipliers (or has such crude ones as 3 or 5) cannot be very precise; I think your result is actually OK.
If your paper is any good, it should specify the expected precision of the results. Otherwise, 42 is a pretty universal answer to the 8x8 DCT problem, with an unspecified precision.
When doing approximations to DCT, it's pretty common to replace the definition of the DCT by something that is easier to implement. If you use DCT for image compression, then changing the definition of DCT to any transform will work, as long as you also change the IDCT (inverse transform) accordingly. For example, H.264 (the video coding standard) does this.
Я думаю вы не правильно интерпретируете "-" на схеме. Там где стоит знак "-" нужно изменить его знак, а потом сложить. -A+B или A+-B => B-A или A-B
/* Chris */
void my_binDCT(int x[8])
{
int tmp0, tmp1, tmp2, tmp3, tmp4, tmp5, tmp6, tmp7, tmp10, tmp11, tmp12, tmp13, tmp14, tmp15, tmp16, tmp17;
//stage 1
tmp0 = x[0] + x[7];
tmp7 = x[0] - x[7];
tmp1 = x[1] + x[6];
tmp6 = x[1] - x[6];
tmp2 = x[2] + x[5];
tmp5 = x[2] - x[5];
tmp3 = x[3] + x[4];
tmp4 = x[3] - x[4];
//stage 2
tmp16 = ((tmp5*3)>>3) + tmp6;
tmp15 = ((tmp16*5)>>3) - tmp5;
//stage 3
tmp10 = tmp0 + tmp3;
tmp13 = tmp0 - tmp3;
tmp11 = tmp1 + tmp2;
tmp12 = tmp1 - tmp2;
tmp14 = tmp4 + tmp15;
tmp15 = tmp4 - tmp15;
int z = tmp16;
tmp16 = tmp7 - tmp16;
tmp17 = (z + tmp7);
//stage 4
tmp14 = tmp14 - (tmp17 >> 3); //fix A+-B (tmp17 >> 3) - tmp14
tmp10 = tmp10 + tmp11;
tmp11 = (tmp10 >> 1) - tmp11;
tmp12 = tmp12 - ((tmp13*3)>>3); //fix A+-B ((tmp13*3)>>3) - tmp12;
tmp13 = ((tmp12*3)>>3) + tmp13;
tmp15 = (((tmp16*7)>>3) + tmp15);
tmp16 = tmp16 - (tmp15>>1); //fix A+-B (tmp15>>1) - tmp16
//stage 5
x[0] = tmp10;
x[4] = tmp11;
x[6] = tmp12;
x[2] = tmp13;
x[7] = tmp14;
x[5] = tmp15;
x[3] = tmp16;
x[1] = tmp17;
}
186 28 -14 -10 -4 3 4 0
-27 -66 -43 -9 13 -3 0 -3
18 47 -4 22 10 -19 -2 -1
-9 -15 9 -20 1 7 2 0
23 -16 -24 -10 -11 6 -1 0
-8 11 3 -3 13 -3 -3 1
-8 10 -15 -2 9 0 -1 -1
-5 2 -1 3 4 -2 -1 0
-----------
186 13 -7 -5 -2 4 -7 0
-13 -20 -11 -2 2 -2 -2 3
9 14 -1 4 2 -12 1 0
-6 -4 2 -3 0 3 -2 -1
11 -5 -6 -2 -3 4 0 -1
-12 8 1 -2 10 -4 6 -3
11 -7 10 2 -7 -1 -1 -4
12 -5 0 -3 -5 5 -1 -1
row_fdct my_binDCT
---------- ----------
72796704 72545773 (rows per second)
Посмотрите на intDCT (row_fdct). На x86 нет никакого прироста производительности! использовать binDCT имеет смысл только в оборудовании, которое не умеет умножать или которая экономит энергию.
#define FIX_0_382683433 98
#define FIX_0_541196100 139
#define FIX_0_707106781 181
#define FIX_1_306562965 334
void row_fdct(int dataptr[]){
int tmp0, tmp1, tmp2, tmp3, tmp4, tmp5, tmp6, tmp7;
int tmp10, tmp11, tmp12, tmp13;
int z1, z2, z3, z4, z5, z11, z13;
/* Pass 1: process rows. */
tmp0 = dataptr[0] + dataptr[7];
tmp7 = dataptr[0] - dataptr[7];
tmp1 = dataptr[1] + dataptr[6];
tmp6 = dataptr[1] - dataptr[6];
tmp2 = dataptr[2] + dataptr[5];
tmp5 = dataptr[2] - dataptr[5];
tmp3 = dataptr[3] + dataptr[4];
tmp4 = dataptr[3] - dataptr[4];
/* Even part */
tmp10 = tmp0 + tmp3; /* phase 2 */
tmp13 = tmp0 - tmp3;
tmp11 = tmp1 + tmp2;
tmp12 = tmp1 - tmp2;
dataptr[0] = tmp10 + tmp11; /* phase 3 */
dataptr[4] = tmp10 - tmp11;
z1 = (tmp12 + tmp13) * FIX_0_707106781 >> 8; /* c4 */
dataptr[2] = tmp13 + z1; /* phase 5 */
dataptr[6] = tmp13 - z1;
/* Odd part */
tmp10 = tmp4 + tmp5; /* phase 2 */
tmp11 = tmp5 + tmp6;
tmp12 = tmp6 + tmp7;
/* The rotator is modified from fig 4-8 to avoid extra negations. */
z5 = (tmp10 - tmp12) * FIX_0_382683433 >> 8; /* c6 */
z2 = (tmp10 * FIX_0_541196100 >> 8) + z5; /* c2-c6 */
z4 = (tmp12 * FIX_1_306562965 >> 8) + z5; /* c2+c6 */
z3 = tmp11 * FIX_0_707106781 >> 8; /* c4 */
z11 = tmp7 + z3; /* phase 5 */
z13 = tmp7 - z3;
dataptr[5] = z13 + z2; /* phase 6 */
dataptr[3] = z13 - z2;
dataptr[1] = z11 + z4;
dataptr[7] = z11 - z4;
}
я погуглил по поводу binDCT и нашёл ещё документ, где есть схема binDCT C7. Я поиграл с ней и подогнал выходные умножения, чтобы приблизить результаты к каноническому fastDCT (но я всё-же буду использовать intDCT вместо binDCT):
void row_bdct_c7_scale(int dataptr[8]){
int tmp0, tmp1, tmp2, tmp3, tmp4, tmp5, tmp6, tmp7,z1;
tmp0 = dataptr[0] + dataptr[7];
tmp7 = dataptr[0] - dataptr[7];
tmp1 = dataptr[1] + dataptr[6];
tmp6 = dataptr[1] - dataptr[6];
tmp2 = dataptr[2] + dataptr[5];
tmp5 = dataptr[2] - dataptr[5];
tmp3 = dataptr[3] + dataptr[4];
tmp4 = dataptr[3] - dataptr[4];
tmp5 = tmp5 - tmp6/2;
tmp6 = tmp5*3/4 + tmp6;
tmp5 = tmp6/2 - tmp5;
tmp0 = (z1=tmp0) + tmp3;
tmp3 = z1-tmp3;
tmp1 = (z1=tmp1) + tmp2;
tmp2 = z1-tmp2;
dataptr[0] = tmp0 = tmp0+tmp1;
dataptr[4] = (tmp0/2 - tmp1)*2;
dataptr[6] = tmp2 = tmp3/2-tmp2;
dataptr[2] = (tmp3 - tmp2/2)*2;
tmp4 = (z1=tmp4)+tmp5;
tmp5 = z1-tmp5;
tmp6 = tmp7 - (z1=tmp6);
tmp7 = tmp7 + z1;
dataptr[7] = tmp4 = (tmp7/4-tmp4)>>1;
dataptr[1] = (tmp7 - tmp4/4)*2; //scale x2
dataptr[5] = tmp5 = tmp6 + tmp5;
dataptr[3] = (tmp6 - tmp5/2)*2; //scale x2
}
186 28 -14 -10 -4 3 4 0
-27 -66 -43 -9 13 -3 0 -3
18 47 -4 22 10 -19 -2 -1
-9 -15 9 -20 1 7 2 0
23 -16 -24 -10 -11 6 -1 0
-8 11 3 -3 13 -3 -3 1
-8 10 -15 -2 9 0 -1 -1
-5 2 -1 3 4 -2 -1 0
-----------
186 28 -16 -8 -4 1 6 -1
-27 -63 -41 -7 16 -5 -2 -4
21 38 4 25 5 -19 -3 -1
-7 -18 4 -16 -3 6 4 0
22 -14 -23 -11 -11 6 -2 0
-11 13 8 -6 17 -3 -5 1
-11 15 -21 -1 15 -1 -2 -3
-8 4 -1 3 5 -2 -1 0
row_fdct row_bdct_c
---------- ----------
72404388 62906263 (rows per second)
The psuedocode for the Halton sequnce can be found here. I wrote a function that does this but for some reason checking the Matlab results for the 4th dimensional Halton sequence my numbers do not match up and I am not sure why. Here is my code:
double Halton_Seq(int index, double base){
double f = 1, r;
while(index > 0){
f = f/base;
r = r + f*(fmod(index,base));
index = index/base;
}
return r;
}
Here are the first 10 results I get:
1
0.25
0.5
0.75
0.0625
0.3125
0.5625
0.8125
0.125
0.375
Here is the first 10 results MATLAB gets:
Columns 1 through 2
0 0.5000
Columns 3 through 4
0.2500 0.7500
Columns 5 through 6
0.1250 0.6250
Columns 7 through 8
0.3750 0.8750
Columns 9 through 10
0.0625 0.5625
You forgot to initialize r in line 2.
r = 0;
double Halton_Seq(int index, int base){
double f = 1, r = 0;
while(index > 0){
f = f/base;
r = r + f* (index% base);
index = index/base;
}
return r;
}
// Output for 10 (base 2)
0.000000
0.500000
0.250000
0.750000
0.125000
0.625000
0.375000
0.875000
0.062500
0.562500
So, I am doing a little code for opengl that picks the color of one square and sum 0.01 on his value, so the color will be more shining. I have values of colors for each square in one array , and I got one variable that holds the value of the maximum one element of the color can go, in this case this value is one.
This is part of the function
for(GLint i = 0; i < 3; i++) {
if(colors[selectedSquare][i] > 0) {
colors[selectedSquare][i] += 0.01;
if(colors[selectedSquare][i] == maxColor) {
flag = false;
}
}
}
I call this function in glutTimerFunc, and improve the value of the color in 0.01 for each time. When the value of the color goes egual 1 (the maxColor) i start to reducing the color in other part of the function.
The problem here is that the comparison
(colors[selectedSquare][i] == maxColor)
Never gets true, I made some output to check and this is what I got
colors[selectedSquare][i] value = 0.99 size = 4
maxColor value = 1 size = 4
(colors[selectedSquare][i] == maxColor) is 0
colors[selectedSquare][i] value = 1 size = 4
maxColor value = 1 size = 4
(colors[selectedSquare][i] == maxColor) is 0
colors[selectedSquare][i] value = 1.01 size = 4
maxColor value = 1 size = 4
(colors[selectedSquare][i] == maxColor) is 0
colors[selectedSquare][i] value = 1.02 size = 4
maxColor value = 1 size = 4
(colors[selectedSquare][i] == maxColor) = 0
But the interesting thing starts here, when I change the comparison to
((int)colors[selectedSquare][i] == maxColor)
I get this output
colors[selectedSquare][i] value = 0.99 size = 4
maxColor value = 1 size = 4
(colors[selectedSquare][i] == maxColor) is 0
colors[selectedSquare][i] value = 1 size = 4
maxColor value = 1 size = 4
(colors[selectedSquare][i] == maxColor) is 0
colors[selectedSquare][i] value = 1.01 size = 4
maxColor value = 1 size = 4
(colors[selectedSquare][i] == maxColor) is 1
colors[selectedSquare][i] value = 1.02 size = 4
maxColor value = 1 size = 4
(colors[selectedSquare][i] == maxColor) is 1
I measure the size using sizeof(), and the declaration of colors and maxColor is like that
GLfloat (Memoria::colors)[9][3] = {
{ 0.80, 0.80, 0.00 },
{ 0.00, 0.80, 0.80 },
{ 0.80, 0.00, 0.00 },
{ 0.00, 0.80, 0.00 },
{ 0.00, 1.00, 1.00 },
{ 1.00, 0.00, 0.00 },
{ 1.00, 0.00, 1.00 },
{ 1.00, 1.00, 0.00 },
{ 1.00, 1.00, 1.00 },
};
const GLfloat maxColor;
Both belong to the same class, but colors is static.
Hope someone knows the problem.
Directly comparing doubles is a bad idea. You could use >= instead of == or do something like
if(fabs(colors[selectedSquare][i] - maxColor) > delta)
where delta is a precision you want to use.
Your problem is - doubles are never stored exactly as you seem to expect them to be. There are always fluctuations at the end of the number far beyond the comma separated part.