Full recovery data using reed solomon - c++

I'm testing a Reed Solomon algorithm from this repository in order to recover info in case something is externally changed.
Assuming:
m = bits per symbol
k = data
r = redundance
n = bits per block = r + k = 2^m - 1
t = error correction = (n - k) / 2
I can codify and recover info using the following parameters:
m = 8
n = 255
r = 135
k = 120
t = 67
And works perfectly, I can recover 67 errors.
My assumtions are:
Only data will be corrupted, no redundancy.
To get full recovery n = 3 * k --> r = 2 * k.
Then n = 255 so r in this case shall be 170.
So I must have GF(2^m) and RS [n, k, n - k + 1] for use is GF(2^8) and RS [255, 85, 171]
With these parameters I get the error:
Error - Failed to create sequential root generator!
This means that library function make_sequential_root_generator_polynomial:
const std::size_t field_descriptor = 8; /* m = bit per symbol */
const std::size_t generator_polynomial_index = 120;
const std::size_t generator_polynomial_root_count = 170; /* root shall be equal to redundance */
const schifra::galois::field field(field_descriptor,
schifra::galois::primitive_polynomial_size06,
schifra::galois::primitive_polynomial06);
if (
!schifra::make_sequential_root_generator_polynomial(field,
generator_polynomial_index,
generator_polynomial_root_count,
generator_polynomial)
)
{
std::cout << "Error - Failed to create sequential root generator!" << std::endl;
return false;
}
My problem is that I don't know why the algorithm fails. I think I have a concept problem, bug after read about the topic here and here, I don't get why is not possible.
Is it possible to use according assumtions or theory say that this is not possible?

The current code in the question is failing because it sets
generator_polynomial_index = 120;
and 120 (index) + 170 (root count) is > 255 (field size), which is checked for in
make_sequential_root_generator_polynomial()
generator_polynomial_index is normally set to 0 (first consecutive root = 1) or 1 (first consecutive root = field primitive = 2), unless the goal is to use a self-reciprocal generator polynomial.
Even in the case of a self-reciprocal poly, for 170 roots, generator_polynomial_index = 128 - (170/2) = 43. Setting it to 120 is unusually high.
It could be possible for this to work as the roots are sequential powers modulo 255, so they could just wrap around, 2^120, 2^121, ... , 2^253, 2^254, 2^255=2^0, 2^256 = 2^1, ..., as this is done for self-reciprocal polynomials for an odd number of roots generator_polynomial_index = (255 - (number of roots / 2)), but perhaps the rest of the code has an issue with this.

Related

How to count how many valid colourings in a graph?

I attempted this SPOJ problem.
Problem:
AMR10J - Mixing Chemicals
There are N bottles each having a different chemical. For each chemical i, you have determined C[i] which means that mixing chemicals i and C[i] causes an explosion. You have K distinct boxes. In how many ways can you divide the N chemicals into those boxes such that no two chemicals in the same box can cause an explosion together?
INPUT
The first line of input is the number of test cases T. T test cases follow each containing 2 lines.
The first line of each test case contains 2 integers N and K.
The second line of each test case contains N integers, the ith integer denoting the value C[i]. The chemicals are numbered from 0 to N-1.
OUTPUT
For each testcase, output the number of ways modulo 1,000,000,007.
CONSTRAINTS
T <= 50
2 <= N <= 100
2 <= K <= 1000
0 <= C[i] < N
For all i, i != C[i]
SAMPLE INPUT
3
3 3
1 2 0
4 3
1 2 0 0
3 2
1 2 0
SAMPLE OUTPUT
6
12
0
EXPLANATION
In the first test case, we cannot mix any 2 chemicals. Hence, each of the 3 boxes must contain 1 chemical, which leads to 6 ways in total.
In the third test case, we cannot put the 3 chemicals in the 2 boxes satisfying all the 3 conditions.
The summary of the problem, given a set of chemicals and a set of boxes, count how many possible ways to place these chemicals in boxes such that no chemicals will explode.
At first I used brute force method to solve the problem, I recursively place chemicals in boxes and count valid configurations, I got TLE at my first attempt.
Later I learned that the problem can be solved with graph colouring.
I can represent chemicals as vertexes and there'a an edge between chemicals if they cannot be placed each other.
And the set of boxes can be used as vertex colours, all I need to do was to count how many different valid colourings of the graph.
I applyed this concept to solve the problem unfortunately I got TLE again. I don't know how to improve my code, I need help.
code:
#include <bits/stdc++.h>
#define MAXN 100
using namespace std;
const int mod = (int) 1e9 + 7;
int n;
int k;
int ways;
void greedy_coloring(vector<int> adj[], int color[])
{
int u = 0;
for (; u < n; ++u)
if (color[u] == -1)//found first uncolored vertex
break;
if (u == n)//no uncolored vertexex means all vertexes are colored
{
ways = (ways + 1) % mod;
return;
}
bool available[k];
memset(available, true, sizeof(available));
for (int v : adj[u])
if (color[v] != -1)//if the adjacent vertex colored, make its color unavailable
available[color[v]] = false;
for (int c = 0; c < k; ++c)
if (available[c])
{
color[u] = c;
greedy_coloring(adj, color);
color[u] = -1;//don't forgot to reset the color
}
}
int main()
{
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int T;
cin >> T;
while (T--)
{
cin >> n >> k;
vector<int> adj[n];
int c[n];
for (int i = 0; i < n; ++i)
{
cin >> c[i];
adj[i].push_back(c[i]);
adj[c[i]].push_back(i);
}
ways = 0;
int color[n];
memset(color, -1, sizeof(color));
greedy_coloring(adj, color);
cout << ways << "\n";
}
return 0;
}
Counting the number of colorings in a general graph is #P-hard, but this graph has some special structure, which I'll exploit in a minute after I enumerate some basic properties of counting colorings. The first observation is that, if the graph has a node with no neighbors, if we delete that node, the number of colorings decreases by a factor of k. The second observation is that, if a node has exactly one neighbor and we delete it, the number of colorings decreases by a factor of k-1. The third is that the number of colorings is equal to the product of the number of colorings for each connected component. The fourth is that we can delete all but one parallel edge.
Using these properties, it suffices to determine a formula for each connected component of the 2-core of this graph, which is a simple cycle of some length. Let P(n) and C(n) be the number of ways to color a path or cycle respectively with n nodes. We use the basic properties above to find
P(n) = k (k-1)^(n-1).
Finding a formula for C(n) I think requires the deletion contraction formula, which leads to a recurrence
C(3) = k (k-1) (k-2), i.e., three nodes of different colors;
C(n) = P(n) - C(n-1) = k (k-1)^(n-1) - C(n-1).
Multiply the above recurrence by (-1)^n.
(-1)^3 C(3) = -k (k-1) (k-2)
(-1)^n C(n) = (-1)^n k (k-1)^(n-1) - (-1)^n C(n-1)
= (-1)^n k (k-1)^(n-1) + (-1)^(n-1) C(n-1)
(-1)^n C(n) - (-1)^(n-1) C(n-1) = (-1)^n k (k-1)^(n-1)
Let D(n) = (-1)^n C(n).
D(3) = -k (k-1) (k-2)
D(n) - D(n-1) = (-1)^n k (k-1)^(n-1)
Now we can write D(n) as a telescoping sum:
D(n) = [sum_{i=4}^n (D(n) - D(n-1))] + D(3)
D(n) = [sum_{i=4}^n (-1)^n k (k-1)^(n-1)] - k (k-1) (k-2).
Break it down as two geometric sums which then cancel nicely.
D(n) = [sum_{i=4}^n (-1)^n ((k-1) + 1) (k-1)^(n-1)] - k (k-1) (k-2)
= sum_{i=4}^n (1-k)^n - sum_{i=4}^n (1-k)^(n-1) - k (k-1) (k-2)
= (1-k)^n - (1-k)^3 - k (k-1) (k-2)
= (1-k)^n - (1 - 3k + 3k^2 - k^3) - (2k - 3k^2 + k^3)
= (1-k)^n - (1-k)
C(n) = (-1)^n (1-k)^n - (-1)^n (1-k)
= (k-1)^n + (-1)^n (k-1).
Note that after removing all parallel edges, we can have at most n edges. This means that in any one connected component we can only see one cycle (and simple at that), which makes the combinatorics rather straightforward. (Cycles are only dependent on how many edges each node can spawn, which is capped at 1.)
Second example:
k = 3
<< 0 <-- 3
/ ^
/ ^
1 --> 2
Since cycles are self contained, any connection to one removes the possibility of another. In the example above, we cannot make a second cycle involving node 3 by adding more nodes, and the same issue would extend to any subsequent connected nodes.
It should be enough, therefore, to perform a search, separating out connected components and marking their node count and whether they contain a cycle. Given a connected component, where c of the nodes are part of a cycle and m nodes are not, we have the following formula (David Eisenstat helped me correct my combinatoric for the count of colourings of a cycle):
if the component has a cycle:
[(k - 1)^c + (-1)^c * (k - 1)] *
(k - 1)^(m)
otherwise:
k * (k - 1)^(m - 1)
As David Eisenstat noted, multiply all these results for the final tally.

OpenCv: FilterEngine() explanation / Create custom nonlinear filter

I would like to create my own nonlinear filter in OpenCV using C++, and if I see it correctly, I can use the FilterEngine class to do so. Unfortunately, I'm not really able to follow the documentation of this class. (Link: http://docs.opencv.org/2.4/modules/imgproc/doc/filtering.html#filterengine).
Could someone be so kind to explain the class to me in a little bit more detail?
I'm grateful for every input and every example you can provide me with :-)
.
My specific needs:
1) I would like learn how to create my own nonlinear filters in general.
2) I would like to apply a rank-transform filter to my images:
Meaning: I have a kernel/region and I would like to flag every pixel inside that region with a one if the intensity-value of that (neighbourhood-) pixel is lower than the intensity of the center-pixel. Next, I want to use a simple convolution to save the sum of the transformed region, and store the value at the center-pixel. Let's look at a simple example:
100 120 200 rank-trans. 1 0 0 convolution
110 120 220 --> 1 0 0 --> 2
180 200 200 0 0 0
P.S: I know that I can archive the result of 2) by combining 255 threshold-operations with 255 box-filter operations, and then looping over every pixel and selecting the correct value. However, that seems quite inefficient to me ...
.
Code-Snipped [Edit]:
As I still struggle to understand the FilterEngine(), I started to write my own function for the above-descripted usecase. I would also be happy if you could comment on it to improve its efficiency, as it is quite slow at the moment. (~2sec. for a 1080x1920 image on one CPU-core).
void rankTransform(Mat& out, Mat in, int kernal_size, int borderType) {
// Issue warning if neccessary:
if (kernal_size >= 17) {
std::cout << "Warning, need to change Mat-type. Unsigned short only supports kernels up-to the size of 15x15" << std::endl << std::endl;
};
// First: Get borders around the image:
int border_size = (kernal_size - 1) / 2;
Mat in_incl_border = Mat(1080 + 2 * border_size, 1920 + 2 * border_size, in.depth());
copyMakeBorder(in, in_incl_border, border_size, border_size, border_size, border_size, borderType);
// Second: Loop through the image, conduct a rank transform and
// then sum over the kernel-size:
int start_pixel = 0 + (border_size + 1);
int end_pixel_width = 1920 + border_size;
int end_pixel_height = 1080 + border_size;
int i, j;
int x_1, x_2, y_1;
for (i = start_pixel; i < end_pixel_height; ++i) {
x_1 = i - border_size;
x_2 = i + border_size + 1;
for (j = start_pixel; j < end_pixel_width; ++j) {
y_1 = j - border_size;
out.at<unsigned short>(x_1-1, y_1-1) = static_cast<unsigned short>( (sum( in_incl_border(Range(x_1, x_2), Range(y_1, j + border_size + 1)) < in_incl_border.at<unsigned short>(i, j) )[0])/255 );
};
};

How to filter given width of lines in a image?

I need to filter given width of lines in a image.
I am coding a program which will detect lines of road image. And I found something like that but can't understand logic of it. My function has to do that:
I will send image and width of line in terms of pixel size(e.g 30 pixel width), the function will filter just these lines in image.
I found that code:
void filterWidth(Mat image, int tau) // tau=width of line I want to filter
int aux = 0;
for (int j = 0; j < quad.rows; ++j)
{
unsigned char *ptRowSrc = quad.ptr<uchar>(j);
unsigned char *ptRowDst = quadDst.ptr<uchar>(j);
for (int i = tau; i < quad.cols - tau; ++i)
{
if (ptRowSrc[i] != 0)
{
aux = 2 * ptRowSrc[i];
aux += -ptRowSrc[i - tau];
aux += -ptRowSrc[i + tau];
aux += -abs((int)(ptRowSrc[i - tau] - ptRowSrc[i + tau]));
aux = (aux < 0) ? (0) : (aux);
aux = (aux > 255) ? (255) : (aux);
ptRowDst[i] = (unsigned char)aux;
}
}
}
What is the mathematical explanation of that code? And how does that work?
Read up about convolution filters. This code is a particular case of a 1 dimensional convolution filter (it only convolves with other pixels on the currently processed line).
The value of aux is started with 2 * the current pixel value, then pixels on either side of it at distance tau are being subtracted from that value. Next the absolute difference of those two pixels is also subtracted from it. Finally it is capped to the range 0...255 before being stored in the output image.
If you have an image:
0011100
This convolution will cause the centre 1 to gain the value:
2 * 1
- 0
- 0
- abs(0 - 0)
= 2
The first '1' will become:
2 * 1
- 0
- 1
- abs(0 - 1)
= 0
And so will the third '1' (it's a mirror image).
And of course the 0 values will always stay zero or become negative, which will be capped back to 0.
This is a rather weird filter. It takes the pixel values three by three on the same line, with a tau spacing. Let these values by Vl, V and Vr.
The filter computes - Vl + 2 V - Vr, which can be seen as a second derivative, and deducts |Vl - Vr|, which can be seen as a first derivative (also called gradient). The second derivative gives a maximum response in case of a maximum configuration (Vl < V > Vr); the first derivative gives a minimum response in case of a symmetric configuration (Vl = Vr).
So the global filter will give a maximum response for a symmetric maximum (like with a light road on a dark background, vertical, with a width less than 2.tau).
By rearranging the terms, you can see that the filter also yields the smallest of the left and right gradients, V - Vm and V - Vp (clamped to zero).

Incremental entropy computation

Let std::vector<int> counts be a vector of positive integers and let N:=counts[0]+...+counts[counts.length()-1] be the the sum of vector components. Setting pi:=counts[i]/N, I compute the entropy using the classic formula H=p0*log2(p0)+...+pn*log2(pn).
The counts vector is changing --- counts are incremented --- and every 200 changes I recompute the entropy. After a quick google and stackoverflow search I couldn't find any method for incremental entropy computation. So the question: Is there an incremental method, like the ones for variance, for entropy computation?
EDIT: Motivation for this question was usage of such formulas for incremental information gain estimation in VFDT-like learners.
Resolved: See this mathoverflow post.
I derived update formulas and algorithms for entropy and Gini index and made the note available on arXiv. (The working version of the note is available here.) Also see this mathoverflow answer.
For the sake of convenience I am including simple Python code, demonstrating the derived formulas:
from math import log
from random import randint
# maps x to -x*log2(x) for x>0, and to 0 otherwise
h = lambda p: -p*log(p, 2) if p > 0 else 0
# update entropy if new example x comes in
def update(H, S, x):
new_S = S+x
return 1.0*H*S/new_S+h(1.0*x/new_S)+h(1.0*S/new_S)
# entropy of union of two samples with entropies H1 and H2
def update(H1, S1, H2, S2):
S = S1+S2
return 1.0*H1*S1/S+h(1.0*S1/S)+1.0*H2*S2/S+h(1.0*S2/S)
# compute entropy(L) using only `update' function
def test(L):
S = 0.0 # sum of the sample elements
H = 0.0 # sample entropy
for x in L:
H = update(H, S, x)
S = S+x
return H
# compute entropy using the classic equation
def entropy(L):
n = 1.0*sum(L)
return sum([h(x/n) for x in L])
# entry point
if __name__ == "__main__":
L = [randint(1,100) for k in range(100)]
M = [randint(100,1000) for k in range(100)]
L_ent = entropy(L)
L_sum = sum(L)
M_ent = entropy(M)
M_sum = sum(M)
T = L+M
print("Full = ", entropy(T))
print("Update = ", update(L_ent, L_sum, M_ent, M_sum))
You could re-compute the entropy by re-computing the counts and using some simple mathematical identity to simplify the entropy formula
K = count.size();
N = count[0] + ... + count[K - 1];
H = count[0]/N * log2(count[0]/N) + ... + count[K - 1]/N * log2(count[K - 1]/N)
= F * h
h = (count[0] * log2(count[0]) + ... + count[K - 1] * log2(count[K - 1]))
F = -1/(N * log2(N))
which holds because of log2(a / b) == log2(a) - log2(b)
Now given an old vector count of observations so far and another vector of new 200 observations called batch, you can do in C++11
void update_H(double& H, std::vector<int>& count, int& N, std::vector<int> const& batch)
{
N += batch.size();
auto F = -1/(N * log2(N));
for (auto b: batch)
++count[b];
H = F * std::accumulate(count.begin(), count.end(), 0.0, [](int elem) {
return elem * log2(elem);
});
}
Here I assume that you have encoded your observations as int. If you have some kind of symbol, you would need a symbol table std::map<Symbol, int>, and do a lookup for each symbol in batch before you update the count.
This seems the quickest way of writing some code for a general update. If you know that in every batch only few counts actually change, you can do as #migdal does and keep track of the changing counts, subtract their old contribution to the entropy and add the new contribution.

Implementation of the Discrete Fourier Transform - FFT

I am trying to do a project in sound processing and need to put the frequencies into another domain. Now, I have tried to implement an FFT, that didn't go well. I tried to understand the z-transform, that didn't go to well either. I read up and found DFT's a lot more simple to understand, especially the algorithm. So I coded the algorithm using examples but I do not know or think the output is right. (I don't have Matlab on here, and cannot find any resources to test it) and wondered if you guys knew if I was going in the right direction. Here is my code so far:
#include <iostream>
#include <complex>
#include <vector>
using namespace std;
const double PI = 3.141592;
vector< complex<double> > DFT(vector< complex<double> >& theData)
{
// Define the Size of the read in vector
const int S = theData.size();
// Initalise new vector with size of S
vector< complex<double> > out(S, 0);
for(unsigned i=0; (i < S); i++)
{
out[i] = complex<double>(0.0, 0.0);
for(unsigned j=0; (j < S); j++)
{
out[i] += theData[j] * polar<double>(1.0, - 2 * PI * i * j / S);
}
}
return out;
}
int main(int argc, char *argv[]) {
vector< complex<double> > numbers;
numbers.push_back(102023);
numbers.push_back(102023);
numbers.push_back(102023);
numbers.push_back(102023);
vector< complex<double> > testing = DFT(numbers);
for(unsigned i=0; (i < testing.size()); i++)
{
cout << testing[i] << endl;
}
}
The inputs are:
102023 102023
102023 102023
And the result:
(408092, 0)
(-0.0666812, -0.0666812)
(1.30764e-07, -0.133362)
(0.200044, -0.200043)
Any help or advice would be great, I'm not expecting a lot, but, anything would be great. Thank you :)
#Phorce is right here. I don't think there is any reson to reinvent the wheel. However, if you want to do this so that you understand the methodology and to have the joy of coding it yourself I can provide a FORTRAN FFT code that I developed some years ago. Of course this is not C++ and will require a translation; this should not be too difficult and should enable you to learn a lot in doing so...
Below is a Radix 4 based algorithm; this radix-4 FFT recursively partitions a DFT into four quarter-length DFTs of groups of every fourth time sample. The outputs of these shorter FFTs are reused to compute many outputs, thus greatly reducing the total computational cost. The radix-4 decimation-in-frequency FFT groups every fourth output sample into shorter-length DFTs to save computations. The radix-4 FFTs require only 75% as many complex multiplies as the radix-2 FFTs. See here for more information.
!+ FILE: RADIX4.FOR
! ===================================================================
! Discription: Radix 4 is a descreet complex Fourier transform algorithim. It
! is to be supplied with two real arrays, one for real parts of function
! one for imaginary parts: It can also unscramble transformed arrays.
! Usage: calling FASTF(XREAL,XIMAG,ISIZE,ITYPE,IFAULT); we supply the
! following:
!
! XREAL - array containing real parts of transform sequence
! XIMAG - array containing imagianry parts of transformation sequence
! ISIZE - size of transform (ISIZE = 4*2*M)
! ITYPE - +1 forward transform
! -1 reverse transform
! IFAULT - 1 if error
! - 0 otherwise
! ===================================================================
!
! Forward transform computes:
! X(k) = sum_{j=0}^{isize-1} x(j)*exp(-2ijk*pi/isize)
! Backward computes:
! x(j) = (1/isize) sum_{k=0}^{isize-1} X(k)*exp(ijk*pi/isize)
!
! Forward followed by backwards will result in the origonal sequence!
!
! ===================================================================
SUBROUTINE FASTF(XREAL,XIMAG,ISIZE,ITYPE,IFAULT)
REAL*8 XREAL(*),XIMAG(*)
INTEGER MAX2,II,IPOW
PARAMETER (MAX2 = 20)
! Check for valid transform size upto 2**(max2):
IFAULT = 1
IF(ISIZE.LT.4) THEN
print*,'FFT: Error: Data array < 4 - Too small!'
return
ENDIF
II = 4
IPOW = 2
! Prepare mod 2:
1 IF((II-ISIZE).NE.0) THEN
II = II*2
IPOW = IPOW + 1
IF(IPOW.GT.MAX2) THEN
print*,'FFT: Error: FFT1!'
return
ENDIF
GOTO 1
ENDIF
! Check for correct type:
IF(IABS(ITYPE).NE.1) THEN
print*,'FFT: Error: Wrong type of transformation!'
return
ENDIF
! No entry errors - continue:
IFAULT = 0
! call FASTG to preform transformation:
CALL FASTG(XREAL,XIMAG,ISIZE,ITYPE)
! Due to Radix 4 factorisation results are not in the same order
! after transformation as they were when the data was submitted:
! We now call SCRAM, to unscramble the reults:
CALL SCRAM(XREAL,XIMAG,ISIZE,IPOW)
return
END
!-END: RADIX4.FOR
! ===============================================================
! Discription: This is the radix 4 complex descreet fast Fourier
! transform with out unscrabling. Suitable for convolutions or other
! applications that do not require unscrambling. Designed for use
! with FASTF.FOR.
!
SUBROUTINE FASTG(XREAL,XIMAG,N,ITYPE)
INTEGER N,IFACA,IFCAB,LITLA
INTEGER I0,I1,I2,I3
REAL*8 XREAL(*),XIMAG(*),BCOS,BSIN,CW1,CW2,PI
REAL*8 SW1,SW2,SW3,TEMPR,X1,X2,X3,XS0,XS1,XS2,XS3
REAL*8 Y1,Y2,Y3,YS0,YS1,YS2,YS3,Z,ZATAN,ZFLOAT,ZSIN
ZATAN(Z) = ATAN(Z)
ZFLOAT(K) = FLOAT(K) ! Real equivalent of K.
ZSIN(Z) = SIN(Z)
PI = (4.0)*ZATAN(1.0)
IFACA = N/4
! Forward transform:
IF(ITYPE.GT.0) THEN
GOTO 5
ENDIF
! If this is for an inverse transform - conjugate the data:
DO 4, K = 1,N
XIMAG(K) = -XIMAG(K)
4 CONTINUE
5 IFCAB = IFACA*4
! Proform appropriate transformations:
Z = PI/ZFLOAT(IFCAB)
BCOS = -2.0*ZSIN(Z)**2
BSIN = ZSIN(2.0*Z)
CW1 = 1.0
SW1 = 0.0
! This is the main body of radix 4 calculations:
DO 10, LITLA = 1,IFACA
DO 8, I0 = LITLA,N,IFCAB
I1 = I0 + IFACA
I2 = I1 + IFACA
I3 = I2 + IFACA
XS0 = XREAL(I0) + XREAL(I2)
XS1 = XREAL(I0) - XREAL(I2)
YS0 = XIMAG(I0) + XIMAG(I2)
YS1 = XIMAG(I0) - XIMAG(I2)
XS2 = XREAL(I1) + XREAL(I3)
XS3 = XREAL(I1) - XREAL(I3)
YS2 = XIMAG(I1) + XIMAG(I3)
YS3 = XIMAG(I1) - XIMAG(I3)
XREAL(I0) = XS0 + XS2
XIMAG(I0) = YS0 + YS2
X1 = XS1 + YS3
Y1 = YS1 - XS3
X2 = XS0 - XS2
Y2 = YS0 - YS2
X3 = XS1 - YS3
Y3 = YS1 + XS3
IF(LITLA.GT.1) THEN
GOTO 7
ENDIF
XREAL(I2) = X1
XIMAG(I2) = Y1
XREAL(I1) = X2
XIMAG(I1) = Y2
XREAL(I3) = X3
XIMAG(I3) = Y3
GOTO 8
! Now IF required - we multiply by twiddle factors:
7 XREAL(I2) = X1*CW1 + Y1*SW1
XIMAG(I2) = Y1*CW1 - X1*SW1
XREAL(I1) = X2*CW2 + Y2*SW2
XIMAG(I1) = Y2*CW2 - X2*SW2
XREAL(I3) = X3*CW3 + Y3*SW3
XIMAG(I3) = Y3*CW3 - X3*SW3
8 CONTINUE
IF(LITLA.EQ.IFACA) THEN
GOTO 10
ENDIF
! Calculate a new set of twiddle factors:
Z = CW1*BCOS - SW1*BSIN + CW1
SW1 = BCOS*SW1 + BSIN*CW1 + SW1
TEMPR = 1.5 - 0.5*(Z*Z + SW1*SW1)
CW1 = Z*TEMPR
SW1 = SW1*TEMPR
CW2 = CW1*CW1 - SW1*SW1
SW2 = 2.0*CW1*SW1
CW3 = CW1*CW2 - SW1*SW2
SW3 = CW1*SW2 + CW2*SW1
10 CONTINUE
IF(IFACA.LE.1) THEN
GOTO 14
ENDIF
! Set up tranform split for next stage:
IFACA = IFACA/4
IF(IFACA.GT.0) THEN
GOTO 5
ENDIF
! This is the calculation of a radix two-stage:
DO 13, K = 1,N,2
TEMPR = XREAL(K) + XREAL(K + 1)
XREAL(K + 1) = XREAL(K) - XREAL(K + 1)
XREAL(K) = TEMPR
TEMPR = XIMAG(K) + XIMAG(K + 1)
XIMAG(K + 1) = XIMAG(K) - XIMAG(K + 1)
XIMAG(K) = TEMPR
13 CONTINUE
14 IF(ITYPE.GT.0) THEN
GOTO 17
ENDIF
! For the inverse case, cojugate and scale the transform:
Z = 1.0/ZFLOAT(N)
DO 16, K = 1,N
XIMAG(K) = -XIMAG(K)*Z
XREAL(K) = XREAL(K)*Z
16 CONTINUE
17 return
END
! ----------------------------------------------------------
!-END of subroutine FASTG.FOR.
! ----------------------------------------------------------
!+ FILE: SCRAM.FOR
! ==========================================================
! Discription: Subroutine for unscrambiling FFT data:
! ==========================================================
SUBROUTINE SCRAM(XREAL,XIMAG,N,IPOW)
INTEGER L(19),II,J1,J2,J3,J4,J5,J6,J7,J8,J9,J10,J11,J12
INTEGER J13,J14,J15,J16,J17,J18,J19,J20,ITOP,I
REAL*8 XREAL(*),XIMAG(*),TEMPR
EQUIVALENCE (L1,L(1)),(L2,L(2)),(L3,L(3)),(L4,L(4))
EQUIVALENCE (L5,L(5)),(L6,L(6)),(L7,L(7)),(L8,L(8))
EQUIVALENCE (L9,L(9)),(L10,L(10)),(L11,L(11)),(L12,L(12))
EQUIVALENCE (L13,L(13)),(L14,L(14)),(L15,L(15)),(L16,L(16))
EQUIVALENCE (L17,L(17)),(L18,L(18)),(L19,L(19))
II = 1
ITOP = 2**(IPOW - 1)
I = 20 - IPOW
DO 5, K = 1,I
L(K) = II
5 CONTINUE
L0 = II
I = I + 1
DO 6, K = I,19
II = II*2
L(K) = II
6 CONTINUE
II = 0
DO 9, J1 = 1,L1,L0
DO 9, J2 = J1,L2,L1
DO 9, J3 = J2,L3,L2
DO 9, J4 = J3,L4,L3
DO 9, J5 = J4,L5,L4
DO 9, J6 = J5,L6,L5
DO 9, J7 = J6,L7,L6
DO 9, J8 = J7,L8,L7
DO 9, J9 = J8,L9,L8
DO 9, J10 = J9,L10,L9
DO 9, J11 = J10,L11,L10
DO 9, J12 = J11,L12,L11
DO 9, J13 = J12,L13,L12
DO 9, J14 = J13,L14,L13
DO 9, J15 = J14,L15,L14
DO 9, J16 = J15,L16,L15
DO 9, J17 = J16,L17,L16
DO 9, J18 = J17,L18,L17
DO 9, J19 = J18,L19,L18
J20 = J19
DO 9, I = 1,2
II = II +1
IF(II.GE.J20) THEN
GOTO 8
ENDIF
! J20 is the bit reverse of II!
! Pairwise exchange:
TEMPR = XREAL(II)
XREAL(II) = XREAL(J20)
XREAL(J20) = TEMPR
TEMPR = XIMAG(II)
XIMAG(II) = XIMAG(J20)
XIMAG(J20) = TEMPR
8 J20 = J20 + ITOP
9 CONTINUE
return
END
! -------------------------------------------------------------------
!-END:
! -------------------------------------------------------------------
Going through this and understanding it will take time! I wrote this using a CalTech paper I found years ago, I cannot recall the reference I am afraid. Good luck.
I hope this helps.
Your code works.
I would give more digits for PI ( 3.1415926535898 ).
Also, you have to devide the output of the DFT summation by S, the DFT size.
Since the input series in your test is constant, the DFT output should have only one non-zero coefficient.
And indeed all the output coefficients are very small relative to the first one.
But for a large input length, this is not an efficient way of implementing the DFT.
If timing is a concern, look into the Fast Fourrier Transform for faster methods to calculate the DFT.
Your code looks right to me. I'm not sure what you were expecting for output but, given that your input is a constant value, the DFT of a constant is a DC term in bin 0 and zeroes in the remaining bins (or a close equivalent, which you have).
You might try testing you code with a longer sequence containing some type of waveform like a sine wave or a square wave. In general, however, you should consider using something like fftw in production code. Its been wrung out and highly optimized by many people for a long time. FFTs are optimized DFTs for special cases (e.g., lengths that are powers of 2).
Your code looks okey. out[0] should represent the "DC" component of your input waveform. In your case, it is 4 times bigger than the input waveform, because your normalization coefficient is 1.
The other coefficients should represent the amplitude and phase of your input waveform. The coefficients are mirrored, i.e., out[i] == out[N-i]. You can test this with the following code:
double frequency = 1; /* use other values like 2, 3, 4 etc. */
for (int i = 0; i < 16; i++)
numbers.push_back(sin((double)i / 16 * frequency * 2 * PI));
For frequency = 1, this gives:
(6.53592e-07,0)
(6.53592e-07,-8)
(6.53592e-07,1.75661e-07)
(6.53591e-07,2.70728e-07)
(6.5359e-07,3.75466e-07)
(6.5359e-07,4.95006e-07)
(6.53588e-07,6.36767e-07)
(6.53587e-07,8.12183e-07)
(6.53584e-07,1.04006e-06)
(6.53581e-07,1.35364e-06)
(6.53576e-07,1.81691e-06)
(6.53568e-07,2.56792e-06)
(6.53553e-07,3.95615e-06)
(6.53519e-07,7.1238e-06)
(6.53402e-07,1.82855e-05)
(-8.30058e-05,7.99999)
which seems correct to me: negligible DC, amplitude 8 for 1st harmonics, negligible amplitudes for other harmonics.
MoonKnight has already provided a radix-4 Decimation In Frequency Cooley-Tukey scheme in Fortran. I'm below providing a radix-2 Decimation In Frequency Cooley-Tukey scheme in Matlab.
The code is an iterative one and considers the scheme in the following figure:
A recursive approach is also possible.
As you will see, the implementation calculates also the number of performed multiplications and additions and compares it with the theoretical calculations reported in How many FLOPS for FFT?.
The code is obviously much slower than the highly optimized FFTW exploited by Matlab.
Note also that the twiddle factors omegaa^((2^(p - 1) * n)) can be calculated off-line and then restored from a lookup table, but this point is skipped in the code below.
For a Matlab implementation of an iterative radix-2 Decimation In Time Cooley-Tukey scheme, please see Implementing a Fast Fourier Transform for Option Pricing.
% --- Radix-2 Decimation In Frequency - Iterative approach
clear all
close all
clc
N = 32;
x = randn(1, N);
xoriginal = x;
xhat = zeros(1, N);
numStages = log2(N);
omegaa = exp(-1i * 2 * pi / N);
mulCount = 0;
sumCount = 0;
tic
M = N / 2;
for p = 1 : numStages;
for index = 0 : (N / (2^(p - 1))) : (N - 1);
for n = 0 : M - 1;
a = x(n + index + 1) + x(n + index + M + 1);
b = (x(n + index + 1) - x(n + index + M + 1)) .* omegaa^((2^(p - 1) * n));
x(n + 1 + index) = a;
x(n + M + 1 + index) = b;
mulCount = mulCount + 4;
sumCount = sumCount + 6;
end;
end;
M = M / 2;
end
xhat = bitrevorder(x);
timeCooleyTukey = toc;
tic
xhatcheck = fft(xoriginal);
timeFFTW = toc;
rms = 100 * sqrt(sum(sum(abs(xhat - xhatcheck).^2)) / sum(sum(abs(xhat).^2)));
fprintf('Time Cooley-Tukey = %f; \t Time FFTW = %f\n\n', timeCooleyTukey, timeFFTW);
fprintf('Theoretical multiplications count \t = %i; \t Actual multiplications count \t = %i\n', ...
2 * N * log2(N), mulCount);
fprintf('Theoretical additions count \t\t = %i; \t Actual additions count \t\t = %i\n\n', ...
3 * N * log2(N), sumCount);
fprintf('Root mean square with FFTW implementation = %.10e\n', rms);
Your code is correct to obtain the DFT.
The function you are testing is (sin ((double) i / points * frequency * 2) which corresponds to a synoid of amplitude 1, frequency 1 and sampling frequency Fs = number of points taken.
Operating with the obtained data we have:
As you can see, the DFT coefficients are symmetric with respect to the position coefficient N / 2, so only the first N / 2 provide information. The amplitude obtained by means of the module of the real and imaginary part must be divided by N and multiplied by 2 to reconstruct it. The frequencies of the coefficients will be multiples of Fs / N by the coefficient number.
If we introduce two sinusoids, one of frequency 2 and amplitude 1.3 and another of frequency 3 and amplitude 1.7.
for (int i = 0; i < 16; i++)
{
numbers.push_back(1.3 *sin((double)i / 16 * frequency1 * 2 * PI)+ 1.7 *
sin((double)i / 16 * frequency2 * 2 * PI));
}
The obtained data are:
Good luck.