I was just wondering which will the best BigInteger class in C++ for programming contests which do not allow external libraries?
Mainly I was looking for a class which could be used in my code( I will of course write it on my own, on similar grounds ).
The primary factors which I think are important are( according to their importance ):
Arbitrary length numbers and their operations should be supported.
Should be as small as possible, code-wise. Usually there's a limit on the size of the source code which can be submitted to ~50KB, so the code should be ( much )smaller than that.
Should be as fast as possible. I read somewhere that bigInt classes take O( log( n ) ) time, so this should have a similiar complexity. What I mean is that it should be as fast as possible.
So far I've only needed unsigned integer big numbers for codechef, but codechef only gives 2KB, so I don't have the full implementation up there anywhere, just the members needed for that problem. My code also assumes that long long has at least twice as many bits as a unsigned, though that's pretty safe. The only real trick to it is that different biguint classes may have different data lengths. Here's summaries of the more interesting functions.
#define BIG_LEN() (data.size()>rhs.data.size()?data.size():rhs.data.size())
//the length of data or rhs.data, whichever is bigger
#define SML_LEN() (data.size()<rhs.data.size()?data.size():rhs.data.size())
//the length of data or rhs.data, whichever is smaller
const unsigned char baselut[256]={ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 0, 0, 0, 0, 0,
0,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,
25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,
41,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,
25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40
};
const unsigned char base64lut[256]={ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,62, 0, 0, 0,63,
52,53,54,55,56,57,58,59,60,61, 0, 0, 0, 0, 0, 0,
0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14,
15,16,17,18,19,20,21,22,23,24,25, 0, 0, 0, 0, 0,
0,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,
41,42,43,44,45,46,47,48,49,50,51, 0, 0, 0, 0, 0
};
//lookup tables for creating from strings
void add(unsigned int amount, unsigned int index)
adds amount at index with carry, simplifies other members
void sub(unsigned int amount, unsigned int index)
subtracts amount at index with borrow, simplifies other members
biguint& operator+=(const biguint& rhs)
resize data to BIG_LEN()
int carry = 0
for each element i in data up to SML_LEN()
data[i] += rhs.data[i] + carry
carry = ((data[i]<rhs[i]+carry || (carry && rhs[i]+carry==0)) ? 1u : 0u);
if data.length > rhs.length
add(carry, SML_LEN())
biguint& operator*=(const biguint& rhs)
biguint lhs = *this;
resize data to data.length + rhs.length
zero out data
for each element j in lhs
long long t = lhs[j]
for each element i in rhs (and j+i<data.size)
t*=rhs[i];
add(t&UINT_MAX, k);
if (k+1<data.size())
add(t>>uint_bits, k+1);
//note this was public, so I could do both at the same time when needed
//operator /= and %= both just call this
//I have never needed to divide by a biguint yet.
biguint& div(unsigned int rhs, unsigned int & mod)
long long carry = 0
for each element i from data length to zero
carry = (carry << uint_bits) | data[i]
data[i] = carry/rhs;
carry %= rhs
mod = carry
//I have never needed to shift by a biguint yet
biguint& operator<<=(unsigned int rhs)
resize to have enough room, always at least 1 bigger
const unsigned int bigshift = rhs/uint_bits;
const unsigned int lilshift = rhs%uint_bits;
const unsigned int carry_shift = (uint_bits-lilshift)%32;
for each element i from bigshift to zero
t = data[i-bigshift] << lilshift;
t |= data[i-bigshift-1] >> carry_shift;
data[i] = t;
if bigshift < data.size
data[bigshift] = data[0] << lilshift
zero each element i from 0 to bigshift
std::ofstream& operator<<(std::ofstream& out, biguint num)
unsigned int mod
vector reverse
do
num.div(10,mod);
push back mod onto reverse
while num greater than 0
print out elements of reverse in reverse
std::ifstream& operator>>(std::ifstream& in, biguint num)
char next
do
in.get(next)
while next is whitespace
num = 0
do
num = num * 10 + next
while in.get(next) and next is digit
//these are handy for initializing to known values.
//I also have constructors that simply call these
biguint& assign(const char* rhs, unsigned int base)
for each char c in rhs
data *= base
add(baselut[c], 0)
biguint& assign(const char* rhs, std::integral_constant<unsigned int, 64> base)
for each char c in rhs
data *= base
add(base64lut[c], 0)
//despite being 3 times as much, code, the hex version is _way_ faster.
biguint& assign(const char* rhs, std::integral_constant<unsigned int, 16>)
if first two characters are "0x" skip them
unsigned int len = strlen(rhs);
grow(len/4+1);
zero out data
const unsigned int hex_per_int = uint_bits/4;
if (len > hex_per_int*data.size()) { //calculate where first digit goes
rhs += len-hex_per_int*data.size();
len = hex_per_int*data.size();
}
for(unsigned int i=len; i --> 0; ) { //place each digit directly in it's place
unsigned int t = (unsigned int)(baselut[*(rhs++)]) << (i%hex_per_int)*4u;
data[i/hex_per_int] |= t;
}
I also made specializations for multiplication, divide, modulo, shifts and others for std::integral_constant<unsigned int, Value>, which made massive improvements to my serializing and deserializing functions amongst others.
Related
The above parallelize code is taking much more time as compare to the original one. I have used bfs approach to solve the problem. I am getting the correct output but it is taking too much time.
(x, y) represents matrix cell coordinates, and
dist represents their minimum distance from the source
struct Node
{
int x, y, dist;
};
\\Below arrays detail all four possible movements from a cell
int row[] = { -1, 0, 0, 1 };
int col[] = { 0, -1, 1, 0 };
Function to check if it is possible to go to position (row, col)
from the current position. The function returns false if (row, col)
is not a valid position or has a value 0 or already visited.
bool isValid(vector<vector<int>> const &mat, vector<vector<bool>> &visited, int row, int col)
{
return (row >= 0 && row < mat.size()) && (col >= 0 && col < mat[0].size())
&& mat[row][col] && !visited[row][col];
}
Find the shortest possible route in a matrix mat from source
cell (i, j) to destination cell (x, y)
int findShortestPathLength(vector<vector<int>> const &mat, pair<int, int> &src,
pair<int, int> &dest)
{
if (mat.size() == 0 || mat[src.first][src.second] == 0 ||
mat[dest.first][dest.second] == 0) {
return -1;
}
// `M × N` matrix
int M = mat.size();
int N = mat[0].size();
// construct a `M × N` matrix to keep track of visited cells
vector<vector<bool>> visited;
visited.resize(M, vector<bool>(N));
// create an empty queue
queue<Node> q;
// get source cell (i, j)
int i = src.first;
int j = src.second;
// mark the source cell as visited and enqueue the source node
visited[i][j] = true;
q.push({i, j, 0});
// stores length of the longest path from source to destination
int min_dist = INT_MAX;
// loop till queue is empty
while (!q.empty())
{
// dequeue front node and process it
Node node = q.front();
q.pop();
// (i, j) represents a current cell, and `dist` stores its
// minimum distance from the source
int i = node.x, j = node.y, dist = node.dist;
// if the destination is found, update `min_dist` and stop
if (i == dest.first && j == dest.second)
{
min_dist = dist;
break;
}
// check for all four possible movements from the current cell
// and enqueue each valid movement
#pragma omp parallel for
for (int k = 0; k < 4; k++)
{
// check if it is possible to go to position
// (i + row[k], j + col[k]) from current position
#pragma omp task shared(i,visited,j)
{
if (isValid(mat, visited, i + row[k], j + col[k]))
{
// mark next cell as visited and enqueue it
visited[i + row[k]][j + col[k]] = true;
q.push({ i + row[k], j + col[k], dist + 1 });
}
}
}
}
if (min_dist != INT_MAX) {
return min_dist;
}
return -1;
}
main part of the code only contains a matrix and source and destination coordinates
int main()
{
vector<vector<int>> mat =
{
{ 1, 1, 1, 1, 1, 0, 0, 1, 1, 1 },
{ 0, 1, 1, 1, 1, 1, 0, 1, 0, 1 },
{ 0, 0, 1, 0, 1, 1, 1, 0, 0, 1 },
{ 1, 0, 1, 1, 1, 0, 1, 1, 0, 1 },
{ 0, 0, 0, 1, 0, 0, 0, 1, 0, 1 },
{ 1, 0, 1, 1, 1, 0, 0, 1, 1, 0 },
{ 0, 0, 0, 0, 1, 0, 0, 1, 0, 1 },
{ 0, 1, 1, 1, 1, 1, 1, 1, 0, 0 },
{ 1, 1, 1, 1, 1, 0, 0, 1, 1, 1 },
{ 0, 0, 1, 0, 0, 1, 1, 0, 0, 1 },
};
pair<int, int> src = make_pair(0, 0);
pair<int, int> dest = make_pair(7, 5);
int min_dist = findShortestPathLength(mat, src, dest);
if (min_dist != -1)
{
cout << min_dist<<endl;
}
else {
cout << "Destination cannot be reached from a given source"<<endl;
}
return 0;
}
I have used shared variable but it is taking too much time.
Can anyone help me?
As people have remarked, you only get parallelism over the 4 directions; a better approach is to keep a set of sets of points: first the starting point, then all points you can reach from there in 1 step, then the ones you can reach in two steps.
With this approach you get more parallelism: you're building what's known as a "wave front" and all the points can be tackled simultaneously.
auto last_level = reachable.back();
vector<int> newly_reachable;
for ( auto n : last_level ) {
const auto& row = graph.row(n);
for ( auto j : row ) {
if ( not reachable.has(j)
and not any_of
( newly_reachable.begin(),newly_reachable.end(),
[j] (int i) { return i==j; } ) )
newly_reachable.push_back(j);
}
}
if (newly_reachable.size()>0)
reachable.push_back(newly_reachable);
(I have written this for a general DAG; writing your maze as a DAG is an exercise for the reader.)
However, this approach still has big problems: if two points on the current wave front decide to add the same new point, you have to resolve that.
For a very parallel approach you need to abandon the "push" model of adding new points altogether, and to go a "pull" model: in every "distance" iteration you loop over all points and ask, if I am not reachable, was one of my neighbors reachable? If so, mark me as reachable in one step more than that neighbor.
If you think about that last approach for a second (or two) you'll see that you are essentially doing a sequence of matrix-vector products, with the adjacency matrix and the currently reachable set. Except that you replace the scalar "+" operation by "min" and the scalar "*" by "+1". Read any tutorial about the interpretation of graph operations as linear algebra. Except that it's not really linear.
I'm using arrayfire and I need to translate a lot of images at once and store it in a new array. The images are contained in a single array of size (w, h, c, b) and the amount by which each image needs to be translated is inside a (2, 1, 1, b) array.
The sequential implementation is as follows
for (int i=0; i<b; i++)
{
float x = coords(0, 0, 0, i).scalar<float>();
float y = coords(1, 0, 0, i).scalar<float>();
af::array t_imgs(af::span,af::span,af::span,i) =
af::translate(imgs(af::span,af::span,af::span,i), x, y);
}
How could I parallelize it? Translate doesn't accept arrays as arguments, so I can't do something like this:
gfor(af::seq i, b)
{
af::array x = coords(0, 0, 0, i);
af::array y = coords(1, 0, 0, i);
t_imgs(af::span,af::span,af::span,i) =
af::translate(imgs(af::span,af::span,af::span,i), x, y);
}
I am trying to integrate a very simple ODE using boost odeint.
For some cases, the values are the same (or very similar) to Python's scipy odeint function.
But for other initial conditions, the values are vastly different.
The function is: d(uhat) / dt = - alpha^2 * kappa^2 * uhat
where alpha is 1.0, and kappa is a constant depending on the case (see values below).
I have tried several different ODE solvers from boost, and none seem to work.
Update: The code below is now working.
In the code below, the first case gives nearly identical results, the 2nd case is kind of trivial (but reassuring), and the 3rd case gives erroneous answers in the C++ version.
Here is the C++ version:
#include <boost/numeric/odeint.hpp>
#include <cstdlib>
#include <iostream>
typedef boost::numeric::odeint::runge_kutta_dopri5<double> Stepper_Type;
struct ResultsObserver
{
std::ostream& m_out;
ResultsObserver( std::ostream &out ) : m_out( out ) { }
void operator()(const State_Type& x , double t ) const
{
m_out << t << " : " << x << std::endl;
}
};
// The rhs: d_uhat_dt = - alpha^2 * kappa^2 * uhat
class Eq {
public:
Eq(double alpha, double kappa)
: m_constant(-1.0 * alpha * alpha * kappa * kappa) {}
void operator()(double uhat, double& d_uhat_dt, const double t) const
{
d_uhat_dt = m_constant * uhat;
}
private:
double m_constant;
};
void integrate(double kappa, double initValue)
{
const unsigned numTimeIncrements = 100;
const double dt = 0.1;
const double alpha = 1.0;
double uhat = initValue; //Init condition
std::vector<double> uhats; //Results vector
Eq rhs(alpha, kappa); //The RHS of the ODE
//This is what I was doing that did not work
//
//boost::numeric::odeint::runge_kutta_dopri5<double> stepper;
//for(unsigned step = 0; step < numTimeIncrements; ++step) {
// uhats.push_back(uhat);
// stepper.do_step(rhs, uhat, step*dt, dt);
//}
//This works
integrate_const(
boost::numeric::odeint::make_dense_output<Stepper_Type>( 1E-12, 1E-6 ),
rhs, uhat, startTime, endTime, dt, ResultsObserver(std::cout)
);
std::cout << "kappa = " << kappa << ", initial value = " << initValue << std::endl;
for(auto val : uhats)
std::cout << val << std::endl;
std::cout << "---" << std::endl << std::endl;
}
int main() {
const double kappa1 = 0.062831853071796;
const double initValue1 = -187.097241230045967;
integrate(kappa1, initValue1);
const double kappa2 = 28.274333882308138;
const double initValue2 = 0.000000000000;
integrate(kappa2, initValue2);
const double kappa3 = 28.337165735379934;
const double initValue3 = -0.091204068895190;
integrate(kappa3, initValue3);
return EXIT_SUCCESS;
}
and the corresponding Python3 version:
enter code here
#!/usr/bin/env python3
import numpy as np
from scipy.integrate import odeint
def Eq(uhat, t, kappa, a):
d_uhat = -a**2 * kappa**2 * uhat
return d_uhat
def integrate(kappa, initValue):
dt = 0.1
t = np.arange(0,10,dt)
a = 1.0
print("kappa = " + str(kappa))
print("initValue = " + str(initValue))
uhats = odeint(Eq, initValue, t, args=(kappa,a))
print(uhats)
print("---")
print()
kappa1 = 0.062831853071796
initValue1 = -187.097241230045967
integrate(kappa1, initValue1)
kappa2 = 28.274333882308138
initValue2 = 0.000000000000
integrate(kappa2, initValue2)
kappa3 = 28.337165735379934
initValue3 = -0.091204068895190
integrate(kappa3, initValue3)
With boost::odeint you should use the integrate... interface functions.
What happens in your code using do_step is that you use the dopri5 method as a fixed-step method with your provided step size. In connection with the large coefficient L=kappa^2 of about 800, the product L*dt=80 is far outside the stability region of the method, the boundary is between values of 2 to 3.5. Divergence or oscillating divergence is the expected result.
What should happen and what is implemented in the scipy odeint, ode-dopri5 or solve_ivp-RK45 methods is a method with adaptive step size. Internally the optimal step size for the error tolerances is chosen, and in each internal step an interpolation polynomial is determined. The output or observed values are computed with this interpolator, also called "dense output" if the interpolation function is returned from the integrator. One result of the error control is that the step size is always inside the stability region, for stiff problems with medium error tolerances on or close to the boundary of this region.
This has all the hallmarks of precision issues.
Simply replacing double with long double gives:
Live On Compiler Explorer
#include <boost/numeric/odeint.hpp>
#include <boost/multiprecision/cpp_bin_float.hpp>
#include <boost/multiprecision/cpp_dec_float.hpp>
#include <fmt/ranges.h>
#include <fmt/ostream.h>
#include <iostream>
using Value = long double;
//using Value = boost::multiprecision::number<
//boost::multiprecision::backends::cpp_bin_float<100>,
//boost::multiprecision::et_off>;
// The rhs: d_uhat_dt = - alpha^2 * kappa^2 * uhat
class Eq {
public:
Eq(Value alpha, Value kappa)
: m_constant(-1.0 * alpha * alpha * kappa * kappa)
{
}
void operator()(Value uhat, Value& d_uhat_dt, const Value) const
{
d_uhat_dt = m_constant * uhat;
}
private:
Value m_constant;
};
void integrate(Value const kappa, Value const initValue)
{
const unsigned numTimeIncrements = 100;
const Value dt = 0.1;
const Value alpha = 1.0;
Value uhat = initValue; // Init condition
std::vector<Value> uhats; // Results vector
Eq rhs(alpha, kappa); // The RHS of the ODE
boost::numeric::odeint::runge_kutta_dopri5<Value> stepper;
for (unsigned step = 0; step < numTimeIncrements; ++step) {
uhats.push_back(uhat);
auto&& stepdt = step * dt;
stepper.do_step(rhs, uhat, stepdt, dt);
}
fmt::print("kappa = {}, initial value = {}\n{}\n---\n", kappa, initValue,
uhats);
}
int main() {
for (auto [kappa, initValue] :
{
std::pair //
{ 0.062831853071796 , -187.097241230045967 },
{28.274333882308138 , 0.000000000000 },
{28.337165735379934 , -0.091204068895190 },
}) //
{
integrate(kappa, initValue);
}
}
Prints
kappa = 0.0628319, initial value = -187.097
[-187.097, -187.023, -186.95, -186.876, -186.802, -186.728, -186.655, -186.581, -186.507, -186.434, -186.36, -186.287, -186.213, -186.139, -186.066, -185.993, -185.919, -185.846, -185.772, -185.699, -185.626, -185.553, -185.479, -185.406, -185.333, -185.26, -185.187, -185.114, -185.04, -184.967, -184.894, -184.821, -184.748, -184.676, -184.603, -184.53, -184.457, -184.384, -184.311, -184.239, -184.166, -184.093, -184.021, -183.948, -183.875, -183.803, -183.73, -183.658, -183.585, -183.513, -183.44, -183.368, -183.296, -183.223, -183.151, -183.079, -183.006, -182.934, -182.862, -182.79, -182.718, -182.645, -182.573, -182.501, -182.429, -182.357, -182.285, -182.213, -182.141, -182.069, -181.998, -181.926, -181.854, -181.782, -181.71, -181.639, -181.567, -181.495, -181.424, -181.352, -181.281, -181.209, -181.137, -181.066, -180.994, -180.923, -180.852, -180.78, -180.709, -180.638, -180.566, -180.495, -180.424, -180.353, -180.281, -180.21, -180.139, -180.068, -179.997, -179.926]
---
kappa = 28.2743, initial value = 0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
---
kappa = 28.3372, initial value = -0.0912041
[-0.0912041, -38364100, -1.61375e+16, -6.78809e+24, -2.85534e+33, -1.20107e+42, -5.0522e+50, -2.12516e+59, -8.93928e+67, -3.76022e+76, -1.5817e+85, -6.65327e+93, -2.79864e+102, -1.17722e+111, -4.95186e+119, -2.08295e+128, -8.76174e+136, -3.68554e+145, -1.55029e+154, -6.52114e+162, -2.74306e+171, -1.15384e+180, -4.85352e+188, -2.04159e+197, -8.58774e+205, -3.61235e+214, -1.5195e+223, -6.39163e+231, -2.68858e+240, -1.13092e+249, -4.75713e+257, -2.00104e+266, -8.41718e+274, -3.54061e+283, -1.48932e+292, -6.26469e+300, -2.63518e+309, -1.10846e+318, -4.66265e+326, -1.9613e+335, -8.25002e+343, -3.47029e+352, -1.45975e+361, -6.14028e+369, -2.58285e+378, -1.08645e+387, -4.57005e+395, -1.92235e+404, -8.08618e+412, -3.40137e+421, -1.43075e+430, -6.01833e+438, -2.53155e+447, -1.06487e+456, -4.47929e+464, -1.88417e+473, -7.92559e+481, -3.33382e+490, -1.40234e+499, -5.89881e+507, -2.48128e+516, -1.04373e+525, -4.39033e+533, -1.84675e+542, -7.76818e+550, -3.26761e+559, -1.37449e+568, -5.78166e+576, -2.432e+585, -1.023e+594, -4.30314e+602, -1.81008e+611, -7.61391e+619, -3.20272e+628, -1.34719e+637, -5.66684e+645, -2.3837e+654, -1.00268e+663, -4.21768e+671, -1.77413e+680, -7.4627e+688, -3.13911e+697, -1.32044e+706, -5.55429e+714, -2.33636e+723, -9.82768e+731, -4.13392e+740, -1.73889e+749, -7.31449e+757, -3.07677e+766, -1.29421e+775, -5.44399e+783, -2.28996e+792, -9.6325e+800, -4.05182e+809, -1.70436e+818, -7.16922e+826, -3.01567e+835, -1.26851e+844, -5.33587e+852]
---
As you can see, I tried some simple takes to get it to use Boost Multiprecision, but that didn't immediately work and may require someone with actual understanding of the maths / IDEINT.
My title might be confusing, please read on. The question seems long, but rather simple to understand.
I am using Eclipse Oxygen MacOSX GCC to write C++. In my project, I have a class called FilterArray, and inside of it there are three private fields:
int arrayLength;
int * originalArray;
int * filteredArray;
And, (1)a function that prints out the array (2)a function to set array length with a random number up to 50 (3)a function to initialize the original array (4)a function that's to filter the original array and store the filtered array in filteredArray, several getter functions, and so on.
void FilterArray::printArray(int *arr);
void FilterArray::setArrayLength();
void FilterArray::createOrginalArray(); //random value assigned to each index
void FilterArray::filterTheArray(); //by a certain algorithm
void FilterArray::getOriginalArray();
I made three test cases to test my filterArray() function in main(). At first, I forgot to initialize the fliteredArray with the size(only declared it) for each of the objects, but to my surprise, two of the original arrays gets filtered succesfully, and the program gets halted at the process when the function filtered the third array.
I soon realized that the problem is indeed that I forgot to initialize the filtered array with the size so when the function runs, fliteredArray[index] would make no sense, however, why this situation did happen to the first two objects, but not the third??
If you need to further understand my question, please refer to the following code. I've cut it to the most succinct.
In my FilterArray.hpp:
#ifndef FILTERARRAY_HPP_
#define FILTERARRAY_HPP_
class FilterArray {
int *originalArray;
int arrayLength;
int *filteredArray;
public:
// Constructor
FilterArray();
void setArraySize();
int getArraySize();
int * getOriginalArray(int index);
int * getFilteredArray();
void printArray(int *arr);
};
#endif /* FILTERARRAY_HPP_ */
In my FilterArray.cpp:
#include "FilterArray.hpp"
#include <iostream>
#include <stdlib.h>
using namespace std;
#include <math.h>
// Constructor
FilterArray::FilterArray(){
}
// function definitions
void FilterArray::setArraySize(){
this->arrayLength = rand() % 25 + 25;
}
int FilterArray::getArraySize(){
return this->arrayLength;
}
int * FilterArray::getOriginalArray(int index){
return &this->originalArray[index];
}
int * FilterArray::getFilteredArray(){
return this->filteredArray;
}
void FilterArray::printArr(int *arr){
cout <<"["<<arr[0]<<", ";
for (int i = 1; i < this->arrayLength - 1; i++){
cout <<arr[i]<<", ";
}
cout <<arr[this->arrayLength - 1]<<"]"<<endl;
}
In my main.cpp:
#include <iostream>
#include <stdlib.h>
using namespace std;
#include <string>
#include "FilterArray.hpp"
int main() {
srand(time(NULL));
// Test Cases
// Object array1 is for case 1, and the same for 2 and 3.
FilterArray *array1 = new FilterArray();
FilterArray *array2 = new FilterArray();
FilterArray *array3 = new FilterArray();
array1->setArraySize();
array2->setArraySize();
array3->setArraySize();
cout <<"1"<<endl;
//array1->printArr(array1->getOriginalArray(0));
array1->printArr(array1->getFilteredArray());
cout <<"2"<<endl;
//array2->printArr(array2->getOriginalArray(0));
array2->printArr(array2->getFilteredArray());
cout <<"3"<<endl;
//array3->printArr(array3->getOriginalArray(0));
array3->printArr(array3->getFilteredArray());
delete array1;
array1 = NULL;
delete array2;
array2 = NULL;
delete array3;
array3 = NULL;
return 1;
}
And to my suprise, the output is:
1
[8, 0, 0, 0, -824775304, 32767, -666960016, 32767, 0, 0, 0, 0, 0, 0, 0, 0, 27, 1, -1201811687, 32767, -824775248, 32767, -824816597, 32767, 17, 40, 40, 0, 0, 0, -824775295, 32767, -666959792]
2
[524288, 0, 0, 0, -310902784, 2147471062, -9437184, 2147473470, 0, 0, 0, 0, 0, 0, 0, 0, 1769472, 65536, -820445184, 2147465309, -307232768, 2147471062, 1277886464, 2147471062, 1114112, 2621440, 2621440, 0, 0, 0, -310312960, 2147471062, 5242880, 2147473471, 0, 0, 0, 0, 0, 0, 0, 0, 1769472, 196608]
3
You can see in my main() there is nowhere I initialized the filteredArray, but why the first two objects can have it, but not the third? I delete the pointer after the main though, or is it because the program halts before the deletion of the pointer or what?
Thanks a lot guys!
I'm trying to make sure FFTW does what I think it should do, but am having problems. I'm using OpenCV's cv::Mat. I made a test program that, given a Mat f, computes ifft(fft(f)) and compares the result to f. I would expect the difference between the two to be negligible, but there's a strange pattern in the data..
In this case, f is initialized to be an 8x8 array of floats with positive values less than 1.
Here's my test program code:
Mat f = .. //populate f
if (f.type() != CV_32FC1)
DLOG << "Bad f type";
const int y = f.rows;
const int x = f.cols;
double* input = fftw_alloc_real(y * 2*(x/2 + 1));
// forward fft
fftw_plan plan = fftw_plan_dft_r2c_2d(x, y, input, (fftw_complex*)input, FFTW_MEASURE);
// inverse fft
fftw_plan iplan = fftw_plan_dft_c2r_2d(x, y, (fftw_complex*)input, input, FFTW_MEASURE);
// populate fftw data from f
for (int yi = 0; yi < y; ++yi)
{
const float* yptr = f.ptr<float>(yi);
for (int xi = 0; xi < x; ++xi)
input[yi*x + xi] = (double)yptr[xi];
}
fftw_execute(plan);
fftw_execute(iplan);
// put data into another cv::Mat for comparison
Mat check(y, x, f.type());
for (int yi = 0; yi < y; ++yi)
{
float* yptr = check.ptr<float>(yi);
for (int xi = 0; xi < x ; ++xi)
yptr[xi] = (float)input[yi*x + xi];
}
DLOG << Util::summary(f, "f");
DLOG << f;
DLOG << Util::summary(check, "check");
DLOG << check;
Mat diff = f*x*y - check;
DLOG << Util::summary(diff, "diff");
DLOG << diff;
Where DLOG is my logger and Util::summary(cv::Mat m) just prints passed string and the dimensions, channels, min, and max of the mat.
Here's what the data looks like (output):
f: rows:8 cols:8 chans:1 min:0.00257996 max:0.4
[0.050668437, 0.04509116, 0.033668514, 0.10986148, 0.12855141, 0.048241843, 0.12613985,.09731093;
0.028602425, 0.0092236707, 0.037089188, 0.118964, 0.075040311, 0.40000001, 0.11959606, 0.071930833;
0.0025799556, 0.051522054, 0.22233701, 0.052993439, 0.032000393, 0.12673819, 0.015244827, 0.044803992;
0.13946071, 0.019708242, 0.0112687, 0.047459811, 0.019342113, 0.030085485, 0.018739942, 0.0098618753;
0.041809395, 0.029681522, 0.026837418, 0.16038358, 0.29034778, 0.17247421, 0.1789207, 0.042179305;
0.025630442, 0.017192598, 0.060540862, 0.1854037, 0.21287154, 0.04813192, 0.042614728, 0.034764063;
0.0030835248, 0.018511582, 0.0071733585, 0.017076733, 0.064545207, 0.0026390438, 0.088922881, 0.045725599;
0.12798512, 0.23215951, 0.027465452, 0.03174505, 0.04352935, 0.025079668, 0.044403922, 0.035459157]
check: rows:8 cols:8 chans:1 min:-3.26489 max:25.6
[3.24278, 2.8858342, 2.1547849, 7.0311346, 8.2272902, 3.0874779, 8.0729504, 6.2278996;
0.30818239, 0, 2.373708, 7.6136961, 4.8025799, 25.6, 7.6541481, 4.6035733;
0.16511716, 3.2974114, -3.2648909, 0, 2.0480251, 8.1112442, 0.97566891, 2.8674555;
8.9254856, 1.2613275, 0.72119683, 3.0374279, -0.32588482, 0, 1.1993563, 0.63116002;
2.6758013, 1.8996174, 1.7175947, 10.264549, 18.582258, 11.038349, 0.042666838, 0;
1.6403483, 1.1003263, 3.8746152, 11.865837, 13.623778, 3.0804429, 2.7273426, 2.2249;
0.44932228, 0, 0.45909494, 1.0929109, 4.1308932, 0.16889881, 5.6910644, 2.9264383;
8.1910477, 14.858209, -0.071794562, 0, 2.7858784, 1.6050987, 2.841851, 2.2693861]
diff: rows:8 cols:8 chans:1 min:-0.251977 max:17.4945
[0, 0, 0, 0, 0, 0, 0, 0;
1.5223728, 0.59031492, 0, 0, 0, 0, 0, 0;
0, 0, 17.494459, 3.3915801, 0, 0, 0, 0;
0, 0, 0, 0, 1.5637801, 1.9254711, 0, 0;
0, 0, 0, 0, 0, 0, 11.408258, 2.6994755;
0, 0, 0, 0, 0, 0, 0, 0;
-0.2519767, 1.1847413, 0, 0, 0, 0, 0, 0;
0, 0, 1.8295834, 2.0316832, 0, 0, 0, 0]
The difficult part for me is the nonzero entries in the diff matrix. I've accounted for the scaling FFTW does on the values and the padding needed to do an in-place fft on real data; what am I missing?
I find it surprising that the data could be off by a value of 17 (which is 66% of the max value), when there are so many zeros. Also, the data irregularities seem to form a diagonal pattern.
As you may have noticed when writting fftw_alloc_real(y * 2*(x/2 + 1)); fftw needs extra space in the x direction to store complex data. In your case, as x=8, it needs 2*(x/2+1)=10 reals.
http://www.fftw.org/doc/Real_002ddata-DFT-Array-Format.html#Real_002ddata-DFT-Array-Format
So...you should take care of this as you populate the input array or retreive values from it.
You way change
input[yi*x + xi] = (double)yptr[xi];
for
int xfft=2*(x/2 + 1);
...
input[yi*xfft + xi] = (double)yptr[xi];
And
yptr[xi] = (float)input[yi*x + xi];
for
yptr[xi] = (float)input[yi*xfft + xi];
It should solve your problem since the non-nul points in your diff correspond to the extra padding.
Bye,