Well, here is my code and I am having a problem because my n is not increasing:
#define N 100
#define N_EQUATIONS 18 + 2
//initial values
int v = 1;
int cai = 2;
int caSR = 3;
int nai = 4;
int ki = 5;
int dvdt = 18;
double V_init = -87.5;
double Cai_init=1.0e-4;
double cansr=1.2;
double cajsr=cansr;
double CaSR_init = cansr + cajsr;
double Nai_init = 7;
double Ki_init = 145;
double u[N + 1][N_EQUATIONS + 1];
double Im[N + 1];
int main () {
int n = 0;
for ( n = 0; n <= N; n++) {
printf("n=%.18f\n", n);
u[n][v] = V_init;
//printf("t=%.18f\n", u[n][v]);
u[n][cai] = Cai_init;
//printf("cai=%.18f\n", u[n][cai]);
u[n][caSR] = CaSR_init;
u[n][nai] = Nai_init;
u[n][ki] = Ki_init;
u[n][dvdt] = 0.0;//check it
tapend[n] = 0.0;
tapstart[n] = 0.0;
}
}
Sorry if it is a stupid question and the answer is staring me at the eyes..
P.S. see the new revised code
You are probably just confused because your printf is incorrect:
printf("n=%.18f\n", n);
should be, e.g.
printf("n=%18d\n", n);
Currently you just print garbage in your loop (0 in your case, it seems, but it could be anything), so this may give the incorrect impression that n is not incrementing correctly.
Note that if you enable compiler warnings (and compiler warnings should always be enabled), then the compiler would have pointed out this mistake to you. Always enable compiler warnings and always take notice of any warnings, understand them, and fix them.
Related
I have a loop and inside a have a inner loop. How can I optimise it please in order to optimise execution time like avoiding accessing to memory many times to the same thing and avoid the maximum possible the addition and multiplication.
int n,m,x1,y1,x2,y2,cnst;
int N = 9600;
int M = 1800;
int temp11,temp12,temp13,temp14;
int temp21,temp22,temp23,temp24;
int *arr1 = new int [32000]; // suppose it's already filled
int *arr2 = new int [32000];// suppose it's already filled
int sumFirst = 0;
int maxFirst = 0;
int indexFirst = 0;
int sumSecond = 0;
int maxSecond = 0;
int indexSecond = 0;
int jump = 2400;
for( n = 0; n < N; n++)
{
temp14 = 0;
temp24 = 0;
for( m = 0; m < M; m++)
{
x1 = m + cnst;
y1 = m + n + cnst;
temp11 = arr1[x1];
temp12 = arr2[y1];
temp13 = temp11 * temp12;
temp14+= temp13;
x2 = m + cnst + jump;
y2 = m + n + cnst + jump;
temp21 = arr1[x2];
temp22 = arr2[y2];
temp23 = temp21 * temp22;
temp24+= temp23;
}
sumFirst += temp14;
if (temp14 > maxFirst)
{
maxFirst = temp14;
indexFirst = m;
}
sumSecond += temp24;
if (temp24 > maxSecond)
{
maxSecond = temp24;
indexSecond = n;
}
}
// At the end we use sum , index and max for first and second;
You are multiplying array elements and accumulating the result.
This can be optimized by:
SIMD (doing multiple operations at a single CPU step)
Parallel execution (using multiple physical/logical CPUs at once)
Look for CPU-specific SIMD way of doing this. Like _mm_mul_epi32 from SSE4.1 can possibly be used on x86-64. Before trying to write your own SIMD version with compiler intrinsics, make sure the compiler doesn't do it already for you.
As for parallel execution, look into omp, or using C++17 parallel accumulate.
Let's say I have a float called foo. And foo increases or decreases at certain intervals. How can I make foo start from zero again once it succeeded a specified number / and the same in reverse for the decreasing?
For example:
int max = 0;
int min = 50;
float foo = 45;
foo += 7.5;
foo would be 52.5 now. But since I specified 50 at the max number, i want it to sort of overflow at that point so that the result is just 2.5.
Or:
int min = 0;
int max = 50;
float foo = 45;
foo += 108.3;
the result should be 3.3. It just overflowed 3 times.
And for the reverse:
int min = 0;
int max = 50;
float foo = 1;
foo -= 5.5;
the result would be -4.5 but it should be 44.5.
I was thinking that maybe something like this would solve the problem:
foo = foo % max;
if (foo <0)
foo+=max;
But % is only for integers and including a library just to get fmod feels overkill.
Anyway, I'm wondering if this would work, if it could be done with less code and if it could be done without fmod.
Because #include <fmod> is too mainstream you can use;
if(foo > max)
{
int c = foo / max;
foo = foo - max*c;
}
else if(foo < min)
{
int c = foo / max;
if( c < 0 ) c = c * -1;
foo = foo + max*c;
foo = max + foo;
}
results:
for:
int min = 0;
int max = 50;
float foo = 1;
foo -= 5.5;
cpp.sh/4akud
for:
int min= 0;
int max= 50;
float foo = 45;
foo += 7.5;
cpp.sh/3accv
including a library just to get fmod feels overkill.
How so ? If you don't use the other functions, they won't lead to any overhead, and the increase in compiling time from including cmath surely is negligeable.
Just use the standard library unless you have a good reason not to.
You might want to read that : Are unnecessary include files an overhead?
const float TOP_CAP = 5.0f;
const float LOW_CAP = 1.0f;
float value = 42.0f;
while(value >= CAP)
{
value -= TOP_CAP;
while(value < LOW_CAP)
value += LOW_CAP;
}
while(value < LOW_CAP)
value += LOW_CAP;
If a whiles condition is false then the loop will be skipped.
Or just #include math and use fmod.
Concurring with Ivan Rubinson the use of single increments and decrements inside while loops seems the most simplistic solution to the problem, especially if one can expect values only shortly outside the interval.
int min = 0;
int max = 50;
int mod = max - min;
float foo = 45;
foo += 7.5;
while(foo >= max) foo -= mod;
while(foo < min) foo += mod;
I'm sure this may seem like a trivial problem, but I'm having issues with the "clock()" function in my program (please note I have checked similar issues but they didn't seem to relate in the same context). My clock outputs are almost always 0, however there seem to be a few 10's as well (which is why I'm baffled). I considered the fact that maybe the function calling is too quickly processed but judging by the sorting algorithms, surely there should be some time taken.
Thank you all in advance for any help! :)
P.S I'm really sorry about the mess regarding correlation of variables between functions (it's a group code I've merged together, and I'm focusing of correct output before beautifying it) :D
#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;
int bruteForce(int array[], const int& sizeofArray);
int maxSubArray (int array[], int arraySize);
int kadane_Algorithm(int array[], int User_size);
int main()
{
int maxBF, maxDC, maxKD;
clock_t t1, t2, t3;
int arraySize1 = 1;
double t1InMSEC, t2InMSEC, t3InMSEC;
while (arraySize1 <= 30000)
{
int* array = new int [arraySize1];
for (int i = 0; i < arraySize1; i++)
{
array[i] = rand()% 100 + (-50);
}
t1 = clock();
maxBF = bruteForce(array, arraySize1);
t1 = clock() - t1;
t1InMSEC = (static_cast <double>(t1))/CLOCKS_PER_SEC * 1000.00;
t2 = clock();
maxDC = maxSubArray(array, arraySize1);
t2 = clock() - t2;
t2InMSEC = (static_cast <double>(t2))/CLOCKS_PER_SEC * 1000.00;
t3 = clock();
maxKD = kadane_Algorithm(array, arraySize1);
t3 = clock() - t3;
t3InMSEC = (static_cast <double>(t3))/CLOCKS_PER_SEC * 1000.00;
cout << arraySize1 << '\t' << t1InMSEC << '\t' << t2InMSEC << '\t' << t3InMSEC << '\t' << endl;
arraySize1 = arraySize1 + 100;
delete [] array;
}
return 0;
}
int bruteForce(int array[], const int& sizeofArray)
{
int maxSumOfArray = 0;
int runningSum = 0;
int subArrayIndex = sizeofArray;
while(subArrayIndex >= 0)
{
runningSum += array[subArrayIndex];
if (runningSum >= maxSumOfArray)
{
maxSumOfArray = runningSum;
}
subArrayIndex--;
}
return maxSumOfArray;
}
int maxSubArray (int array[], int arraySize)
{
int leftSubArray = 0;
int rightSubArray = 0;
int leftSubArraySum = -50;
int rightSubArraySum = -50;
int sum = 0;
if (arraySize == 1) return array[0];
else
{
int midPosition = arraySize/2;
leftSubArray = maxSubArray(array, midPosition);
rightSubArray = maxSubArray(array, (arraySize - midPosition));
for (int j = midPosition; j < arraySize; j++)
{
sum = sum + array[j];
if (sum > rightSubArraySum)
rightSubArraySum = sum;
}
sum = 0;
for (int k = (midPosition - 1); k >= 0; k--)
{
sum = sum + array[k];
if (sum > leftSubArraySum)
leftSubArraySum = sum;
}
}
int maxSubArraySum = 0;
if (leftSubArraySum > rightSubArraySum)
{
maxSubArraySum = leftSubArraySum;
}
else maxSubArraySum = rightSubArraySum;
return max(maxSubArraySum, (leftSubArraySum + rightSubArraySum));
}
int kadane_Algorithm(int array[], int User_size)
{
int maxsofar=0, maxending=0, i;
for (i=0; i < User_size; i++)
{
maxending += array[i];
if (maxending < 0)
{
maxending = 0 ;
}
if (maxsofar < maxending)
{
maxsofar = maxending;
}
}
return maxending;
}
Output is as follows: (just used a snippet for visualization)
29001 0 0 0
29101 0 10 0
29201 0 0 0
29301 0 0 0
29401 0 10 0
29501 0 0 0
29601 0 0 0
29701 0 0 0
29801 0 10 0
29901 0 0 0
Your problem is probably that clock() doesn't have enough resolution: you truly are taking zero time to within the precision allowed. And if not, then it's almost surely that the compiler is optimizing away the computation since it's computing something that isn't being used.
By default, you really should be using the chrono header rather than the antiquated C facilities for timing. In particular, high_resolution_clock tends to be good for measuring relatively quick things should you find yourself really needing that.
Accurately benchmarking things is a nontrivial exercise, and you really should read up on how to do it properly. There are a surprising number of issues involving things like cache or surprising compiler optimizations or variable CPU frequency that many programmers have never even thought about before, and ignoring them can lead to incorrect conclusions.
One particular aspect is that you should generally arrange to time things that actually take some time; e.g. run whatever it is you're testing on thousands of different inputs, so that the duration is, e.g., on the order of a whole second or more. This tends to improve both the precision and the accuracy of your benchmarks.
Thanks for all the help guys! It seems to be some kind of issue with any windows/windows emulation software.
I decided to boot up ubuntu and rather give it a shot there, and voila! I get a perfect output :)
I guess Open Source really is the best way to go :D
This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I hesitate to ask this question because there is probably something wrong with my C++ template program but this problem has been bugging me for the past couple of hours. I am running the exact same program on Visual C++ and Mingw-g++ compilers but only VC2010 is giving me the expected results. I am not proficient C++ programmer by any means so not getting any error messages from either compilers is even more frustrating.
Edit : I did mingw-get upgrade after failing to resolve the error. I was running g++ 4.5.2 and now I have version 4.7.2 but the problem persists.
Late Update - I did a complete uninstall of MinGW platform, manually removed every folder and then installed TDM-GCC but the problem persists. Maybe there is some conflict with my Windows Installation. I have installed Cygwin and g++ 4.5.3 for the time being (It is working) as OS reinstallation isn't really an option right now. Thanks for all the help.
Here is my code. (Header File itertest.h)
#ifndef ITERTEST_H
#define ITERTEST_H
#include <iostream>
#include <cmath>
#include <vector>
#include <string>
#include <algorithm>
using namespace std;
template <typename T>
class fft_data{
public:
vector<T> re;
vector<T> im;
};
template <typename T>
void inline twiddle(fft_data<T> &vec,int N,int radix){
// Calculates twiddle factors for radix-2
T PI2 = (T) 6.28318530717958647692528676655900577;
T theta = (T) PI2/N;
vec.re.resize(N/radix,(T) 0.0);
vec.im.resize(N/radix,(T) 0.0);
vec.re[0] = (T) 1.0;
for (int K = 1; K < N/radix; K++) {
vec.re[K] = (T) cos(theta * K);
vec.im[K] = (T) sin(theta * K);
}
}
template <typename T>
void inline sh_radix5_dif(fft_data<T> &x,fft_data<T> &wl, int q, int sgn) {
int n = x.re.size();
int L = (int) pow(5.0, (double)q);
int Ls = L / 5;
int r = n / L;
T c1 = 0.30901699437;
T c2 = -0.80901699437;
T s1 = 0.95105651629;
T s2 = 0.58778525229;
T tau0r,tau0i,tau1r,tau1i,tau2r,tau2i,tau3r,tau3i;
T tau4r,tau4i,tau5r,tau5i;
T br,bi,cr,ci,dr,di,er,ei;
fft_data<T> y = x;
T wlr,wli,wl2r,wl2i,wl3r,wl3i,wl4r,wl4i;
int lsr = Ls*r;
for (int j = 0; j < Ls; j++) {
int ind = j*r;
wlr = wl.re[ind];
wli = wl.im[ind];
wl2r = wlr*wlr - wli*wli;
wl2i = 2.0*wlr*wli;
wl3r = wl2r*wlr - wli*wl2i;
wl3i= wl2r*wli + wl2i*wlr;
wl4r = wl2r*wl2r - wl2i*wl2i;
wl4i = 2.0*wl2r*wl2i;
for (int k =0; k < r; k++) {
int index = k*L+j;
int index1 = index+Ls;
int index2 = index1+Ls;
int index3 = index2+Ls;
int index4 = index3+Ls;
tau0r = y.re[index1] + y.re[index4];
tau0i = y.im[index1] + y.im[index4];
tau1r = y.re[index2] + y.re[index3];
tau1i = y.im[index2] + y.im[index3];
tau2r = y.re[index1] - y.re[index4];
tau2i = y.im[index1] - y.im[index4];
tau3r = y.re[index2] - y.re[index3];
tau3i = y.im[index2] - y.im[index3];
tau4r = c1 * tau0r + c2 * tau1r;
tau4i = c1 * tau0i + c2 * tau1i;
tau5r = sgn * ( s1 * tau2r + s2 * tau3r);
tau5i = sgn * ( s1 * tau2i + s2 * tau3i);
br = y.re[index] + tau4r + tau5i;
bi = y.im[index] + tau4i - tau5r;
er = y.re[index] + tau4r - tau5i;
ei = y.im[index] + tau4i + tau5r;
tau4r = c2 * tau0r + c1 * tau1r;
tau4i = c2 * tau0i + c1 * tau1i;
tau5r = sgn * ( s2 * tau2r - s1 * tau3r);
tau5i = sgn * ( s2 * tau2i - s1 * tau3i);
cr = y.re[index] + tau4r + tau5i;
ci = y.im[index] + tau4i - tau5r;
dr = y.re[index] + tau4r - tau5i;
di = y.im[index] + tau4i + tau5r;
int indexo = k*Ls+j;
int indexo1 = indexo+lsr;
int indexo2 = indexo1+lsr;
int indexo3 = indexo2+lsr;
int indexo4 = indexo3+lsr;
x.re[indexo]= y.re[index] + tau0r + tau1r;
x.im[indexo]= y.im[index] + tau0i + tau1i;
x.re[indexo1] = wlr*br - wli*bi;
x.im[indexo1] = wlr*bi + wli*br;
x.re[indexo2] = wl2r*cr - wl2i*ci;
x.im[indexo2] = wl2r*ci + wl2i*cr;
x.re[indexo3] = wl3r*dr - wl3i*di;
x.im[indexo3] = wl3r*di + wl3i*dr;
x.re[indexo4] = wl4r*er - wl4i*ei;
x.im[indexo4] = wl4r*ei + wl4i*er;
}
}
}
template <typename T>
void inline fftsh_radix5_dif(fft_data<T> &data,int sgn, unsigned int N) {
//unsigned int len = data.re.size();
int num = (int) ceil(log10(static_cast<double>(N))/log10(5.0));
//indrev(data,index);
fft_data<T> twi;
twiddle(twi,N,5);
if (sgn == 1) {
transform(twi.im.begin(), twi.im.end(),twi.im.begin(),bind1st(multiplies<T>(),(T) -1.0));
}
for (int i=num; i > 0; i--) {
sh_radix5_dif(data,twi,i,sgn);
}
}
#endif
main.cpp
#include "itertest.h"
using namespace std;
int main(int argc, char **argv)
{
int N = 25;
//vector<complex<double> > sig1;
fft_data<double> sig1;
for (int i =0; i < N; i++){
//sig1.push_back(complex<double>((double)1.0, 0.0));
//sig2.re.push_back((double) i);
//sig2.im.push_back((double) i+2);
sig1.re.push_back((double) 1);
sig1.im.push_back((double) 0);
}
fftsh_radix5_dif(sig1,1,N);
for (int i =0; i < N; i++){
cout << sig1.re[i] << " " << sig1.im[i] << endl;
}
cin.get();
return 0;
}
The expected Output (which I am getting from VC2010)
25 0
4.56267e-016 -2.50835e-016
2.27501e-016 -3.58484e-016
1.80101e-017 -2.86262e-016
... rest 21 rows same as the last three rows ( < 1e-015)
The Output from Mingw-g++
20 0
4.94068e-016 -2.10581e-016
2.65385e-016 -3.91346e-016
-5.76751e-017 -2.93654e-016
5 0
-1.54508 -4.75528
-3.23032e-017 1.85061e-017
-4.68253e-017 -1.18421e-016
-6.32003e-017 -2.05833e-016
1.11022e-016 0
4.04508 -2.93893
8.17138e-017 6.82799e-018
3.5246e-017 9.06767e-017
-6.59101e-017 -1.62762e-016
1.11022e-016 0
4.04508 2.93893
-6.28467e-017 6.40636e-017
1.79807e-016 3.34411e-017
-6.94919e-017 -1.05831e-016
1.11022e-016 0
-1.54508 4.75528
5.70402e-017 -1.68674e-017
-1.36169e-016 -8.30473e-017
-9.75639e-017 3.40359e-016
1.11022e-016 0
There must be something wrong with your MinGW installation. You might have an out-of-date, buggy version of GCC. The unofficial TDM-GCC distribution usually has a more up-to-date version: http://tdm-gcc.tdragon.net/
When I compile your code with GCC 4.6.3 on Ubuntu, it produces the output below, which appears to match the VC2010 output exactly (but I can't verify this, since you didn't provide it in full). Adding the options -O3 -ffast-math -march=native doesn't seem to change anything.
Note that I had to fix an obvious typo in fftsh_radix5_dif (missing closing angle bracket in the list of template arguments to multiply), but I assume you do not have it in your code, since it wouldn't compile at all.
25 0
4.56267e-16 -2.50835e-16
2.27501e-16 -3.58484e-16
1.80101e-17 -2.86262e-16
-5.76751e-17 -1.22566e-16
8.88178e-16 0
9.45774e-17 1.19479e-17
1.27413e-16 -5.04465e-17
7.97139e-17 -9.63575e-17
1.35142e-17 -7.08438e-17
8.88178e-16 0
4.84283e-17 4.54772e-17
1.02473e-16 2.63107e-17
1.02473e-16 -2.63107e-17
4.84283e-17 -4.54772e-17
8.88178e-16 0
1.35142e-17 7.08438e-17
7.97139e-17 9.63575e-17
1.27413e-16 5.04465e-17
9.45774e-17 -1.19479e-17
8.88178e-16 0
-5.76751e-17 1.22566e-16
1.80101e-17 2.86262e-16
2.27501e-16 3.58484e-16
4.56267e-16 2.50835e-16
Check the creation date of the executable you're running.
You may be running an earlier draft of your program.
I'm fairly new to C++ and I'm attempting to learn how to use pointers. I have the following file that creates coordinates and then moves them in random directions using a random number generator.
The value sigmaf_point is inputted from a text file:
void methane_coords(double *&sigmaf_point)
double dummy_int = 1;
string dummystring;
string s;
ifstream Dfile;
std::stringstream out;
out << 1;
s = out.str() + ".TXT";
Dfile.open (s.c_str());
if (Dfile.fail())
{
return;
}
for (int i=0; i<dummy_int; i++)
{
Dfile >> sigmaf_point[i];
}
Which I then use in another function:
double initial_energy(double **coords_fluid, const double *box_size){
// Loop over all pairs of atoms and calculate the LJ energy
double total_energy = 0;
for (int i = 0; i <= n_atoms-1; i++)
{
sf1=sigmaf_point(coords_fluid[i][3]);
ef1=epsilonf_point(coords_fluid[i][3]);
// Energy fluid-fluid
for (int j = i+1; j <= n_atoms-1; j++)
{
sf2=sigmaf_point(coords_fluid[j][3]);
ef2=epsilonf_point(coords_fluid[j][3]);
double delta_x = coords_fluid[j][0] - coords_fluid[i][0];
double delta_y = coords_fluid[j][1] - coords_fluid[i][1];
double delta_z = coords_fluid[j][2] - coords_fluid[i][2];
// Apply periodic boundaries
delta_x = make_periodic(delta_x, box_size[0]);
delta_y = make_periodic(delta_y, box_size[1]);
delta_z = make_periodic(delta_z, box_size[2]);
// Calculate the LJ potential
s=(sf1+sf2)/2.0;
e=pow((ef1*ef2),0.5);
double r = pow((delta_x*delta_x) + (delta_y*delta_y) +
(delta_z*delta_z),0.5)/s;
double e_lj = 4*((1/pow(r,12.0))-(1/pow(r,6.0))/e);
total_energy = (total_energy + e_lj);
}
}
coords_fluid is created in the main file like so:
double **coords_fluid = new double*[5000];
Now the problem is with sf1=sigmaf_point(coords_fluid[i][3]);
I get the error "expression must have pointer to function type" for sigmaf_point. I'm a bit confused about this, I know it's about how I call the variable but can't seem to fix it.
Cheers
First of all: Rereference to pointers are completly useless since it a pointer is already a sort of reference.
So change double *& to double * or double &. It will be faster.
Besides I see that you're using sigmaf_point as a function and as an array.
Which one is it?
Could you give the declaration of sigmaf_point?
Assuming it's an array change
sf1 = sigmaf_point(coords_fluid[i][3]);
to
sf1 = sigmaf_point[coords_fluid[i][3]];