Weird behaviour in a for loop changing the results - c++

i've got a weird problem in my code.
here's the context : in my method i create an object and then i fill the (int) buffer of this object with data in TWO "for loops".
The problem is , when i insert a printf in my loop to look at the data into my buffer, it change the data in the buffer.
actually, the result in the buffer is different if there's a printf inside the loop or not
Heres's my code, maybe it can help to understand :
bool Mod::Realiser(FFTResult * inputdata,FFTSample_s * & moduleData){
bool done = true;
float module;
unsigned int r,n;
moduleData = new FFTSample_s(NbPointsSample);
unsigned int limit = NbPointsSample >> 1;
int iGain= 0;
for (n = CentrageFFT, r = 0; r < limit; n++, r++)
{
module = inputdata->buffer[n][0] * inputdata->buffer[n][0] + inputdata->buffer[n][1] * inputdata->buffer[n][1];
// printf(" M = %lf\n",module);
moduleData->buffer[r] = (int)(10.0*log10(module)) + iGain;
}
for (n = 0; n < limit; n++, r++)
{
module = inputdata->buffer[n][0] * inputdata->buffer[n][0] + inputdata->buffer[n][1] * inputdata->buffer[n][1];
moduleData->buffer[r] = (int)(10.0*log10(module)) + iGain;
}
/* for (int i=0;i<2048;i++){
printf(" X = %lf \n",inputdata->buffer[i][0]);
printf(" Y = %lf \n",inputdata->buffer[i][1]);
printf(" M = %d\n",moduleData->buffer[i]);
}*/

This is normal behavior. See What Every Computer Scientist Should Know about Floating-Point Arithmetic. By passing a floating point number to printf, you probably force the implementation to convert it into canonical float form from an internal form that happens to have higher precision.
The results can be different. There is not one and only one right answer.
Also:
"The result of a + b is stored in a temporary destination of unspecified precision. Neither the C++ or IEEE standards mandate what precision intermediate calculations are done to and this intermediate precision will affect your results. The temporary result could equally easily be stored in a float or a double and there are significant advantages to both options. " - Floating-Point Determinism

Related

Arithmetic Coding FPAQ0 (a simple order-0 arithmetic file compressor )

I am trying to understand the code of fpaq0 aritmetic compressor but I am not able to fully understand it.Here is the link to the code -fpaq0.cpp
I am not able to understand exactly the how ct[512]['2] and cxt are working.Also I am not very much clear how decoder is working.Why before encoding every charater e.encode(0) is being called.
NOTE; I have understood the arithmetic coder presented in the link-Data Compression with Arithmetic Encoding
void update(int y) {
if (++ct[cxt][y] > 65534) {
ct[cxt][0] >>= 1;
ct[cxt][1] >>= 1;
}
if ((cxt+=cxt+y) >= 512)
cxt=1;
}
// Assume a stationary order 0 stream of 9-bit symbols
int p() const {
return 4096*(ct[cxt][1]+1)/(ct[cxt][0]+ct[cxt][1]+2);
}
inline void Encoder::encode(int y) {
// Update the range
const U32 xmid = x1 + ((x2-x1) >> 12) * predictor.p();
assert(xmid >= x1 && xmid < x2);
if (y)
x2=xmid;
else
x1=xmid+1;
predictor.update(y);
// Shift equal MSB's out
while (((x1^x2)&0xff000000)==0) {
putc(x2>>24, archive);
x1<<=8;
x2=(x2<<8)+255;
}
}
inline int Encoder::decode() {
// Update the range
const U32 xmid = x1 + ((x2-x1) >> 12) * predictor.p();
assert(xmid >= x1 && xmid < x2);
int y=0;
if (x<=xmid) {
y=1;
x2=xmid;
}
else
x1=xmid+1;
predictor.update(y);
// Shift equal MSB's out
while (((x1^x2)&0xff000000)==0) {
x1<<=8;
x2=(x2<<8)+255;
int c=getc(archive);
if (c==EOF) c=0;
x=(x<<8)+c;
}
return y;
}
fpaq0 is a file compressor which uses an order-0 bitwise model for modeling and uses 12-bits carry-less arithmetic coder for entropy coding stage. ct[512][2] stores counters for each contexts to compute symbol probabilities. The context (order-0 in fpaq0) is calculated with partial bits with a leading one (to simplify calculations).
For more easy explanation, let's skip EOF symbol for now. Order-0 context calculated as follow without EOF symbol (simplified):
// Full byte encoding
int cxt = 1; // context starts with leading one
for (int i = 0; i < 8; ++i) {
// Encoding part
int y = ReadNextBit();
int p = GetProbability(ctx);
EncodeBit(y, p);
// Model updating
UpdateCounter(cxt, y); // Update related counter
cxt = (cxt << 1) | y; // shift left and insert new bit
}
For decoding, context is used without EOF symbol like following (simplified):
// Full byte decoding
int cxt = 1; // context starts with leading one
for (int i = 0; i < 8; ++i) {
// Decoding part
int p = GetProbability(ctx);
int y = DecodeBit(p);
WriteBit(y);
// Model updating
UpdateCounter(cxt, y); // Update related counter
cxt = (cxt << 1) | y; // shift left and insert new bit
}
fpaq0 is designed as a streaming compressor. Meaning that it doesn't need to know exact length of the input stream. So, the question how decoder should know when to stop? EOF symbol used exactly for that. While encoding every single byte, a zero bit is encoded as a flag to indicate there is more data to follow. One indicates we reached the end of stream. So, decoder knows when to stop. That's the reason why our context model is 9-bits (EOF flag + 8 bits data).
Now, the last part: probability calculation. fpaq0 uses just counts of past symbols under order-0 context to calculate final probability.
n0 = count of 0
n1 = count of 1
p = n1 / (n0 + n1)
There are two implementation details that should be addressed: counter overflow and division by zero.
Counter overflow is addressed by halving both counts when they reach a threshold. Since, we're dealing with p, it makes sense.
Division by zero is addressed by inserting one into formula for each variables. So,
p = (n1 + 1) / ((n0 + 1) + (n1 + 1))

Weird but close fft and ifft of image in c++

I wrote a program that loads, saves, and performs the fft and ifft on black and white png images. After much debugging headache, I finally got some coherent output only to find that it distorted the original image.
input:
fft:
ifft:
As far as I have tested, the pixel data in each array is stored and converted correctly. Pixels are stored in two arrays, 'data' which contains the b/w value of each pixel and 'complex_data' which is twice as long as 'data' and stores real b/w value and imaginary parts of each pixel in alternating indices. My fft algorithm operates on an array structured like 'complex_data'. After code to read commands from the user, here's the code in question:
if (cmd == "fft")
{
if (height > width) size = height;
else size = width;
N = (int)pow(2.0, ceil(log((double)size)/log(2.0)));
temp_data = (double*) malloc(sizeof(double) * width * 2); //array to hold each row of the image for processing in FFT()
for (i = 0; i < (int) height; i++)
{
for (j = 0; j < (int) width; j++)
{
temp_data[j*2] = complex_data[(i*width*2)+(j*2)];
temp_data[j*2+1] = complex_data[(i*width*2)+(j*2)+1];
}
FFT(temp_data, N, 1);
for (j = 0; j < (int) width; j++)
{
complex_data[(i*width*2)+(j*2)] = temp_data[j*2];
complex_data[(i*width*2)+(j*2)+1] = temp_data[j*2+1];
}
}
transpose(complex_data, width, height); //tested
free(temp_data);
temp_data = (double*) malloc(sizeof(double) * height * 2);
for (i = 0; i < (int) width; i++)
{
for (j = 0; j < (int) height; j++)
{
temp_data[j*2] = complex_data[(i*height*2)+(j*2)];
temp_data[j*2+1] = complex_data[(i*height*2)+(j*2)+1];
}
FFT(temp_data, N, 1);
for (j = 0; j < (int) height; j++)
{
complex_data[(i*height*2)+(j*2)] = temp_data[j*2];
complex_data[(i*height*2)+(j*2)+1] = temp_data[j*2+1];
}
}
transpose(complex_data, height, width);
free(temp_data);
free(data);
data = complex_to_real(complex_data, image.size()/4); //tested
image = bw_data_to_vector(data, image.size()/4); //tested
cout << "*** fft success ***" << endl << endl;
void FFT(double* data, unsigned long nn, int f_or_b){ // f_or_b is 1 for fft, -1 for ifft
unsigned long n, mmax, m, j, istep, i;
double wtemp, w_real, wp_real, wp_imaginary, w_imaginary, theta;
double temp_real, temp_imaginary;
// reverse-binary reindexing to separate even and odd indices
// and to allow us to compute the FFT in place
n = nn<<1;
j = 1;
for (i = 1; i < n; i += 2) {
if (j > i) {
swap(data[j-1], data[i-1]);
swap(data[j], data[i]);
}
m = nn;
while (m >= 2 && j > m) {
j -= m;
m >>= 1;
}
j += m;
};
// here begins the Danielson-Lanczos section
mmax = 2;
while (n > mmax) {
istep = mmax<<1;
theta = f_or_b * (2 * M_PI/mmax);
wtemp = sin(0.5 * theta);
wp_real = -2.0 * wtemp * wtemp;
wp_imaginary = sin(theta);
w_real = 1.0;
w_imaginary = 0.0;
for (m = 1; m < mmax; m += 2) {
for (i = m; i <= n; i += istep) {
j = i + mmax;
temp_real = w_real * data[j-1] - w_imaginary * data[j];
temp_imaginary = w_real * data[j] + w_imaginary * data[j-1];
data[j-1] = data[i-1] - temp_real;
data[j] = data[i] - temp_imaginary;
data[i-1] += temp_real;
data[i] += temp_imaginary;
}
wtemp = w_real;
w_real += w_real * wp_real - w_imaginary * wp_imaginary;
w_imaginary += w_imaginary * wp_real + wtemp * wp_imaginary;
}
mmax=istep;
}}
My ifft is the same only with the f_or_b set to -1 instead of 1. My program calls FFT() on each row, transposes the image, calls FFT() on each row again, then transposes back. Is there maybe an error with my indexing?
Not an actual answer as this question is Debug only so some hints instead:
your results are really bad
it should look like this:
first line is the actual DFFT result
Re,Im,Power is amplified by a constant otherwise you would see a black image
the last image is IDFFT of the original not amplified Re,IM result
the second line is the same but the DFFT result is wrapped by half size of image in booth x,y to match the common results in most DIP/CV texts
As you can see if you IDFFT back the wrapped results the result is not correct (checker board mask)
You have just single image as DFFT result
is it power spectrum?
or you forget to include imaginary part? to view only or perhaps also to computation somewhere as well?
is your 1D **DFFT working?**
for real data the result should be symmetric
check the links from my comment and compare the results for some sample 1D array
debug/repair your 1D FFT first and only then move to the next level
do not forget to test Real and complex data ...
your IDFFT looks BW (no gray) saturated
so did you amplify the DFFT results to see the image and used that for IDFFT instead of the original DFFT result?
also check if you do not round to integers somewhere along the computation
beware of (I)DFFT overflows/underflows
If your image pixel intensities are big and the resolution of image too then your computation could loss precision. Newer saw this in images but if your image is HDR then it is possible. This is a common problem with convolution computed by DFFT for big polynomials.
Thank you everyone for your opinions. All that stuff about memory corruption, while it makes a point, is not the root of the problem. The sizes of data I'm mallocing are not overly large, and I am freeing them in the right places. I had a lot of practice with this while learning c. The problem was not the fft algorithm either, nor even my 2D implementation of it.
All I missed was the scaling by 1/(M*N) at the very end of my ifft code. Because the image is 512x512, I needed to scale my ifft output by 1/(512*512). Also, my fft looks like white noise because the pixel data was not rescaled to fit between 0 and 255.
Suggest you look at the article http://www.yolinux.com/TUTORIALS/C++MemoryCorruptionAndMemoryLeaks.html
Christophe has a good point but he is wrong about it not being related to the problem because it seems that in modern times using malloc instead of new()/free() does not initialise memory or select best data type which would result in all problems listed below:-
Possibly causes are:
Sign of a number changing somewhere, I have seen similar issues when a platform invoke has been used on a dll and a value is passed by value instead of reference. It is caused by memory not necessarily being empty so when your image data enters it will have boolean maths performed on its values. I would suggest that you make sure memory is empty before you put your image data there.
Memory rotating right (ROR in assembly langauge) or left (ROL) . This will occur if data types are being used which do not necessarily match, eg. a signed value entering an unsigned data type or if the number of bits is different in one variable to another.
Data being lost due to an unsigned value entering a signed variable. Outcomes are 1 bit being lost because it will be used to determine negative or positive, or at extremes if twos complement takes place the number will become inverted in meaning, look for twos complement on wikipedia.
Also see how memory should be cleared/assigned before use. http://www.cprogramming.com/tutorial/memory_debugging_parallel_inspector.html

float value issue

I am facing problem using float
in loop its value stuck at 8388608.00
int count=0;
long X=10;
cout.precision(flt::digits10);
cout<<"Iterration #"<<setw(15)<<"Add"<<setw(21)<<"Mult"<<endl;
float Start=0.0;
float Multiplication = Addition * N;
long i = 1;
for (i; i <= N; i++){
float temp = Start + Addition;
Start=temp;
count++;
if(count%X==0 && count!=0)
{
X*=10;
cout<<i;
cout<<fixed<<setw(30)<<Start<<setw(20)<<fixed<<i*Addition<<endl;
}
}
what should i do??
Floating point addition doesn't work when you're adding (relatively) small number to (relatively) big one. It's caused by the way float is stored in memory.
You may try replacing single precision floating point (float) with double precision floating point (double) representation but if that doesn't work you'll probably need to implement hack like this:
// Lets say
double OriginalAddition = 0.123;
int Addition = 1;
// You just use base math substitution:
// Addition = OriginalAddition
int temp = Start + Addition; // You will treat transform floating point to fixed point
// with step 0.123, so 1 = 0.123
// And when displaying result (transform back into original floating point):
printf( "%f", (double)result*OriginalAddition)
This needs a lot of thought to find a substitution that doesn't cause data loss, covers required precision and won't cause int to overflow. Try to google fixed point int C (some results: 1, 2) to get better idea what to do.

How can you convert a std::bitset<64> to a double?

Is there a way to convert a std::bitset<64> to a double without using any external library (Boost, etc.)? I am using a bitset to represent a genome in a genetic algorithm and I need a way to convert a set of bits to a double.
The C++11 road:
union Converter { uint64_t i; double d; };
double convert(std::bitset<64> const& bs) {
Converter c;
c.i = bs.to_ullong();
return c.d;
}
EDIT: As noted in the comments, we can use char* aliasing as it is unspecified instead of being undefined.
double convert(std::bitset<64> const& bs) {
static_assert(sizeof(uint64_t) == sizeof(double), "Cannot use this!");
uint64_t const u = bs.to_ullong();
double d;
// Aliases to `char*` are explicitly allowed in the Standard (and only them)
char const* cu = reinterpret_cast<char const*>(&u);
char* cd = reinterpret_cast<char*>(&d);
// Copy the bitwise representation from u to d
memcpy(cd, cu, sizeof(u));
return d;
}
C++11 is still required for to_ullong.
Most people are trying to provide answers that let you treat the bit-vector as though it directly contained an encoded int or double.
I would advise you completely avoid that approach. While it does "work" for some definition of working, it introduces hamming cliffs all over the place. You usually want your encoding to arrange things so that if two decoded values are near to one another, then their encoded values are near to one another as well. It also forces you to use 64-bits of precision.
I would manage the conversion manually. Say you have three variables to encode, x, y, and z. Your domain expertise can be used to say, for example, that -5 <= x < 5, 0 <= y < 100, and 0 <= z < 1, where you need 8 bits of precision for x, 12 bits for y, and 10 bits for z. This gives you a total search space of only 30 bits. You can have a 30 bit string, treat the first 8 as encoding x, the next 12 as y, and the last 10 as z. You are also free to gray code each one to remove the hamming cliffs.
I've personally done the following in the past:
inline void binary_encoding::encode(const vector<double>& params)
{
unsigned int start=0;
for(unsigned int param=0; param<params.size(); ++param) {
// m_bpp[i] = number of bits in encoding of parameter i
unsigned int num_bits = m_bpp[param];
// map the double onto the appropriate integer range
// m_range[i] is a pair of (min, max) values for ith parameter
pair<double,double> prange=m_range[param];
double range=prange.second-prange.first;
double max_bit_val=pow(2.0,static_cast<double>(num_bits))-1;
int int_val=static_cast<int>((params[param]-prange.first)*max_bit_val/range+0.5);
// convert the integer to binary
vector<int> result(m_bpp[param]);
for(unsigned int b=0; b<num_bits; ++b) {
result[b]=int_val%2;
int_val/=2;
}
if(m_gray) {
for(unsigned int b=0; b<num_bits-1; ++b) {
result[b]=!(result[b]==result[b+1]);
}
}
// insert the bits into the correct spot in the encoding
copy(result.begin(),result.end(),m_genotype.begin()+start);
start+=num_bits;
}
}
inline void binary_encoding::decode()
{
unsigned int start = 0;
// for each parameter
for(unsigned int param=0; param<m_bpp.size(); param++) {
unsigned int num_bits = m_bpp[param];
unsigned int intval = 0;
if(m_gray) {
// convert from gray to binary
vector<int> binary(num_bits);
binary[num_bits-1] = m_genotype[start+num_bits-1];
intval = binary[num_bits-1];
for(int i=num_bits-2; i>=0; i--) {
binary[i] = !(binary[i+1] == m_genotype[start+i]);
intval += intval + binary[i];
}
}
else {
// convert from binary encoding to integer
for(int i=num_bits-1; i>=0; i--) {
intval += intval + m_genotype[start+i];
}
}
// convert from integer to double in the appropriate range
pair<double,double> prange = m_range[param];
double range = prange.second - prange.first;
double m = range / (pow(2.0,double(num_bits)) - 1.0);
// m_phenotype is a vector<double> containing all the decoded parameters
m_phenotype[param] = m * double(intval) + prange.first;
start += num_bits;
}
}
Note that for reasons that probably don't matter to you, I wasn't using bit vectors -- just ordinary vector<int> to encoding things. And of course, there's a bunch of stuff tied into this code that isn't shown here, but you can probably get the basic idea.
One other note, if you're doing GPU calculations or if you have a particular problem such that 64 bits are the appropriate size anyway, it may be worth the extra overhead to stuff everything into native words. Otherwise, I would guess that the overhead you add to the search process will probably overwhelm whatever benefits you get by faster encoding and decoding.
Edit:: I've decided that I was being a bit silly with this. While you do end up with a double it assumes that the bitset holds an integer... which is a big assumption to make. You will end up with a predictable and repeatable value per bitset but still I don't think that this is what the author intended.
Well if you iterate over the bit values and do
output_double += pow( 2, 64-(bit_position+1) ) * bit_value;
That would work. As long as it is big-endian

weird performance in C++ (VC 2010)

I have this loop written in C++, that compiled with MSVC2010 takes a long time to run. (300ms)
for (int i=0; i<h; i++) {
for (int j=0; j<w; j++) {
if (buf[i*w+j] > 0) {
const int sy = max(0, i - hr);
const int ey = min(h, i + hr + 1);
const int sx = max(0, j - hr);
const int ex = min(w, j + hr + 1);
float val = 0;
for (int k=sy; k < ey; k++) {
for (int m=sx; m < ex; m++) {
val += original[k*w + m] * ds[k - i + hr][m - j + hr];
}
}
heat_map[i*w + j] = val;
}
}
}
It seemed a bit strange to me, so I did some tests then changed a few bits to inline assembly: (specifically, the code that sums "val")
for (int i=0; i<h; i++) {
for (int j=0; j<w; j++) {
if (buf[i*w+j] > 0) {
const int sy = max(0, i - hr);
const int ey = min(h, i + hr + 1);
const int sx = max(0, j - hr);
const int ex = min(w, j + hr + 1);
__asm {
fldz
}
for (int k=sy; k < ey; k++) {
for (int m=sx; m < ex; m++) {
float val = original[k*w + m] * ds[k - i + hr][m - j + hr];
__asm {
fld val
fadd
}
}
}
float val1;
__asm {
fstp val1
}
heat_map[i*w + j] = val1;
}
}
}
Now it runs in half the time, 150ms. It does exactly the same thing, but why is it twice as quick? In both cases it was run in Release mode with optimizations on. Am I doing anything wrong in my original C++ code?
I suggest you try different floating-point calculation models supported by the compiler - precise, strict or fast (see /fp option) - with your original code before making any conclusions. I suspect that your original code was compiled with some overly restrictive floating-point model (not followed by your assembly in the second version of the code), which is why the original is much slower.
In other words, if the original model was indeed too restrictive, then you were simply comparing apples to oranges. The two versions didn't really do the same thing, even though it might seem so at the first sight.
Note, for example, that in the first version of the code the intermediate sum is accumulated in a float value. If it was compiled with precise model, the intermediate results would have to be rounded to the precision of float type, even if the variable val was optimized away and the internal FPU register was used instead. In your assembly code you don't bother to round the accumulated result, which is what could have contributed to its better performance.
I'd suggest you compile both versions of the code in /fp:fast mode and see how their performances compare in that case.
A few things to check out:
You need to check that is actually is the same code. As in, are your inline assembly statements exactly the same as those generated by the compiler? I can see three potential differences (potential because they may be optimised out). The first is the initial setting of val to zero, the second is the extra variable val1 (unlikely since it will most likely just change the constant subtraction of the stack pointer), the third is that your inline assembly version may not put the interim results back into val.
You need to make sure your sample space is large. You didn't mention whether you'd done only one run of each version or a hundred runs but, the more runs, the better, so as to remove the effect of "noise" in your statistics.
An even better measurement would be CPU time rather than elapsed time. Elapsed time is subject to environmental changes (like your virus checker or one of your services deciding to do something at the time you're testing). The large sample space will alleviate, but not necessarily solve, this.