I'm writing a code in which I read and image and process it and get a Mat of double/float. I'm saving it to a file and later on, I'm reading it from that file.
When I use double, the space it requires is 8MB for 1Kx1K image, when I use float it is 4MB. So I want to use float.
Here is my code and output:
Mat data = readFloatFile("file_location");
cout << data.at<float>(0,0) << " " << data.at<double>(0,0);
When I run this code in the DEBUG mode, the print out for float is -0 and double gives exception namely assertion failed. But when I use RELEASE mode the print out for float is -0 and 0.832 for double which is true value.
My question is why I cant get output when I use data.at<float>(0,0) and why I don't get exception when I use data.at<double>(0,0) in RELEASE mode which is supposed to be the case?
EDIT: Here is my code which writes and reads
void writeNoiseFloat(string imageName,string fingerprintname) throw(){
Mat noise = getNoise(imageName);
FILE* fp = fopen(fingerprintname.c_str(),"wb");
if (!fp){
cout << "not found ";
perror("fopen");
}
float *buffer = new float[noise.cols];
for(int i=0;i<noise.rows;++i){
for(int j=0;j<noise.cols;++j)
buffer[j]=noise.at<float>(i,j);
fwrite(buffer,sizeof(float),noise.cols,fp);
}
fclose(fp);
free(buffer);
}
void readNoiseFloat(string fpath,Mat& data){
clock_t start = clock();
cout << fpath << endl;
FILE* fp = fopen(fpath.c_str(),"rb");
if (!fp)perror("fopen");
int size = 1024;
data.create(size,size,CV_32F);
float* buffer= new float[size];
for(int i=0;i<size;++i) {
fread(buffer,sizeof(float),size,fp);
for(int j=0;j<size;++j){
data.at<float>(i,j)=buffer[j];
cout << data.at<float>(i,j) << " " ;
cout << data.at<double>(i,j);
}
}
fclose(fp);
}
Thanks in advance,
The first of all, you can not use the float and double in one cv::Mat as storage itself is only array of bytes. Size of this array will be different for matrix of float and matrix of double.
So, you have to decide what you are using.
Essentially, data.at<type>(x,y) is equivalent to (type*)data_ptr[x][y] (note this is not exact code, its purpose is to show what is happening)
EDIT:
On the basis of code you added you are creating matrix of CV_32F this means that you must use float to write and read and element. Using of double causes reinterpretation of value and will definitely give you an incorrect result.
Regarding to assertion, I am sure that inside the cv::MAT::at<class T> there is a kind of following code:
assert(sizeof<T>==this.getDepth());
Usually asserts are compiled only in DEBUG mode, so that's why you do not give this error in RELEASE.
EDIT2:
Not regarding to issue, but never use free() with new or delete with malloc(). The result can be a hardly debugging issue.
So please use delete[] for buffer.
Difference between debug & release:
There's a bug in your code. It just doesn't show up in Release mode. That is what the debugger is for. Debugger tells you if there's any bug/issue with the code, Release just runs through it...
Also the compiler optimizes your code to run faster and is therefore smaller, the debugger uses more size on your HD because you can actually DEBUG it.
Release initializes your un-initialized variables to 0. This may vary on different compilers.
Related
I have a string whose length is 1600 and I know that it contains 200 double. When I print out the string I get the following :Y���Vz'#��y'#��!U�}'#�-...
I would like to convert this string to a vector containing the 200 doubles.
Here is the code I tried (blobString is a string 1600 characters long):
string first_eight = blobString.substr(0, sizeof(double)); // I get the first 8 values of the string which should represent the first double
double double_value1
memcpy(&double_value1, &first_eight, sizeof(double)); // First thing I tried
double* double_value2 = (double*)first_eight.c_str(); // Second thing I tried
cout << double_value1 << endl;
cout << double_value2 << endl;
This outputs the following:
6.95285e-310
0x7ffd9b93e320
--- Edit solution---
The second method works all I had to do was look to where double_value1 was pointing.
cout << *double_value2 << endl;
Here's an example that might get you closer to what you need. Bear in mind that unless the numbers in your blob are in the exact format that your particular C++ compiler expects, this isn't going to work like you expect. In my example I'm building up the buffer of doubles myself.
Let's start with our array of doubles.
double doubles[] = { 0.1, 5.0, 0.7, 8.6 };
Now I'll build an std::string that should look like your blob. Notice that I can't simply initialize a string with a (char *) that points to the base of my list of doubles, as it will stop when it hits the first zero byte!
std::string double_buf_str;
double_buf_str.append((char *)doubles, 4 * sizeof(double));
// A quick sanity check, should be 32
std::cout << "Length of double_buf_str "
<< double_buf_str.length()
<< std::endl;
Now I'll reinterpret the c_str() pointer as a (double *) and iterate through the four doubles.
for (auto i = 0; i < 4; i++) {
std::cout << ((double*)double_buf_str.c_str())[i] << std::endl;
}
Depending on your circumstances you might consider using a std::vector<uint8_t> for your blob, instead of an std::string. C++11 gives you a data() function that would be the equivalent of c_str() here. Turning your blob directly into a vector of doubles would give you something even easier to work with--but to get there you'd potentially have to get dirty with a resize followed by a memcpy directly into the internal array.
I'll give you an example for completeness. Note that this is of course not how you would normally initialize a vector of doubles...I'm imagining that my double_blob is just a pointer to a blob containing a known number of doubles in the correct format.
const int count = 200; // 200 doubles incoming
std::vector<double> double_vec;
double_vec.resize(count);
memcpy(double_vec.data(), double_blob, sizeof(double) * count);
for (double& d : double_vec) {
std::cout << d << std::endl;
}
#Mooning Duck brought up the great point that the result of c_str() is not necessarily aligned to an appropriate boundary--which is another good reason not to use std::string as a general purpose blob (or at least don't interpret the internals until they are copied somewhere that guarantees a valid alignment for the type you are interested in). The impact of trying to read a double from a non-aligned location in memory will vary depending on architecture, giving you a portability concern. In x86-based machines there will only be a performance impact AFAIK as it will read across alignment boundaries and assemble the double correctly (you can test this on a x86 machine by writing then reading back a double from successive locations in a buffer with an increasing 1-byte offset--it'll just work). In other architectures you'll get a fault.
The std::vector<double> solution will not suffer from this issue due to guarantees about the alignment of newed memory built into the standard.
Currently, I am working on real time interface with Visual Studio C++.
I faced problem is, when buffer is running for data store, that time .exe is not responding at the point data store in buffer. I collect data as 130Hz from motion sensor. I have tried to increase virtual memory of computer, but problem was not solved.
Code Structure:
int main(){
int no_data = 0;
float x_abs;
float y_abs;
int sensorID = 0;
while (1){
// Define Buffer
char before_trial_output_data[][8 * 4][128] = { { { 0, }, }, };
// Collect Real Time Data
x_abs = abs(inchtocm * record[sensorID].y);
y_abs = abs(inchtocm * record[sensorID].x);
//Save in buffer
sprintf(before_trial_output_data[no_data][sensorID], "%d %8.3f %8.3f\n",no_data,x_abs,y_abs);
//Increment point
no_data++;
// Break While loop, Press ESc key
if (GetAsyncKeyState(VK_ESCAPE)){
break;
}
}
//Data Save in File
printf("\nSaving results to 'RecordData.txt'..\n");
FILE *fp3 = fopen("RecordData.dat", "w");
for (i = 0; i<no_data-1; i++)
fprintf(fp3, output_data[i][sensorID]);
fclose(fp3);
printf("Complete...\n");
}
The code you posted doesn't show how you allocate more memory for your before_trial_output_data buffer when needed. Do you want me to guess? I guess you are using some flavor of realloc(), which needs to allocate ever-increasing amount of memory, fragmenting your heap terribly.
However, in order for you to save that data to a file later on, it doesn't need to be in continuous memory, so some kind of list will work way better than an array.
Also, there is no provision in your "pseudo" code for a 130Hz reading; it processes records as fast as possible, and my guess is - much faster.
Is your prinf() call also a "pseudo code"? Otherwise you are looking for trouble by having mismatch of the % format specifications and number and type of parameters passed in.
I do have a struct in a file wich is included by the host code and the kernel
typedef struct {
float x, y, z,
dir_x, dir_y, dir_z;
int radius;
} WorklistStruct;
I'm building this struct in my c++ host code and passing it via a buffer to the OpenCL kernel.
If I'm choosing an CPU device for computation I will get the following result:
printf ( "item:[%f,%f,%f][%f,%f,%f]%d,%d\n", item.x, item.y, item.z, item.dir_x, item.dir_y,
item.dir_z , item.radius ,sizeof(float));
Host:
item:[20.169043,7.000000,34.933712][0.000000,-3.000000,0.000000]1,4
Device (CPU):
item:[20.169043,7.000000,34.933712][0.000000,-3.000000,0.000000]1,4
And if I choose a GPU device (AMD) for computation weird things are happening:
Host:
item:[58.406261,57.786015,58.137501][2.000000,2.000000,2.000000]2,4
Device (GPU):
item:[58.406261,2.000000,0.000000][0.000000,0.000000,0.000000]0,0
Notable is that the sizeof(float) is garbage on the gpu.
I assume there is a problem with the layouts of floats on different devices.
Note: the struct is contained in an array of structs of this type and every struct in this array is garbage on GPU
Anyone does have an idea why this is the case and how I can predict this?
EDIT I added an %d at the and and replaced it by an 1, the result is:1065353216
EDIT: here two structs wich I'm using
typedef struct {
float x, y, z,//base coordinates
dir_x, dir_y, dir_z;//directio
int radius;//radius
} WorklistStruct;
typedef struct {
float base_x, base_y, base_z; //base point
float radius;//radius
float dir_x, dir_y, dir_z; //initial direction
} ReturnStruct;
I tested some other things, it looks like a problem with printf. The values seems to be right. I passed the arguments to the return struct, read them and these values were correct.
I don't want to post all of the related code, this would be a few hundred lines.
If noone has an idea I would compress this a bit.
Ah, and for printing I'm using #pragma OPENCL EXTENSION cl_amd_printf : enable.
Edit:
Looks really like a problem with printf. I simply don't use it anymore.
There is a simple method to check what happens:
1 - Create host-side data & initialize it:
int num_points = 128;
std::vector<WorklistStruct> works(num_points);
std::vector<ReturnStruct> returns(num_points);
for(WorklistStruct &work : works){
work = InitializeItSomehow();
std::cout << work.x << " " << work.y << " " << work.z << std::endl;
std::cout << work.radius << std::endl;
}
// Same stuff with returns
...
2 - Create Device-side buffers using COPY_HOST_PTR flag, map it & check data consistency:
cl::Buffer dev_works(..., COPY_HOST_PTR, (void*)&works[0]);
cl::Buffer dev_rets(..., COPY_HOST_PTR, (void*)&returns[0]);
// Then map it to check data
WorklistStruct *mapped_works = dev_works.Map(...);
ReturnStruct *mapped_rets = dev_rets.Map(...);
// Output values & unmap buffers
...
3 - Check data consistency on Device side as you did previously.
Also, make sure that code (presumably - header), which is included both by kernel & host-side code is pure OpenCL C (AMD compiler sometimes can "swallow" some errors) and that you've imported directory for includes searching, when building OpenCL kernel ("-I" flag at clBuildProgramm stage)
Edited:
At every step, please collect return codes (or catch exceptions). Beside that, "-Werror" flag at clBuildProgramm stage can also be helpfull.
It looks like I used the wrong OpenCL headers for compiling. If I try the code on the Intel platform(OpenCL 1.2) everything is fine. But on my AMD platform (OpenCL 1.1) I get weird values.
I will try other headers.
As far as I can tell, calling malloc() basically means the program is asking the OS for a hunk of memory. I'm writing a program to interface with a camera, in which I need to allocate chucks of memory large enough to store hundreds of images at a time (its a fast camera).
When I allocate space for about 1.9 Gb worth of images, everything works just fine. The allocation calculation is pretty simple:
int allocateBurst( int numImages )
{
int streamSize = ZIMAGESIZE * numImages;
data.images = new unsigned short [streamSize];
return 0;
}
But as soon as I go over the 2 Gb limit, I get runtime errors like this:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
It seems like 2 Gigs might be the maximum size that I can allocate at once. I have 32 Gigs of ram, and would like to simply be able to allocate larger pieces of memory in one allocation. Is this possible?
I'm running Ubuntu 12.10.
There may be an underlying issue that the OS can't grant your large memory allocation because it is using memory for other applications. Check with your OS to see what the limits are.
Also know that some OS's will "page" memory to the hard disk. When your program asks for memory outside the page, the OS will swap pages with the hard disk. Knowing this, I recommend a classic technique of "Double Buffering" or "Multiple Buffering".
You will need at least two threads: reading and writing. One thread is responsible for reading data from the camera and placing into a buffer. When it fills up a buffer, it starts on another buffer. Meanwhile the writing thread is starting at the buffer and writing it to disk (block file writes). When the writing thread finishes a buffer, it starts on the next one. The buffers should be in a circular sequence to reuse them.
The magic is to have enough buffers so that the reader never catches up to the writer.
Since you are using a couple of small buffers, you should not get any errors from the OS.
The are methods to optimize this, such as obtaining static buffers from the OS.
The problem is you're using a signed 32-bit variable to describe an unsigned 64-bit number.
Use "size_t" instead of "int" for holding the storage count. This has nothing to do with what you intend to store, just how large a count of them you need.
#include <iostream>
int main(int /*argc*/, const char** /*argv*/)
{
int units = 2;
// 32-bit signed, i.e. 31-bit numbers.
int intSize = units * 1024 * 1024 * 1024;
// 64-bit values (ULL suffix)
size_t sizetSize = units * 1024ULL * 1024ULL * 1024ULL;
std::cout << "intSize = " << intSize << ", sizetSize = " << sizetSize << std::endl;
try {
unsigned short* intAlloc = new unsigned short[intSize];
std::cout << "intAlloc = " << intAlloc << std::endl;
delete [] intAlloc;
} catch (std::bad_alloc) {
std::cout << "intAlloc failed (std::bad_alloc)" << std::endl;
}
try {
unsigned short* sizetAlloc = new unsigned short[sizetSize];
std::cout << "sizetAlloc = " << sizetAlloc << std::endl;
delete [] sizetAlloc;
} catch (std::bad_alloc) {
std::cout << "sizetAlloc failed (std::bad_alloc)" << std::endl;
}
return 0;
}
Output (g++ -m64 -o test test.cpp under Mint 15 64 bit with g++ 4.7.3 on a virtual machine with 4Gb of memory)
intSize = -2147483648, sizetSize = 2147483648
intAlloc failed
sizetAlloc = 0x7f55affff010
int allocateBurst( int numImages )
{
// change that from int to long
long streamSize = ZIMAGESIZE * numImages;
data.images = new unsigned short [streamSize];
return 0;
}
Try using
long
OR
cast the result of the allocateBurst function to "uint_64" and the return type of the function to uint_64
Because int you allocate 32 bit allocation while long or uint_64 allocates 64 bit allocation which could possibly allocate more memory space for you.
Hope that helps
When running the release executable only (No problems occur when running through visual studio) my program crashes. When using "attach to process" function visual studio indicates the crash occurred in the following function:
World::blockmap World::newBlankBlockmap(int sideLen, int h){
cout << "newBlankBlockmap side: "<<std::to_string((long long)sideLen) << endl;
cout << "newBlankBlockmap height: "<<std::to_string((long long)h) << endl;
short*** bm = new short**[sideLen];
for(int i=0;i<sideLen;i++){
bm[i] = new short*[h];
for(int j=0;j<h;j++){
bm[i][j] = new short[sideLen];
for (int k = 0; k < sideLen ; k++)
{
bm[i][j][k] = blocks->getAIR_BLOCK();
}
}
}
return (blockmap)bm;
}
Which is called from a child class...
World::chunk* World_E::newChunkMap(World::floatmap north, World::floatmap east, World::floatmap south, World::floatmap west
,float lowlow, float highlow, float highhigh, float lowhigh, bool displaceSides){
World::chunk* c = newChunk(World::CHUNK_SIZE+1,World::HEIGHT);
for (int i = 0; i < World::CHUNK_SIZE ; i++)
{
for (int k = 0; k < World::CHUNK_SIZE ; k++)
{
c->bm[i][0][k] = blocks->getDUMMY_BLOCK();
}
}
c->bm[(int)floor((float)(World::CHUNK_SIZE+1)/2.0f)-1][1][(int)floor((float)(World::CHUNK_SIZE+1)/2.0f)-1] = blocks->getSTONE_BLOCK();
c->bm[(int)ceil((float)(World::CHUNK_SIZE+1)/2.0f)-1][1][(int)floor((float)(World::CHUNK_SIZE+1)/2.0f)-1] = blocks->getSTONE_BLOCK();
c->bm[(int)floor((float)(World::CHUNK_SIZE+1)/2.0f)-1][1][(int)ceil((float)(World::CHUNK_SIZE+1)/2.0f)-1] = blocks->getSTONE_BLOCK();
c->bm[(int)ceil((float)(World::CHUNK_SIZE+1)/2.0f)-1][1][(int)ceil((float)(World::CHUNK_SIZE+1)/2.0f)-1] = blocks->getSTONE_BLOCK();
return c;
}
where...
class World {
public: typedef short*** blockmap;
...
The line which VS points at is...
short*** bm = new short**[sideLen];
The "attach to process" function stats the Local variables are...
sideLen = 1911407648
h = 0
which is what i did NOT expect, but the cout outputs 9 and 30 respectively, which was expected.
I am aware that most "crashes in release only" problems are due to uninitialized variables, however, I fail to see that related here.
The only error message I get is...
Windows has triggered a breakpoint in Blocks Project.exe.
This may be due to a corruption of the heap
I am stumped on this problem, what's the error? how can I better debug release executable?
I can post more code if needed, however, bear in mind there is a lot of it.
Thank you in advanced.
"And I don't see World::newBlankBlockmap() called from that second chunk of code. – Michael Burr", I forgot that bit, here you go...
World::chunk* World::newChunk(int side, int height){
cout << "newChunk side: "<<std::to_string((long long)side) << endl;
cout << "newChunk height: "<<std::to_string((long long)height) << endl;
chunk* ch = new chunk();
ch->bm = newBlankBlockmap(side,height);
ch->fm = newBlankFloatmap(side);
return ch;
}
where...
struct chunk {
blockmap bm;
floatmap fm;
};
as defined in the World class
To reiterate what the comments where hinting at: From what you've posted, you're code seems to be badly structured. Triple pointer constructs like short*** are almost impossible to debug and should be avoided at all costs. The heap corruption error message you got suggests that you have a bad memory access somewhere in your code, which is impossible to find automatically with your current setup.
Your only options at this point are to either dig through your entire code manually, until you've found the bug, or start refactoring. The latter might seem like the more time-consuming now, but it won't be if you plan to work with this code in the future.
Consider the following as possible hints for a refactoring:
Don't use plain arrays for storing values. std::vector is just as effective and a lot easier to debug.
Avoid plain new and delete. In modern C++ with the STL containers and smart pointers, plain memory allocation should only happen in very rare exceptional cases.
Always range-check your array access operations. If you worry about performance, use asserts which disappear in release builds, but be sure the checks are there when you need them for debugging.
Modeling three-dimensional arrays in C++ can be tricky, since operator[] only offers support for one-dimensional arrays. A nice compromise is using operator() instead, which can take an arbitrary number of indices.
Avoid C-style casts. They can be very unpredictable. Use the C++ casts static_cast, dynamic_castand reinterpret_cast instead. If you find yourself using reinterpret_cast regularly, you probably have a mistake in your design somewhere.
There is a problem in this line short*** bm = new short**[sideLen];. The memory is allocated for sideLen elements, but the assignment line bm[i][j][k] = blocks->getAIR_BLOCK(); requires an array having size sideLen * sideLen * h. To fix this problem changing of the 1st line to short*** bm = new short**[sideLen * sideLen * h]; is required.