Hmm, hello! I am trying to read a binary file that contains a number of float values at a specific position. As seemingly must be done with binary files, they were saved as arrays of bytes, and I have been searching for a way to convert them back to floats with no success. Basically I have a char* memory block, and am attempting to extract the floats stored at a particular location and seamlessly insert them into a vector. I wonder, would that be possible, or would I be forced to rely on arrays instead if I wished to save copying the data? And how could it possibly be done? Thank you ^_^
If you know where the floats are you can read them back:
float a = *(float*)buffer[position];
Then you can do whatever you need of a, including 'push_back'ing it into a vector.
Make sure you read the file in binary mode, and if you know the positions of the float in the file it should work.
I'd need to see the code that generated the file to be more efficient.
Related
I have a struct. I would like to have an array in this struct, and then write this to a binary file, and then read it. However this array should be dynamically allocated. I'm not sure how should I approach this. My current guess is this:
I define and then write the struct to the file like this:
struct map {
int *tiles;
};
int main() {
map sample;
sample.tiles = new int[2];
sample.tiles[0]=1;
sample.tiles[1]=2;
ofstream file("sample.data", ios::binary);
file.write((char *)&sample, sizeof(sample));
file.close();
return 0;
}
Then read it like this in another program:
map test;
ifstream file("sample.data", ios::binary);
file.read((char *)&test, sizeof(test));
When I want to check the results with
cout << test.tiles[0];
I get a weirdly huge number, but clearly not the number I originally wrote to the file.
What is the right way to do this? How can I read an array without knowing its size?
file.write((char *)&sample, sizeof(sample));
This writes to the file the following structure:
struct map {
int *tiles;
};
That is, a structure that contains a single pointer. A pointer is an address in memory. So, what you end up writing to the file is one, meaningless, raw, memory address.
Which, obviously, is of absolutely no use, whatsoever, when it gets read back later. You did not write any of your integers to your file, just what their address in memory was, in a program that long ago terminated.
In order to do this correctly, you will need to:
Record how many int-egers the tiles pointer is pointing to.
fwrite() not the structure itself, but the integers that the tiles pointer is pointing to.
You will also need to write, to the file, how many integers there are in this array. If the file contains only these integers, this is not really needed, since you can simply read the entire contents of the file. But, if you expect to write some additional data in the file, it obviously becomes necessary to also record the size of the written array, in the file itself, so that it's possible to figure it out how many integers there are, when reading them back. The simplest way to do this is to write a single int, the size of the array first, followed by the contents of the array itself. And do the reverse process, when reading everything back.
Your sample object contains a pointer tiles and it's the pointer value that you're writing to the file. What you want to write is what the pointer points at.
In your example, you'd want to do file.write(sample.tiles, 2*sizeof(*sample.tiles));. It's 2* because you did new int[2]. In general, you'd also want to save the size (2 in this case) in the file so you know how many ints to read back in. In this simple case you could infer the 2 from the size of the file.
I am new to vector programming in C++. I want to initialize 2D matrix of unknown size so i came to vector side. I have two files 1) .h and 2).cpp. In .h file i initialized the vector like this
vector<vector<double> > vector_stor;
Then in .cpp after getting the size of each dimension from another source i re-sized the vector like this
size_X=5; //assumption
size_Y=5; //assumption
vector_stor.resize(size_X);
for(int i=0;i<size_X;i++)
vector_stor[i].resize(size_Y);
Now i want to store a data from a .mat file, initially read by matIO library, using Mat_VarRead function like this
Mat_VarReadData(vector_stor); //there are other arguments also but for demo just assume it
Mat_VarReadData take arguements in void* data and i have 2D vector. When i am doing like this its giving error
Error 1 error C2664: 'Mat_VarReadData' : cannot convert parameter
from 'std::vector<_Ty>' to 'void *'
Can anyone please guide me that how i can do this? It will be very helpful for me.
Edited Part:
matvar = Mat_VarReadInfo(mat,"data_struct");
field=Mat_VarGetStructFieldByName(matvar,"vect_stor",0);
int start[2]={0,0};
int stride[2]={1,1};
int edge[2];
edge[0]=field->dims[0];
edge[1]=field->dims[1];
Mat_VarReadData(mat,field,vector_stor,start,stride,edge);
where vector_stor is the variable for what i am seeking help.
Thanks
Check the ordering of your inputs to Mat_VarReadData. The function needs to be something like
Mat_VarReadData( ..., vector<vector<double> > mat, ... )
and you need to line up your inputs so that vector_stor lines up with that input.
If I have the function:
foo(int a, double b);
then when I call foo the first argument needs to be an int and the second a double. Same here, you need to match your input types to what your actually trying to pass.
Also check out:
http://libmatio.sourcearchive.com/documentation/1.3.3/group__MAT_g1845000f4fc6252ec5ff11c4b9f0759f.html
It looks like the function is going to dump the data into a single dimensional array, rather than a vector of vectors. Try this:
std::vector<double> mat;
mat.resize(size_X*size_Y);
// call Mat_VarReadData with &mat[0] as your void*
// now you can index with
mat[i*size_Y + j];
That assumes that the matrix is in column major form which MATLAB uses from memory. If it uses row major you'll need to index with
mat[i + j*size_X];
EDIT: If you're curious as to why &mat[0] or mat.data() (the second requires C++11, thanks for pointing it out) works is because the std::vector is guaranteed to be contiguous,see
Are std::vector elements guaranteed to be contiguous?
As others have already pointed out, you won't be able to pass neither vector <vector<double> > nor vector<double> directly to Mat_VarReadData in the form of void*, there's simply no safe way to do that. The best you can do is first to retrieve the data into some raw array, then convert it accordingly to the container you like.
I'm not familiar to MatIO, but I'll try to point you to the right direction. I took a look at the documentation for Mat_VarReadData. Not very helpful I must admit, but at any rate it states that any data can only be read once you have retrieved the information about the corresponding variable. That can be done using the function Mat_VarReadInfo. This function returns a matvar_t, which essentialy is a descriptor for variables. It seems to me, that matvar_t contains all the information you need to allocate data dynamically, that is, through the use of new[]. More precisely, matvar_t::data_size should hold exactly how many bytes are needed to store the data of a given variable.
I think that's more or less what you need to do:
warning, not tested
matvar_t* varInfo = Mat_VarReadInfo(matFileDescriptor, varName)
char* data = new char[varInfo->data_size];
Mat_VarReadData(matFileDescriptor, varInfo, (void*)(data), start, stride, edge);
I'll leave it to you to figure out what start, stride and edge actually stand for.
After you have the data read into the array data, you will have to convert it to the appropriate arithmetic type, probably double, but I can't be sure. Only then you will be able to fit them into a vector<double>. On this part I unfortunately can't help you, because it gets too deep into MatIO.
I understand you are struggling with basics c/c++ and also with MatIO. That's no simple library to be used by someone just starting out coding in c/c++, so I would strongly advise you to first carefuly read any documentations you have available on MatIO before trying any progress with your project. Some reading on basics c/c++ would also be very helpful.
I am working with 3D volumetric images, possibly large (256x256x256). I have 3 such volumes that I want to read in and operate on. Presently, each volume is stored as a text file of numbers which I read in using ifstream. I save it as a matrix (This is a class I have written by dynamic allocation of a 3D array). Then I perform operations on these 3 matrices, addition, multiplication and even Fourier transform. So far, everything works well, but, it takes a hell lot of time, especially the Fourier transform since it has 6 nested loops.
I want to know how I can speed this up. Also, whether the fact that I have stored the images in text files makes a difference. Should I save them as binary or in some other easier/faster to read in format? Is fstream the fastest way I can read in? I use the same 3 matrices each time without changing them. Does that make a difference? Also, is pointer to pointer to pointer the best way to store a 3D volume? If not what else can I do?
Also, is pointer to pointer to pointer best way to store a 3d volume?
Nope thats usually very ineficient.
If not what else can I do?
Its likely that you will get better performance if you store it in a contiguous block, and use computed offsets into the block.
I'd usually use a structure like this:
class DataBlock {
unsigned int nx;
unsigned int ny;
unsigned int nz;
std::vector<double> data;
DataBlock(in_nx,in_ny,in_nz) :
nx(in_nx), ny(in_ny), nz(in_nz) , data(in_nx*in_ny*in_nz, 0)
{}
//You may want to make this check bounds in debug builds
double& at(unsigned int x, unsigned int y, unsigned int z) {
return data[ x + y*nx + z*nx*ny ];
};
const double& at(unsigned int x, unsigned int y, unsigned int z) const {
return data[ x + y*nx + z*nx*ny ];
};
private:
//Dont want this class copied, so remove the copy constructor and assignment.
DataBlock(const DataBlock&);
DataBlock&operator=(const DataBlock&);
};
Storing a large (2563 elements) 3D image file as plain text is a waste of resources.
Without loss of generality, if you have a plain text file for your image and each line of your file consists of one value, you will have to read several characters until you find the end of the line (for a 3-digit number, these will be 4 bytes; 3 bytes for the digits, 1 byte for newline). Afterwards you will have to convert these single digits to a number. When using binary, you directly read a fixed amount of bytes and you will have your number. You could and should write and read it as a binary image.
There are several formats for doing so, the one I would recommend is the meta image file format of VTK. In this format, you have a plain text header file and a binary file with the actual image data. With the information from the header file you will know how large your image is and what datatype you will be using. In your program, you then directly read the binary data and save it to a 3D array.
If you really want to speed things up, use CUDA or OpenCL which will be pretty fast for your applications.
There are several C++ libraries that can help you with writing, saving and manipulating image data, including the before-mentioned VTK and ITK.
2563 is a rather large number. Parsing 2563 text strings will take a considerable amount of time. Using binary will make the reading/writing process much faster because it doesn't require converting a number to/from string, and using much less space. For example to read the number 123 as char from a text file the program will need to read it as a string and convert from decimal to binary using lots of multiplies by 10. Whereas if you had written it directly as the binary value 0b01111011 you only need to read that byte back again into memory, no conversion at all.
Using hexadecimal number may also increase reading speed since each hex digit can map directly to binary value but if you need more speed, binary file is the way to go. Just a fread command is enough to load the whole 2563 bytes = 16MB file into memory in less than 1 sec. And when you're done, just fwrite it back to file. To speedup you can use SIMD (SSE/AVX), CUDA or another parallel processing technique. You can improve the speed even further by multithreading or by only saving the non zero values because in many cases, most values will often be 0's.
Another reason maybe because your array is large and each dimension is a power of 2. This has been discussed in many questions on SO:
Why is there huge performance hit in 2048x2048 versus 2047x2047 array multiplication?
Why is my program slow when looping over exactly 8192 elements?
Why is transposing a matrix of 512x512 much slower than transposing a matrix of 513x513?
You may consider changing the last dimension to 257 and try again. Or better use another algorithm like divide and conquer that's more cache friendly
You should add timers around the load and the process so you know which is taking the most time, and focus your optimization efforts on it. If you control the file format, make one that is more efficient to read. If it is the processing, I'll echo what previous folks have said, investigate efficient memory layout as well as GPGPU computing. Good luck.
I an a newbie to c++.
I want to write a program to read values from file which has data in format:
text<tab or space>text
text<tab or space>text
...
(... indicates more such lines)
The number of lines in file varies. Now, I want to read this file and store the text into either 1 2D string array or 2 1D string arrays.
How do I do it?
Furthermore, I want to run a for loop over this array to process the each entry in file. How can I write that loop?
You're looking for a resizable array. Try std::vector<string>. You can find documentation here.
Edit: You could probably also manage to do this by opening the file, looping through to count the lines of the file, generating your fixed-size array, closing and reopening the file, and then looping through the file to populate the array. However, this is not recommended, as it will increase your runtime complexity far more than the slight overhead involved with managing vector, and it will make your code much more confusing for anyone who reads it.
(ps - I agree with #matthias-vallentin, you should've been able to find this on the site with minimal work)
Currently I read arrays in C++ with ifstream, read and reinterpret_cast by making a loop on values. Is it possible to load for example an unsigned int array from a binary file in one time without making a loop ?
Thank you very much
Yes, simply pass the address of the first element of the array, and the size of the array in bytes:
// Allocate, for example, 47 ints
std::vector<int> numbers(47);
// Read in as many ints as 'numbers' has room for.
inFile.read(&numbers[0], numbers.size()*sizeof(numbers[0]));
Note: I almost never use raw arrays. If I need a sequence that looks like an array, I use std::vector. If you must use an array, the syntax is very similar.
The ability to read and write binary images is non-portable. You may not be able to re-read the data on another machine, or even on the same machine with a different compiler. But, you have that problem already, with the solution that you are using now.