I am writing many instances of a C++ object (below) to file in binary.
struct Example
{
double a, b;
char c;
Object d;
};
If I want to read this data from C++ I just reinterpret_cast<Example> and extract the members.
However, I'd like a non-programmer to be able to scroll and read the file, like the Linux program less. Unfortunately less doesn't understand the binary data of Example.
I know I could just have a C++ script which simply writes the binary file to an ASCII file but the data is large and cannot be re-written to disk.
What is the best way of achieving this?
Related
I am working on project in C++ that adopts many ideas from a golang project.
I don't properly understand how this binary.write works from the documentation and how I can replicate it in C++. I am stuck at this line in my project.
binary.Write(e.offsets, nativeEndian, e.offset)
The type of e.offsets is *bytes.Buffer and e.offset is uint64
In C++ standard libs, it is generally up to you to deal with endian concerns. So let's skip that for the time being. If you just want to write binary data to a stream such as a file, you can do something like this:
uint64_t value = 0xfeedfacedeadbeef;
std::ofstream file("output.bin", ios::binary);
file.write(reinterpret_cast<char*>(&value), sizeof(value));
The cast is necessary because the file stream deals with char*, but you can write whatever byte streams to it you like.
You can write entire structures this way as well so long as they are "Plain Old Data" (POD). For example:
struct T {
uint32_t a;
uint16_t b;
};
T value2 = { 123, 45 };
std::ofstream file("output.bin", ios::binary);
file.write(reinterpret_cast<char*>(&value2), sizeof(value2));
Reading these things back is similar using file.read, but as mentioned, if you REALLY do care about endian, then you need to take care of that yourself.
If you are dealing with non-POD types (such as std::string), then you will need to deal with a more involved data serialization system. There are numerous options to deal with this if needed.
I'm trying to create a lookup table for my Xilinx Zynq SoC (the ARM Cortex).
I have a CSV file with 1330 entries which I can not read or parse during runtime. The latest I can do that is at compile time. I have read it is possible to embed a file into an executable so it can be used during runtime.
In other words, I want to read and parse the CSV file at runtime, without the original file actually being on any filesystem since it's an embedded device. So I would need to somehow embed the CSV file into the executable. How would I achieve something like this?
The CSV file looks like this (full file is here);
0,0,48,112,160,208,272,320,368,....,65440,65488
You asked for a vector but I'm not sure why you'd necessarily want that. The data will unavoidably occupy space in the application's read-only (".text" or ".rodata" or something like that) section, and while you can convert it into a vector if necessary (which will consume heap space and require runtime construction and initialization from the data in the read-only .text/.rodata section), you might as well just use it as a POD array since I doubt you'll be changing the data at runtime. So to create a const POD array of the data you could just do something like this....
const int myArray[] =
{
#include "myCsvFile.csv"
};
If the number of elements is not fixed, your program can determine the number with sizeof(myArray)/sizeof(myArray[0]). Even if it is a fixed size, this technique is probably best anyway. And of course if all of your entries are unsigned and can fit within 16 bits (a cursory examination suggested so), instead of defining it as an array of int, you can define it as an array of unsigned short or uint16_t to save space.
I should also mention that the const keyword is important here: Without that, your array will occupy twice as much memory: first, it will occupy space in the read-only section (.text or .rodata or whatever), and during application initialization, the runtime will make a read/writable copy of the read-only data in the read/write data section (.data probably), where myArray is allocated. To avoid that, define it as const and then the address of myArray will be in the read-only section, and it won't be copied to the read/write data section.
As your data is a plain array of unsigned integers you can use precompiler
Assuming you csv file is in data.csv file.
Then simply in .cpp file you can following:
const unsigned int k_Data[] = {
#include "data.csv" // << Preprocessor will replace this line with the contents of data.csv
};
#include <iostream>
int main()
{
std::cout << k_Data[3];
}
112
For the specific type of CSV data you have, e.g.
0,0,48,112,160,208,272,320,368,432,480,512,576,640,704,752,800,848,896,......
which is basically just a bunch of numbers, you should be able to include them using an actual #include statement, so;
const unsigned short myCSV[] ={
#include "./theCSV.data"
};
Im using unsigned short since the largest number looks like 64k and it would save you some space -- but you may want to use int instead, if you beleive the number can be larger than 64k
For the now being I am developing a C++ program based on some MATLAB codes. During the developing period I need to output the intermediate results to MATLAB in order to compare the C++ implementation result with the MATLAB result. What I am doing now is to write a binary file with C++, and then load the binary file with MATLAB. The following codes show an example:
int main ()
{
ofstream abcdef;
abcdef.open("C:/test.bin",ios::out | ios::trunc | ios::binary);
for (int i=0; i<10; i++)
{
float x_cord;
x_cord = i*1.38;
float y_cord;
y_cord = i*10;
abcdef<<x_cord<<" "<<y_cord<<endl;
}
abcdef.close();
return 0;
}
When I have the file test.bin, I can load the file automatically with MATLAB command:
data = load('test.bin');
This method can work well when numerical data is the output; however, it could fail if the output is a class with many member variables. I was wondering whether there are better ways to do the job not only for simple numerical data but also for complicated data structure. Thanks!
I would suggest the use of MATLAB engine through which you can pass data to MATLAB on real time basis and can even visualize the data using various graph plotting facilities available in MATLAB.
All you have to do is to invoke the MATLAB engine from C/C++ program and then you can easily execute MATLAB commands directly from the C/C++ program and/or exchange data between MATLAB and C/C++. It can be done in both directions i.e. from C++ to MATLAB and vice versa.
You can have a look at a working example for the same as shown here.
I would suggest using the fread command in matlab. I do this all the time for exchanging data between matlab and other programs, for instance:
fd = fopen('datafile.bin','r');
a = fread(fd,3,'*uint32');
b = fread(fd,1,'float32');
With fread you have all the flexibility to read any type of data. By placing a * in the name, as above, you also say that you want to store into that data type instead of the default matlab data type. So the first one reads in 3 32 bit unsigned integers and stores them as integers. The second one reads in a single precision floating point number, but stores it as the default double precision.
You need to control the way that data is written in your c++ code, but that is inevitable. You can make a class method in c++ that packs the data in a deterministic way.
Dustin
code looks like this:
struct Dog {
string name;
unsigned int age;
};
int main()
{
Dog d = {.age = 3, .name = "Lion"};
FILE *fp = fopen("dog.txt", "wb");
fwrite(&d, sizeof(d), 1, fp); //write d into dog.txt
}
My problem is what's the point of write a data object or structure into a binary file? I assume it is for making the data generated in a running program persistent, right? If yes, then how can I get the data back? Using fread?
This makes me think of database-like stuff, dose database write data into disk the same way?
You can do it but you will have a lot of issues to care about:
structure types: all your data needs really be into struct or you can just writing a pointer to some other place.
structure changes: if you need change your structure you will need write a converter to read old struct and write the new.
language interoperability: will be hard to access the data using other language
It was a common practice in the early days before relational databases popularization. You can make index files pointing to a record number.
However nowadays I will advice you to make serialization and write strings instead binaries.
NOTE:
if string is something like char[40] your code maybe will survive... but if your question is about C++ and string is a class then kill you child before it grows up! The string object characters are not into your struct but in the heap.
Writing data in binary is extremely useful and much faster then reading/writing in text, take for instance video games (Although not every video game does this), when the game is saved all of the nescessary structures/classes and other data are written into a save file in binary.
It is just one use for using binary, but the major reason for doing this is speed.
And to read the data back, you will need to know the format that you saved it in, for instance as a simple example, if I saved an integer, char array of n size, and a boolean, I would need to read the binary file in as an integer, char array of n size, and a boolean. Otherwise the data is read improperly and will not be very useful at all
Be careful. The type of field 'name' in your structure is 'string'. This class contains data allocated dynamically. So writing 'string' data into file this way only pointers will be writed, not data itself.
The C++ Middleware Writer supports binary serialization to/from files.
From a marshalling perspective the "unsigned int age" member of your struct is a potential problem. I'd consider changing the type to uint32_t.
Due to annoying overflow problems with C++, I want to instead use Python to precompute some values. I have a function f(a,b) that will then spit out a value. I want to be able to output all the values I need based on ranges of a and b into a file, and then in C++ read that file and popular a vector or array or whatever's better.
What is a good format to output f(a,b) in?
What's the best way to read this back into C++?
Vector or multidim array?
You can use Python to write out a .h file that is compatible with C++ source syntax.
h_file.write('{')
for a in range(a_size):
h_file.write('{' + ','.join(str(f(a, b)) for b in range(b_size)) + '},\n')
h_file.write('}')
You will probably want to modify that code to throw some extra newlines in, and in fact I have such code that I can show later (don't have access to it now).
You can use Python to write out C++ source code that contains your data. E.g:
def f(a, b):
# Your function here, e.g:
return pow(a, b, 65537)
num_a_values = 50
num_b_values = 50
# Write source file
with open('data.cpp', 'wt') as cpp_file:
cpp_file.write('/* Automatically generated file, do not hand edit */\n\n')
cpp_file.write('#include "data.hpp"\n')
cpp_file.write('const int f_data[%d][%d] =\n'
% (num_a_values, num_b_values))
cpp_file.write('{\n')
for a in range(num_a_values):
values = [f(a, b) for b in range(num_b_values)]
cpp_file.write(' {' + ','.join(map(str, values)) + '},\n')
cpp_file.write('}\n')
# Write corresponding header file
with open('data.hpp', 'wt') as hpp_file:
hpp_file.write('/* Automatically generated file, do not hand edit */\n\n')
hpp_file.write('#ifndef DATA_HPP_INCLUDED\n')
hpp_file.write('#define DATA_HPP_INCLUDED\n')
hpp_file.write('#define NUM_A_VALUES %d\n' % num_a_values)
hpp_file.write('#define NUM_B_VALUES %d\n' % num_b_values)
hpp_file.write('extern const int f_data[%d][%d];\n'
% (num_a_values, num_b_values))
hpp_file.write('#endif\n')
You then compile the generated source code as part of your project. You can then use it by #including the header and accessing the f_data[] array directly.
This works really well for small to medium size data tables, e.g. icons. For larger data tables (millions of entries) some C compilers will fail, and you may find that the compile/link is unacceptably slow.
If your data is more complicated, you can use this same method to define structures.
[Based on Mark Ransom's answer, but with some style differences and more explanation].
If there is megabytes of data, then I would read the data in by memory mapping the data file, read-only. I would arrange things so I can use the data file directly, without having to read it all in at startup.
The reason for doing it this way is that you don't want to read megabytes of data at startup if you're only going to use some of the values. By using memory mapping, your OS will automatically read just the parts of the file that you need. And if you run low on RAM, your OS can reuse the memory allocated for that file without having to waste time writing it to the swap file.
If the output of your function is a single number, you probably just want an array of ints. You'll probably want a 2D array, e.g.:
#define DATA_SIZE (50 * 25)
typedef const int (*data_table_type)[50];
int fd = open("my_data_file.dat", O_RDONLY);
data_table_type data_table = (data_table_type)mmap(0, DATA_SIZE,
PROT_READ, MAP_SHARED, fd, 0);
printf("f(5, 11) = %d\n", data_table[5][11]);
For more info on memory mapped files, see Wikipedia, or the UNIX mmap() function, or the Windows CreateFileMapping() function.
If you need more complicated data structures, you can put C/C++ structures and arrays into the file. But you can't embed pointers or any C++ class that has a virtual anything.
Once you've decided on how you want to read the data, the next question is how to generate it. struct.pack() is very useful for this - it will allow you to convert Python values into a properly-formatted Python string, which you can then write to a file.