Write int16_t Array to Text File Without Looping

Write int16_t Array to Text File Without Looping - c++

I am trying to write int61_t data to a ply file (a text file with a special header). I a piece of code that does this with a time consuming loop I am trying to speed it up.
I want to avoid the time spent iterating through the array by putting my data directly into ofs.write. Can I do this?
This is what I've tried
The functional, but slow, code is as follows:
int width = point_cloud_image.get_width_pixels();
int height = point_cloud_image.get_height_pixels();
int i_max = width * height;
// get data
int16_t* point_cloud_image_data = (int16_t*)(void*)point_cloud_image.get_buffer();
std::stringstream ss;
for (int i = 0; i < i_max; i++) // executes 921600 times - this is the bottleneck
{
ss << point_cloud_image_data[3 * i + 0\] << " " << point_cloud_image_data[3 * i + 1\] << " " << point_cloud_image_data[3 * i + 2] << "\n";
}
// save to the ply file
std::ofstream ofs("myfile.ply", std::ios::out | std::fstream::binary); // text mode first
ofs << "ply\n" << "format ascii 1.0\n" << "element vertex" << " " << i_max << "\n" << "property float x\n" << "property float y\n" << "property float z\n" << "end_header\n" << std::endl;
ofs.write(ss.str().c_str(), (std::streamsize)ss.str().length());
ofs.close();
I want to avoid the time spent iterating through the array by putting my point_cloud_image_data pointer directly into ofs.write. My code to do that looks like this:
int width = point_cloud_image.get_width_pixels();
int height = point_cloud_image.get_height_pixels();
int i_max = width * height;
// get data
int16_t* point_cloud_image_data = (int16_t*)(void*)point_cloud_image.get_buffer();
// save to the ply file
std::ofstream ofs("myfile.ply", std::ios::out | std::fstream::binary); // text mode first
ofs << "ply\n" << "format ascii 1.0\n" << "element vertex" << " " << i_max << "\n" << "property float x\n" << "property float y\n" << "property float z\n" << "end_header\n" << std::endl;
ofs.write((char*)(char16_t*)point_cloud_image_data, i_max);
ofs.close();
This is a lot faster, but now point_cloud_image_data is written in binary (the file contains characters like this: ¥ûú). How can I write the array to the text file without a time consuming loop?

Integers are stored in the computer in binary representation. To write an array of integer values to a text file, each one needs to be converted to a series of decimal digits. So you're going to need a loop. Even with buffering and compiler optimizations enabled, conversion of binary to text and back will inevitably be slower than directly working with binary data.
But if all you care about is raw performance, the PLY format can actually be binary. So your second attempt might actually work and produce a working (albeit non-human-readable) PLY file.
int width = point_cloud_image.get_width_pixels();
int height = point_cloud_image.get_height_pixels();
int i_max = width * height;
std::ofstream ofs("myfile.ply", std::ios::out | std::fstream::binary);
ofs << "ply\n"
<< (is_little_endian
? "format binary_little_endian 1.0\n"
: "format binary_big_endian 1.0\n")
<< "element vertex " << i_max << "\n"
<< "property short x\n"
<< "property short y\n"
<< "property short z\n"
<< "end_header\n";
ofs.write((const char*)point_cloud_image.get_buffer(), i_max * 2 * 3);
ofs.close();
The is_little_endian check is optional and can be omitted, but it makes the code a little bit more portable.
int num = 1;
bool is_little_endian = *(char *)&num == 1;

Related

Am I really copying the bytes or am I copying characters in this case?

I have a vector of unsigned char where I copy bytes in C++. I convert all primitive types to bytes and copy to this vector of char (which is interpreted as bytes in C++). Now I am copying also strings. But I am not sure if I am converting strings to bytes. If you take a look at my output when I am printing the vector of unsigned char I am printing bytes from double int float but I am printing the real string of my variable testString. So I suppose that I am not inserting bytes of this testString on my vector of unsigned char. How should I do that?
Thanks
const std::string lat = "lat->", alt = "alt->", lon = "lon->", testString = "TEST-STRING";
double latitude = 10.123456;
double longitude = 50.123456;
double altitude = 1.123456;
std::vector<unsigned char> result(
sizeof(latitude) + sizeof(longitude) + sizeof(altitude) + testString.length());
std::cout << "copying to the vector" << std::endl;
memcpy(result.data(), &longitude, sizeof(longitude));
memcpy(result.data() + sizeof(longitude), &latitude, sizeof(latitude));
memcpy(result.data() + sizeof(longitude) + sizeof(latitude), &altitude, sizeof(altitude));
memcpy(result.data() + sizeof(longitude) + sizeof(latitude) + sizeof(altitude), testString.c_str(),
testString.length() + 1);
std::cout << "copied to the vector\n" << std::endl;
std::cout << "printing the vector" << std::endl;
for (unsigned int j = 0; j < result.size(); j++) {
std::cout << result[j];
}
std::cout << std::endl;
std::cout << "printed the vector\n" << std::endl;
// testing converting back ...................
std::cout << "printing back the original value" << std::endl;
double dLat, dLon, dAlt;
std::string value;
memcpy(&dLon, result.data(), sizeof(longitude));
memcpy(&dLat, result.data() + sizeof(longitude), sizeof(latitude));
memcpy(&dAlt, result.data() + sizeof(longitude) + sizeof(latitude), sizeof(altitude));
value.resize(testString.length());
memcpy(&value[0], result.data() + sizeof(longitude) + sizeof(latitude) + sizeof(altitude),
sizeof(value.data()) + testString.size());
std::cout << alt << dAlt;
std::cout << lat << dLat;
std::cout << lon << dLon;
std::cout << " " << value << std::endl;
std::cout << "printed back the original value\n" << std::endl;
output:
copying to the vector
copied to the vector
printing the vector
[?�gI#m���5?$#l������?TEST-STRING
printed the vector
printing back the original value
alt->1.12346lat->10.1235lon->50.1235 TEST-STRING
printed back the original value

There's no problem with your code! You're printing the actual bytes of your variables. The bytes in a double can't really be interpreted as a text string (at least, it doesn't make sense if you do) but the bytes in a text string can, producing what you see.
Let's say you've got the following code (which is really just disguised C):
#include <cstdio>
int main(int argc, char *argv[]) {
struct {
double latitude;
double longitude;
char name[30];
} structure = {
53.6344,
126.5223167,
"Keyboard Mash"
};
printf("%f %f %s\n", structure.latitude, structure.longitude, structure.name);
for (size_t i = 0; i < sizeof(structure); i += 1) {
printf("%c", ((char*)&structure)[i]);
}
printf("\n");
}
This code would (probably) print:
53.6344 126.5223167 Keyboard Mash
����������������Keyboard Mash�����������������
The first 16 bytes are from the doubles, and the next 30 are from the char[]. That's just how char[]s are stored! Your code is doing what you'd expect it to.
Of course, you can't rely on it doing this in exactly this way; that's undefined behaviour.

I feel like you were expecting something like: 128565TESTSTRING where 12, 85 and 65 are values of longitude, latitude and altitude. Well, that's not going to happen be cause you wrote 12 in the data, not "12"; therefore, it will return you the character whose ASCII code is 12. Maybe you could use something like sprintf() instead.

Writing and reading a binary file to fill a vector - C++

I'm working on a project that involves binary files.
So I started researching about binary files but I'm still confused about how to write and fill a vector from that binary file that I wrote before
Here's code: for writing.
void binario(){
ofstream fout("./Binario/Data.AFe", ios::out | ios::binary);
vector<int> enteros;
enteros.push_back(1);
enteros.push_back(2);
enteros.push_back(3);
enteros.push_back(4);
enteros.push_back(5);
//fout.open()
//if (fout.is_open()) {
std::cout << "Entre al if" << '\n';
//while (!fout.eof()) {
std::cout << "Entre al while" << '\n';
std::cout << "Enteros size: "<< enteros.size() << '\n';
int size1 = enteros.size();
for (int i = 0; i < enteros.size(); i++) {
std::cout << "for " << i << '\n';
fout.write((char*)&size1, 4);
fout.write((char*)&enteros[i], size1 * sizeof(enteros));
//cout<< fout.get(entero[i])<<endl;
}
//fout.close();
//}
fout.close();
cout<<"copiado con exito"<<endl;
//}
}
Here's code for reading:
oid leerBinario(){
vector<int> list2;
ifstream is("./Binario/Data.AFe", ios::binary);
int size2;
is.read((char*)&size2, 4);
list2.resize(size2);
is.read((char*)&list2[0], size2 * sizeof(list2));
std::cout << "Size del vector: " << list2.size() <<endl;
for (int i = 0; i < list2.size(); i++) {
std::cout << i << ". " << list2[i] << '\n';
}
std::cout << "Antes de cerrar" << '\n';
is.close();
}
I don't know if I'm writing correctly to the file, this is just a test so I don't mess up my main file, instead of writing numbers I need to save Objects that are stored in a vector and load them everytime the user runs the program.

Nope, you're a bit confused. You're writing the size in every iteration, and then you're doing something completely undefined when you try to write the value. You can actually do this without the loop, when you are using a vector.
fout.write(&size1, sizeof(size1));
fout.write(enteros.data(), size1 * sizeof(int));
And reading in:
is.read(&list2[0], size2 * sizeof(int));
To be more portable you might want to use data types that won't change (for example when you switch from 32-bit compilation to 64-bit). In that case, use stuff from <cctype> -- e.g. int32_t for both the size and value data.

Access violation reading location using binary file

First off, I know there are posts with similar problems, but I cannot find the solution to mine in any of them.
This is for a programming assignment using binary and text files to store "Corporate sales data." (Division name, quarter, and sales), and then to search for specified records inside the binary data file and display them.
Here are the important parts of my code:
#include stuff
...
// Struct to hold division data
struct DIVISION_DATA_S
{
string divisionName;
int quarter;
double sales;
};
int main()
{
...
// Open the data file
fstream dataFile;
dataFile.open(dataFilePath, ios::in | ios::out | ios::binary);
... Get data from user, store in an instance of my struct ...
// Dump struct into binary file
dataFile.write(reinterpret_cast<char *>(&divisionData), sizeof(divisionData));
// Cycle through the targets file and display the record from divisiondata.dat for each entry
while(targetsFile >> targetDivisionName)
{
int targetQuarter; // Target quarter
string targetQuarterStr;
targetsFile.ignore(); // Ignore the residual '\n' from the ">>" read
getline(targetsFile, targetQuarterStr);
targetQuarter = atoi(targetQuarterStr.c_str()); // Parses into an int
cout << "Target: " << targetDivisionName << " " << targetQuarter << endl;
// Linear search the data file for the required name and quarter to find sales amount
double salesOfTarget;
bool isFound = false;
while (!isFound && !dataFile.eof())
{
cout << "Found division data: " << targetDivisionName << " " << targetQuarter << endl;
DIVISION_DATA_S divisionData;
// Read an object from the file, cast as DIVISION_DATA_S
dataFile.read(reinterpret_cast<char *>(&divisionData), sizeof(divisionData));
cout << "Successfully read data for " << targetDivisionName << " " << targetQuarter << endl
<< "Name: " << divisionData.divisionName << ", Q: " << divisionData.quarter << ", "
<< "Sales: " << divisionData.sales << endl;
// Test for a match of both fields
if (divisionData.divisionName == targetDivisionName && divisionData.quarter == targetQuarter)
{
isFound = true;
cout << "Match!" << endl;
salesOfTarget = divisionData.sales;
}
}
if (!isFound) // Error message if record is not found in data file
{
cout << "\nError. Could not find record for " << targetDivisionName
<< " division, quarter " << targetQuarter << endl;
}
else
{
// Display the corresponding record
cout << "Division: " << targetDivisionName << ", Quarter: " << targetQuarter
<< "Sales: " << salesOfTarget << endl;
totalSales += salesOfTarget; // Add current sales to the sales accumulator
numberOfSalesFound++; // Increment total number of sales found
}
}
Sorry for the lack of indent for the while loop, copy/paste kind of messed it up.
My problem appears when attempting to access information read from the binary file. For instance, when it tries to execute the cout statement I added for debugging, it gives me this error:
Unhandled exception at 0x0FED70B6 (msvcp140d.dll) in CorporateSalesData.exe: 0xC0000005: Access violation reading location 0x310A0D68.
Now, from what I have read, it seems that this means something is trying to read from the very low regions of memory, AKA something somewhere has something to do with a null pointer, but I can't imagine how that would appear. This whole read operation is copied exactly from my textbook, and I have no idea what a reinterpret_chast is, let alone how it works or how to fix errors with it. Please help?
EDIT: Thanks for all the help. To avoid complications or using something I don't fully understand, I'm gonna go with switching to a c-string for the divisionName.

dataFile.write(reinterpret_cast<char *>(&divisionData), sizeof(divisionData));
Works only if you have POD types. It does not work when you have a std::string in there. You'll need to use something along the lines of:
// Write the size of the string.
std::string::size_type size = divisionDat.divisionName.size();
dataFile.write(reinterpret_cast<char*>(&size), sizeof(size));
// Now write the string.
dataFile.write(reinterpret_cast<char*>(divisionDat.divisionName.c_str()), size);
// Write the quarter and the sales.
dataFile.write(reinterpret_cast<char*>(&divisionDat.quarter), sizeof(divisionDat.quarter));
dataFile.write(reinterpret_cast<char*>(&divisionDat.sales), sizeof(divisionDat.sales));
Change the read calls to match the write calls.

// Dump struct into binary file
dataFile.write(reinterpret_cast<char *>(&divisionData), sizeof(divisionData));
/*...*/
// Read an object from the file, cast as DIVISION_DATA_S
dataFile.read(reinterpret_cast<char *>(&divisionData), sizeof(divisionData));
This will categorically not work under any circumstances.
std::string uses heap-allocated pointers to store any string data it contains. What you're writing to the file is not the contents of the string, but simply the address where the string's data is located (along with some meta-data). If you arbitrarily read those pointers and treat them as memory (like you are in the cout statement) you'll reference deleted memory.
You have two options.
If all you want is a struct that can be easily serialized, then simply convert it like so:
// Struct to hold division data
struct DIVISION_DATA_S
{
char divisionName[500];
int quarter;
double sales;
};
Of course, with this style, you're limited to interacting with the name as a c-string, and also are limited to 500 characters.
The other option is to properly serialize this object.
// Struct to hold division data
struct DIVISION_DATA_S
{
string divisionName;
int quarter;
double sales;
string serialize() const { //Could also have the signature be std::vector<char>, but this will make writing with it easier.
string output;
std::array<char, 8> size_array;
size_t size_of_string = divisionName.size();
for(char & c : size_array) {
c = size_of_string & 0xFF;
size_of_string >>= 8;
}
output.insert(output.end(), size_array.begin(), size_array.end());
output.insert(output.end(), divisionName.begin(), divisionName.end());
int temp_quarter = quarter;
for(char & c : size_array) {
c = temp_quarter & 0xFF;
temp_quarter >>= 8;
}
output.insert(output.end(), size_array.begin(), size_array.begin() + sizeof(int));
size_t temp_sales = reinterpret_cast<size_t>(sales);
for(char & c : size_array) {
c = temp_sales & 0xFF;
temp_sales >>= 8;
}
output.insert(output.end(), size_array.begin(), size_array.end());
return output;
}
size_t unserialize(const string & input) {
size_t size_of_string = 0;
for(int i = 7; i >= 0; i--) {
size_of_string <<= 8;
size_of_string += unsigned char(input[i]);
}
divisionName = input.substr(7, 7 + size_of_string);
quarter = 0;
for(int i = 10 + size_of_string; i >= 7 + size_of_string; i--) {
quarter <<= 8;
quarter += unsigned char(input[i]);
}
size_t temp_sales = 0;
for(int i = 18 + size_of_string; i >= 11 + size_of_string; i--) {
temp_sales <<= 8;
temp_sales += unsigned char(input[i]);
}
sales = reinterpret_cast<double>(temp_sales);
return 8 + size_of_string + 4 + 8;
}
};
Writing to files is pretty easy:
dataFile << divisionData.serialize();
Reading can be a little harder:
stringstream ss;
ss << dataFile.rdbuf();
string file_data = ss.str();
size_t size = divisionData.unserialize(file_data);
file_data = file_data.substr(size);
size = divisionData.unserialize(file_data);
/*...*/
By the way, I haven't checked my code for syntax or completeness. This example is meant to serve as a reference for the kind of code that you'd need to write to properly serialize/unserialize complex objects. I believe it to be correct, but I wouldn't just throw it in untested.

welcome to the world of serialization. You are trying to 'bit blit' your structure into a file. This only works for very simple types (int, float, char[xxx]) where the data is actually inline. And even when it does work you are stuck with reloading the data into the same type of machine (same word size, same endianness).
What you need to do is serilaze the data and then deserialize it back. You can invent ways of doing that yourself or you can use one on many standards. There are 2 basic types - binary (efficient, not human readable) and text (less efficient but human readable)
text
json
yaml
xml
csv
binary
protobuf
boost has a serialization library http://www.boost.org/doc/libs/1_61_0/libs/serialization/doc/
Also you might like to look here
https://isocpp.org/wiki/faq/serialization

C++ Reading back "incorrect" values from binary file?

The project I'm working on, as a custom file format consisting of the header of a few different variables, followed by the pixel data. My colleagues have developed a GUI, where processing, writing reading and displaying this type of file format works fine.
But my problem is, while I have assisted in writing the code for writing data to disk, I cannot myself read this kind of file and get satisfactorily values back. I am able to read the first variable back (char array) but not the following value(s).
So the file format matches the following structure:
typedef struct {
char hxtLabel[8];
u64 hxtVersion;
int motorPositions[9];
int filePrefixLength;
char filePrefix[100];
..
} HxtBuffer;
In the code, I create an object of the above structure and then set these example values:
setLabel("MY_LABEL");
setFormatVersion(3);
setMotorPosition( 2109, 5438, 8767, 1234, 1022, 1033, 1044, 1055, 1066);
setFilePrefixLength(7);
setFilePrefix( string("prefix_"));
setDataTimeStamp( string("000000_000000"));
My code for opening the file:
// Open data file, binary mode, reading
ifstream datFile(aFileName.c_str(), ios::in | ios::binary);
if (!datFile.is_open()) {
cout << "readFile() ERROR: Failed to open file " << aFileName << endl;
return false;
}
// How large is the file?
datFile.seekg(0, datFile.end);
int length = datFile.tellg();
datFile.seekg(0, datFile.beg);
cout << "readFile() file " << setw(70) << aFileName << " is: " << setw(15) << length << " long\n";
// Allocate memory for buffer:
char * buffer = new char[length];
// Read data as one block:
datFile.read(buffer, length);
datFile.close();
/// Looking at the start of the buffer, I should be seeing "MY_LABEL"?
cout << "buffer: " << buffer << " " << *(buffer) << endl;
int* mSSX = reinterpret_cast<int*>(*(buffer+8));
int* mSSY = reinterpret_cast<int*>(&buffer+9);
int* mSSZ = reinterpret_cast<int*>(&buffer+10);
int* mSSROT = reinterpret_cast<int*>(&buffer+11);
int* mTimer = reinterpret_cast<int*>(&buffer+12);
int* mGALX = reinterpret_cast<int*>(&buffer+13);
int* mGALY = reinterpret_cast<int*>(&buffer+14);
int* mGALZ = reinterpret_cast<int*>(&buffer+15);
int* mGALROT = reinterpret_cast<int*>(&buffer+16);
int* filePrefixLength = reinterpret_cast<int*>(&buffer+17);
std::string filePrefix; std::string dataTimeStamp;
// Read file prefix character by character into stringstream object
std::stringstream ss;
char* cPointer = (char *)(buffer+18);
int k;
for(k = 0; k < *filePrefixLength; k++)
{
//read string
char c;
c = *cPointer;
ss << c;
cPointer++;
}
filePrefix = ss.str();
// Read timestamp character by character into stringstream object
std::stringstream timeStampStream;
/// Need not increment cPointer, already pointing # 1st char of timeStamp
for (int l= 0; l < 13; l++)
{
char c;
c = * cPointer;
timeStampStream << c;
}
dataTimeStamp = timeStampStream.str();
cout << 25 << endl;
cout << " mSSX: " << mSSX << " mSSY: " << mSSY << " mSSZ: " << mSSZ;
cout << " mSSROT: " << mSSROT << " mTimer: " << mTimer << " mGALX: " << mGALX;
cout << " mGALY: " << mGALY << " mGALZ: " << mGALZ << " mGALROT: " << mGALROT;
Finally, what I see is here below. I added the 25 just to double check that not everything was coming out in hexadecimal. As you can see, I am able to see the label "MY_LABEL" as expected. But the 9 motorPositions all come out looking suspiciously like addresses are not values. The file prefix and the data timestamp (which should be strings, or at least characters), are just empty.
buffer: MY_LABEL M
25
mSSX: 0000000000000003 mSSY: 00000000001BF618 mSSZ: 00000000001BF620 mSSROT: 00000000001BF628 mTimer: 00000000001BF630 mGALX: 00000000001BF638 mGALY: 00000000001BF640 mGALZ: 00000000001BF648 mGALROT: 00000000001BF650filePrefix: dataTimeStamp:
I'm sure the solution can't be too complicated, but I reached a stage where I had this just spinning and I cannot make sense of things.
Many thanks for reading this somewhat long post.
-- Edit--
I might hit the maximum length allowed for a post, but just in case I thought I shall post the code that generates the data that I'm trying to read back:
bool writePixelOutput(string aOutputPixelFileName) {
// Write pixel histograms out to binary file
ofstream pixelFile;
pixelFile.open(aOutputPixelFileName.c_str(), ios::binary | ios::out | ios::trunc);
if (!pixelFile.is_open()) {
LOG(gLogConfig, logERROR) << "Failed to open output file " << aOutputPixelFileName;
return false;
}
// Write binary file header
string label("MY_LABEL");
pixelFile.write(label.c_str(), label.length());
pixelFile.write((const char*)&mFormatVersion, sizeof(u64));
// Include File Prefix/Motor Positions/Data Time Stamp - if format version > 1
if (mFormatVersion > 1)
{
pixelFile.write((const char*)&mSSX, sizeof(mSSX));
pixelFile.write((const char*)&mSSY, sizeof(mSSY));
pixelFile.write((const char*)&mSSZ, sizeof(mSSZ));
pixelFile.write((const char*)&mSSROT, sizeof(mSSROT));
pixelFile.write((const char*)&mTimer, sizeof(mTimer));
pixelFile.write((const char*)&mGALX, sizeof(mGALX));
pixelFile.write((const char*)&mGALY, sizeof(mGALY));
pixelFile.write((const char*)&mGALZ, sizeof(mGALZ));
pixelFile.write((const char*)&mGALROT, sizeof(mGALROT));
// Determine length of mFilePrefix string
int filePrefixSize = (int)mFilePrefix.size();
// Write prefix length, followed by prefix itself
pixelFile.write((const char*)&filePrefixSize, sizeof(filePrefixSize));
size_t prefixLen = 0;
if (mFormatVersion == 2) prefixLen = mFilePrefix.size();
else prefixLen = 100;
pixelFile.write(mFilePrefix.c_str(), prefixLen);
pixelFile.write(mDataTimeStamp.c_str(), mDataTimeStamp.size());
}
// Continue writing header information that is common to both format versions
pixelFile.write((const char*)&mRows, sizeof(mRows));
pixelFile.write((const char*)&mCols, sizeof(mCols));
pixelFile.write((const char*)&mHistoBins, sizeof(mHistoBins));
// Write the actual data - taken out for briefy sake
// ..
pixelFile.close();
LOG(gLogConfig, logINFO) << "Written output histogram binary file " << aOutputPixelFileName;
return true;
}
-- Edit 2 (11:32 09/12/2015) --
Thank you for all the help, I'm closer to solving the issue now. Going with the answer from muelleth, I try:
/// Read into char buffer
char * buffer = new char[length];
datFile.read(buffer, length);// length determined by ifstream.seekg()
/// Let's try HxtBuffer
HxtBuffer *input = new HxtBuffer;
cout << "sizeof HxtBuffer: " << sizeof *input << endl;
memcpy(input, buffer, length);
I can then display the different struct variables:
qDebug() << "Slice BUFFER label " << QString::fromStdString(input->hxtLabel);
qDebug() << "Slice BUFFER version " << QString::number(input->hxtVersion);
qDebug() << "Slice BUFFER hxtPrefixLength " << QString::number(input->filePrefixLength);
for (int i = 0; i < 9; i++)
{
qDebug() << i << QString::number(input->motorPositions[i]);
}
qDebug() << "Slice BUFFER filePrefix " << QString::fromStdString(input->filePrefix);
qDebug() << "Slice BUFFER dataTimeStamp " << QString::fromStdString(input->dataTimeStamp);
qDebug() << "Slice BUFFER nRows " << QString::number(input->nRows);
qDebug() << "Slice BUFFER nCols " << QString::number(input->nCols);
qDebug() << "Slice BUFFER nBins " << QString::number(input->nBins);
The output is then mostly as expected:
Slice BUFFER label "MY_LABEL"
Slice BUFFER version "3"
Slice BUFFER hxtPrefixLength "2"
0 "2109"
1 "5438"
...
7 "1055"
8 "1066"
Slice BUFFER filePrefix "-1"
Slice BUFFER dataTimeStamp "000000_000000P"
Slice BUFFER nRows "20480"
Slice BUFFER nCols "256000"
Slice BUFFER nBins "0"
EXCEPT, dataTimeStamp, which is 13 chars long, displays instead 14 chars. The 3 variables that follow: nRows, nCols and nBins are then incorrect. (Should be nRows=80, nCols=80, nBins=1000). My guess is that the bits belonging to the 14th char of dataTimeStamp should be read along with nRows, and so cascade on to produce the correct nCols and nBins.
I have separately verified (not shown here) using qDebug that what I'm writing into the file, really are the values I expect, and their individual sizes.

I personally would try to read exactly the number of bytes your struct is from the file, i.e. something like
int length = sizeof(HxtBuffer);
and then simply use memcpy to assign a local structure from the read buffer:
HxtBuffer input;
memcpy(&input, buffer, length);
You can then access your data e.g. like:
std::cout << "Data: " << input.hxtLabel << std::endl;

Why do you read to buffer, instead of using the structure for reading?
HxtBuffer data;
datFile.read(reinterpret_cast<char *>(&data), sizeof data);
if(datFile && datFile.gcount()!=sizeof data)
throw io_exception();
// Can use data.
If you want to read to a chracter buffer, than your way of getting the data is just wrong. You probably want to do something like this.
char *buf_offset=buffer+8+sizeof(u64); // Skip label (8 chars) and version (int64)
int mSSX = *reinterpret_cast<int*>(buf_offset);
buf_offset+=sizeof(int);
int mSSY = *reinterpret_cast<int*>(buf_offset);
buf_offset+=sizeof(int);
int mSSZ = *reinterpret_cast<int*>(buf_offset);
/* etc. */
Or, a little better (provided you don't change the contents of the buffer).
int *ptr_motors=reinterpret_cast<int *>(buffer+8+sizeof(u64));
int &mSSX = ptr_motors[0];
int &mSSY = ptr_motors[1];
int &mSSZ = ptr_motors[2];
/* etc. */
Notice that I don't declare mSSX, mSSY etc. as pointers. Your code was printing them as addresses because you told the compiler that they were addresses (pointers).

Faster Alternative to std::ofstream

I generate a set of data files. As the files are supposed to be readable, they text files (opposed to binary files).
To output information to my files, I used very comfortable std::ofstream object.
In the beginning, when the data to be exported was smaller, the time needed to write to the files was not noticeable. However, as the information to be exported has accumulated, it takes now around 5 minutes to generate them.
As I started being bothered by waiting, my question is obvious: Is there any faster alternative to std::ofstream, please? In case there is faster alternative, will it be worth of rewritting my application? In other words, could the time saved be +50%? Thank you.
Update:
I was asked to show you my code that generates the above files, so here you are - the most time consuming loop:
ofstream fout;
fout.open(strngCollectiveSourceFileName,ios::out);
fout << "#include \"StdAfx.h\"" << endl;
fout << "#include \"Debug.h\"" << endl;
fout << "#include \"glm.hpp\"" << endl;
fout << "#include \"" << strngCollectiveHeaderFileName.substr( strngCollectiveHeaderFileName.rfind(TEXT("\\")) + 1) << "\"" << endl << endl;
fout << "using namespace glm;" << endl << endl << endl;
for (unsigned int nSprite = 0; nSprite < vpTilesetSprites.size(); nSprite++ )
{
for(unsigned int nFrameSet = 0; nFrameSet < vpTilesetSprites[nSprite]->vpFrameSets.size(); nFrameSet++)
{
// display index definition
fout << "// Index Definition: " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetLongDescription() << "\n";
string strngIndexSignature = strngIndexDefinitionSignature;
strngIndexSignature.replace(strngIndexSignature.find(TEXT("#aIndexArrayName#")), strlen(TEXT("#aIndexArrayName#")), TEXT("a") + vpTilesetSprites[nSprite]->GetObjectName() + vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetFrameSetName() + TEXT("IndexData") );
strngIndexSignature.replace(strngIndexSignature.find(TEXT("#ClassName#")), strlen(TEXT("#ClassName#")), strngCollectiveArrayClassName );
fout << strngIndexSignature << "[4] = {0, 1, 2, 3};\t\t" << "// " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetShortDescription() << ": Index Definition\n\n";
// display vertex definition
fout << "// Vertex Definition: " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetLongDescription() << "\n";
string strngVertexSignature = strngVertexDefinitionSignature;
strngVertexSignature.replace(strngVertexSignature.find(TEXT("#aVertexArrayName#")), strlen(TEXT("#aVertexArrayName#")), TEXT("a") + vpTilesetSprites[nSprite]->GetObjectName() + vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetFrameSetName() + TEXT("VertexData") );
strngVertexSignature.replace(strngVertexSignature.find(TEXT("#ClassName#")), strlen(TEXT("#ClassName#")), strngCollectiveArrayClassName );
fout << strngVertexSignature << "[" << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetFramesCount() << "] =\n";
fout << "{\n";
for (int nFrameNo = 0; nFrameNo < vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetFramesCount(); nFrameNo++)
{
fout << "\t" << "{{ vec4(" << fixed << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vPosition.fx << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vPosition.fy << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vPosition.fz << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vPosition.fw << "f), vec2(" << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vTextureUV.fu << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vTextureUV.fv << "f) }, // " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetShortDescription() << " vertex 1: vec4(x, y, z, w), vec2(u, v) \n";
fout << "\t" << " { vec4(" << fixed << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vPosition.fx << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vPosition.fy << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vPosition.fz << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vPosition.fw << "f), vec2(" << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vTextureUV.fu << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vTextureUV.fv << "f) }, // " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetShortDescription() << " vertex 2: vec4(x, y, z, w), vec2(u, v) \n";
fout << "\t" << " { vec4(" << fixed << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vPosition.fx << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vPosition.fy << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vPosition.fz << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vPosition.fw << "f), vec2(" << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vTextureUV.fu << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vTextureUV.fv << "f) }, // " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetShortDescription() << " vertex 3: vec4(x, y, z, w), vec2(u, v) \n";
fout << "\t" << " { vec4(" << fixed << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vPosition.fx << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vPosition.fy << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vPosition.fz << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vPosition.fw << "f), vec2(" << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vTextureUV.fu << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vTextureUV.fv << "f) }}, // " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetShortDescription() << " vertex 4: vec4(x, y, z, w), vec2(u, v) \n\n";
}
fout << "};\n\n\n\n";
}
}
fout.close();

If you don't want to use C file I/O then you can give a try to; FastFormat. Look at the comparison for more info.

How are vpTilesetSprites and vpTilesetSprites[nSprite] stored? Are they implemented with lists or arrays? There is a lot of indexed access to them, and if they are list-like structures, you'll spend a lot of extra time following needless pointers. Ed S.'s comment is right: giving the long indexed temporary variables and linebreaks could make it easier to read, and maybe faster, too:
fout << "// Index Definition: " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetLongDescription() << "\n";
string strngIndexSignature = strngIndexDefinitionSignature;
strngIndexSignature.replace(strngIndexSignature.find(TEXT("#aIndexArrayName#")), strlen(TEXT("#aIndexArrayName#")), TEXT("a") + vpTilesetSprites[nSprite]->GetObjectName() + vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetFrameSetName() + TEXT("IndexData") );
strngIndexSignature.replace(strngIndexSignature.find(TEXT("#ClassName#")), strlen(TEXT("#ClassName#")), strngCollectiveArrayClassName );
vs
string idxsig = strngIndexDefinitionSignature;
sprite sp = vpTilesetSprites[nSprite];
frameset fs = sp->vpFrameSets[nFrameSet];
fout << "// Index Definition: " << fs->GetLongDescription() << "\n";
idxsig.replace(idxsig.find(TEXT("#aIndexArrayName#")), strlen(TEXT("#aIndexArrayName#")),
TEXT("a") + sp->GetObjectName() + fs->getFrameSetName() + TEXT("IndexData"));
idxsig.replace(idxsig.find(TEXT("#ClassName#")), strlen(TEXT("#ClassName#")),
strngCollectiveArrayClassName);
But, the much bigger problem is how you're using strings as templates; you're looking for a given text string (and computing the length of your needle string every single time you need it!) over and over again.
Consider this: You're performing the find and replace operations nSprite * nFrameSet times. Each time through, this loop:
makes a copy of strngIndexDefinitionSignature
creates four temporary string objects when concatenating static and dynamic strings
compute strlen(TEXT("#ClassName#"))
compute strlen(TEXT("#aIndexArrayName#"))
find start point of both
replace both texts with new texts
And that's just the first four lines of your loop.
Can you replace your strngIndexDefinitionSignature with a format string? I assume it currently looks like this:
"flubber #aIndexArrayName# { blubber } #ClassName# blorp"
If you re-write it like this:
"flubber a %s%sIndexData { blubber } %s blorp"
Then your two find and replace lines can be replaced with:
sprintf(out, index_def_sig, sp->GetObjectName(), fs->getFrameSetName(),
strngCollectiveArrayClassName);
This would remove two find() operations, two replace() operations, creating and destroying four temporary string objects, a string duplicate that was promptly over-written with two replace() calls, and two strlen() operations that return the same result every time (but aren't actually needed anyway).
You can then output your string with << as usual. Or, you can change sprintf(3) to fprintf(3), and avoid even the temporary C string.

Assuming you do it in large enough chunks, calling write() directly might be faster; that said, it's more likely that your biggest bottleneck doesn't have anything directly to do with std::ofstream. The most obvious thing is to make sure you aren't using std::endl (because flushing the stream frequently will kill performance). Beyond that, I would suggest profiling your app to see where it's actually spending the time.

The performance of ostream is probably not your actual issue; I suggest using a profiler to determine where your real bottlenecks are. If ostream turns out to be your actual problem, you can drop down to <cstdio> and use fprintf(FILE*, const char*, ...) for formatted output to a file handle.

The best answer will depend on what sort of text you are generating, and how you are generating it. C++ streams can be slow, but that mostly is because they can also do a lot more for you, such as locale-dependent formatting, and so on.
You may find speed ups with streams by bypassing some of the formatting (eg. ostream::write), or by writing characters directly to a streambuf instead (streambuf::sputn). Sometimes increasing the buffer size on the relevant streambuf helps (via streambuf::pubsetbuf).
If this isn't good enough, you might want to try C-style stdio files, eg fopen, fprintf, etc. It takes a little while to get used to the way the text is formatted if you're not used to that method but the performance is usually pretty good.
For the absolute top performance you usually have to go to OS-specific routines. Sometimes the direct low-level file routines are significantly better than the C stdio, but sometimes not - for example, I've seen some people say WriteFile on Win32 is the fastest method on Windows, whereas some Google hits report it as being slower than stdio. Another approach might be a memory-mapped file, eg. mmap + msync - this essentially uses your system memory as the disk and writes the actual data to disk in large blocks, which is likely to be near optimal. However you run the risk of losing all the data if you incur a crash half way for some reason, which may or may not be a problem for you.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Write int16_t Array to Text File Without Looping - c++

Related

Am I really copying the bytes or am I copying characters in this case?

Writing and reading a binary file to fill a vector - C++

Access violation reading location using binary file

C++ Reading back "incorrect" values from binary file?

Faster Alternative to std::ofstream

Categories

Resources