Binary reader not triggering eof bit when reading exact number of bytes - c++

I am writing images to a binary file using this code:
std::ofstream edgefile("C:\\****\\edge.bin", std::ofstream::binary | std::ofstream::app | std::ofstream::out);
Mat edges;
Canny(bilat, edges, cthr1, cthr2, 3); //cany sliders
if (writeedge){
int rows = edges.rows;
int cols = edges.cols;
edgefile.write(reinterpret_cast<const char *>(&rows), sizeof(int));
edgefile.write(reinterpret_cast<const char *>(&cols), sizeof(int));
edgefile.write(reinterpret_cast<char*>(edges.data), edges.rows*edges.cols*sizeof(uchar));
cout << "writen r:" << rows << "C: " << cols << "Bytes: " << edges.rows*edges.cols*sizeof(uchar) << endl;
}
And then reading the same images with this:
std::ifstream infile;
int main(int argc, char* argv[])
{
int * ptr;
ptr = new int;
int rows;
int cols;
infile.open("C:\\****\\edge.bin", std::ofstream::binary | std::ofstream::app | std::ofstream::in);
while (!infile.eof())
{
infile.read(reinterpret_cast<char*>(ptr), sizeof(int));
rows = *ptr;
infile.read(reinterpret_cast<char*>(ptr), sizeof(int));
cols = *ptr;
Mat ed(rows, cols, CV_8UC1, Scalar::all(0));
infile.read(reinterpret_cast<char*>(ed.data), rows * cols * (sizeof uchar));
cout << "writen r: " << rows << " C: " << cols << " Bytes: " << rows * cols * (sizeof uchar) << endl;
imshow("God Knows", ed);
cvWaitKey();
}
infile.close();
return 0;
}
The images are read accurately however eof bit is not triggered at the end thus multiplying the last ptr value and reading another blank image at the end. After this the cycle ends. How can I check if the next bit is EOF bit without resetting the currently read position?
(I know that if 1 more byte would be read it would trigger the EOF bit)

The EOF bit is set after you try to read past the end of the file, that's just how streams work.
You can easily restructure the main loop to check the status after the first read. This works because the return value from read is a reference to the stream, and casting the reference to bool checks whether the stream is still in a good status (i.e. no EOF).
while (infile.read(reinterpret_cast<char*>(ptr), sizeof(int)))
{
// ...

Related

std::fstream read block of data from file and write data back to file until EOF

I'm reading blocks of data from the file, but not all at once (ex. 3 bytes per read/write) and then write same 3 bytes back to file to the very same position inside a file, and then continue looping until there are no more blocks to read.
In other words I'm trying to rewrite the file by it's very contents.
However there is a problem that final output isn't the same as it was in the beginning.
Following sample code reads 3 bytes per iteration from a file "sample.txt", file contents are simple:
0123456789
after reading data and writing data back to file, the contents are:
012345345345
As you see data doesn't get rewritten correctly for some reason.
#include <fstream>
#include <iostream>
using namespace std;
#define BLOCK_SIZE 3
int main()
{
// open file
fstream file;
file.open("sample.txt", ios::binary | ios::out | ios::in);
// determine size and number of blocks to read
file.seekg(0, ios::end);
streampos size = file.tellg();
int blocks = size / BLOCK_SIZE;
cout << "size:\t" << size << endl;
if (size % BLOCK_SIZE != 0)
{
++blocks;
}
cout << "blocks:\t" << blocks << endl;
// return to beginning
file.seekg(ios::beg);
// we will read data here
unsigned char* data = new unsigned char[BLOCK_SIZE];
streampos pos;
// read blocks of data and write data back
for (int i = 0; i < blocks; ++i)
{
pos = file.tellg();
cout << "before read:\t" << pos << endl;
// read block
file.read(reinterpret_cast<char*>(data), BLOCK_SIZE);
cout << "after read:\t" << file.tellg() << endl;
// write same block back to same position
file.seekp(pos);
cout << "before write:\t" << file.tellg() << endl;
file.write(reinterpret_cast<char*>(data), BLOCK_SIZE);
cout << "after write:\t" << file.tellg() << endl;
// reset buffer
memset(data, 0, BLOCK_SIZE);
}
file.close();
delete[] data;
cin.get();
return 0;
}
Do you see what could be the reason for bad overwrite?
EDIT:
Sorry, I can't see how the linked duplicate answers my question, I'm simply unable to apply given answer to the code above.
Your code does not handle the EOF condition well, and leaves the stream in a bad state after trying to read past the end of the file. On my system, this results in all further calls to the stream having no effect. I bet that isn't the case on your system (which I suspect is a bug in its iostream implementation). I re-did your code to handle the EOF condition correctly, and also to be a lot cleaner in a few other ways:
#include <fstream>
#include <iostream>
using namespace std;
const int BLOCK_SIZE = 3;
int main()
{
// open file
fstream file;
file.open("sample.txt", ios::binary | ios::out | ios::in);
// we will read data here
bool found_eof = false;
// read blocks of data and write data back
while (!found_eof)
{
unsigned char data[BLOCK_SIZE] = {0};
char * const data_as_char = reinterpret_cast<char *>(data);
streampos const pos = file.tellp();
int count_to_write = BLOCK_SIZE;
cout << "before read:\t" << file.tellg() << ' ' << pos << '\n';
// read block
if (!file.read(data_as_char, BLOCK_SIZE)) {
found_eof = true;
count_to_write = file.gcount();
file.clear();
cout << "Only " << count_to_write << " characters extracted.\n";
}
cout << "after read:\t" << file.tellg() << ' ' << file.tellp() << '\n';
// write same block back to same position
file.seekp(pos);
cout << "before write:\t" << file.tellg() << ' ' << file.tellp() << '\n';
file.write(data_as_char, count_to_write);
cout << "after write:\t" << file.tellg() << ' ' << file.tellp() << '\n';
file.seekp(file.tellp());
}
file.close();
cin.get();
return 0;
}
But, this is not fundamentally different. Both versions work for me just the same. I'm on Linux with g++.
From the linked to possible dupe, I would also suggest adding this just before the closing } of your for loop:
file.seekp(file.tellp());
I've put that in my code in the appropriate place.

vector of structs with weird behavior c++

i have a problem with one assignment that i have. I have to read a .ts file, read the packets that are inside and extract header information from each packet.
I have created a struct Packet that will hold all the info of the header, and i also have a vector in which i will push_back each Packet.
The problem is that the for loop stops for some reason on the 163rd loop. If i loop until lets say i=160, then the code escapes ends the loop, but when i print the vector.size() i get a really huge number which doesn't make sense. i guess it should be an integer value as high as the pushed back number of Packets.Here is the code that i have so far:
int main() {
FILE *ts_file = NULL;
ts_file = fopen64("/home/ddd/Desktop/Assignment/Streams/ddd.ts", "rb");
if (ts_file == NULL){
cout << "No file detected on this path, try again" << endl; // prints !!!Hello World!!!
}
TS_Analyzer *ts_analyzer;
ts_analyzer->parse_file(ts_file);
cout << "Finished main" << endl;
return 0;
}
void TS_Analyzer::parse_file(FILE *ts_file){
cout << "Inside parser" << endl;
fseek(ts_file,0,SEEK_END);
long file_size = ftell(ts_file);
rewind (ts_file);
number_of_packets = file_size/PACKET_SIZE;
unsigned int current_header_add = 0;
unsigned int i=0;
for (unsigned int j=1; i<number_of_packets; j++)
{
i++;
unsigned char TS_raw_header[4];
cout << "current position " << int(current_header_add) << endl;
current_header_add = ftell(ts_file);
fread(&TS_raw_header, sizeof(TS_raw_header), 1, ts_file);
Packet current_packet;
current_packet.sync_byte = TS_raw_header[0];
current_packet.transport_error_indicator = (TS_raw_header[1] & 0x80) >> 7;
current_packet.payload_start_indicator = (TS_raw_header[1] & 0x40) >> 6;
current_packet.transport_priority = (TS_raw_header[1] & 0x20) >> 5;
current_packet.PID = ((TS_raw_header[1] & 31) << 8) | TS_raw_header[2];
current_packet.transport_scrambling_control = (TS_raw_header[3] & 0xC0);
current_packet.adaption_field_control = (TS_raw_header[3] & 0x30) >> 4;
current_packet.continuity_counter = (TS_raw_header[3] & 0xF);
stream_packets.push_back(current_packet);
//cout << hex << int(current_packet.PID) << endl;
//cout << dec << "continuity counter " << int(current_packet.continuity_counter) << endl;
cout << " i " << int(i) << endl;
fseek(ts_file, 184, SEEK_CUR);
}
cout << "##" << endl;
cout << stream_packets.size() << endl;
}
class TS_Analyzer: public Analyzer {
public:
TS_Analyzer();
~TS_Analyzer();
struct Packet {
unsigned char sync_byte;
unsigned char transport_error_indicator;
unsigned char payload_start_indicator;
unsigned char transport_priority;
unsigned int PID;
unsigned char transport_scrambling_control;
unsigned char adaption_field_control;
unsigned char continuity_counter;
};
std::vector<Packet>stream_packets;
int number_of_packets = 0;
void parse_file(FILE *);
};
Any ideas of why the vector push_back breaks the for loop and why i cannot get a correct vector size?
If I put this code through the clang compiler, I get an error on following code:
TS_Analyzer *ts_analyzer;
ts_analyzer->parse_file(ts_file);
>> variable 'ts_analyzer' is uninitialized when used here
I guess you are encountering undefined behavior: As ts_analyzer as ptr is any random value, the data in its members is also very random.
I'm actually surprised that this code runs at all without crashing, though you can always be lucky.
If you like to fix this, try avoiding pointers by creating the object at the stack:
TS_Analyzer ts_analyzer;
ts_analyzer.parse_file(ts_file);
or if you really need allocated memory, at least fill in the pointer:
auto ts_analyzer = std::make_unique<TS_Analyzer>();
ts_analyzer->parse_file(ts_file);

C++ Reading back "incorrect" values from binary file?

The project I'm working on, as a custom file format consisting of the header of a few different variables, followed by the pixel data. My colleagues have developed a GUI, where processing, writing reading and displaying this type of file format works fine.
But my problem is, while I have assisted in writing the code for writing data to disk, I cannot myself read this kind of file and get satisfactorily values back. I am able to read the first variable back (char array) but not the following value(s).
So the file format matches the following structure:
typedef struct {
char hxtLabel[8];
u64 hxtVersion;
int motorPositions[9];
int filePrefixLength;
char filePrefix[100];
..
} HxtBuffer;
In the code, I create an object of the above structure and then set these example values:
setLabel("MY_LABEL");
setFormatVersion(3);
setMotorPosition( 2109, 5438, 8767, 1234, 1022, 1033, 1044, 1055, 1066);
setFilePrefixLength(7);
setFilePrefix( string("prefix_"));
setDataTimeStamp( string("000000_000000"));
My code for opening the file:
// Open data file, binary mode, reading
ifstream datFile(aFileName.c_str(), ios::in | ios::binary);
if (!datFile.is_open()) {
cout << "readFile() ERROR: Failed to open file " << aFileName << endl;
return false;
}
// How large is the file?
datFile.seekg(0, datFile.end);
int length = datFile.tellg();
datFile.seekg(0, datFile.beg);
cout << "readFile() file " << setw(70) << aFileName << " is: " << setw(15) << length << " long\n";
// Allocate memory for buffer:
char * buffer = new char[length];
// Read data as one block:
datFile.read(buffer, length);
datFile.close();
/// Looking at the start of the buffer, I should be seeing "MY_LABEL"?
cout << "buffer: " << buffer << " " << *(buffer) << endl;
int* mSSX = reinterpret_cast<int*>(*(buffer+8));
int* mSSY = reinterpret_cast<int*>(&buffer+9);
int* mSSZ = reinterpret_cast<int*>(&buffer+10);
int* mSSROT = reinterpret_cast<int*>(&buffer+11);
int* mTimer = reinterpret_cast<int*>(&buffer+12);
int* mGALX = reinterpret_cast<int*>(&buffer+13);
int* mGALY = reinterpret_cast<int*>(&buffer+14);
int* mGALZ = reinterpret_cast<int*>(&buffer+15);
int* mGALROT = reinterpret_cast<int*>(&buffer+16);
int* filePrefixLength = reinterpret_cast<int*>(&buffer+17);
std::string filePrefix; std::string dataTimeStamp;
// Read file prefix character by character into stringstream object
std::stringstream ss;
char* cPointer = (char *)(buffer+18);
int k;
for(k = 0; k < *filePrefixLength; k++)
{
//read string
char c;
c = *cPointer;
ss << c;
cPointer++;
}
filePrefix = ss.str();
// Read timestamp character by character into stringstream object
std::stringstream timeStampStream;
/// Need not increment cPointer, already pointing # 1st char of timeStamp
for (int l= 0; l < 13; l++)
{
char c;
c = * cPointer;
timeStampStream << c;
}
dataTimeStamp = timeStampStream.str();
cout << 25 << endl;
cout << " mSSX: " << mSSX << " mSSY: " << mSSY << " mSSZ: " << mSSZ;
cout << " mSSROT: " << mSSROT << " mTimer: " << mTimer << " mGALX: " << mGALX;
cout << " mGALY: " << mGALY << " mGALZ: " << mGALZ << " mGALROT: " << mGALROT;
Finally, what I see is here below. I added the 25 just to double check that not everything was coming out in hexadecimal. As you can see, I am able to see the label "MY_LABEL" as expected. But the 9 motorPositions all come out looking suspiciously like addresses are not values. The file prefix and the data timestamp (which should be strings, or at least characters), are just empty.
buffer: MY_LABEL M
25
mSSX: 0000000000000003 mSSY: 00000000001BF618 mSSZ: 00000000001BF620 mSSROT: 00000000001BF628 mTimer: 00000000001BF630 mGALX: 00000000001BF638 mGALY: 00000000001BF640 mGALZ: 00000000001BF648 mGALROT: 00000000001BF650filePrefix: dataTimeStamp:
I'm sure the solution can't be too complicated, but I reached a stage where I had this just spinning and I cannot make sense of things.
Many thanks for reading this somewhat long post.
-- Edit--
I might hit the maximum length allowed for a post, but just in case I thought I shall post the code that generates the data that I'm trying to read back:
bool writePixelOutput(string aOutputPixelFileName) {
// Write pixel histograms out to binary file
ofstream pixelFile;
pixelFile.open(aOutputPixelFileName.c_str(), ios::binary | ios::out | ios::trunc);
if (!pixelFile.is_open()) {
LOG(gLogConfig, logERROR) << "Failed to open output file " << aOutputPixelFileName;
return false;
}
// Write binary file header
string label("MY_LABEL");
pixelFile.write(label.c_str(), label.length());
pixelFile.write((const char*)&mFormatVersion, sizeof(u64));
// Include File Prefix/Motor Positions/Data Time Stamp - if format version > 1
if (mFormatVersion > 1)
{
pixelFile.write((const char*)&mSSX, sizeof(mSSX));
pixelFile.write((const char*)&mSSY, sizeof(mSSY));
pixelFile.write((const char*)&mSSZ, sizeof(mSSZ));
pixelFile.write((const char*)&mSSROT, sizeof(mSSROT));
pixelFile.write((const char*)&mTimer, sizeof(mTimer));
pixelFile.write((const char*)&mGALX, sizeof(mGALX));
pixelFile.write((const char*)&mGALY, sizeof(mGALY));
pixelFile.write((const char*)&mGALZ, sizeof(mGALZ));
pixelFile.write((const char*)&mGALROT, sizeof(mGALROT));
// Determine length of mFilePrefix string
int filePrefixSize = (int)mFilePrefix.size();
// Write prefix length, followed by prefix itself
pixelFile.write((const char*)&filePrefixSize, sizeof(filePrefixSize));
size_t prefixLen = 0;
if (mFormatVersion == 2) prefixLen = mFilePrefix.size();
else prefixLen = 100;
pixelFile.write(mFilePrefix.c_str(), prefixLen);
pixelFile.write(mDataTimeStamp.c_str(), mDataTimeStamp.size());
}
// Continue writing header information that is common to both format versions
pixelFile.write((const char*)&mRows, sizeof(mRows));
pixelFile.write((const char*)&mCols, sizeof(mCols));
pixelFile.write((const char*)&mHistoBins, sizeof(mHistoBins));
// Write the actual data - taken out for briefy sake
// ..
pixelFile.close();
LOG(gLogConfig, logINFO) << "Written output histogram binary file " << aOutputPixelFileName;
return true;
}
-- Edit 2 (11:32 09/12/2015) --
Thank you for all the help, I'm closer to solving the issue now. Going with the answer from muelleth, I try:
/// Read into char buffer
char * buffer = new char[length];
datFile.read(buffer, length);// length determined by ifstream.seekg()
/// Let's try HxtBuffer
HxtBuffer *input = new HxtBuffer;
cout << "sizeof HxtBuffer: " << sizeof *input << endl;
memcpy(input, buffer, length);
I can then display the different struct variables:
qDebug() << "Slice BUFFER label " << QString::fromStdString(input->hxtLabel);
qDebug() << "Slice BUFFER version " << QString::number(input->hxtVersion);
qDebug() << "Slice BUFFER hxtPrefixLength " << QString::number(input->filePrefixLength);
for (int i = 0; i < 9; i++)
{
qDebug() << i << QString::number(input->motorPositions[i]);
}
qDebug() << "Slice BUFFER filePrefix " << QString::fromStdString(input->filePrefix);
qDebug() << "Slice BUFFER dataTimeStamp " << QString::fromStdString(input->dataTimeStamp);
qDebug() << "Slice BUFFER nRows " << QString::number(input->nRows);
qDebug() << "Slice BUFFER nCols " << QString::number(input->nCols);
qDebug() << "Slice BUFFER nBins " << QString::number(input->nBins);
The output is then mostly as expected:
Slice BUFFER label "MY_LABEL"
Slice BUFFER version "3"
Slice BUFFER hxtPrefixLength "2"
0 "2109"
1 "5438"
...
7 "1055"
8 "1066"
Slice BUFFER filePrefix "-1"
Slice BUFFER dataTimeStamp "000000_000000P"
Slice BUFFER nRows "20480"
Slice BUFFER nCols "256000"
Slice BUFFER nBins "0"
EXCEPT, dataTimeStamp, which is 13 chars long, displays instead 14 chars. The 3 variables that follow: nRows, nCols and nBins are then incorrect. (Should be nRows=80, nCols=80, nBins=1000). My guess is that the bits belonging to the 14th char of dataTimeStamp should be read along with nRows, and so cascade on to produce the correct nCols and nBins.
I have separately verified (not shown here) using qDebug that what I'm writing into the file, really are the values I expect, and their individual sizes.
I personally would try to read exactly the number of bytes your struct is from the file, i.e. something like
int length = sizeof(HxtBuffer);
and then simply use memcpy to assign a local structure from the read buffer:
HxtBuffer input;
memcpy(&input, buffer, length);
You can then access your data e.g. like:
std::cout << "Data: " << input.hxtLabel << std::endl;
Why do you read to buffer, instead of using the structure for reading?
HxtBuffer data;
datFile.read(reinterpret_cast<char *>(&data), sizeof data);
if(datFile && datFile.gcount()!=sizeof data)
throw io_exception();
// Can use data.
If you want to read to a chracter buffer, than your way of getting the data is just wrong. You probably want to do something like this.
char *buf_offset=buffer+8+sizeof(u64); // Skip label (8 chars) and version (int64)
int mSSX = *reinterpret_cast<int*>(buf_offset);
buf_offset+=sizeof(int);
int mSSY = *reinterpret_cast<int*>(buf_offset);
buf_offset+=sizeof(int);
int mSSZ = *reinterpret_cast<int*>(buf_offset);
/* etc. */
Or, a little better (provided you don't change the contents of the buffer).
int *ptr_motors=reinterpret_cast<int *>(buffer+8+sizeof(u64));
int &mSSX = ptr_motors[0];
int &mSSY = ptr_motors[1];
int &mSSZ = ptr_motors[2];
/* etc. */
Notice that I don't declare mSSX, mSSY etc. as pointers. Your code was printing them as addresses because you told the compiler that they were addresses (pointers).

C++ read/write random access file using gcount() method

Consider the code:
const int length = 1024 * 1024; // 1048576
char buffer[length];
fstream f;
int main(int argc, char *argv[])
{
f.open("file.bin", ios::in | ios::out | ios::binary);
f.read(buffer, length);
int k = 0;
while (f.gcount() > 0)
{
k++;
cout << "Block #" << k << ": " << f.gcount() << " bytes" << endl;
f.read(buffer, f.gcount());
} // while
f.close();
return 0;
} // main
The size of the file "file.bin" is 2,895,872 bytes.
When I ran this code, the output is:
Block #1: 1048576 bytes
Block #2: 1048576 bytes
Block #3: 798720 bytes
Now, suppose that I want to do a useless thing: read each block and then write it again in the same file (in pratical terms this is a do nothing operation)
const int length = 1024 * 1024; // 1048576
char buffer[length];
fstream f;
int main(int argc, char *argv[])
{
f.open("file.bin", ios::in | ios::out | ios::binary);
f.read(buffer, length);
int k = 0;
while (f.gcount() > 0)
{
k++;
cout << "Block #" << k << ": " << f.gcount() << " bytes" << endl;
// this is the code I added
f.seekp(-f.gcount(), ios_base::cur); // move file pointer backwards
f.write(buffer, f.gcount()); // write the buffer again <=> do nothing
// end of the code I added
f.read(buffer, f.gcount());
} // while
f.close();
return 0;
} // main
Now the output is
Block #1: 1048576 bytes
Why Block #2 and #3 are not listed?
Thank you
The function seekp seeks on the output sequence, but the output sequence didn't change due to the fact that you were just reading (which changes the input sequence).
I think the best thing to do update the output sequence each time you perform a read, I'm not sure if it will work but you might try:
// ...
f.read(buffer, length);
f.seekp(f.gcount(), ios_base::cur); // update output sequence
int k = 0;
while (f.gcount() > 0)
{
k++;
cout << "Block #" << k << ": " << f.gcount() << " bytes" << endl;
// this is the code I added
f.seekp(-f.gcount(), ios_base::cur); // move file pointer backwards
f.write(buffer, f.gcount()); // write the buffer again <=> do nothing
// end of the code I added
f.read(buffer, f.gcount());
f.seekp(f.gcount(), ios_base::cur); // update output sequence
}
// ...

.dat ASCII file I/O in C++

I have a .dat file with ASCII characters like following picture:
It is basically a series of 16-bit numbers. I can read it, as unsigned short, in my data structure, but I have no idea how to save my unsigned short as same format as the input. Here is my current code, although the value is right, the format is not. See following picture:
Anyone has any idea how I should save it same as the input format? Here is my saving function"
void SavePxlShort(vector<Point3D> &pts, char * fileName)
{
ofstream os(fileName, ios::out);
size_t L = pts.size();
cout << "writing data (pixel as short) with length "<< L << " ......" << endl;
unsigned short pxl;
for (long i = 0; i < L; i++)
{
pxl = Round(pts[i].val());
if (pts[i].val() < USHRT_MAX)
{
os << pxl << endl;
}
else
{
cout << "pixel intensity overflow ushort" << endl;
return;
}
}
os.close();
return;
}
void SavePxlShort(vector<Point3D> &pts, char * fileName)
{
ofstream os(fileName, ios::out, ios::binary);
size_t L = pts.size();
cout << "writing data (pixel as short) with length "<< L << " ......" << endl;
unsigned short* pData = new unsigned short[L];
unsigned short pxl;
for (long i = 0; i < L; i++)
{
pxl = pts[i].val();
if (pts[i].val() < USHRT_MAX)
{
pData[i] = pxl ;
}
else
{
cout << "pixel intensity overflow ushort" << endl;
return;
}
}
os.write(reinterpret_cast<char*> (pData), sizeof(unsigned short)*L);
os.close();
delete pData;
return;
}
Two things:
You are are not opening the stream in binary mode. Try this:
ofstream os(fileName, ios::out | ios::binary);
Actually, because ofstream automatically sets the ios::out flag, you just need this:
ofstream os(fileName, ios::binary);
Another problem is that you are calling std::endl. This outputs a \n and then flushes the stream.
os << pxl << endl;
Change the above to just:
os << pxl;
in place of
os << pxl << endl;
You could put
os.write((char*)&pxl, sizeof(pxl));
to write the raw bytes of pxl into the file instead of the ASCII representation. Bear in mind the byte order and word size of an unsigned short may vary between systems.