inconsistent results while reading file in binary c++. READ BODY - c++

As shown in the picture, if I read 2 bytes at offset 254786 and print it in hexadecimal, I should be getting 0xffd9 and I do get that exact value if I directly set the offset to 254786. however, if I set the offset to something far away from 254786 and run a while loop as shown in the second picture, I do not get 0xffd9. I really don't know where I could be possibly going wrong here.
std::ifstream myfile ("test-01.jpg");
if(!myfile) throw std::runtime_error("unable to open input file");
myfile.seekg(254786,myfile.beg);
std::string buf {};
buf.resize(2);
myfile.read(&buf[0], 2);
std::cout << std::hex << std::showbase << big_endian_2_bytes_to_int(buf);
0xffd9
int offset = 17000
std::cout << offset << std::endl;
myfile.seekg(offset,myfile.beg);
myfile.read(&buffer[0],2);
while ( big_endian_2_bytes_to_int(buffer) != 0xffd9){
offset++;
myfile.seekg(offset,myfile.beg);
myfile.read(&buffer[0],2);
if (offset == 254786){
std::cout << offset <<std::endl;
std::cout << std::hex << std::showbase << big_endian_2_bytes_to_int(buffer) << std::endl;
return 0;
}
}

Related

std::fstream read block of data from file and write data back to file until EOF

I'm reading blocks of data from the file, but not all at once (ex. 3 bytes per read/write) and then write same 3 bytes back to file to the very same position inside a file, and then continue looping until there are no more blocks to read.
In other words I'm trying to rewrite the file by it's very contents.
However there is a problem that final output isn't the same as it was in the beginning.
Following sample code reads 3 bytes per iteration from a file "sample.txt", file contents are simple:
0123456789
after reading data and writing data back to file, the contents are:
012345345345
As you see data doesn't get rewritten correctly for some reason.
#include <fstream>
#include <iostream>
using namespace std;
#define BLOCK_SIZE 3
int main()
{
// open file
fstream file;
file.open("sample.txt", ios::binary | ios::out | ios::in);
// determine size and number of blocks to read
file.seekg(0, ios::end);
streampos size = file.tellg();
int blocks = size / BLOCK_SIZE;
cout << "size:\t" << size << endl;
if (size % BLOCK_SIZE != 0)
{
++blocks;
}
cout << "blocks:\t" << blocks << endl;
// return to beginning
file.seekg(ios::beg);
// we will read data here
unsigned char* data = new unsigned char[BLOCK_SIZE];
streampos pos;
// read blocks of data and write data back
for (int i = 0; i < blocks; ++i)
{
pos = file.tellg();
cout << "before read:\t" << pos << endl;
// read block
file.read(reinterpret_cast<char*>(data), BLOCK_SIZE);
cout << "after read:\t" << file.tellg() << endl;
// write same block back to same position
file.seekp(pos);
cout << "before write:\t" << file.tellg() << endl;
file.write(reinterpret_cast<char*>(data), BLOCK_SIZE);
cout << "after write:\t" << file.tellg() << endl;
// reset buffer
memset(data, 0, BLOCK_SIZE);
}
file.close();
delete[] data;
cin.get();
return 0;
}
Do you see what could be the reason for bad overwrite?
EDIT:
Sorry, I can't see how the linked duplicate answers my question, I'm simply unable to apply given answer to the code above.
Your code does not handle the EOF condition well, and leaves the stream in a bad state after trying to read past the end of the file. On my system, this results in all further calls to the stream having no effect. I bet that isn't the case on your system (which I suspect is a bug in its iostream implementation). I re-did your code to handle the EOF condition correctly, and also to be a lot cleaner in a few other ways:
#include <fstream>
#include <iostream>
using namespace std;
const int BLOCK_SIZE = 3;
int main()
{
// open file
fstream file;
file.open("sample.txt", ios::binary | ios::out | ios::in);
// we will read data here
bool found_eof = false;
// read blocks of data and write data back
while (!found_eof)
{
unsigned char data[BLOCK_SIZE] = {0};
char * const data_as_char = reinterpret_cast<char *>(data);
streampos const pos = file.tellp();
int count_to_write = BLOCK_SIZE;
cout << "before read:\t" << file.tellg() << ' ' << pos << '\n';
// read block
if (!file.read(data_as_char, BLOCK_SIZE)) {
found_eof = true;
count_to_write = file.gcount();
file.clear();
cout << "Only " << count_to_write << " characters extracted.\n";
}
cout << "after read:\t" << file.tellg() << ' ' << file.tellp() << '\n';
// write same block back to same position
file.seekp(pos);
cout << "before write:\t" << file.tellg() << ' ' << file.tellp() << '\n';
file.write(data_as_char, count_to_write);
cout << "after write:\t" << file.tellg() << ' ' << file.tellp() << '\n';
file.seekp(file.tellp());
}
file.close();
cin.get();
return 0;
}
But, this is not fundamentally different. Both versions work for me just the same. I'm on Linux with g++.
From the linked to possible dupe, I would also suggest adding this just before the closing } of your for loop:
file.seekp(file.tellp());
I've put that in my code in the appropriate place.

Why does ifstream read() behave differently in two different programs?

I'm trying to write a program that reads in an OpenGL shader from a .txt file. I've actually already done this a few days ago, this was the code I used:
char vShaderData[2000];
char fShaderData[2000];
void readShaders() {
std::ifstream vShaderF;
std::ifstream fShaderF;
vShaderF.open("shaders//vertexShader.txt");
fShaderF.open("shaders//fragShader.txt");
if (vShaderF.is_open() && fShaderF.is_open()) std::cout << m << "Shader read success" << std::endl;
else std::cout << "Shader read fail" << std::endl;
std::cout << m << "vertex shader: " << std::endl;
vShaderF.read(vShaderData, 2000);
for (int i = 0; i < 2000; i++) {
std::cout << vShaderData[i];
}
std::cout << std::endl << std::endl;
std::cout << m << "frag shader: " << std::endl;
fShaderF.read(fShaderData, 2000);
for (int i = 0; i < 2000; i++) {
std::cout << fShaderData[i];
}
std::cout << std::endl;
vShaderF.close();
fShaderF.close();
}
This worked great. my shader file was not actually not 2000 in length, but the read() call seemed to store the extra characters as whitespace into the char array which is what I wanted.
Now having restructured my code a little bit in a newer program, my reader now looks like this:
std::ifstream shaderFile;
shaderFile.open(path);
if (shaderFile.is_open()) cout << "Shader at: " << path << ", initalized" << endl;
char data[2000];
shaderFile.read(data, 2000);
for (int i = 0; i < 2000; i++) std::cout << data[i];
The actual text portion still reads correct. However, now the extra space in the char array is stored with this instead of whitespace:
In case the image won't show, it is basically just a reapeating pattern of these two characters [|[|[|....
Why is this happening and how can I fix it?
NOTE: I'm using the same shader file, same computer, same IDE, same everything. The old one still works.
When using std::istream:read() it will not set the parts of the buffer to spaces which were not read. The memory will be left untouched. If you want to get spaces into an unread area of the buffer, you'll need to put the spaces there yourself. If the program indeed had spaces in the buffer it was because the buffer somehow already contained spaces by chance.
You can use std::istream::gcount() to determine how many characters were read.
If you want the arrays to contain predefined data, you'll have to initialize it with such predefined data. If the stream reads fewer data than the array size, you will have the padding you want.

C++ Reading back "incorrect" values from binary file?

The project I'm working on, as a custom file format consisting of the header of a few different variables, followed by the pixel data. My colleagues have developed a GUI, where processing, writing reading and displaying this type of file format works fine.
But my problem is, while I have assisted in writing the code for writing data to disk, I cannot myself read this kind of file and get satisfactorily values back. I am able to read the first variable back (char array) but not the following value(s).
So the file format matches the following structure:
typedef struct {
char hxtLabel[8];
u64 hxtVersion;
int motorPositions[9];
int filePrefixLength;
char filePrefix[100];
..
} HxtBuffer;
In the code, I create an object of the above structure and then set these example values:
setLabel("MY_LABEL");
setFormatVersion(3);
setMotorPosition( 2109, 5438, 8767, 1234, 1022, 1033, 1044, 1055, 1066);
setFilePrefixLength(7);
setFilePrefix( string("prefix_"));
setDataTimeStamp( string("000000_000000"));
My code for opening the file:
// Open data file, binary mode, reading
ifstream datFile(aFileName.c_str(), ios::in | ios::binary);
if (!datFile.is_open()) {
cout << "readFile() ERROR: Failed to open file " << aFileName << endl;
return false;
}
// How large is the file?
datFile.seekg(0, datFile.end);
int length = datFile.tellg();
datFile.seekg(0, datFile.beg);
cout << "readFile() file " << setw(70) << aFileName << " is: " << setw(15) << length << " long\n";
// Allocate memory for buffer:
char * buffer = new char[length];
// Read data as one block:
datFile.read(buffer, length);
datFile.close();
/// Looking at the start of the buffer, I should be seeing "MY_LABEL"?
cout << "buffer: " << buffer << " " << *(buffer) << endl;
int* mSSX = reinterpret_cast<int*>(*(buffer+8));
int* mSSY = reinterpret_cast<int*>(&buffer+9);
int* mSSZ = reinterpret_cast<int*>(&buffer+10);
int* mSSROT = reinterpret_cast<int*>(&buffer+11);
int* mTimer = reinterpret_cast<int*>(&buffer+12);
int* mGALX = reinterpret_cast<int*>(&buffer+13);
int* mGALY = reinterpret_cast<int*>(&buffer+14);
int* mGALZ = reinterpret_cast<int*>(&buffer+15);
int* mGALROT = reinterpret_cast<int*>(&buffer+16);
int* filePrefixLength = reinterpret_cast<int*>(&buffer+17);
std::string filePrefix; std::string dataTimeStamp;
// Read file prefix character by character into stringstream object
std::stringstream ss;
char* cPointer = (char *)(buffer+18);
int k;
for(k = 0; k < *filePrefixLength; k++)
{
//read string
char c;
c = *cPointer;
ss << c;
cPointer++;
}
filePrefix = ss.str();
// Read timestamp character by character into stringstream object
std::stringstream timeStampStream;
/// Need not increment cPointer, already pointing # 1st char of timeStamp
for (int l= 0; l < 13; l++)
{
char c;
c = * cPointer;
timeStampStream << c;
}
dataTimeStamp = timeStampStream.str();
cout << 25 << endl;
cout << " mSSX: " << mSSX << " mSSY: " << mSSY << " mSSZ: " << mSSZ;
cout << " mSSROT: " << mSSROT << " mTimer: " << mTimer << " mGALX: " << mGALX;
cout << " mGALY: " << mGALY << " mGALZ: " << mGALZ << " mGALROT: " << mGALROT;
Finally, what I see is here below. I added the 25 just to double check that not everything was coming out in hexadecimal. As you can see, I am able to see the label "MY_LABEL" as expected. But the 9 motorPositions all come out looking suspiciously like addresses are not values. The file prefix and the data timestamp (which should be strings, or at least characters), are just empty.
buffer: MY_LABEL M
25
mSSX: 0000000000000003 mSSY: 00000000001BF618 mSSZ: 00000000001BF620 mSSROT: 00000000001BF628 mTimer: 00000000001BF630 mGALX: 00000000001BF638 mGALY: 00000000001BF640 mGALZ: 00000000001BF648 mGALROT: 00000000001BF650filePrefix: dataTimeStamp:
I'm sure the solution can't be too complicated, but I reached a stage where I had this just spinning and I cannot make sense of things.
Many thanks for reading this somewhat long post.
-- Edit--
I might hit the maximum length allowed for a post, but just in case I thought I shall post the code that generates the data that I'm trying to read back:
bool writePixelOutput(string aOutputPixelFileName) {
// Write pixel histograms out to binary file
ofstream pixelFile;
pixelFile.open(aOutputPixelFileName.c_str(), ios::binary | ios::out | ios::trunc);
if (!pixelFile.is_open()) {
LOG(gLogConfig, logERROR) << "Failed to open output file " << aOutputPixelFileName;
return false;
}
// Write binary file header
string label("MY_LABEL");
pixelFile.write(label.c_str(), label.length());
pixelFile.write((const char*)&mFormatVersion, sizeof(u64));
// Include File Prefix/Motor Positions/Data Time Stamp - if format version > 1
if (mFormatVersion > 1)
{
pixelFile.write((const char*)&mSSX, sizeof(mSSX));
pixelFile.write((const char*)&mSSY, sizeof(mSSY));
pixelFile.write((const char*)&mSSZ, sizeof(mSSZ));
pixelFile.write((const char*)&mSSROT, sizeof(mSSROT));
pixelFile.write((const char*)&mTimer, sizeof(mTimer));
pixelFile.write((const char*)&mGALX, sizeof(mGALX));
pixelFile.write((const char*)&mGALY, sizeof(mGALY));
pixelFile.write((const char*)&mGALZ, sizeof(mGALZ));
pixelFile.write((const char*)&mGALROT, sizeof(mGALROT));
// Determine length of mFilePrefix string
int filePrefixSize = (int)mFilePrefix.size();
// Write prefix length, followed by prefix itself
pixelFile.write((const char*)&filePrefixSize, sizeof(filePrefixSize));
size_t prefixLen = 0;
if (mFormatVersion == 2) prefixLen = mFilePrefix.size();
else prefixLen = 100;
pixelFile.write(mFilePrefix.c_str(), prefixLen);
pixelFile.write(mDataTimeStamp.c_str(), mDataTimeStamp.size());
}
// Continue writing header information that is common to both format versions
pixelFile.write((const char*)&mRows, sizeof(mRows));
pixelFile.write((const char*)&mCols, sizeof(mCols));
pixelFile.write((const char*)&mHistoBins, sizeof(mHistoBins));
// Write the actual data - taken out for briefy sake
// ..
pixelFile.close();
LOG(gLogConfig, logINFO) << "Written output histogram binary file " << aOutputPixelFileName;
return true;
}
-- Edit 2 (11:32 09/12/2015) --
Thank you for all the help, I'm closer to solving the issue now. Going with the answer from muelleth, I try:
/// Read into char buffer
char * buffer = new char[length];
datFile.read(buffer, length);// length determined by ifstream.seekg()
/// Let's try HxtBuffer
HxtBuffer *input = new HxtBuffer;
cout << "sizeof HxtBuffer: " << sizeof *input << endl;
memcpy(input, buffer, length);
I can then display the different struct variables:
qDebug() << "Slice BUFFER label " << QString::fromStdString(input->hxtLabel);
qDebug() << "Slice BUFFER version " << QString::number(input->hxtVersion);
qDebug() << "Slice BUFFER hxtPrefixLength " << QString::number(input->filePrefixLength);
for (int i = 0; i < 9; i++)
{
qDebug() << i << QString::number(input->motorPositions[i]);
}
qDebug() << "Slice BUFFER filePrefix " << QString::fromStdString(input->filePrefix);
qDebug() << "Slice BUFFER dataTimeStamp " << QString::fromStdString(input->dataTimeStamp);
qDebug() << "Slice BUFFER nRows " << QString::number(input->nRows);
qDebug() << "Slice BUFFER nCols " << QString::number(input->nCols);
qDebug() << "Slice BUFFER nBins " << QString::number(input->nBins);
The output is then mostly as expected:
Slice BUFFER label "MY_LABEL"
Slice BUFFER version "3"
Slice BUFFER hxtPrefixLength "2"
0 "2109"
1 "5438"
...
7 "1055"
8 "1066"
Slice BUFFER filePrefix "-1"
Slice BUFFER dataTimeStamp "000000_000000P"
Slice BUFFER nRows "20480"
Slice BUFFER nCols "256000"
Slice BUFFER nBins "0"
EXCEPT, dataTimeStamp, which is 13 chars long, displays instead 14 chars. The 3 variables that follow: nRows, nCols and nBins are then incorrect. (Should be nRows=80, nCols=80, nBins=1000). My guess is that the bits belonging to the 14th char of dataTimeStamp should be read along with nRows, and so cascade on to produce the correct nCols and nBins.
I have separately verified (not shown here) using qDebug that what I'm writing into the file, really are the values I expect, and their individual sizes.
I personally would try to read exactly the number of bytes your struct is from the file, i.e. something like
int length = sizeof(HxtBuffer);
and then simply use memcpy to assign a local structure from the read buffer:
HxtBuffer input;
memcpy(&input, buffer, length);
You can then access your data e.g. like:
std::cout << "Data: " << input.hxtLabel << std::endl;
Why do you read to buffer, instead of using the structure for reading?
HxtBuffer data;
datFile.read(reinterpret_cast<char *>(&data), sizeof data);
if(datFile && datFile.gcount()!=sizeof data)
throw io_exception();
// Can use data.
If you want to read to a chracter buffer, than your way of getting the data is just wrong. You probably want to do something like this.
char *buf_offset=buffer+8+sizeof(u64); // Skip label (8 chars) and version (int64)
int mSSX = *reinterpret_cast<int*>(buf_offset);
buf_offset+=sizeof(int);
int mSSY = *reinterpret_cast<int*>(buf_offset);
buf_offset+=sizeof(int);
int mSSZ = *reinterpret_cast<int*>(buf_offset);
/* etc. */
Or, a little better (provided you don't change the contents of the buffer).
int *ptr_motors=reinterpret_cast<int *>(buffer+8+sizeof(u64));
int &mSSX = ptr_motors[0];
int &mSSY = ptr_motors[1];
int &mSSZ = ptr_motors[2];
/* etc. */
Notice that I don't declare mSSX, mSSY etc. as pointers. Your code was printing them as addresses because you told the compiler that they were addresses (pointers).

c++ string concatenation second time

I am a newbie, writing a c++ code to open and read from multiple files and then dump part of the data into other files.
I want to generate file names in a for loop.
But I can't concatenate string(numbering of file) and string literal(a file extension). The same line of code works at the very beginning of the program, but not at the later part.
int main(int argc, char *argv[])
{
std::cout << std::string("9") + ".dat" << std::endl;
// many more lines
dump = 1;
if (dump == 1){
for (int ilevel=std::max(levelmin,lmin); ilevel < lmax + 1; ilevel++){
std::cout << std::string("9") + ".dat" << std::endl; // crashes here!
std::ofstream fout (std::string("9") + ".dat", std::ios::out | std::ios::binary);
std::cout << grid[ilevel].cube[0] << std::endl;
fout.write ((char*)&grid[ilevel].cube[0], grid[ilevel].cube.size() * sizeof(grid[ilevel].cube[0]));
fout.close();
}
}
...
}
If I put std::cout << std::string("9") + ".dat" << std::endl; at the beginning, it works and prints "9.dat".
but in the later loop, segmentation fault.
In between I call a function that uses stringstream to pad leading zeros to an integer. The function looks:
std::string int2str(const int n, const int m){
std::stringstream ss;
ss << std::setfill('0') << std::setw(m) << n;
std::string s2(ss.str());
ss.clear();
return s2;
}
I don't have a clear understanding about string and stringstream in c++.
But out of many things in my program, this function is the only thing I can think of being relevant. Other parts of codes does not deal with strings. It's mostly array manipulation code.
I've also tried std::string("9") + std::string(".dat")
but had no luck.
What is wrong?
Is there a specific reason why you're using std::string("9") rather than just "9"?
Where does the 9 come from? If its generated as part of a loop or a returned value from a function you can either place the variable itself to be concatenated, or the function that returns it, so:
std::cout << iFileNumber + ".dat" << std::endl;
or
std::cout << fileNumberGenerator() + ".dat" << std::endl;
For the hardcoded examples you've provided, I personally can't see the need for anything other than
std::cout << 9 + ".dat" << endl;
but that could easily just be lack of experience on my part.
For the sake of printing to the command line, its also worth nothing that this is equally acceptable syntax (assuming you're not already aware):
std::cout << 9 << ".dat" << endl;

C++ equivalent of C fgets

I am looking to find a C++ fstream equivalent function of C fgets. I tried with get function of fstream but did not get what I wanted. The get function does not extract the delim character whereas the fgets function used to extract it. So, I wrote a code to insert this delim character from my code itself. But it is giving strange behaviour. Please see my sample code below;
#include <stdio.h>
#include <fstream>
#include <iostream>
int main(int argc, char **argv)
{
char str[256];
int len = 10;
std::cout << "Using C fgets function" << std::endl;
FILE * file = fopen("C:\\cpp\\write.txt", "r");
if(file == NULL){
std::cout << " Error opening file" << std::endl;
}
int count = 0;
while(!feof(file)){
char *result = fgets(str, len, file);
std::cout << result << std::endl ;
count++;
}
std::cout << "\nCount = " << count << std::endl;
fclose(file);
std::fstream fp("C:\\cpp\\write.txt", std::ios_base::in);
int iter_count = 0;
while(!fp.eof() && iter_count < 10){
fp.get(str, len,'\n');
int count = fp.gcount();
std::cout << "\nCurrent Count = " << count << std::endl;
if(count == 0){
//only new line character encountered
//adding newline character
str[1] = '\0';
str[0] = '\n';
fp.ignore(1, '\n');
//std::cout << fp.get(); //ignore new line character from stream
}
else if(count != (len -1) ){
//adding newline character
str[count + 1] = '\0';
str[count ] = '\n';
//std::cout << fp.get(); //ignore new line character from stream
fp.ignore(1, '\n');
//std::cout << "Adding new line \n";
}
std::cout << str << std::endl;
std::cout << " Stream State : Good: " << fp.good() << " Fail: " << fp.fail() << std::endl;
iter_count++;
}
std::cout << "\nCount = " << iter_count << std::endl;
fp.close();
return 0;
}
The txt file that I am using is write.txt with following content:
This is a new lines.
Now writing second
line
DONE
If you observe my program, I am using fgets function first and then using the get function on same file. In case of get function, the stream state goes bad.
Can anyone please point me out what is going wrong here?
UPDATED: I am now posting a simplest code which does not work at my end. If I dont care about the delim character for now and just read the entire file 10 characters at a time using getline:
void read_file_getline_no_insert(){
char str[256];
int len =10;
std::cout << "\nREAD_GETLINE_NO_INSERT FUNCITON\n" << std::endl;
std::fstream fp("C:\\cpp\\write.txt", std::ios_base::in);
int iter_count = 0;
while(!fp.eof() && iter_count < 10){
fp.getline(str, len,'\n');
int count = fp.gcount();
std::cout << "\nCurrent Count = " << count << std::endl;
std::cout << str << std::endl;
std::cout << " Stream State : Good: " << fp.good() << " Fail: " << fp.fail() << std::endl;
iter_count++;
}
std::cout << "\nCount = " << iter_count << std::endl;
fp.close();
}
int main(int argc, char **argv)
{
read_file_getline_no_insert();
return 0;
}
If wee see the output of above code:
READ_GETLINE_NO_INSERT FUNCITON
Current Count = 9
This is a
Stream State : Good: 0 Fail: 1
Current Count = 0
Stream State : Good: 0 Fail: 1
You would see that the state of stream goes Bad and the fail bit is set. I am unable to understand this behavior.
Rgds
Sapan
std::getline() will read a string from a stream, until it encounters a delimiter (newline by default).
Unlike fgets(), std::getline() discards the delimiter. But, also unlike fgets(), it will read the whole line (available memory permitting) since it works with a std::string rather than a char *. That makes it somewhat easier to use in practice.
All types derived from std::istream (which is the base class for all input streams) also have a member function called getline() which works a little more like fgets() - accepting a char * and a buffer size. It still discards the delimiter though.
The C++-specific options are overloaded functions (i.e. available in more than one version) so you need to read documentation to decide which one is appropriate to your needs.