reading bytes from a file to short / long integer - c++

Hiho everyone! I'm trying to read first 4 bytes of a file and store them in integer variable.
here's what I'm doing:
#include <iostream>
#include <fstream>
#include <iomanip>
#include <cstring>
using namespace std;
int main(){
ifstream is;
is.open ("binary_file.dat", ios::binary );
char file_version[4];
is.read(file_version, 4);
int fv_int;
memcpy(&fv_int, file_version, sizeof(fv_int));
cout << fv_int;
}
But the result is not what I meant it to be. Program copies first byte of the file in correct position, but considers the rest of bytes to be 0's. Example:
First 4 bytes of my file:
10101010 00101100 00101100 00101100
What is the content of fv_int after program execution:
10101010 00000000 00000000 00000000
Is there any way to access specific bytes of integer? Or maybe better method of reading bytes from a file?

istream::read does not read exactly 4 bytes, it returns number of bytes read. Check return value, your file may be too short
Additional hint:
You could do is.read(reinterpret_cast<char*>(&fv_int), size_of(fv_int)); to reduce amount of code and add verbosity

If I feed your program with files which have the first 4 bytes, it reads & displays them perfect. For further diagnosis, change the last cout to: cout <<sizeof(int)<<" "<<hex<<fv_int<<endl;

Related

Binary input of text file

Programming Principles and Practice says in the Chapter 11:
"In memory, we can represent the number 123 as an integer value (each int on 4 bytes) or as a string value (each character on 1 byte)".
I'm trying to understand what is stored in the memory, when reading binary a text file.
So I'm writing the content of the vector v.
If the input file contains this text: "test these words"
The output file shows these numbers: 1953719668 1701344288 1998611827 1935962735 168626701 168626701 168626701 168626701 168626701 168626701
I tried to convert each char of "test" to binary
and I have 01110100 01100101 01100101 01110100
and if I consider this as an integer of 4 bytes and convert it to decimal I get 1952802164, which is still different from the output.
How is this done correctly, so I can understand what's going on? Thanks!
#include<iostream>
#include<string>
#include<vector>
#include<algorithm>
#include<cmath>
#include<sstream>
#include <fstream>
#include <iomanip>
using namespace std;
template <class T>
char *as_bytes(T &i) // treat a T as a sequence of bytes
{
void *addr = &i; // get the address of the first byte of memory used to store the object
return static_cast<char *>(addr); // treat that memory as bytes
}
int main()
{
string iname{"11.9_in.txt"};
ifstream ifs {iname,ios_base::binary}; // note: stream mode
string oname{"11.9_out.txt"};
ofstream ofs {oname,ios_base::binary}; // note: stream mode
vector<int> v;
// read from binary file:
for(int x; ifs.read(as_bytes(x),sizeof(int)); ) // note: reading bytes
v.push_back(x);
for(int x : v)
ofs << x << ' ';
}
Let me assume you are using little-endian machine (for example, x86) and ASCII-compatible character code (such as Shift_JIS and UTF-8).
test is represented as 74 65 73 74 as binary data.
Using little-endian, higher bytes of muitl-byte integer is placed to higher address.
Therefore, reading thes as 4-byte integer, it will be interpreted as 0x74736574 and it is 1953719668 in decimal.

Reading multiple bytes from file and storing them for comparison in C++

I want to binary read a photo in 1460 bytes increments and compare consecutive packets for corrupted transmission. I have a python script that i wrote and want to translate in C++, however I'm not sure that what I intend to use is correct.
for i in range(0, fileSize-1):
buff=f.read(1460) // buff stores a packet of 1460 bytes where f is the opened file
secondPacket=''
for j in buff:
secondPacket+="{:02x}".format(j)
if(secondPacket==firstPacket):
print(f'Packet {i+1} identical with {i}')
firstPacket=secondPacket
I have found int fseek ( FILE * stream, long int offset, int origin ); but it's unclear if it reads the first byte that is located offset away from origin or everything in between.
Thanks for clarifications.
#include <iostream>
#include <fstream>
#include <array>
std::array<char, 1460> firstPacket;
std::array<char, 1460> secondPacket;
int i=0;
int main() {
std::ifstream file;
file.open("photo.jpg", std::ios::binary);
while (file.read(firstPacket.data(), firstPacket.size())){
++i;
if (firstPacket==secondPacket)
std::cout<<"Packet "<<i<<" is a copy of packet "<<i-1<<std::endl;
memcpy(&secondPacket, &firstPacket, firstPacket.size());
}
std::cout<<i; //tested to check if i iterate correctly
return 0;
}
This is the code i have so far which doesn't work.
fseek
doesn't read, it just moves the point where the next read operation should begin. If you read the file from start to end you don't need this.
To read binary data you want the aptly named std::istream::read. You can use it like this wih a fixed size buffer:
// char is one byte, could also be uint8_t, but then you would need a cast later on
std::array<char, 1460> bytes;
while(myInputStream.read(bytes.data(), bytes.size())) {
// do work with the read data
}

Why doesn't file.write() store bytes in the sequence I have given ? c++

Short:
I want to write data byte for byte in sequence to a file.
The data is transfered to the file with file.write.
But when I review the file with hexdump the written data is not in sequence.
Here is my code:
#include <iostream>
#include <fstream>
#include <stdint.h>
int main() {
// array with four bytes I want to write
// This should be 0x01020304 in memory
char write_arr[4]={1,2,3,4};
// int with four bytes I want to write
// I use little endian so this should be 0x04030201 in memory
int write_num=0x01020304;
std::ofstream outfile;
outfile.open("output.txt",std::ios::out | std::ios::binary | std::ios::trunc);
if( outfile.is_open() ) {
outfile.write( write_arr ,sizeof(write_arr)/sizeof(char) );
outfile.write( reinterpret_cast<char *>(&write_num),sizeof(write_num) );
outfile.close();
}
return 0;
}
When I use hexdump on the output it displays this:
0201 0403 0304 0102
The bytes have been rearranged.
I expect the output to be:
0102 0304 0403 0201
Why is the rearranging happening ?
And how can I achieve a transfer where the bytes are in sequence ?
The hexdump dumps 2-bytes words; not individual characters.
That is what is confusing you - try
od -t x1

Writing preceding zeros with ofstream

I am writing a program that will read and write a file format that dictates the content of the file, byte by byte. The nature of this program is that the first two bytes details how many bytes are left in that part of the file, followed by another two bytes that indicates what the part of the file actually represents. This pattern is repeated for the length of the file. This means I have to write the exact numbers buffered by preceding zeros such that each component is the exact size it needs to be. I have written up a dummy file that illustrates my points:
#include <fstream>
#include <stdint.h>
int main() {
std::ofstream outputFile;
outputFile.open("test.txt",
std::ios::out | std::ios::ate | std::ios::binary);
const int16_t HEADER = 0x0002;
int16_t recordSize = 2*sizeof(int16_t);
int16_t version = 0x0258;
outputFile << recordSize << HEADER << version;
outputFile.close();
}
which writes a file named "test.txt" who's hex contents are:
34 32 36 30 30
and for those of us that can read straight hex this translates to:
42600
As you can see the preceding zeros are removed and my record is not what I was hoping it to be. Is there a way to use ofstream to buffer my numbers with zeros as I naively tried to do by using int16_t for all of the writes that I wanted to be exactly two bytes long? Is there another, possibly more stylistically correct way of doing this?
operator<< is for text formatting. You probably want to use .write() instead.
e.g.
outputFile.write(reinterpret_cast<char*>(&recordSize), sizeof(int16_t));
outputFile.write(reinterpret_cast<char*>(&HEADER), sizeof(int16_t));
// ...

Reading and writing int to a binary file in C++

I am unclear about how reading long integers work. If I say
long int a[1]={666666}
ofstream o("ex",ios::binary);
o.write((char*)a,sizeof(a));
to store values to a file and want to read them back as it is
long int stor[1];
ifstream i("ex",ios::binary);
i.read((char*)stor,sizeof(stor));
how will I be able to display the same number as stored using the information stored in multiple bytes of character array?
o.write does not write character, it writes bytes (if flagged with ios::binary). The char-pointer is used because a char has length 1 Byte.
o.write((char*)a,sizeof(a));
(char*) a is the adress of what o.write should write. Then it writes sizeof(a) bytes to a file. There are no characters stored, just bytes.
If you open the file in a Hex-Editor you would see something like this if a is int i = 10:
0A 00 00 00 (4 Byte, on x64).
Reading is analogue.
Here is a working example:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main (int argc, char* argv[]){
const char* FILENAM = "a.txt";
int toStore = 10;
ofstream o(FILENAM,ios::binary);
o.write((char*)&toStore,sizeof(toStore));
o.close();
int toRestore=0;
ifstream i(FILENAM,ios::binary);
i.read((char*)&toRestore,sizeof(toRestore));
cout << toRestore << endl;
return 0;
}
Sorry I took so long to see your question.
I think the difference between binary is the binary will read and write the file as is. But the non-binary (i.e. text) mode will fix up the end-of-line '\n' with carriage-return '\r'. The fix-up will change back and forth between '\n' and '\r', or "\n\r" or "\r\n" or leave it as '\n'. What it does depends on whether the target operating system is Mac, Windows, Unix, etc.
I think if you are reading and writing an integer, it will read and write your integer fine and it will look correct. But if some byte(s) of the integer look like '\r' and '\n', then the integer will not read back correctly from the file.
Binary assures that reading back an int will always be correct. But you want text mode to format a file to be read in a text editor such as Windows's Notepad.