Byte output to binary file C++ - c++

I'm writing Huffman coding and everything was OK, until I tried to save the result into the archived file. Our teacher offered us to do it with such function (it takes each time a bit and after taking 8 of them should output a byte):
long buff=0;
int counter=0;
std::ofstream out("output", std::iostream::binary);
void putbit(bool b)
{
buff<<=1;
if (b) buff++;
counter++;
if (counter>=8)
{
out.put(buff);
counter=0;
buff=0;
}
}
I tried an example with inputting sequence of bits like this:
0011001011001101111010010001000001010101101100
but the output file in binary mode includes just: 1111111
As buff variable has the correct numbers (25 102 250 68 21 108) I suggested that I wrote the code in my notebook incorrectly and something is wrong with this line:
out.put(buff);
I tried to remove it with this line:
out << buff;
but got: 1111111111111111
Another way was:
out.write((char *) &buff, 8);
which gives:
100000001000000010000000100000001000000010000000
It look like the closest to the correct answer, but still doesn't work correctly.
Maybe I don't understand something about file output.
Question:
Could you explain me how to make it work and why previous variants are wrong?
UPD:
The input comes from this function:
void code(std::vector<bool> cur, std::vector<bool> sh, std::vector<bool>* codes, Node* r)
{
if (r->l)
{
cur.push_back(0);
if (r->l->symb)
{
putbit(0);
codes[(int)r->l->symb] = cur;
for (int i=7; i>=0; i--)
{
if ((int)r->l->symb & (1 << i))
putbit(1);
else putbit(0);
}
}
else
{
putbit(0);
code(cur, sh, codes, r->l);
}
cur.pop_back();
}
if (r->r)
{
cur.push_back(1);
if (r->r->symb)
{
putbit(1);
codes[(int)r->r->symb] = cur;
for (int i=7; i>=0; i--)
{
if ((int)r->r->symb & (1 << i))
putbit(1);
else putbit(0);
}
}
else
{
putbit(1);
code(cur, sh, codes, r->r);
}
cur.pop_back();
}
}

The thing is, your putbit function is working (though its terrible, you use globals and your buffer should be a char).
For example, this is how I tested your function.
out.open( "outfile", std::ios::binary );
if ( out.is_open() ) {
putbit(1);
putbit(1);
putbit(0);
putbit(1);
putbit(0);
putbit(1);
putbit(0);
putbit(0);
out.close();
}
This should ouput 1101 0100 or d4 in hex.
I believe this an XY problem. The problem you're trying to solve is not in the putbit function but rather on the way you use it and in your algorithm.
You said that you had the right values before putting your data to the output file. There are many similar questions to your in stackoverflow, just look for them.
The real problem is that the putbit function is not enough to solve your problems. You rely of the fact that it will write a byte after you call it 8 times. What if you write less than 8 bytes? Also, you never flush your file (at least in the code you posted) so there's no guarantee that all data will be written.
First you must understand how file handles (streams) work. Open your file locally, check if it's open and close it when you're done. Closing also guarantees that all data in the file buffer is written to the file.
outfile.open( "output", std::ios::binary );
if ( outfile.is_open() ) {
// ... use file ...
outfile.close();
}
else {
// Couldnt open file!
}
Other questions solve this by writing, or using, a BitStream. It would look somewhat like this,
class OutBitstream {
public:
OutBitstream();
~OutBitstream(); // close file
bool isOpen();
void open( const std::string &file );
void close(); // close file, also write pending bits
void writeBit( bool b ); // or putbit, use the names you prefer
void writeByte( char c );
void writePendingBits(); // write bits in the buffer they may
// be less than 8 so you may have to do some padding
private:
std::ofstream _out;
char _bitBuffer; //or std::bitset<8>
int _numbits;
};
With this interface it should be easier to handle bit input. No globals as well. I hope that helps.

Related

C, C++ extract struct member from binary file

I'm using the following code to extract a struct member from a binary file.
I'm wondering why this prints out multiple times? when there is only one ID record, and only one struct in the file. I need to access just this member, what is the best way to do it?
I don't really understand what the while loop is doing? Is it testing for whether the file is open and returning 1 until that point?
Why use fread inside the while loop?
Does the fread need to be set to the specific size of the struct member?
Is the printf statement reading the binary and outputting an int?
FILE *p;
struct myStruct x;
p=fopen("myfile","rb");
while(1) {
size_t n = fread(&x, sizeof(x), 1, p);
if (n == 0) {
break;
}
printf("\n\nID:%d", x.ID); // Use matching specifier
fflush(stdout); // Insure output occurs promptly
}
fclose(p);
return 0;
The struct looks like this:
struct myStruct
{
int cm;
int bytes;
int ID;
int version;
char chunk[1];
}
Not really an answer but to answer a comment.
Just do
FILE *p = fopen("myfile","rb");
struct myStruct x;
size_t n = fread(&x, sizeof(x), 1, p);
if (n != 1) {
// Some error message
} else {
printf("\n\nID:%d\n", x.ID);
}
...Do as you wish with the rest of the file
I'm wondering why this prints out multiple times? when there is only one ID record, and only one struct in the file.
It won't! So if you have multiple prints the likely explanation is that the file contains more than just one struct. Another explanation could be that the file (aka the struct) was not saved in the same way as you use for reading.
I need to access just this member, what is the best way to do it?
Your approach looks fine to me.
I don't really understand what the while loop is doing?
The while is there because the code should be able to read multiple structs from the file. Using while(1) means something like "loop forever". To get out of such a loop, you use break. In your code the break happens when it can't read more structs from the file, i.e. if (n == 0) { break; }
Is it testing for whether the file is open and returning 1 until that point?
No - see answer above.
Why use fread inside the while loop?
As above: To able to read multiple structs from the file
Does the fread need to be set to the specific size of the struct member?
Well, fread is not "set" to anything. It is told how many elements to read and the size of each element. Therefore you call it with sizeof(x).
Is the printf statement reading the binary and outputting an int?
No, the reading is done by fread. Yes, printf outputs the decimal value.
You can try out this code:
#include <stdio.h>
#include <unistd.h>
struct myStruct
{
int cm;
int bytes;
int ID;
int version;
char chunk[1];
};
void rr()
{
printf("Reading file\n");
FILE *p;
struct myStruct x;
p=fopen("somefile","rb");
while(1) {
size_t n = fread(&x, sizeof(x), 1, p);
if (n == 0) {
break;
}
printf("\n\nID:%d", x.ID); // Use matching specifier
fflush(stdout); // Insure output occurs promptly
}
fclose(p);
}
void ww()
{
printf("Creating file containing a single struct\n");
FILE *p;
struct myStruct x;
x.cm = 1;
x.bytes = 2;
x.ID = 3;
x.version = 4;
x.chunk[0] = 'a';
p=fopen("somefile","wb");
fwrite(&x, sizeof(x), 1, p);
fclose(p);
}
int main(void) {
if( access( "somefile", F_OK ) == -1 )
{
// If "somefile" isn't there already, call ww to create it
ww();
}
rr();
return 0;
}
Answers in-line
I'm wondering why this prints out multiple times? when there is only one ID record, and only one struct in the file. I need to access just this member, what is the best way to do it?
The file size is 2906 bytes and fread is only reading sone 17 bytes at a time, and this goes on in a loop
I don't really understand what the while loop is doing? Is it testing for whether the file is open and returning 1 until that point?
The total number of elements successfully read is returned by fread
Why use fread inside the while loop?
In this case while is not necessary. just one fread is enough. Fread is sometimes used in a while loop when input from some other source like UART is being processed and the program has to wait for the said number of bytes t be read
Does the fread need to be set to the specific size of the struct member?
No. Reading the entire struct is better
Is the printf statement reading the binary and outputting an int?
No

Using fscanf to read from /proc C++

I am currently getting started on writing a program that will read info from /proc using fscanf and I am not sure where to start. Looking through the man page for proc(5), I noticed that you can use fscanf to get certain attributes from the /proc directory.
For example MemTotal %lu gets the total usable amount of RAM if you were reading proc/meminfo. Then would the fscanf would look like:
unsigned long memTotal=0;
FILE* file = fopen("/proc/meminfo", "r");
fscanf(file, "MemTotal %lu", &memTotal);
How would I iterate over the file while using fscanf to get certain values.
I wrote some code to do exactly [well, it wasn't exactly "/proc/meminfo", but reading data from a "/proc/something" using scanf] this at work the other day.
The principle is to check the return value of fscanf. It will be either EOF, 0 or 1 for End of input, didn't get anything and found what you were looking for. If the result is EOF, you exit the loop. If it's 0 for all of your sampling points, you will need to do something else to skip the line - I use a loop around fgetc() to read the line.
If you want to read several elements, it's probably best to do that using some kind of list.
I'd probably do something like this:
std::vector<std::pair<std::string, unsigned long* value>> list =
{ { "MemTotal %lu", &memTotal },
{ "MemFree %lu", &memFree },
...
};
bool done = false
while(!done)
{
int res = 0;
bool found_something = false;
for(auto i : list)
{
res = fscanf(file, i.first.c_str(), i.second);
if (res == EOF)
{
done = true;
break;
}
if (res != 0)
{
found_something = true;
}
}
if (!found_something)
{
// Skip line that didn't contain what we were looking for.
int ch;
while((ch = fgetc(file)) != EOF)
{
if (ch == '\n')
break;
}
}
}
This is just a sketch of how to do this, but it should give you an idea.

writing bits into a c++ file

I'm working on Huffman coding and I have built the chars frequency table with a
std::map<char,int> frequencyTable;
Then I have built the Huffman tree and then i have built the codes table in this way:
std::map<char,std::vector<bool> > codes;
Now I would read the input file, character by character ,and encode them through the codes table but i don't know how write bits into a binary output file.
Any advice?
UPDATE:
Now i'm trying with these functions:
void Encoder::makeFile()
{
char c,ch;
unsigned char ch2;
while(inFile.get(c))
{
ch=c;
//send the Huffman string to output file bit by bit
for(unsigned int i=0;i < codes[ch].size();++i)
{
if(codes[ch].at(i)==false){
ch2=0;
}else{
ch2=1;
}
encode(ch2, outFile);
}
}
ch2=2; // send EOF
encode(ch2, outFile);
inFile.close();
outFile.close();
}
and this:
void Encoder::encode(unsigned char i, std::ofstream & outFile)
{
int bit_pos=0; //0 to 7 (left to right) on the byte block
unsigned char c; //byte block to write
if(i<2) //if not EOF
{
if(i==1)
c |= (i<<(7-bit_pos)); //add a 1 to the byte
else //i==0
c=c & static_cast<unsigned char>(255-(1<<(7-bit_pos))); //add a 0
++bit_pos;
bit_pos%=8;
if(bit_pos==0)
{
outFile.put(c);
c='\0';
}
}
else
{
outFile.put(c);
}
}
but ,I don't know why ,it doesn't work, the loop is never executed and the encode function is never used, why?
You can't write a single bit directly to a file. The I/O unit of reading/writing is a byte (8-bits). So you need to pack your bools into chunks of 8 bits and then write the bytes. See Writing files in bit form to a file in C or How to write single bits to a file in C for example.
The C++ Standard streams support an access of the smallest unit the underlying CPU supports. That's a byte.
There are implementation of a bit stream class in C++ like the
Stanford Bitstream Class.
Another approach could use the std::bitset class.

C++ Make a file of a specific size

Here is my current problem: I am trying to create a file of x MB in C++. The user will enter in the file name then enter in a number between 5 and 10 for the size of the file they want created. Later on in this project i'm gonna do other things with it but I'm stuck on the first step of creating the darn thing.
My problem code (so far):
char empty[1024];
for(int i = 0; i < 1024; i++)
{
empty[i] = 0;
}
fileSystem = fopen(argv[1], "w+");
for(int i = 0; i < 1024*fileSize; i++){
int temp = fputs(empty, fileSystem);
if(temp > -1){
//Sucess!
}
else{
cout<<"error"<<endl;
}
}
Now if i'm doing my math correctly 1 char is 1byte. There are 1024 bytes in 1KB and 1024KB in a MB. So if I wanted a 2 MB file, i'd have to write 1024*1024*2 bytes to this file. Yes?
I don't encounter any errors but I end up with an file of 0 bytes... I'm not sure what I'm doing wrong here so any help would be greatly appreciated!
Thanks!
Potentially sparse file
This creates output.img of size 300 MB:
#include <fstream>
int main()
{
std::ofstream ofs("ouput.img", std::ios::binary | std::ios::out);
ofs.seekp((300<<20) - 1);
ofs.write("", 1);
}
Note that technically, this will be a good way to trigger your filesystem's support for sparse files.
Dense file - filled with 0's
Functionally identical to the above, but filling the file with 0's:
#include <iostream>
#include <fstream>
#include <vector>
int main()
{
std::vector<char> empty(1024, 0);
std::ofstream ofs("ouput.img", std::ios::binary | std::ios::out);
for(int i = 0; i < 1024*300; i++)
{
if (!ofs.write(&empty[0], empty.size()))
{
std::cerr << "problem writing to file" << std::endl;
return 255;
}
}
}
Your code doesn't work because you are using fputs which writes a null-terminated string into the output buffer. But you are trying to write all nulls, so it stops right when it looks at the first byte of your string and ends up writing nothing.
Now, to create a file of a specific size, all you need to do is to call truncate function (or _chsiz for Windows) exactly once and set what size you want the file to be.
Good luck!
To make a 2MB file you have to seek to 2*1024*1024 and write 0 bytes. fput()ting empty string will do no good no matter how many time. And the string is empty, because strings a 0-terminated.

Possible reasons for tellg() failing?

ifstream::tellg() is returning -13 for a certain file.
Basically, I wrote a utility that analyzes some source code; I open all files alphabetically, I start with "Apple.cpp" and it works perfectly.. But when it gets to "Conversion.cpp", always on the same file, after reading one line successfully tellg() returns -13.
The code in question is:
for (int i = 0; i < files.size(); ++i) { /* For each .cpp and .h file */
TextIFile f(files[i]);
while (!f.AtEof()) // When it gets to conversion.cpp (not on the others)
// first is always successful, second always fails
lines.push_back(f.ReadLine());
The code for AtEof is:
bool AtEof() {
if (mFile.tellg() < 0)
FATAL(format("DEBUG - tellg(): %d") % mFile.tellg());
if (mFile.tellg() >= GetSize())
return true;
return false;
}
After it reads successfully the first line of Conversion.cpp, it always crashes with DEBUG - tellg(): -13.
This is the whole TextIFile class (wrote by me, the error may be there):
class TextIFile
{
public:
TextIFile(const string& path) : mPath(path), mSize(0) {
mFile.open(path.c_str(), std::ios::in);
if (!mFile.is_open())
FATAL(format("Cannot open %s: %s") % path.c_str() % strerror(errno));
}
string GetPath() const { return mPath; }
size_t GetSize() { if (mSize) return mSize; const size_t current_position = mFile.tellg(); mFile.seekg(0, std::ios::end); mSize = mFile.tellg(); mFile.seekg(current_position); return mSize; }
bool AtEof() {
if (mFile.tellg() < 0)
FATAL(format("DEBUG - tellg(): %d") % mFile.tellg());
if (mFile.tellg() >= GetSize())
return true;
return false;
}
string ReadLine() {
string ret;
getline(mFile, ret);
CheckErrors();
return ret;
}
string ReadWhole() {
string ret((std::istreambuf_iterator<char>(mFile)), std::istreambuf_iterator<char>());
CheckErrors();
return ret;
}
private:
void CheckErrors() {
if (!mFile.good())
FATAL(format("An error has occured while performing an I/O operation on %s") % mPath);
}
const string mPath;
ifstream mFile;
size_t mSize;
};
Platform is Visual Studio, 32 bit, Windows.
Edit: Works on Linux.
Edit: I found the cause: line endings. Both Conversion and Guid and others had \n instead of \r\n. I saved them with \r\n instead and it worked. Still, this is not supposed to happen is it?
It's difficult to guess without knowing exactly what's in Conversion.cpp. However, using < with stream positions is not defined by the standard. You might want to consider an explicit cast to the correct integer type before formatting it; I don't know what formatting FATAL and format() expect to perform or how the % operator is overloaded. Stream positions don't have to map in a predicatable way to integers, certainly not if the file isn't opened in binary mode.
You might want to consider an alternative implementation for AtEof(). Say something like:
bool AtEof()
{
return mFile.peek() == ifstream::traits_type::eof();
}