File Size to store an integer

File Size to store an integer - c++

I want to write an integer (for ex - 222222) to a text file in a way that the size of the file is reduced. If I write the integer in the form of a string, it takes 6 Bytes because of the six characters present. If I store the integer in the form of an integer, it again takes 6 Bytes. Why isn't the file size equal to 4 Bytes since an int takes 4 Bytes?
#include <iostream>
#include<stdlib.h>
#include<stdio.h>
using namespace std;
int main()
{
//char* x = "222222.2222";
//double x = 222222.2222;
int x = 222222;
FILE *fp = fopen("now.txt","w");
fprintf(fp,"%d",x);
return 0;
}

Here is the definition of fprintf:
writes the C string pointed by format to the stream.
So whatever you pass to the function, they are treated as a string, that's the output file all has 222222 stored in it.
If you want to store a integer rather than a string in the file, you could use: fwrite.
int x = 222222;
FILE *fp = fopen("now.txt","w");
fwrite(&x, sizeof(int), 1, fp);
Then the file stores: 0E 64 03 00 if you change you editor to hex mode. It's 4 bytes.

There is a simple reason behind this.
Whenever we write to file it's stored in characters. So when you write integer 222222 into a file it's written character by character not as an integer.

when you write integer as integer, that file turns in to a binary file.
When you write and read binary files, it's required to take care of the paddings , byte order etc.
The other way around is plain text and you read it as strings and with the help of libraries we convert it to integers.

Related

How to write byte(s) to a file in C++?

I have created a bitset using std::bitset<8> bits which is equivalent to 00000000 i.e., 1 byte.
I have output file defined as std::ofstream outfile("./compressed", std::ofstream::out | std::ofstream::binary) but when I write the bits using outfile << bits, the content of outfile becomes 00000000 but the size of file is 8 bytes. (each bit of bits end up taking 1 byte in the file). Is there any way to truly write byte to a file? For example if I write 11010001 then this should be written as a byte and the file size should be 1 byte not 8 bytes. I am writing a code for Huffman encoder and I am not able to find a way to write the encoded bytes to the output compressed file.

The issue is operator<< is the text encoding method, even if you've specified std::ofstream::binary. You can use put to write a single binary character or write to output multiple characters. Note that you are responsible for the conversion of data to its char representation.
std::bitset<8> bits = foo();
std::ofstream outfile("compressed", std::ofstream::out | std::ofstream::binary);
// In reality, your conversion code is probably more complicated than this
char repr = bits.to_ulong();
// Use scoped sentries to output with put/write
{
std::ofstream::sentry sentry(outfile);
if (sentry)
{
outfile.put(repr); // <- Option 1
outfile.write(&repr, sizeof repr); // <- Option 2
}
}

c++ how to write integers to a binary file that stay 4 bytes long

I want to write a bunch of integers to a file and then be able to read them later. My problem is that when I write the integers to a file, smaller integers end up using less than 4 bytes. So 1 for example is represented as 01 rather than 00 00 00 01. This means I'll have trouble reading the file because I don't know where one integer begins and ends. How do I make it so that the integer I write to the file is always 4 bytes long? My code is below:
std::fstream file;
file.open("test.bin", std::ios::out | std::ios::binary);
for each(int i in vectorOfInts) {
file << i;
}
file.close();

You seem to be confused between text and binary files. The << operator is used for text files. It converts the value to text and writes that to the file. You need to use the write method to write an integer in native binary format to a file. The below would write out the 4 bytes to the file.
file.write( reinterpret_cast<const char *>(&i), sizeof(i));
You may also need to consider the endianness of data depending on what will be reading the data back.
You could also write the whole vector without a loop using:
file.write( reinterpret_cast<const char *>(&vectorOfInts[0]), vectorOfInts.size()*sizeof(int));

How to read a binary number as input?

Is there a way for the user to input a binary number in C or C++?
If we write something like
int a = 0b1010;
std::cout << a << std::endl
Then the output comes out to be 10 (when using the appropriate compiler extensions).
but when we try to write
int n;
std::cin >> n;
int t = 0bn;
It gives us an error so can anyone suggest that how can we directly read binary number as input rather than using string to store input?

There is a bit of confusion here, let's disentangle it a bit.
0b1010 is an integer literal, a constant, compile-time integer value written in base 2. Likewise, 0xA is a literal in base 16 and 10 is in base 10. All of these refer to the same integer, it is just a different way of telling the compiler which number you mean. At runtime, in memory, this integer is always represented as a base-2 number.
std::cout << a; takes the integer value of a and outputs a string representation of it. By default it outputs it in base 10, but you can i.e use the std::hex modifier to have it output it in base 16. There is no predefined modifier to print in binary. So you need to do that on your own (or google it, it is a common question).
0b at last, is only used to define integer literals. It is not a runtime operator. Recall, all ints are represented as base 2 numbers in memory. Other bases do not exist from a machine point of view, int is int, so there is nothing to convert. If you need to read a binary number from a string, you would roll the reverse code to what you do to print it (std::cin >> n assumes that the input is a base 10 number, so it reads a wrong number if the input is actually intended to be in base 2).

While there is no function to read binary numbers directly, there are functions, strtox (where x represents the data type) to convert a string containing a binary number (or a number of any other base) to a numeric value.
So the solution is to first read the number as a string and then convert it.
Example:
char input[100];
char *endpointer;
<read input using either C or C++ syntax>
int n = (int) strtol(input, &endpointer, 2);

To take a binary number as input, there are two ways I use frequently:
(Keynote: Take the input as string!!! use: #include <string>)
The to_ulong() method of the bitset template of the bitset library
for this you need to include the bitset library using #include <bitset>
Example:
string s;
cin>>s; // Suppose s = "100100101"
int n = (int) bitset<64>(s).to_ulong();
cout<<n; // 293
Explore more about bitset here and about to_ulong() here.
The stoi() method of the string library
for this you need to include the string library using #include <string>
Example:
string s;
cin>>s; // Suppose s = "100100101"
int n = stoi(s, 0, 2);
cout<<n; // 293
Explore the format of stoi() here.

rather do it yourself:
uint32_t a = 0;
char c;
while ((c = getchar()) != '\n') { // read a line char by char
a <<= 1; // shift the uint32 a bit left
a += (c - '0') & 1; // convert the char to 0/1 and put it at the end of the binary
}
printf("%u\n", a);

What is the correct way to output hex data to a file?

I've read about [ostream] << hex << 0x[hex value], but I have some questions about it
(1) I defined my file stream, output, to be a hex output file stream, using output.open("BWhite.bmp",ios::binary);, since I did that, does that make the hex parameter in the output<< operation redundant?
(2)
If I have an integer value I wanted to store in the file, and I used this:
int i = 0;
output << i;
would i be stored in little endian or big endian? Will the endi-ness change based on which computer the program is executed or compiled on?
Does the size of this value depend on the computer it's run on? Would I need to use the hex parameter?
(3) Is there a way to output raw hex digits to a file? If I want the file to have the hex digit 43, what should I use?
output << 0x43 and output << hex << 0x43 both output ASCII 4, then ASCII 3.
The purpose of outputting these hex digits is to make the header for a .bmp file.

The formatted output operator << is for just that: formatted output. It's for strings.
As such, the std::hex stream manipulator tells streams to output numbers as strings formatted as hex.
If you want to output raw binary data, use the unformatted output functions only, e.g. basic_ostream::put and basic_ostream::write.
You could output an int like this:
int n = 42;
output.write(&n, sizeof(int));
The endianness of this output will depend on the architecture. If you wish to have more control, I suggest the following:
int32_t n = 42;
char data[4];
data[0] = static_cast<char>(n & 0xFF);
data[1] = static_cast<char>((n >> 8) & 0xFF);
data[2] = static_cast<char>((n >> 16) & 0xFF);
data[3] = static_cast<char>((n >> 24) & 0xFF);
output.write(data, 4);
This sample will output a 32 bit integer as little-endian regardless of the endianness of the platform. Be careful converting that back if char is signed, though.

You say
"Is there a way to output raw hex digits to a file? If I want the file to have the hex digit 43, what should I use? "
"Raw hex digits" will depend on the interpretation you do on a collection of bits. Consider the following:
Binary : 0 1 0 0 1 0 1 0
Hex : 4 A
Octal : 1 1 2
Decimal : 7 4
ASCII : J
All the above represents the same numeric quantity, but we interpret it differently.
So you can simply need to store the data as binary format, that is the exact bit pattern which is represent by the number.
EDIT1
When you open a file in text mode and write a number in it, say when you write 74 (as in above example) it will be stored as two ASCII character '7' and '4' . To avoid this open the file in binary mode ios::binary and write it with write () . Check http://courses.cs.vt.edu/~cs2604/fall00/binio.html#write

The purpose of outputting these hex digits is to make the header for a .bmp file.
You seem to have a large misconception of how files work.
The stream operators << generate text (human readable output). The .bmp file format is a binary format that is not human readable (will it is but its not nice and I would not read it without tools).
What you really want to do is generate binary output and place it the file:
char x = 0x43;
output.write(&x, sizeof(x));
This will write one byte of data with the hex value 0x43 to the output stream. This is the binary representation you want.
would i be stored in little endian or big endian? Will the endi-ness change based on which computer the program is executed or compiled on?
Neither; you are again outputting text (not binary data).
int i = 0;
output.write(reinterpret_cast<char*>(&i), sizeof(i)); // Writes the binary representation of i
Here you do need to worry about endianess (and size) of the integer value and this will vary depending on the hardware that you run your application on. For the value 0 there is not much tow worry about endianess but you should worry about the size of the integer.
I would stick some asserts into my code to validate the architecture is OK for the code. Then let people worry about if their architecture does not match the requirements:
int test = 0x12345678;
assert((sizeof(test) * CHAR_BITS == 32) && "BMP uses 32 byte ints");
assert((((char*)&test)[0] == 0x78) && "BMP uses little endian");
There is a family of functions that will help you with endianess and size.
http://www.gnu.org/s/hello/manual/libc/Byte-Order.html
Function: uint32_t htonl (uint32_t hostlong)
This function converts the uint32_t integer hostlong from host byte order to network byte order.
// Writing to a file
uint32_t hostValue = 0x12345678;
uint32_t network = htonl(hostValue);
output.write(&network, sizeof(network));
// Reading from a file
uint32_t network;
output.read(&network, sizeof(network);
uint32_t hostValue = ntohl(network); // convert back to platform specific value.
// Unfortunately the BMP was written with intel in-mind
// and thus all integers are in liitle-endian.
// network bye order (as used by htonl() and family) is big endian.
// So this may not be much us to you.
Last thing. When you open a file in binary format output.open("BWhite.bmp",ios::binary) it does nothing to stream apart from how it treats the end of line sequence. When the file is in binary format the output is not modified (what you put in the stream is what is written to the file). If you leave the stream in text mode then '\n' characters are converted to the end of line sequence (OS specific set of characters that define the end of line). Since you are writing a binary file you definitely do not want any interference in the characters you write so binary is the correct format. But it does not affect any other operation that you perform on the stream.

Writing multiple variable types to a text file using ofstream

I just want to write a simple text file:
ofstream test;
test.clear();
test.open("test.txt",ios::out);
float var = 132.26;
BYTE var2[2];
var2[0] = 45;
var2[1] = 55;
test << var << (BYTE)var2[0] << (BYTE)var2[1];
test.close();
But in the output file I get:
132.26-7
I don't get what the problem is...

I think that the problem might be that BYTE type might be a typedef for char. If this were the case, then whenevernyou try to write out a BYTE to a stream, it will print the ASCII character corresponding to that byte rather than the numeric value of the byte. Notice that the characters - and 7 correspond to ASCII values 45 and 55, for example.
To fix this, you'll want to do two things:
Typecast the BYTEs you're writing to some integral type like int or short before writing them to the file. This forces the stream to write a numeric value rather than a character.
Output some amount of whitespace in-between all of the data you output. Right now everythingnis bleeding together because there are no spaces, which makes things harder to read.
Hope this helps!

BYTE is nothing but an alias for unsigned char. By default, when you output a char in a stream, it is converted to its ASCII character. In the ASCII table, the character 45 is '-' and the character 55 is '7'.
Try this instead:
test << var << (int)var2[0] << (int)var2[1];

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

File Size to store an integer - c++

There is a simple reason behind this. Whenever we write to file it's stored in characters. So when you write integer 222222 into a file it's written character by character not as an integer.

when you write integer as integer, that file turns in to a binary file. When you write and read binary files, it's required to take care of the paddings , byte order etc. The other way around is plain text and you read it as strings and with the help of libraries we convert it to integers.

Related

How to write byte(s) to a file in C++?

c++ how to write integers to a binary file that stay 4 bytes long

How to read a binary number as input?

What is the correct way to output hex data to a file?

Writing multiple variable types to a text file using ofstream

Categories

Resources