I'm trying to read bytes from binary file but to no success.
I've tried many solutions, but I get no get result.
Struct of file:
[offset] [type] [value] [description]
0000 32 bit integer 0x00000803(2051) magic number
0004 32 bit integer 60000 number of images
0008 32 bit integer 28 number of rows
0012 32 bit integer 28 number of columns
0016 unsigned byte ?? pixel
0017 unsigned byte ?? pixel
........
xxxx unsigned byte ?? pixel
How I tried (doesn't work):
auto myfile = fopen("t10k-images.idx3-ubyte", "r");
char buf[30];
auto x = fread(buf, 1, sizeof(int), myfile);
Read the bytes as unsigned char:
ifstream if;
if.open("filename", ios::binary);
if (if.fail())
{
//error
}
vector<unsigned char> bytes;
while (!if.eof())
{
unsigned char byte;
if >> byte;
if (if.fail())
{
//error
break;
}
bytes.push_back(byte);
}
if.close();
Then to turn multiple bytes into a 32-bit integer for example:
uint32_t number;
number = ((static_cast<uint32_t>(byte3) << 24)
| (static_cast<uint32_t>(byte2) << 16)
| (static_cast<uint32_t>(byte1) << 8)
| (static_cast<uint32_t>(byte0)));
This should cover endian issues. It doesn't matter if int shows up as B0B1B2B3 or B3B2B1B0 on the system, since the conversion is handled by bit shifts. The code doesn't assume any particular order in memory.
The C++ stream library function read() can be used for binary file I/O. Given the code example from the link, I would start like this:
std::ifstream myfile("t10k-images.idx3-ubyte", std::ios::binary);
std::uint32_t magic, numim, numro, numco;
myfile.read(reinterpret_cast<char*>(&magic), 4);
myfile.read(reinterpret_cast<char*>(&numim), 4);
myfile.read(reinterpret_cast<char*>(&numro), 4);
myfile.read(reinterpret_cast<char*>(&numco), 4);
// Changing byte order if necessary
//endswap(&magic);
//endswap(&numim);
//endswap(&numro);
//endswap(&numco);
if (myfile) {
std::cout << "Magic = " << magic << std::endl
<< "Images = " << numim << std::endl
<< "Rows = " << numro << std::endl
<< "Cols = " << numco << std::endl;
}
If the byte order (Endianness) should be reversed you could write a simple reverse function like this one: endswap()
Knowing the endianness of your file layout whence reading multi-byte numerics is important. Assuming big-endian is always the written format, and assuming the value is indeed a 32bit unsigned value:
uint32_t magic = 0;
unsigned char[4] bytes;
if (1 == fread(bytes, sizeof(bytes), 1, f))
{
magic = (uint32_t)((bytes[0] << 24) |
(bytes[1] << 16) |
(bytes[2] << 8) |
bytes[3]);
}
Note: this will work regardless of whether the reader (your program) is little endian or big-endian. I'm sure I missed at least one cast in there, but hopefully you get the point. The only safe, and portable way of reading multi-byte numerics is to (a) know the endianness they were written with, and (b) read-and-assemble them byte by byte.
This is how you read an uint32_t from a file:
auto f = fopen("", "rb"); // not the b, for binary files you need to specify 'b'
std::uint32_t magic = 0;
fread (&magic, sizeof(std::uint32_t), 1, f);
Hope this helps.
Related
I am trying to create a bitmaped data in , here is the code I used but I am not able to figure the right logic. Here's my code
bool a=1;
bool b=0;
bool c=1;
bool d=0;
uint8_t output = a|b|c|d;
printf("outupt = %X", output);
I want my output to be "1010" which is equivalent to hex "0x0A". How do I do it ??
The bitwise or operator ors the bits in each position. The result of a|b|c|d will be 1 because you're bitwise oring 0 and 1 in the least significant position.
You can shift (<<) the bits to the correct positions like this:
uint8_t output = a << 3 | b << 2 | c << 1 | d;
This will result in
00001000 (a << 3)
00000000 (b << 2)
00000010 (c << 1)
| 00000000 (d; d << 0)
--------
00001010 (output)
Strictly speaking, the calculation happens with ints and the intermediate results have more leading zeroes, but in this case we do not need to care about that.
If you're interested in setting/clearing/accessing very simply specific bits, you could consider std::bitset:
bitset<8> s; // bit set of 8 bits
s[3]=a; // access individual bits, as if it was an array
s[2]=b;
s[1]=c;
s[0]=d; // the first bit is the least significant bit
cout << s <<endl; // streams the bitset as a string of '0' and '1'
cout << "0x"<< hex << s.to_ulong()<<endl; // convert the bitset to unsigned long
cout << s[3] <<endl; // access a specific bit
cout << "Number of bits set: " << s.count()<<endl;
Online demo
The advantage is that the code is easier to read and maintain, especially if you're modifying bitmapped data. Because setting specific bits using binary arithmetics with a combination of << and | operators as explained by Anttii is a vorkable solution. But clearing specific bits in an existing bitmap, by combining the use of << and ~ (to create a bit mask) with & is a little more tricky.
Another advantage is that you can easily manage large bitsets of hundreds of bits, much larger than the largest built-in type unsigned long long (although doing so will not allow you to convert as easily to an unsigned long or an unsigned long long: you'll have to go via a string).
C only
I would use bitfields. I know that they are not portable, but for the particular embedded hardware (especially uCs) it is well defined.
#include <string.h>
#include <stdio.h>
#include <stdbool.h>
typedef union
{
struct
{
bool a:1;
bool b:1;
bool c:1;
bool d:1;
bool e:1;
bool f:1;
};
unsigned char byte;
}mydata;
int main(void)
{
mydata d;
d.a=1;
d.b=0;
d.c=1;
d.d=0;
printf("outupt = %hhX", d.byte);
}
Sorry for my bad English. I need to build app which converts hex to rgb. I have file U1.txt with content inside:
2 3
008000
FF0000
FFFFFF
FFFF00
FF0000
FFFF00
And my codeblocks app:
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
int a;
int b;
string color;
ifstream data("U1.txt");
ofstream result("U1result.txt");
data >> a;
data >> b;
for (int i = 0; i < a * b; i++) {
data >> color;
cout << color[0] * 16 + color[1] << endl;
}
data.close();
result.close();
return 0;
}
This gives me 816. But it should be 0. I think color[0] is not an integer, but a char and it multiplies by ASCII number.. I've tried many ways with atoi, c_str() and it not working. P.S do not suggest stoi(), because I need to do this homework with older C++. Thanks in advance and have a good day ;)
You can directly store the hexadecimal values in an int with std::hex.
int b;
ifstream data("U1.txt");
data >> std::hex >> b;
Since those encodings use 24 bits, you have to start out with an integer type that holds at least 24 bits. And for this kind of packing and unpacking, it really ought to be unsigned, so you don't get tangled up in sign bits. That means using std::uint_least32_t, which is the smallest unsigned type that can hold at least 32 bits. (Yes, 24 would fit better, but there is no least24 type; 32 is the best you can do).
If your compiler doesn't provide those fixed-width types (std::uint_least32_t), you can use unsigned long. It's required to be at least 32 bits wide. It could be larger, and the reason for using std::uint_least32_t is that your compiler might have, for example, a 32-bit integer, in which case unsigned int would be 32 bits wide. But you can't count on that, so either use the fixed-width type or use unsigned long to ensure that you have enough bits.
Since the character inputs are encoded in hexadecimal, you need to tell the input system to interpret them as hex values. So:
std::uint_least32_t value;
data >> std::hex >> value;
Now you've got the value in the low 24 bits of value. You need to pick out the individual R, G, and B parts of that value. That's straightforward. To get the low 8 bits, just mask out the higher ones:
std::cout << (value & 0xFF) << '\n';
To get the next 8 bits, shift and mask:
std::cout << ((value >> 8) & 0xFF) << '\n';
And, naturally, to get the upper 8 bits, shift and mask:
std::cout << ((value >> 16) & 0xFF) << '\n';
A rather unelegant but also working answer is to subtract all your chars by 48 as thats where numbers start in ASCII. This is also the reason why you get 816 as:
48*16+48 = 816
First off, I apologize if this is a duplicate; but my Google-fu seems to be failing me today.
I'm in the middle of writing an image format module for Photoshop, and one of the save options for this format, includes a 4-bit alpha channel. Of course, the data I have to convert is 8-bit/1 byte alpha - so I need to essentially take every two bytes of alpha, and merge it into one.
my attempt (below), I believe has a lot of room for improvement:
for(int x=0,w=0;x < alphaData.size();x+=2,w++)
{
short ashort=(alphaData[x] << 8)+alphaData[x+1];
alphaFinal[w]=(unsigned char)ashort;
}
alphaData and alphaFinal are vectors that contains the 8-bit alpha data and the 4-bit alpha data, respectively. I realize that reducing two bytes into the value of one, is bound to result in loss of "resolution", but I can't help but think there's a better way of doing this.
For extra information, here's the loop that does the reverse (converts 4-bit alpha from the format to 8-bit for Photoshop)
alphaData serves the same purpose as above, and imgData is an unsigned char vector that holds the raw image data. (alpha data is tacked on after the actual rgb data for the image in this particular variant of the format)
for(int b=alphaOffset,x2=0;b < (alphaOffset+dataLength); b++,x2+=2)
{
unsigned char lo = (imgData[b] & 15);
unsigned char hi = ((imgData[b] >> 4) & 15);
alphaData[x2]=lo*17;
alphaData[x2+1]=hi*17;
}
Are you sure that it's
alphaData[x2]=lo*17;
alphaData[x2+1]=hi*17;
and not
alphaData[x2]=lo*16;
alphaData[x2+1]=hi*16;
In any case, to generate the values that work with the decoding function you have posted, you just have to reverse the operations. So multiplying by 17 becomes dividing by 17 and the shifts and masks get reordered to look like this:
for(int x=0,w=0;x < alphaData.size();x+=2,w++)
{
unsigned char alpha1 = alphaData[x] / 17;
unsigned char alpha2 = alphaData[x+1] / 17;
Assert(alpha1 < 16 && alpha2 < 16);
alphaFinal[w]=(alpha2 << 4) | alpha1;
}
short ashort=(alphaData[x] << 8)+alphaData[x+1];
alphaFinal[w]=(unsigned char)ashort;
You're actually losing alphaData[x] in alphaFinal. You shift alphaData[x] by 8 bits to the left and then assign 8 low bits.
Also your for loop is unsafe, if for some reason alphaData.size() is odd, you'll run out of range.
what you want to do, I think, is to truncate an 8-bit value into a 4-bit one; not to combine two 8-bit vales. In other words, you want to drop the four least significant bits of each alpha value, not to combine two different alpha values.
So, basically, you want to right-shift by 4.
output = (input >> 4); /* truncate four bits */
in case you're not familiar with binary shifts, take this random 8-bit number:
10110110
>> 1
= 01011011
>> 1
= 00101101
>> 1
= 00010110
>> 1
= 00001011
so,
10110110
>> 4
= 00001011
and to reverse, left-shift instead...
input = (output << 4); /* expand four bits */
which, using the result from that same random 8-bit number as before, would be
00001011
>> 4
= 10110000
obviously, as you noted, 4 bits of precision is lost. But you'd be surprised how little it's noticed in a fully-composited work.
This code
for(int x=0,w=0;x < alphaData.size();x+=2,w++)
{
short ashort=(alphaData[x] << 8)+alphaData[x+1];
alphaFinal[w]=(unsigned char)ashort;
}
Is broken. Given
#include <iostream>
using std::cout;
using std::endl;
typedef unsigned char uchar;
int main() {
uchar x0 = 1; // for alphaData[x]
uchar x1 = 2; // for alphaData[x+1]
short ashort = (x0 << 8) + x1; // The value 0x0102
uchar afinal = (uchar)ashort; // truncates to 0x02.
cout << std::hex
<< "x0 = 0x" << x0 << " << 8 = 0x" << (x0 << 8) << endl
<< "x1 = 0x" << x1 << endl
<< "ashort = 0x" << ashort << endl
<< "afinal = 0x" << (unsigned int)afinal << endl
;
}
If you are saying that your source stream contains sequences of 4-bit pairs stored in 8-bit storage values, which you need to re-store as a single 8-bit value, then what you want is:
for(int x=0,w=0;x < alphaData.size();x+=2,w++)
{
unsigned char aleft = alphaData[x] & 0x0f; // 4 bits.
unsigned char aright = alphaData[x + 1] & 0x0f; // 4 bits.
alphaFinal[w] = (aleft << 4) | (aright);
}
"<<4" is equivalent to "*16", as ">>4" is equivalent to "/16".
Say I have a binary file; it contains positive binary numbers, but written in little endian as 32-bit integers
How do I read this file? I have this right now.
int main() {
FILE * fp;
char buffer[4];
int num = 0;
fp=fopen("file.txt","rb");
while ( fread(&buffer, 1, 4,fp) != 0) {
// I think buffer should be 32 bit integer I read,
// how can I let num equal to 32 bit little endian integer?
}
// Say I just want to get the sum of all these binary little endian integers,
// is there an another way to make read and get sum faster since it's all
// binary, shouldnt it be faster if i just add in binary? not sure..
return 0;
}
This is one way to do it that works on either big-endian or little-endian architectures:
int main() {
unsigned char bytes[4];
int sum = 0;
FILE *fp=fopen("file.txt","rb");
while ( fread(bytes, 4, 1,fp) != 0) {
sum += bytes[0] | (bytes[1]<<8) | (bytes[2]<<16) | (bytes[3]<<24);
}
return 0;
}
If you are using linux you should look here ;-)
It is about useful functions such as le32toh
From CodeGuru:
inline void endian_swap(unsigned int& x)
{
x = (x>>24) |
((x<<8) & 0x00FF0000) |
((x>>8) & 0x0000FF00) |
(x<<24);
}
So, you can read directly to unsigned int and then just call this.
while ( fread(&num, 1, 4,fp) != 0) {
endian_swap(num);
// conversion done; then use num
}
If you are working with short files, I recommend the simple use of the class stringstream and then the function stoul. The code below reads byte per byte (in this case 2 bytes) from an ifstream and writes them in hex inside a string stream. Then thanks to stoul converts the string into a 16 bit integer:
#include <sstream>
#include <iomanip>
using namespace std;
ifstream is("filename.bin", ios::binary);
if(!is) { /*Error*/ }
is.unsetf(ios_base::skipws);
stringstream ss;
uint8_t byte1, byte2;
uint16_t val;
is >> byte1; is >> byte2;
ss << setw(2) << setfill('0') << hex << static_cast<size_t>(byte1);
ss << setw(2) << setfill('0') << hex << static_cast<size_t>(byte2);
val = static_cast<uint16_t>(stoul(ss.str(), nullptr, 16));
cout << val << endl;
For example if you have to read from a binary file, a 16 bit integer stored in Big Endian (00 f3), you put it inside a stringstream ("00f3") and then convert it in a integer (243). The example writes the value in hex, but it could be dec or oct, even binary, using the class bitset. The iomanip functions (setw, setfill) are used to give a correct format to the sstream.
The bad of this method is that it's tremendously slow if you have to work with files large in size.
You read the code normally. However when you go to interpret the data you need to make the proper conversions.
This can be a pain in the butt as if you want to make your code portable, ie to run in both little and big endian machines, you need to handle all types of combinations: little to big, big to little, little to little and big to big. In the last two cases a no-op.
Fortunately this all can be automated with the boost::endian library. An example from their documentation:
#include <iostream>
#include <cstdio>
#include <boost/endian/arithmetic.hpp>
#include <boost/static_assert.hpp>
using namespace boost::endian;
namespace
{
// This is an extract from a very widely used GIS file format.
// Why the designer decided to mix big and little endians in
// the same file is not known. But this is a real-world format
// and users wishing to write low level code manipulating these
// files have to deal with the mixed endianness.
struct header
{
big_int32_t file_code;
big_int32_t file_length;
little_int32_t version;
little_int32_t shape_type;
};
const char* filename = "test.dat";
}
int main(int, char* [])
{
header h;
BOOST_STATIC_ASSERT(sizeof(h) == 16U); // reality check
h.file_code = 0x01020304;
h.file_length = sizeof(header);
h.version = 1;
h.shape_type = 0x01020304;
// Low-level I/O such as POSIX read/write or <cstdio>
// fread/fwrite is sometimes used for binary file operations
// when ultimate efficiency is important. Such I/O is often
// performed in some C++ wrapper class, but to drive home the
// point that endian integers are often used in fairly
// low-level code that does bulk I/O operations, <cstdio>
// fopen/fwrite is used for I/O in this example.
std::FILE* fi = std::fopen(filename, "wb"); // MUST BE BINARY
if (!fi)
{
std::cout << "could not open " << filename << '\n';
return 1;
}
if (std::fwrite(&h, sizeof(header), 1, fi) != 1)
{
std::cout << "write failure for " << filename << '\n';
return 1;
}
std::fclose(fi);
std::cout << "created file " << filename << '\n';
return 0;
}
After compiling and executing endian_example.cpp, a hex dump of test.dat shows:
01020304 00000010 01000000 04030201
Basically what I want to do is to read a binary file, and extract 4 consecutive values at address e.g. 0x8000. For example, the 4 numbers are 89 ab cd ef. I want to read these values and store them into a buffer, and then convert the buffer to int type. I have tried the following method:
ifstream *pF = new ifstream();
buffer = new char[4];
memset(buffer, 0, 4);
pF->read(buffer, 4);
When I tried
cout << buffer << endl;
nothing happens, I guarantee that there are values at this location (I can view the binary file in hex viewer). Could anyone show me the method to convert the buffer to int type and properly display it? Thank you.
Update
int number = buffer[0];
for (int i = 0; i < 4; ++i)
{
number <<= 8;
number |= buffer[i];
}
It also depends on Little endian and Bit endian notations. If you compose your number with another way, you can use number |= buffer[3 - i]
And in order to display hex int you can use
#include <iomanip>
cout << hex << number;
cout << hex << buffer[0] << buffer[1] << buffer[2] << buffer[3] << endl;
See http://www.cplusplus.com/reference/iostream/manipulators/hex/