I am trying to get number of bits per pixel in a bmp file. According to Wikipedia, it is supposed to be at 28th byte. So after reading a file:
// Przejscie do bajtu pod ktorym zapisana jest liczba bitow na pixel
plik.seekg(28, ios::beg);
// Read number of bytes used per pixel
int liczbaBitow;
plik.read((char*)&liczbaBitow, 2);
cout << "liczba bitow " << liczbaBitow << endl;
But liczbaBitow (variable that is supposed to hold number of bits per pixel value) is -859045864. I don't know where it comes from... I'm pretty lost.
Any ideas?
To clarify #TheBluefish's answer, this code has the bug
// Read number of bytes used per pixel
int liczbaBitow;
plik.read((char*)&liczbaBitow, 2);
When you use (char*)&libczbaBitow, you're taking the address of a 4 byte integer, and telling the code to put 2 bytes there.
The other two bytes of that integer are unspecified and uninitialized. In this case, they're 0xCC because that's the stack initialization value used by the system.
But if you're calling this from another function or repeatedly, you can expect the stack to contain other bogus values.
If you initialize the variable, you'll get the value you expect.
But there's another bug.. Byte order matters here too. This code is assuming that the machine native byte order exactly matches the byte order from the file specification. There are a number of different bitmap formats, but from your reference, the wikipedia article says:
All of the integer values are stored in little-endian format (i.e. least-significant byte first).
That's the same as yours, which is obviously also x86 little endian. Other fields aren't defined to be little endian, so as you proceed to decode the image, you'll have to watch for it.
Ideally, you'd read into a byte array and put the bytes where they belong.
See Convert Little Endian to Big Endian
int libczbaBitow;
unsigned char bpp[2];
plik.read(bpp, 2);
libczbaBitow = bpp[0] | (bpp[1]<<8);
-859045864 can be represented in hexadecimal as 0xCCCC0018.
Reading the second byte gives us 0x0018 = 24bpp.
What is most likely happening here, is that liczbaBitow is being initialized to 0xCCCCCCCC; while your plik.read is only writing the lower 16 bits and leaving the upper 16 bits unchanged. Changing that line should fix this issue:
int liczbaBitow = 0;
Though, especially with something like this, it's best to use a datatype that exactly matches your data:
int16_t liczbaBitow = 0;
This can be found in <cstdint>.
Related
I`m working on an LZW compression app in C++. Since there are no data types that can store 12 bit numbers for representing table elements up to 4095 I thought that I can store 2 of those nrs as 3 bytes in a file and then read them as a struct with 2 unsigned short members. Is there a way to do that or I should just use unsigned short? This is what I have tried but it stores 4 bytes as there are 2 unsigned short members.
#define BITS 12
struct LZWStruct {
unsigned short code1 : BITS;
unsigned short code2 : BITS;
};
int main() {
LZWStruct test;
test.code1 = 144;
test.code2 = 233;
FILE* f = fopen("binary.bin", "wb");
fwrite(&test, sizeof(test), 1, f);
fclose(f);
}
Your question title and question body are two different questions with different answers.
No, you absolutely cannot store 3 * 12-bit unsigned numbers (36 bits) in four bytes (32 bits).
Yes, you can store two 12-bit numbers (24 bits) in three bytes (24 bits).
The bit fields in C++, inherited from C, that you are trying to use do not guarantee exactly how the bits are packed in the structure, so you cannot know which three bytes in the structure have your data. You should simply use the shift and or operators to put them in an integer. Then you will know exactly which three bytes to write to the file.
Then to be portable, in particular not dependent on the endianess of the machine, you should write bytes from the integer also using the shift operator. If you write using a pointer to the integer, it won't be portable.
In your example, you could have tried fwrite(&test, 3, 1, f), and it might work, if the compiler put the codes in the low bits of test, and if your machine is little-endian. Otherwise, no.
So to do it reliably:
Put in an integer:
unsigned short code1;
unsigned short code2;
uint32_t test = (code1 & 0x3ff) | ((uint32_t)(code2 & 0x3ff) << 12);
Write to a file:
putc(test, f);
putc(test >> 8, f);
putc(test >> 16, f);
You can skip the intermediate step if you like:
putc(code1, f);
putc(((code1 >> 8) & 0xf) | (code2 << 4), f);
putc(code2 >> 4, f);
(In the above I am assuring that I only store the low 12 bits of each code with the & operators, in case the bits above the low 12 are not zero. If you know for certain that the code values are less than 4096, then you can remove the & operations above.)
From here, multiple adjacent bit fields are usually packed together. The special unnamed bit field of size zero can be forced to break up padding. It specifies that the next bit field begins at the beginning of its allocation unit. Use sizeof to verify the size of your structure.
The exact packing, however, may depend on platform and compiler. This may be less a problem if the data are later loaded by the same program, or some closely related, but may be an issue for some generic format.
I have read 4 bytes as an array from a file from a SD-Card on an Arduino Mega. Now I want to convert this array in one number, so that I can work with the number as integer(The bytes are a length of the next File section). Is there any included function for my problem or must I code my own?
I read the File into the byte array with the file.read() function from SDFat:
byte array[4]; //creates the byte array
file.read(array,4); //reads 4 bytes from the file and stores it in the array
I hope, you can understand my Problem.
It depends on the endianess of the stored bytes.
If the endianess matches the one of your target system (I believe the Atmegas are big endian) you can just do
int32_t number = *(int32_t*)array;
to get a 32 bit integer.
If the endianess is not matching you have to shift the bytes around yourself, for a little endian encoded number:
int32_t number = uint32_t(array[3]) << 24 | uint32_t(array[2]) << 16 | uint32_t(array[1]) << 8 | uint32_t(array[0]);
I've recently needed to convert mnist data-set to images and labels, it is binary and the structure is in the previous link, so i did a little research and as I'm fan of c++ ,I've read the I/O binary in c++,after that I've found this link in stack. That link works well but no code commenting and no explanation of algorithm so I've get confused and that raise some question in my mind which i need a professional c++ programmer to ask.
1-What is the algorithm to convert the data-set in c++ with help of ifstream?
I've realized to read a file as a binary with file.read and move to the next record, but in C , we define a struct and move it inside the file but i can't see any struct in c++ program for example to read this:
[offset] [type] [value] [description]
0000 32 bit integer 0x00000803(2051) magic number
0004 32 bit integer 60000 number of images
0008 32 bit integer 28 number of rows
0012 32 bit integer 28 number of columns
0016 unsigned byte ?? pixel
How can we go to the specific offset for example 0004 and read for example 32 bit integer and put it to an integer variable.
2-What the function reverseInt is doing? (It is not obviously doing simple reversing an integer)
int ReverseInt (int i)
{
unsigned char ch1, ch2, ch3, ch4;
ch1 = i & 255;
ch2 = (i >> 8) & 255;
ch3 = (i >> 16) & 255;
ch4 = (i >> 24) & 255;
return((int) ch1 << 24) + ((int)ch2 << 16) + ((int)ch3 << 8) + ch4;
}
I've did a little debugging with cout and when it revised for example 270991360 it return 10000 , which i cannot find any relation, I understand it AND the number multiples with two with 255 but why?
PS :
1-I already have the MNIST converted images but i want to understand the algorithm.
2-I've already unzip the gz files so the file is pure binary.
1-What is the algorithm to convert the data-set in c++ with help of ifstream?
This function read a file (t10k-images-idx3-ubyte.gz) as follow:
Read a magic number and adjust endianness
Read number of images and adjust endianness
Read number rows and adjust endianness
Read number of columns and adjust endianness
Read all the given images x rows x columns characters (but loose them).
The function use normal int and always switch endianness, that means it target a very specific architecture and is not portable.
How can we go to the specific offset for example 0004 and read for example 32 bit integer and put it to an integer variable.
ifstream provides a function to seek to a given position:
file.seekg( posInBytes, std::ios_base::beg);
At the given position, you could read the 32-bit integer:
int32_t val;
file.read ((char*)&val,sizeof(int32_t));
2- What the function reverseInt is doing?
This function reverse order of the bytes of an int value:
Considering an integer of 32bit like aaaaaaaabbbbbbbbccccccccdddddddd, it return the integer ddddddddccccccccbbbbbbbbaaaaaaaa.
This is useful for normalizing endianness, however, it is probably not very portable, as int might not be 32bit (but e.g. 16bit or 64bit)
I was wondering if there is anyway to use specific range of bits with if statement.
im using a fpga to send 8bits binary data over usb to pc. each transaction has 3x 8 bits packets. in each packet first four bits are generated by outside module and i want to send control data in the last four bits.
=>
usb interface accepts data as integers and i've got bitset function to convert integers to 8bit binary. i want to use last four bits to use with if statements. is there any way i can do this?
thanks in advance
If you have an unsigned char as input:
void foo(uint8_t x) {
uint8_t top4 = x >> 4; // moves top 4 bits down by 4 positions
uint8_t bottom4 = x & 0x0f; // zeros out top 4 bits, leaving bottom 4
}
This will work with int, short, etc., types as well, but you would need to &-mask the top results too to strip away the unwanted high bits there.
So if I have a 4 byte number (say hex) and want to store a byte say DD into hex, at the nth byte position without changing the other elements of hex's number, what's the easiest way of going about that? I'm guessing it's some combination of bitwise operations, but I'm still quite new with them, and have found them quite confusing thus far?
byte n = 0xDD;
uint i = 0x12345678;
i = (i & ~0x0000FF00) | ((uint)n << 8);
Edit: Forgot to mention, be careful if you're doing this with signed data types, so that things don't get inadvertently sign-extended.
Mehrdad's answer shows how to do it with bit manipulation. You could also use the old byte array trick (assuming C or some other language that allows this silliness):
byte n = 0xDD;
uint i = 0x12345678;
byte *b = (byte*)&i;
b[1] = n;
Of course, that's processor specific in that big-endian machines have the bytes reversed from little-endian. Also, this technique limits you to working on exact byte boundaries whereas the bit manipulation will let you modify any given 8 bits. That is, you might want to turn 0x12345678 into 0x12345DD8, which the technique I show won't do.