Accessing specific binary information based on binary format documentation

Accessing specific binary information based on binary format documentation - c++

I have a binary file and documentation of the format the information is stored in. I'm trying to write a simple program using c++ that pulls a specific piece of information from the file but I'm missing something since the output isn't what I expect.
The documentation is as follows:
Half-word Field Name Type Units Range Precision
10 Block Divider INT*2 N/A -1 N/A
11-12 Latitude INT*4 Degrees -90 to +90 0.001
There are other items in the file obviously but for this case I'm just trying to get the Latitude value.
My code is:
#include <cstdlib>
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char* argv[])
{
char* dataFileLocation = "testfile.bin";
ifstream dataFile(dataFileLocation, ios::in | ios::binary);
if(dataFile.is_open())
{
char* buffer = new char[32768];
dataFile.seekg(10, ios::beg);
dataFile.read(buffer, 4);
dataFile.close();
cout << "value is << (int)(buffer[0] & 255);
}
}
The result of which is "value is 226" which is not in the allowed range.
I'm quite new to this and here's what my intentions where when writing the above code:
Open file in binary mode
Seek to the 11th byte from the start of the file
Read in 4 bytes from that point
Close the file
Output those 4 bytes as an integer.
If someone could point out where I'm going wrong I'd sure appreciate it. I don't really understand the (buffer[0] & 255) part (took that from some example code) so layman's terms for that would be greatly appreciated.
Hex Dump of the first 100 bytes:
testfile.bin 98,402 bytes 11/16/2011 9:01:52
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
00000000- 00 5F 3B BF 00 00 C4 17 00 00 00 E2 2E E0 00 00 [._;.............]
00000001- 00 03 FF FF 00 00 94 70 FF FE 81 30 00 00 00 5F [.......p...0..._]
00000002- 00 02 00 00 00 00 00 00 3B BF 00 00 C4 17 3B BF [........;.....;.]
00000003- 00 00 C4 17 00 00 00 00 00 00 00 00 80 02 00 00 [................]
00000004- 00 05 00 0A 00 0F 00 14 00 19 00 1E 00 23 00 28 [.............#.(]
00000005- 00 2D 00 32 00 37 00 3C 00 41 00 46 00 00 00 00 [.-.2.7.<.A.F....]
00000006- 00 00 00 00 [.... ]

Since the documentation lists the field as an integer but shows the precision to be 0.001, I would assume that the actual value is the stored value multiplied by 0.001. The integer range would be -90000 to 90000.
The 4 bytes must be combined into a single integer. There are two ways to do this, big endian and little endian, and which you need depends on the machine that wrote the file. x86 PCs for example are little endian.
int little_endian = buffer[0] | buffer[1]<<8 | buffer[2]<<16 | buffer[3]<<24;
int big_endian = buffer[0]<<24 | buffer[1]<<16 | buffer[2]<<8 | buffer[3];
The &255 is used to remove the sign extension that occurs when you convert a signed char to a signed integer. Use unsigned char instead and you probably won't need it.
Edit: I think "half-word" refers to 2 bytes, so you'll need to skip 20 bytes instead of 10.

Related

Incomplete binary data between QSignal and QSlot

I have in my Qt code a function f1() that emits a QSignal with a char array of binary data as parameter.
My problem is the QSlot that is connected to this QSignal receives this array but the data is incomplete: it receives the data until the first "0x00" byte.
I tried to change the char [] to char*, didn't help.
How can I do to receive the full data, including the "0x00" bytes ?
connect(dataStream, &BaseConnection::GotPacket, this, &myClass::HandleNewPacket);
void f1()
{
qDebug() << "Binary read = " << inStream.readRawData(logBuffer, static_cast<int>(frmIndex->dataSize));
//logBuffer contains the following hexa bytes: "10 10 01 30 00 00 30 00 00 00 01 00 D2 23 57 A5 38 A2 05 00 E8 03 00 00 6C E9 01 00 00 00 00 00 0B 00 00 00 00 00 00 00 A6 AF 01 00 00 00 00 00"
Q_EMIT GotPacket(logBuffer, frmIndex->dataSize);
}
void myClass::HandleNewPacket(char p[LOG_BUFFER_SIZE], int size)
{
// p contains the following hexa bytes : "10 10 01 30"
}
Thank you.

Displaying Hex codes from buffer after reading from a file [duplicate]

This question already has answers here:
how do I print an unsigned char as hex in c++ using ostream?
(17 answers)
Closed 4 years ago.
I'm trying to store the hex codes read from a file into a buffer and then display it on the console, so far it doesn't seem to work. This is my code:
using namespace std;
int main()
{
ifstream file("Fishie.ch8",ios::binary);
if (!file.is_open())
{
cout << "Error";
}
else
{
file.seekg(0, ios::end);
streamoff size = file.tellg();
file.seekg(0, ios::beg);
char *buffer = new char[size];
file.read(buffer, size);
file.close();
for (int i = 0; i < size; i++)
{
cout <<hex<< buffer[i] << " ";
}
}
delete[] buffer;
cin.get();
}
The expected output should be this:
00 e0 a2 20 62 08 60 f8 70 08 61 10 40 20 12 0e
d1 08 f2 1e 71 08 41 30 12 08 12 10 00 00 00 00
00 00 00 00 00 18 3c 3c 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
3e 3f 3f 3b 39 38 38 38 00 00 80 c1 e7 ff 7e 3c
00 1f ff f9 c0 80 03 03 00 80 e0 f0 78 38 1c 1c
38 38 39 3b 3f 3f 3e 3c 78 fc fe cf 87 03 01 00
00 00 00 00 80 e3 ff 7f 1c 38 38 70 f0 e0 c0 00
3c 18 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Instead the above output I get some strange looking symbols with lots of empty spaces.
It looks like this:
What could be the problem?

As you buffer is char all elements will be printed as characters. What you want is the number converted to hex.
BTW: As you want a conversion to hexadecimal output, it is a question if you really want to read char from file or unsigned char.
As you find out, the signature for istream.read uses char you have to convert before to unsigned char and than to unsigned int like:
cout <<hex<< (unsigned int)(unsigned char) buffer[i] << " ";
For real c++ users you should write a fine static_cast ;)
This will print out the hex values. But if you have a CR you will see a 'a' instead of '0a', so you have to set your width and fill char before:
cout.width(2);
cout.fill('0');
for (int i = 0; i < size; i++)
{
cout <<hex<< (unsigned int)(unsigned char)buffer[i] << " ";
}
BTW: delete[] buffer; is in wrong scope and must be shifted in the scope where it was defined.

Why will cout.imbue(locale("")) cause memory leaks?

My compiler is Visual VC++ 2013. The following simplest program will cause a few memory leaks.
Why? How to fix it?
#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>
#include <cstdlib>
#include <iostream>
#include <locale>
using namespace std;
int main()
{
_CrtSetDbgFlag(_CRTDBG_ALLOC_MEM_DF|_CRTDBG_LEAK_CHECK_DF);
cout.imbue(locale("")); // If this statement is commented, then OK.
}
The debug window outputs as follows:
Detected memory leaks!
Dumping objects ->
{387} normal block at 0x004FF8C8, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
{379} normal block at 0x004FF678, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
{352} normal block at 0x004FE6E8, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
{344} normal block at 0x004FE498, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
{318} normal block at 0x004FD5C8, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
{308} normal block at 0x004F8860, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
Object dump complete.
Detected memory leaks!
Dumping objects ->
{387} normal block at 0x004FF8C8, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
{379} normal block at 0x004FF678, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
{352} normal block at 0x004FE6E8, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
{344} normal block at 0x004FE498, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
{318} normal block at 0x004FD5C8, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
{308} normal block at 0x004F8860, 12 bytes long.
Data: <z h - C N > 7A 00 68 00 2D 00 43 00 4E 00 00 00
Object dump complete.
The program '[0x5B44] cpptest.exe' has exited with code 0 (0x0).

I was using std::codecvt and get a similar problem. I am not sure whether it is a same cause. Just try to provide s possible way to discover the root cause.
You can reference the example in http://www.cplusplus.com/reference/locale/codecvt/in/
It actually "use" the member of mylocale, and it seems without an r-value reference version overload. So when directly write const facet_type& myfacet = std::use_facet<facet_type>(std::locale()); may cause the same problem. .
So try
auto myloc = locale("");
cout.imbue(myloc);

TAR file format issue

It is unclear to me, what is a correct .tar file format, as I am experiencing proper functionality with three scenarios (see below).
Based on .tar specification I have been working with, the magic field (ustar) is null-terminated character string and version field is octal number with no trailing nulls.
However I've review several .tar files I found on my server and I found different implementation of magic and version field and all three of them seems to work properly, probably because system ignore those fields.
See different (3) bytes between words ustar and root in the following examples >>
Scenario 1 (20 20 00):
000000F0 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
000000FC 00 00 00 00 | 00 75 73 74 | 61 72 20 20 .....ustar
00000108 00 72 6F 6F | 74 00 00 00 | 00 00 00 00 .root.......
00000114 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
Scenario 2 (00 20 20):
000000F0 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
000000FC 00 00 00 00 | 00 75 73 74 | 61 72 00 20 .....ustar.
00000108 20 72 6F 6F | 74 00 00 00 | 00 00 00 00 root.......
00000114 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
Scenario 3 (00 00 00):
000000F0 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
000000FC 00 00 00 00 | 00 75 73 74 | 61 72 00 00 .....ustar..
00000108 00 72 6F 6F | 74 00 00 00 | 00 00 00 00 .root.......
00000114 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
Which one is the correct format?

In my opinion none of your examples is the correct one, at least not for the POSIX format.
As you can read here:
/* tar Header Block, from POSIX 1003.1-1990. */
/* POSIX header */
struct posix_header { /* byte offset */
char name[100]; /* 0 */
char mode[8]; /* 100 */
char uid[8]; /* 108 */
char gid[8]; /* 116 */
char size[12]; /* 124 */
char mtime[12]; /* 136 */
char chksum[8]; /* 148 */
char typeflag; /* 156 */
char linkname[100]; /* 157 */
char magic[6]; /* 257 */
char version[2]; /* 263 */
char uname[32]; /* 265 */
char gname[32]; /* 297 */
char devmajor[8]; /* 329 */
char devminor[8]; /* 337 */
char prefix[155]; /* 345 */
};
#define TMAGIC "ustar" /* ustar and a null */
#define TMAGLEN 6
#define TVERSION "00" /* 00 and no null */
#define TVERSLEN 2
The format of your first example (Scenario 1) seems to be matching with the old GNU header format:
/* OLDGNU_MAGIC uses both magic and version fields, which are contiguous.
Found in an archive, it indicates an old GNU header format, which will be
hopefully become obsolescent. With OLDGNU_MAGIC, uname and gname are
valid, though the header is not truly POSIX conforming */
#define OLDGNU_MAGIC "ustar " /* 7 chars and a null */
In both your second and third examples (Scenario 2 and Scenario 3), the version field is set to an unexpected value (according to the above documentation, the correct value should be 00 ASCII or 0x30 0x30 hex), so this field is most likely ignored.

With Fedora 18 if I execute this command:
tar --format=posix -cvf testPOSIX.tar test.txt
I have a POSIX tar file format with: ustar\0 (0x757374617200)
else if I execute this:
tar --format=gnu -cvf testGNU.tar test.txt
I have a GNU tar file format with: ustar 0x20 0x20 0x00 (0x7573746172202000) (old gnu format)
From /usr/share/magic file:
# POSIX tar archives
257 string ustar\0 POSIX tar archive
!:mime application/x-tar # encoding: posix
257 string ustar\040\040\0 GNU tar archive
!:mime application/x-tar # encoding: gnu
0x20 is 40 in octal.
I've also tried to edit the hex code with:
00 20 20
and however the tar worked correctly. I've exctract test.txt without problem.
but when I've tried to edit the hex code with:
00 00 00
The tar was not recognized.
So, my conclusion is that the correct format is:
20 20 00

Examining Output of raw files C++

Hi I am reading in a binary file formatted in hex. It is an image file below is a short example of the first few lines of code using hd ... |more command on linux. The image is a binary graphic so the only pixel colours are either black or white. It is a 1024 by 1024 image however the size comes out to be 2097152 bytes
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000dfbf0 00 00 00 00 00 00 00 00 00 00 00 00 ff 00 ff 00 |................|
000dfc00 ff 00 ff 00 ff 00 00 00 00 00 00 00 00 00 00 00 |................|
000dfc10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
This is the code I am using to read it in found in another thread on SO
ifstream file (argv[1], ios::in | ios::binary | ios::ate);
ifstream::pos_type fileSize;
char* fileContents;
if(file.is_open())
{
fileSize = file.tellg();
fileContents = new char[fileSize];
file.seekg(0, ios::beg);
if(!file.read(fileContents, fileSize))
{
cout << "fail to read" << endl;
}
file.close();
cout << fileSize << endl;
The code works however when I run this for loop
for (i=0; i<2097152; i++)
printf("%hd",fileContents[i]);
The only thing printed out are zeros and no 1s. Why is this are my parameters in printf not correctly specifying the pixel size. I know for a fact that there are 1's in the image representing the white areas. Also how do i figure out how many bytes represent a pixel in this image.

Your printf() is wrong. %hd means short, while fileContents[i] is a char; on all modern systems I'm familiar with, this is a size mismatch. Use an array of short instead, since you have twice as many bytes as pixels.
Also, stop using printf() and use std::cout, avoiding all type mismatch problems.

Since 2097152/1024 is exactly 2048 which is in turn 2*1024, I would assume each pixel is 2 bytes.
The other problem is probably in the printf. I'm not sure what %hd is, I would use %02x myself and cast the data to int.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Accessing specific binary information based on binary format documentation - c++

Related

Incomplete binary data between QSignal and QSlot

Displaying Hex codes from buffer after reading from a file [duplicate]

Why will cout.imbue(locale("")) cause memory leaks?

TAR file format issue

Examining Output of raw files C++

Categories

Resources