Read binary file into struct and also problems with endianness - c++

I want to read a binary file image.dd into struct teststruct *test;. Basically there are two problems:
1. Wrong order because of little / big endian.
printf("%02x", test->magic); just gives me 534b554c instead of 4c55b453 (maybe this has something to do with the "main problem" in the next part). Its just "one value". As an example, printf("%c", test->magic); gives L instead of LUKS.
2. No output with test->version.
uint16_t version; in struct teststruct gives no output. Which means, i call printf("%x ", test->version); and there is no result.
This is exampleh.h which contains struct:
#ifndef _EXAMPLEH_H
#define _EXAMPLEH_H
#define MAGIC_L 6
struct teststruct {
char magic [MAGIC_L];
uint16_t version;
};
#endif
This is the main code:
using namespace std;
#include <stdint.h>
#include <string.h>
#include <iostream>
#include <fstream>
#include "exampleh.h"
struct teststruct *test;
int main() {
FILE *fp = fopen("C:\\image.dd", "rb"); // open file in binary mode
if (fp == NULL) {
fprintf(stderr, "Can't read file");
return 0;
}
fread(&test,sizeof(test),1,fp);
//printf("%x ", test->magic); //this works, but in the wrong order because of little/big endian
printf("%x ", test->version); //no output at all
fclose(fp);
return 0;
}
And this here are the first 114 Bytes of image.dd:
4C 55 4B 53 BA BE 00 01 61 65 73 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 78 74 73 2D 70 6C 61 69
6E 36 34 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 73 68 61 32 35 36 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 40

You must allocate the structure and read data into the structure instead of reading into an pointer directly. If you are going to read only one structure, you won't need to declare pointers for the structure.
printf("%x ", test->magic); invokes undefined behavior because pointer (automatically converted from the array) is passed to where unsigned int is required.
In this case, the observed behavior is because:
Firstly, fread(&test,sizeof(test),1,fp); read the first few bytes from the file as pointer value.
Then, printf("%02x", test->magic); printed the first 4-byte integer from the file because test->magic is (converted to) the pointer to the array placed at the top of the structure, and the address of the array is same as the address of the structure itself, so the address read from the file is printed. One more lucky is that where to read 4-byte integer and address (pointer) from as function arguments are the same.
Finally, you didn't get any output from printf("%x ", test->version); because the address read from the file is unfortunately in region that is not readable and trying to read there caused Segmentation Fault.
Fixed code:
using namespace std;
#include <stdint.h>
#include <string.h>
#include <iostream>
#include <fstream>
#include "exampleh.h"
struct teststruct test; // allocate structure directly instead of pointer
int main() {
FILE *fp = fopen("C:\\image.dd", "rb"); // open file in binary mode
if (fp == NULL) {
fprintf(stderr, "Can't read file");
return 0;
}
fread(&test,sizeof(test),1,fp); // now structure is read instead of pointer
for (int i = 0; i < 6; i++) {
printf("%02x", (unsigned char)test.magic[i]); // print using proper combination of format and data
}
printf(" ");
printf("%x ", test.version); // also use . instead of ->
fclose(fp);
return 0;
}

struct teststruct *test; points to NULL, as it is defined in the global namespace. You do not allocate memory for this pointer, so test->version is UB.
fread(&test,sizeof(test),1,fp); is also wrong, this will read a pointer, not the content of the struct.
An easy fix is to change test to be a struct teststruct and not a pointer to it.
using namespace std;
#include <stdint.h>
#include <string.h>
#include <iostream>
#include <fstream>
#include "exampleh.h"
struct teststruct test; //not a pointer anymore
int main() {
FILE *fp = fopen("C:\\image.dd", "rb"); // open file in binary mode
if (fp == NULL) {
fprintf(stderr, "Can't read file");
return 0;
}
fread(&test,sizeof(test),1,fp);
//printf("%x ", test.magic); //this works, but in the wrong order because of little/big endian
printf("%x ", test.version); //no output at all
fclose(fp);
return 0;
}

Related

C++ structure/array initialization

I have a C++ array or structure initialization issue that I have not been able to resolve.
I have a 4-level nested structure. Each level is actually the same 48 bytes wrapped in the structure next level up. The issue is when the structure is initialized and declared as a scalar value, it is correctly initialized with the provided values. However, when it is declared as a single element array, all 48 bytes become zeros, as shown below. Unfortunately the structures are too complicated to be pasted here.
If I define 4 simple structures, one containing another, with the innermost one containing the same 12 unsigned integers, then it is initialized correctly, even if it is declared in an array.
Has anyone experienced similar issues? What am I missing? What compiler flags, options, etc could lead to such a problem? Appreciate any comments and help.
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "bls12_381/fq.hpp"
static constexpr embedded_pairing::bls12_381::Fq scalar = {
{{{.std_words = {0x1c7238e5, 0xcf1c38e3, 0x786f0c70, 0x1616ec6e, 0x3a6691ae, 0x21537e29,
0x4d9e82ef, 0xa628f1cb, 0x2e5a7ddf, 0xa68a205b, 0x47085aba, 0xcd91de45}}}}
};
static constexpr embedded_pairing::bls12_381::Fq array[1] = {
{{{{.std_words = {0x1c7238e5, 0xcf1c38e3, 0x786f0c70, 0x1616ec6e, 0x3a6691ae, 0x21537e29,
0x4d9e82ef, 0xa628f1cb, 0x2e5a7ddf, 0xa68a205b, 0x47085aba, 0xcd91de45}}}}}
};
void print_struct(const char *title, const uint8_t *cbuf, int len)
{
printf("\n");
printf("[%s] %d\n", title, len);
for (int i = 0; i < len; i++) {
if (i % 30 == 0 && i != 0)
printf("\n");
else if ((i % 10 == 0 || i % 20 == 0) && i != 0)
printf(" ");
printf("%02X ", cbuf[i]);
}
printf("\n");
}
void run_tests()
{
print_struct("scalar", (const uint8_t *) &scalar, sizeof(scalar));
print_struct("array", (const uint8_t *) &array[0], sizeof(array[0]));
}
[scalar] 48
E5 38 72 1C E3 38 1C CF 70 0C 6F 78 6E EC 16 16 AE 91 66 3A 29 7E 53 21 EF 82 9E 4D CB F1
28 A6 DF 7D 5A 2E 5B 20 8A A6 BA 5A 08 47 45 DE 91 CD
[array] 48
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
I've just narrowed down the example.
The following is a complete and standalone example. I also forgot to mention that the initialization on Linux using g++ 9.3.0, -std=c++17, gets the expected results of all FF's. However, on an embedded device, the inherited structure gets all 0's.
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
struct Data {
uint32_t words;
};
struct Overlay {
Data val;
};
struct Inherit : Data {
};
static Overlay overlay[1] = {
{{.words = 0xffffffff}}
};
static Inherit inherit[1] = {
{{.words = 0xffffffff}}
};
void print_struct(const char *title, const uint8_t *cbuf, int len)
{
printf("[%s] %d\n", title, len);
for (int i = 0; i < len; i++) {
printf("%02X ", cbuf[i]);
}
printf("\n");
}
int main()
{
print_struct("overlay", (const uint8_t *) &overlay[0], sizeof(overlay[0])); // FF FF FF FF
print_struct("inherit", (const uint8_t *) &inherit[0], sizeof(inherit[0])); // 00 00 00 00 <-- incorrect?
return 0;
}

The size of these structs are different in a file but the same in program memory

Consider the following POD struct:
struct MessageWithArray {
uint32_t raw;
uint32_t myArray[10];
//MessageWithArray() : raw(0), myArray{ 10,20,30,40,50,60,70,80,90,100 } { };
};
Running the following:
#include <type_traits>
#include <iostream>
#include <fstream>
#include <string>
struct MessageWithArray {
uint32_t raw;
uint32_t myArray[10];
//MessageWithArray() : raw(0), myArray{ 10,20,30,40,50,60,70,80,90,100 } { };
};
//https://stackoverflow.com/questions/46108877/exact-definition-of-as-bytes-function
template <class T>
char* as_bytes(T& x) {
return &reinterpret_cast<char&>(x);
// or:
// return reinterpret_cast<char*>(std::addressof(x));
}
int main() {
MessageWithArray msg = { 0, {0,1,2,3,4,5,6,7,8,9} };
std::cout << "Size of MessageWithArray struct: " << sizeof(msg) << std::endl;
std::cout << "Is a POD? " << std::is_pod<MessageWithArray>() << std::endl;
std::ofstream buffer("message.txt");
buffer.write(as_bytes(msg), sizeof(msg));
return 0;
}
Gives the following output:
Size of MessageWithArray struct: 44
Is a POD? 1
A hex dump of the "message.txt" file looks like this:
00 00 00 00 00 00 00 00 01 00 00 00 02 00 00 00
03 00 00 00 04 00 00 00 05 00 00 00 06 00 00 00
07 00 00 00 08 00 00 00 09 00 00 00
Now if I uncomment the constructor (so that MessageWithArray has a zero-argument constructor), MessageWithArray becomes a non-POD struct. Then I use the constructor to initialize instead. This results in the following changes in the code:
....
struct MessageWithArray {
.....
MessageWithArray() : raw(0), myArray{ 10,20,30,40,50,60,70,80,90,100 }{ };
};
....
int main(){
MessageWithArray msg;
....
}
Running this code, I get:
Size of MessageWithArray struct: 44
Is a POD? 0
A hex dump of the "message.txt" file looks like this:
00 00 00 00 0D 0A 00 00 00 14 00 00 00 1E 00 00
00 28 00 00 00 32 00 00 00 3C 00 00 00 46 00 00
00 50 00 00 00 5A 00 00 00 64 00 00 00
Now, I'm not so interested in the actual hex values, what I'm curious about is why there is one more byte in the non-POD struct dump compared to the POD struct dump, when sizeof() declares they are the same number of bytes? Is it possible that, because the constructor makes the struct non-POD, that something hidden has been added to the struct? sizeof() should be an accurate compile-time check, correct? Is something possibly avoiding being measured by sizeof()?
Specifications: I am running this in an empty project in Visual Studio 2017 version 15.7.5, Microsoft Visual C++ 2017, on a Windows 10 machine.
Intel Core i7-4600M CPU
64-bit Operating System, x64-based processor
EDIT: I decided to initialize the struct to avoid Undefined Behaviour, and because the question is still valid with the initialization. Initializing it to a value without 10 preserves the behaviour I observed initially, because the data the array had never contained any 10s (even if it was garbage, and random).
It has nothing to do with POD-ness.
Your ofstream is opened in text mode (rather than binary mode). On windows it means that \n gets converted to \r\n.
In the second case there happened to be one 0x0A (\n) byte in the struct, that became 0x0D 0x0A (\r\n). That's why you see an extra byte.
Also, using uninitialized variables in the first case leads to undefined behaviour, which is this case didn't manifest itself.
Other answer explains the problem with writing binary data into stream opened in text mode, however this code is fundamentally wrong. There is no need to dump anything, the proper way to check sizes of those structures and verify that they are equal would be to use static_assert:
struct MessageWithArray {
uint32_t raw;
uint32_t myArray[10];
};
struct NonPodMessageWithArray {
uint32_t raw;
uint32_t myArray[10];
NonPodMessageWithArray() : raw(0), myArray{ 10,20,30,40,50,60,70,80,90,100 } {}
};
static_assert(sizeof(MessageWithArray) == sizeof(NonPodMessageWithArray));
online compiler

couldn't write specific content into stringstream

I have some sample code reading some binary data from file and then writing the content into stringstream.
#include <sstream>
#include <cstdio>
#include <fstream>
#include <cstdlib>
std::stringstream * raw_data_buffer;
int main()
{
std::ifstream is;
is.open ("1.raw", std::ios::binary );
char * buf = (char *)malloc(40);
is.read(buf, 40);
for (int i = 0; i < 40; i++)
printf("%02X ", buf[i]);
printf("\n");
raw_data_buffer = new std::stringstream("", std::ios_base::app | std::ios_base::out | std::ios_base::in | std::ios_base::binary);
raw_data_buffer -> write(buf, 40);
const char * tmp = raw_data_buffer -> str().c_str();
for (int i = 0; i < 40; i++)
printf("%02X ", tmp[i]);
printf("\n");
delete raw_data_buffer;
return 0;
}
With a specific input file I have, the program doesn't function correctly. You could download the test file here.
So the problem is, I write the file content into raw_data_buffer and immediately read it back, and the content differs. The program's output is:
FFFFFFC0 65 59 01 00 00 00 00 00 00 00 00 00 00 00 00 FFFFFFE0 0A 40 00 00 00 00 00 FFFFFF80 08 40 00 00 00 00 00 70 FFFFFFA6 57 6E FFFFFFFF 7F 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FFFFFFE0 0A 40 00 00 00 00 00 FFFFFF80 08 40 00 00 00 00 00 70 FFFFFFA6 57 6E FFFFFFFF 7F 00 00
The content FFFFFFC0 65 59 01 is overwritten with 0. Why so?
I suspect this a symptom of undefined behavior from using deallocated memory. You're getting a copy of the string from the stringstream but you're only grabbing a raw pointer to the internals that is then immediately deleted. (the link actually warns against this exact case)
const char* tmp = raw_data_buffer->str().c_str();
// ^^^^^ returns a temporary that is destroyed
// at the end of this statement
// ^^^ now a dangling pointer
Any use of tmp would exhibit undefined behavior and could easily cause the problem you're seeing. Keep the result of str() in scope.

Having trouble getting the output of Linux 'dd' command in C++ program

I'm trying to use the sudo dd if=/dev/sda ibs=1 count=64 skip=446 command to get the partition table information from the master boot record in order to parse it I'm basically trying to read the output to a string in order to parse it, but all I'm getting is the following: � !. What I'm expecting is:
80 01 01 00 83 FE 3F 01 3F 00 00 00 43 7D 00 00
00 00 01 02 83 FE 3F 0D 82 7D 00 00 0C F1 02 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
My current code looks like this, and is just taken from here: How to execute a command and get output of command within C++ using POSIX?
#include <iostream>
#include <stdexcept>
#include <stdio.h>
#include <string>
using namespace std;
string exec(const char* cmd) {
char buffer[128];
string result = "";
FILE* pipe = popen(cmd, "r");
if (!pipe) throw std::runtime_error("popen() failed!");
try {
while (!feof(pipe)) {
if (fgets(buffer, 128, pipe) != NULL)
result += buffer;
}
} catch (...) {
pclose(pipe);
throw;
}
pclose(pipe);
return result;
}
int main() {
string s = exec("sudo dd if=/dev/sda ibs=1 count=64 skip=446");
cout << s;
}
Obviously I'm doing something wrong, but I can't figure out the problem. How do I get the proper output into my string?
while (!feof(pipe)) {
This is your first bug.
result += buffer;
This is your second bug. buffer is a char array, which decays to a char * in this context. As you know, a char * in a string context gets typically interpreted as a C-style string that's terminated by a '\0' byte.
You might've noticed that you expect to get a bunch of 00 bytes read. Well, after the char array gets decayed to a char *, everything up to the first 00 byte is going to get appended to your result, rather than the 128 bytes, exactly. And if there were no 00 bytes in those 128 bytes, you'll probably end up getting some random garbage, as an extra bonus, with a small possibility of a crash.
if (fgets(buffer, 128, pipe) != NULL)
This is your third bug. If the read data happens to include a 0A byte, an '\n' character, this is not going to read 128 bytes.
cout << s;
This is your fourth bug. Since the data will (after all the other bugs are fixed) presumably contain binary stuff, your terminal is inlikely to have much success displaying various bytes, especially bytes 00 through 1F.
To fix your code you will need to:
Correctly handle the end-of-file condition.
Correctly read binary data. fgets(), et al, are completely unsuitable for the task. If you insist on using C file structures, your only reasonable option is to use fread().
Correctly assemble a std::string from a blob of binary data. Merely appending a char buffer to it, crossing your fingers, and hoping for the best, will not work. You will most likely need to use the two-argument std::string constructor, that takes a beginning and an ending iterator value as parameters.
Display binary data correctly, instead of just dumping the entire blob to std::cout, just like that. The most common approach is a std::hex manipulator, and diligent up-conversion of each char to an int, as an unsigned value.

Examining Output of raw files C++

Hi I am reading in a binary file formatted in hex. It is an image file below is a short example of the first few lines of code using hd ... |more command on linux. The image is a binary graphic so the only pixel colours are either black or white. It is a 1024 by 1024 image however the size comes out to be 2097152 bytes
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000dfbf0 00 00 00 00 00 00 00 00 00 00 00 00 ff 00 ff 00 |................|
000dfc00 ff 00 ff 00 ff 00 00 00 00 00 00 00 00 00 00 00 |................|
000dfc10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
This is the code I am using to read it in found in another thread on SO
ifstream file (argv[1], ios::in | ios::binary | ios::ate);
ifstream::pos_type fileSize;
char* fileContents;
if(file.is_open())
{
fileSize = file.tellg();
fileContents = new char[fileSize];
file.seekg(0, ios::beg);
if(!file.read(fileContents, fileSize))
{
cout << "fail to read" << endl;
}
file.close();
cout << fileSize << endl;
The code works however when I run this for loop
for (i=0; i<2097152; i++)
printf("%hd",fileContents[i]);
The only thing printed out are zeros and no 1s. Why is this are my parameters in printf not correctly specifying the pixel size. I know for a fact that there are 1's in the image representing the white areas. Also how do i figure out how many bytes represent a pixel in this image.
Your printf() is wrong. %hd means short, while fileContents[i] is a char; on all modern systems I'm familiar with, this is a size mismatch. Use an array of short instead, since you have twice as many bytes as pixels.
Also, stop using printf() and use std::cout, avoiding all type mismatch problems.
Since 2097152/1024 is exactly 2048 which is in turn 2*1024, I would assume each pixel is 2 bytes.
The other problem is probably in the printf. I'm not sure what %hd is, I would use %02x myself and cast the data to int.