Put a string of hexadecimal values directly into memory - c++

I am working on a project in which I take the hexadecimal memory values of a variable/struct and print them into a file.
My goal is to get the hexadecimal memory values from that file and place it back in a pointer pointing to an "empty" variable. The part in which I get the hexadecimal memory elements works like this:
template <typename T>
void encode (T n, std::ofstream& file) {
char *ptr = reinterpret_cast<char*>(&n);
for (int i = 0; i < sizeof(T); i++) {
unsigned int byte = static_cast<unsigned int>(ptr[i]);
file << std::setw(2) << std::setfill('0') << std::hex << (byte & 0xff) << " ";
}
}
This piece of code results in creating the following hexadecimal string:
7b 00 00 00 33 33 47 41 d9 22 00 00 01 ff 02 00 03 14 00 00 c6 1f 00 00
Now I want to place these hexadecimal values directly back into memory, but at a different location. (See it as a client recieving this string and having to decode it)
My problem now is that I don't know how to put it directly into memory. I've tried the following and unfortunately failed:
template<typename T>
void decode(T* ptr, std::ifstream& file){
//Line of hex values
std::string line;
std::getline(file,line);
//Size of string
int n = line.length();
//Converting the string into char *
char * array = new char[n];
strcpy(array, line.c_str());
//copying the char * into the pointer given to the function
memcpy(ptr,array,n);
}
This is the item which will be encoded. Its memory pattern is the same as in the outputted file:
This is the result I'm getting which as you can see stores the char * into memory but not the way I want it:
The expected result is that the decoded variable should have the same memory pattern as the encoded variable, how can I do this?

std::getline(file,line);
This reads exactly what's in the file, character by character.
You indicate that your file contains this hexadecimal string:
7b 00 00 00 33 33 47 41 d9 22 00 00 01 ff 02 00 03 14 00 00 c6 1f 00 00
That is: the first character in the file is '7'. The next one is 'b', then a space character. And so on.
That's what you will get in your file, after std::getline() returns. That is, the first character of file will be 7, the next one will be b, the next one will be a space, and so on.
My problem now is that I don't know how to put it directly into memory.
No, your problem seems to be that you need to convert the read line of text back into actual, binary, raw bytes. You will need to write some code do it, first. You will need to write additional code that does the exact opposite of what you did here:
file << std::setw(2) << std::setfill('0') << std::hex << (byte & 0xff) << " ";
The additional code, that needs to be written, does exactly the opposite of this. That is, once done, the first byte in your read buffer will be 0x7B, instead of three characters "7b", and so on.
There are many different ways to do it, ranging between using istringstream to writing a very simple hex-to-decimal conversion function. If you flip through the pages in your C++ textbook you are likely to find some sample code to do that, this is a fairly common algorithm that's offered as an example of a basic, logical test in most introductory textbooks.
And once you do that, you can copy it into your pointer. You cannot use strcpy(), for that, of course, because it copies whatever it sees up until the first 00 byte. You'll need to use std::copy, or maybe even your own, manual, copy loop.

Related

converting a string read from binary file to integer

I have a binary file. i am reading 16 bytes at a time it using fstream.
I want to convert it to an integer. I tried atoi. but it didnt work.
In python we can do that by converting to byte stream using stringobtained.encode('utf-8') and then converting it to int using int(bytestring.hex(),16). Should we follow such an elloborate steps as done in python or is there a way to convert it directly?
ifstream file(binfile, ios::in | ios::binary | ios::ate);
if (file.is_open())
{
size = file.tellg();
memblock = new char[size];
file.seekg(0, ios::beg);
while (!file.eof())
{
file.read(memblock, 16);
int a = atoi(memblock); // doesnt work 0 always
cout << a << "\n";
memset(memblock, 0, sizeof(memblock));
}
file.close();
Edit:
This is the sample contents of the file.
53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00
04 00 01 01 00 40 20 20 00 00 05 A3 00 00 00 47
00 00 00 2E 00 00 00 3B 00 00 00 04 00 00 00 01
I need to read it as 16 byte i.e. 32 hex digits at a time.(i.e. one row in the sample file content) and convert it to integer.
so when reading 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00, i should get, 110748049513798795666017677735771517696
But i couldnt do it. I always get 0 even after trying strtoull. Am i reading the file wrong, or what am i missing.
You have a number of problems here. First is that C++ doesn't have a standard 128-bit integer type. You may be able to find a compiler extension, see for example Is there a 128 bit integer in gcc? or Is there a 128 bit integer in C++?.
Second is that you're trying to decode raw bytes instead of a character string. atoi will stop at the first non-digit character it runs into, which 246 times out of 256 will be the very first byte, thus it returns zero. If you're very unlucky you will read 16 valid digits and atoi will start reading uninitialized memory, leading to undefined behavior.
You don't need atoi anyway, your problem is much simpler than that. You just need to assemble 16 bytes into an integer, which can be done with shifting and or operators. The only complication is that read wants a char type which will probably be signed, and you need unsigned bytes.
ifstream file(binfile, ios::in | ios::binary);
char memblock[16];
while (file.read(memblock, 16))
{
uint128_t a = 0;
for (int i = 0; i < 16; ++i)
{
a = (a << 8) | (static_cast<unsigned int>(memblock[i]) & 0xff);
}
cout << a << "\n";
}
file.close();
It the number is binary what you want is:
short value ;
file.read(&value, sizeof (value));
Depending upon how the file was written and your processor, you may have to reverse the bytes in value using bit operations.

Problems reading binary file

I'm trying to read the first 6 bytes of a file, but it's giving me weird results, and I can't seem to figure out what I'm doing wrong.
My code:
struct Block {
char fileSize[3];
char initialDataBlockId[3];
};
int main(int c, char **a) {
ifstream file("C\\main_file_cache.idx0", ios::binary);
Block block;
file.get((char*)&block, sizeof(block));
printf("File Size: %i\n", block.fileSize);
printf("Initial Data Block ID: %i\n", block.initialDataBlockId);
file.close();
system("pause");
return 0;
}
Before I ran the code, I opened the file in a binary editor,
and it showed me this hex code:
00 00 00 00-00 00 05 af-4b 00 00 01-26 df cd 00
00 6f 03 3f-ed 00 03 61-05 08 35 00-04 8b 01 61
59 00 08 39-03 23 0a 00-05 6c 00 35-d0 00 06 fe
03 69 d8 00-07 19
There are a total of 54 bytes. The first 6 bytes are just zero.
So, I expected my program to produce the following output:
File Size: 0
Initial Data Block ID: 0
Instead, the outputs is as follows:
File Size: 10419128
Initial Data Block ID: 10419131
This result makes no sense. Maybe there is something wrong with my code?
You should use type unsigned char in your Block structure.
You should use file.read() to read binary data instead of file.get().
You are printing the addresses of the arrays in the Block structure, not their contents, furthermore the specifier %i expects an int, not a char *, so the behavior in undefined and you get some weird integer value but anything culd have happened, including program termination. Increasing the warning level is advisable so the compiler warns about such silly mistakes.
If the file format is little endian, you could convert these 3 byte arrays to numbers this way:
int block_fileSize = (unsigned char)block.fileSize[0] +
((unsigned char)block.fileSize[1] << 8) +
((unsigned char)block.fileSize[2] << 16);
int block_initialDataBlockId = (unsigned char)block.initialDataBlockId[0] +
((unsigned char)block.initialDataBlockId[1] << 8) +
((unsigned char)block.initialDataBlockId[2] << 16);
printf("File Size: %i\n", block_fileSize);
printf("Initial Data Block ID: %i\n", block_initialDataBlockId);
If you want to read a binary data you can use a read method from ifstream and also write method from ofstream.
istream & ifstream::read (char * s, streamsize n);
ostream & ofstream::write (const char * s, streamsize n);
You have to know that binary mode is useless for UNIX systems and text mode is only useful.

Having trouble getting the output of Linux 'dd' command in C++ program

I'm trying to use the sudo dd if=/dev/sda ibs=1 count=64 skip=446 command to get the partition table information from the master boot record in order to parse it I'm basically trying to read the output to a string in order to parse it, but all I'm getting is the following: � !. What I'm expecting is:
80 01 01 00 83 FE 3F 01 3F 00 00 00 43 7D 00 00
00 00 01 02 83 FE 3F 0D 82 7D 00 00 0C F1 02 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
My current code looks like this, and is just taken from here: How to execute a command and get output of command within C++ using POSIX?
#include <iostream>
#include <stdexcept>
#include <stdio.h>
#include <string>
using namespace std;
string exec(const char* cmd) {
char buffer[128];
string result = "";
FILE* pipe = popen(cmd, "r");
if (!pipe) throw std::runtime_error("popen() failed!");
try {
while (!feof(pipe)) {
if (fgets(buffer, 128, pipe) != NULL)
result += buffer;
}
} catch (...) {
pclose(pipe);
throw;
}
pclose(pipe);
return result;
}
int main() {
string s = exec("sudo dd if=/dev/sda ibs=1 count=64 skip=446");
cout << s;
}
Obviously I'm doing something wrong, but I can't figure out the problem. How do I get the proper output into my string?
while (!feof(pipe)) {
This is your first bug.
result += buffer;
This is your second bug. buffer is a char array, which decays to a char * in this context. As you know, a char * in a string context gets typically interpreted as a C-style string that's terminated by a '\0' byte.
You might've noticed that you expect to get a bunch of 00 bytes read. Well, after the char array gets decayed to a char *, everything up to the first 00 byte is going to get appended to your result, rather than the 128 bytes, exactly. And if there were no 00 bytes in those 128 bytes, you'll probably end up getting some random garbage, as an extra bonus, with a small possibility of a crash.
if (fgets(buffer, 128, pipe) != NULL)
This is your third bug. If the read data happens to include a 0A byte, an '\n' character, this is not going to read 128 bytes.
cout << s;
This is your fourth bug. Since the data will (after all the other bugs are fixed) presumably contain binary stuff, your terminal is inlikely to have much success displaying various bytes, especially bytes 00 through 1F.
To fix your code you will need to:
Correctly handle the end-of-file condition.
Correctly read binary data. fgets(), et al, are completely unsuitable for the task. If you insist on using C file structures, your only reasonable option is to use fread().
Correctly assemble a std::string from a blob of binary data. Merely appending a char buffer to it, crossing your fingers, and hoping for the best, will not work. You will most likely need to use the two-argument std::string constructor, that takes a beginning and an ending iterator value as parameters.
Display binary data correctly, instead of just dumping the entire blob to std::cout, just like that. The most common approach is a std::hex manipulator, and diligent up-conversion of each char to an int, as an unsigned value.

processing a raw hex communication data log into readable values

I have an embedded microcontroller system that is communicating to my computer via uart, and at this point of development is should be sending out valid data on its own. I have triggered a 'scan' of this device, its role then is to send out data it's read to the host device autonomously. I now have a text file with raw hex data values in there, ready to be processed. I'm sending out packets that start with 0x54 and end in 0x55 and they come in lots of 4 (4 packets per single 'read' of the instrument). A packet contains two 'identifier' bytes after the 0x54, then a bunch of 6 bytes for data, totaling 10 bytes per packet. depending on the packet identifiers, the data within could be a floating point number or an integer.
I basically want to to design a command line program that takes in a raw text file with all this data in it, and outputs a text file, comma seperated between packets, with the data converted to its readable decimal counterpart. a new line for every 4th converted packet would be very useful. I'd like to do this in C (I am relatively proficient in coding embedded C, and I can convert the hex values back and forth between types easily enough) but I do not have much experience in writing C executables. Any help with getting started on this little mini project would be awesome. I need to know how to create a command line controllable executable, read in a data file, manipulate the data (which I think I can do now) and then export the data to another file. I've installed netbeans c/c++. Pointers in the right direction are all I require (no pun intended =] )
Here's the raw data file:
http://pastebin.com/dx4HetT0
There is not much variation in the data to deduce the bytes with certainty, but the following should get the OP started.
void db_parse(FILE *outf, const unsigned char *s) {
fprintf(outf, "%c", *s++);
fprintf(outf, " %3u", *((uint8_t *) s));
s += sizeof(uint8_t);
uint8_t id1 = *((uint8_t *) s);
fprintf(outf, " %3u", *((uint8_t *) s));
s += sizeof(uint8_t);
if (id1 >= 3) {
fprintf(outf, " %13e", *((float *) s));
s += sizeof(float);
} else {
fprintf(outf, " %13u", *((uint32_t *) s));
s += sizeof(uint32_t);
}
fprintf(outf, " %5u", *((uint16_t *) s));
s += sizeof(uint16_t);
fprintf(outf, " %c\n", *s);
}
// Test code below
const char *h4 =
"54 12 04 00 00 40 C0 00 00 55 54 12 01 02 00 00 00 00 00 55 54 12 02 03 00 00 00 00 00 55 54 12 03 00 00 40 C0 00 00 55 ";
void db_test() {
unsigned char uc[10];
const char *s = h4;
while (*s) {
for (int i = 0; i < 10; i++) {
unsigned x;
sscanf(s, "%x", &x);
uc[i] = x;
s += 3;
}
db_parse(stdout, uc);
}
}
Output
T 18 4 -3.000000e+00 0 U
T 18 1 2 0 U
T 18 2 3 0 U
T 18 3 -3.000000e+00 0 U

Bit reading puzzle (reading a binary file in C++)

I am trying to read the file 'train-images-idx3-ubyte', which can be found here along with the corresponding file format description (at the bottom of the webpage). When I look at the bytes with od -t x1 train-images-idx3-ubyte | less (hexadecimal, bytewise), I get the following output:
adress bytes
0000000 00 00 08 03 00 00 ea 60 00 00 00 1c 00 00 00 1c
0000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
...
This is what I expected according to 1. But when I try to read the data with C++ I've got a problem. What I do is this:
std::fstream trainingData("minst/train-images-idx3-ubyte",
std::ios::in | std::ios::binary);
int8_t zero = 0, encoding = 0, dimension = 0;
int32_t samples = -1;
trainingData >> zero >> zero >> encoding >> dimension;
trainingData >> samples;
debugLogger << "training set image file, encoding = "
<< (int) encoding << ", dimension = "
<< (int) dimension << ", items = " << (int) samples << "\n";
But the output of these few lines of code is:
training set image file, encoding = 8, dimension = 3, items = 0
Everything but the number of instances (items, samples) is correct. I tried reading the next 4 bytes as int8_t and that gave me at least the same result as od. I cannot imagine how samples can be 0. What I actually wanted to read here was 10,000. Maybe you've got a clue?
As mentioned in other answers, you need to use unformatted input, i.e. istream::read(...) instead of operator>>. Translating your code above to use read yields:
trainingData.read(reinterpret_cast<char*>(&zero), sizeof(zero));
trainingData.read(reinterpret_cast<char*>(&zero), sizeof(zero));
trainingData.read(reinterpret_cast<char*>(&encoding), sizeof(encoding));
trainingData.read(reinterpret_cast<char*>(&dimension), sizeof(dimension));
trainingData.read(reinterpret_cast<char*>(&samples), sizeof(samples));
Which gets you most of the way there - but 00 00 ea 60 looks like it's in Big-endian format, so you'll have to pass it through ntohl to make sense of it if you're running on an intel-based machine:
samples = ntohl(samples);
which gives encoding = 8, dimension = 3, items = 60000.
The input is formatted, which will result in you reading wrong results from the file. Reading from an unformatted input will provide the correct results.