c++ parsing hex char array efficiently

c++ parsing hex char array efficiently - c++

I'm trying to figure out how to most efficiently parse the following into Hex segments with c++ 98.
//One lump, no delemiters
char hexData[] = "50FFFEF080";
and want parse out 50 FF FE & F080 (assuming I know hexData will be in this format every time) into base 10. Yielding something like:
var1=80
var2=255
var3=254
var4=61568

Here's one strategy.
Copy the necessary characters one at a time to a temporary string.
Use strtol to extract the numbers.
Program:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char hexData[] = "50FFFEF080";
int i = 0;
int var[4];
char temp[5] = {};
char* end = NULL;
for ( i = 0; i < 3; ++i )
{
temp[0] = hexData[i*2];
temp[1] = hexData[i*2+1];
var[i] = (int)strtol(temp, &end, 16);
printf("var[%d]: %d\n", i, var[i]);
}
// The last number.
temp[0] = hexData[3*2];
temp[1] = hexData[3*2+1];
temp[2] = hexData[3*2+2];
temp[3] = hexData[3*2+3];
var[3] = (int)strtol(temp, &end, 16);
printf("var[3]: %d\n", var[3]);
return 0;
}
Output:
var[0]: 80
var[1]: 255
var[2]: 254
var[3]: 61568

You can convert all string to number and then use bitwise operations to get any bytes or bits. Try this
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
char hexData[] = "50FFFEF080";
uint64_t number; // 64bit number
// conversion from char-string to one big number
sscanf(hexData, "%llx", &number); // read as a hex number
uint64_t tmp = number; // just a copy of initial number to make bitwise operations
// use masks to get particular bytes
printf("%lld \n", tmp & 0xFFFF); // prints last two bytes as decimal number: 61568
// or copy to some other memory
unsigned int lastValue = tmp & 0xFFFF; // now lastValue has 61568 (0xF080)
tmp >>= 16; // remove last two bytes with right shift
printf("%lld \n", tmp & 0xFF); // prints the last byte 254
tmp >>= 8; // remove lass byte with right shift
printf("%lld \n", tmp & 0xFF); // prints 255
tmp >>= 8; // remove lass byte with right shift
printf("%lld \n", tmp & 0xFF); // prints 80
return 0;
}

#include <iostream>
#include <string>
int main() {
std::istringstream buffer("50FFFEF080");
unsigned long long value;
buffer >> std::hex >> value;
int var1 = value & 0xFFFF;
int var2 = (value >> 16) & 0xFF;
int var3 = (value >> 24) & 0xFF;
int var4 = (value >> 32) & 0xFF;
return 0;
}

Related

Converting string hexadecmials to unsigned char (BYTE) in C

I want to convert the hexadecimal string value 0x1B6 to unsigned char - where it will store the value in the format 0x1B, 0x60 We had achieved the scenarios in C++, but C doesn't support std::stringstream.
The following code is C++, how do I achieve similar behavior in C?
char byte[2];
std::string hexa;
std::string str = "0x1B6" // directly assigned the char* value in to string here
int index =0;
unsigned int i;
for(i = 2; i < str.length(); i++) {
hexa = "0x"
if(str[i + 1] !NULL) {
hexa = hexa + str[i] + str[i + 1];
short temp;
std::istringstream(hexa) >> std::hex >> temp;
byte[index] = static_cast<BYTE>(temp);
} else {
hexa = hexa+ str[i] + "0";
short temp;
std::istringstream(hexa) >> std::hex >> temp;
byte[index] = static_cast<BYTE>(temp);
}
}
output:
byte[0] --> 0x1B
byte[1]--> 0x60

I don't think your solution is very efficient. But disregarding that, with C you would use strtol. This is an example of how to achieve something similar:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
int main(void) {
const char *hex_string = "0x1B60";
long hex_as_long = strtol(hex_string, NULL, 16);
printf("%lx\n", hex_as_long);
// From right to left
for(int i = 0; i < strlen(&hex_string[2]); i += 2) {
printf("%x\n", (hex_as_long >> (i * 4)) & 0xff);
}
printf("---\n");
// From left to right
for(int i = strlen(&hex_string[2]) - 2; i >= 0; i -= 2) {
printf("%x\n", (hex_as_long >> (i * 4)) & 0xff);
}
}
So here we get the full value as a long inside hex_as_long. We then print the whole long with the first print and the individual bytes inside the second for loop. We are shifting multiples of 4 bits because one hex digit (0xf) covers exactly 4 bits of data.
To get the bytes or the long printed to a string rather than to stdout (if that is what you want to achieve), you can use strprintf or strnprintf in a similar way to how printf is used, but with a variable or array as destination.
This solution scans whole bytes (0xff) at a time. If you need to handle one hex digit (0xf) at a time you can divide all the operations by two and mask with 0xf instead of 0xff.

Base64 image file encoding with C++

I am writing some simple code to encode files to base64. I have a short c++ code that reads a file into a vector and converts it to unsigned char*. I do this so I can properly use the encoding function I got.
The problem: It works with text files (of different sizes), but it won't work with image files. And I can't figure it out why. What gives?
For an simple text.txt containing the text abcd, the output for both my code and a bash $( base64 text.txt ) is the same.
On the other hand, when I input an image the output is something like iVBORwOKGgoAAAAAAA......AAA== or sometimes it ends with an corrupted size vs prev_size Aborted (core dumped), the first few bytes are correct.
The code:
static std::vector<char> readBytes(char const* filename)
{
std::ifstream ifs(filename, std::ios::binary|std::ios::ate);
std::ifstream::pos_type pos = ifs.tellg();
std::vector<char> result(pos);
ifs.seekg(0, std::ios::beg);
ifs.read(&result[0], pos);
return result;
}
static char Base64Digits[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
int ToBase64Simple( const BYTE* pSrc, int nLenSrc, char* pDst, int nLenDst )
{
int nLenOut= 0;
while ( nLenSrc > 0 ) {
if (nLenOut+4 > nLenDst) {
cout << "error\n";
return(0); // error
}
// read three source bytes (24 bits)
BYTE s1= pSrc[0]; // (but avoid reading past the end)
BYTE s2= 0; if (nLenSrc>1) s2=pSrc[1]; //------ corrected, thanks to jprichey
BYTE s3= 0; if (nLenSrc>2) s3=pSrc[2];
DWORD n;
n = s1; // xxx1
n <<= 8; // xx1x
n |= s2; // xx12
n <<= 8; // x12x
n |= s3; // x123
//-------------- get four 6-bit values for lookups
BYTE m4= n & 0x3f; n >>= 6;
BYTE m3= n & 0x3f; n >>= 6;
BYTE m2= n & 0x3f; n >>= 6;
BYTE m1= n & 0x3f;
//------------------ lookup the right digits for output
BYTE b1 = Base64Digits[m1];
BYTE b2 = Base64Digits[m2];
BYTE b3 = Base64Digits[m3];
BYTE b4 = Base64Digits[m4];
//--------- end of input handling
*pDst++ = b1;
*pDst++ = b2;
if ( nLenSrc >= 3 ) { // 24 src bits left to encode, output xxxx
*pDst++ = b3;
*pDst++ = b4;
}
if ( nLenSrc == 2 ) { // 16 src bits left to encode, output xxx=
*pDst++ = b3;
*pDst++ = '=';
}
if ( nLenSrc == 1 ) { // 8 src bits left to encode, output xx==
*pDst++ = '=';
*pDst++ = '=';
}
pSrc += 3;
nLenSrc -= 3;
nLenOut += 4;
}
// Could optionally append a NULL byte like so:
*pDst++= 0; nLenOut++;
return( nLenOut );
}
int main(int argc, char* argv[])
{
std::vector<char> mymsg;
mymsg = readBytes(argv[1]);
char* arr = &mymsg[0];
int len = mymsg.size();
int lendst = ((len+2)/3)*4;
unsigned char* uarr = (unsigned char *) malloc(len*sizeof(unsigned char));
char* dst = (char *) malloc(lendst*sizeof(char));;
mymsg.clear(); //free()
// convert to unsigned char
strncpy((char*)uarr, arr, len);
int lenOut = ToBase64Simple(uarr, len, dst, lendst);
free(uarr);
int cont = 0;
while (cont < lenOut) //(dst[cont] != 0)
cout << dst[cont++];
cout << "\n";
}
Any insight is welcomed.

I see two problems.
First, you are clearing your mymsg vector before you're done using it. This leaves the arr pointer dangling (pointing at memory that is no longer allocated). When you access arr to get the data out, you end up with Undefined Behavior.
Then you use strncpy to copy (potentially) binary data. This copy will stop when it reaches the first nul (0) byte within the file, so not all of your data will be copied. You should use memcpy instead.

Shifting arrays of bytes and skipping bits

I'm trying to make a function that would return N number of bits of a given memory chunk, and optionally skipping M bits.
Example:
unsigned char *data = malloc(3);
data[0] = 'A'; data[1] = 'B'; data[2] = 'C';
read(data, 8, 4);
would skip 12 bits and then read 8 bits from the data chunk "ABC".
"Skipping" bits means it would actually bitshift the entire array, carrying bits from the right to the left.
In this example ABC is
01000001 01000010 01000011
and the function would need to return
0001 0100
This question is a follow up of my previous question
Minimal compilable code
#include <ios>
#include <cmath>
#include <bitset>
#include <cstdio>
#include <cstring>
#include <cstdlib>
#include <iostream>
using namespace std;
typedef unsigned char byte;
typedef struct bit_data {
byte *data;
size_t length;
} bit_data;
/*
Asume skip_n_bits will be 0 >= skip_n_bits <= 8
*/
bit_data *read(size_t n_bits, size_t skip_n_bits) {
bit_data *bits = (bit_data *) malloc(sizeof(struct bit_data));
size_t bytes_to_read = ceil(n_bits / 8.0);
size_t bytes_to_read_with_skip = ceil(n_bits / 8.0) + ceil(skip_n_bits / 8.0);
bits->data = (byte *) calloc(1, bytes_to_read);
bits->length = n_bits;
/* Hardcoded for the sake of this example*/
byte *tmp = (byte *) malloc(3);
tmp[0] = 'A'; tmp[1] = 'B'; tmp[2] = 'C';
/*not working*/
if(skip_n_bits > 0){
unsigned char *tmp2 = (unsigned char *) calloc(1, bytes_to_read_with_skip);
size_t i;
for(i = bytes_to_read_with_skip - 1; i > 0; i--) {
tmp2[i] = tmp[i] << skip_n_bits;
tmp2[i - 1] = (tmp[i - 1] << skip_n_bits) | (tmp[i] >> (8 - skip_n_bits));
}
memcpy(bits->data, tmp2, bytes_to_read);
free(tmp2);
}else{
memcpy(bits->data, tmp, bytes_to_read);
}
free(tmp);
return bits;
}
int main(void) {
//Reading "ABC"
//01000001 01000010 01000011
bit_data *res = read(8, 4);
cout << bitset<8>(*res->data);
cout << " -> Should be '00010100'";
return 0;
}
The current code returns 00000000 instead of 00010100.
I feel like the error is something small, but I'm missing it. Where is the problem?

Your code is tagged as C++, and indeed you're already using C++ constructs like bitset, however it's very C-like. The first thing to do I think would be to use more C++.
Turns out bitset is pretty flexible already. My approach would be to create one to store all the bits in our input data, and then grab a subset of that based on the number you wish to skip, and return the subset:
template<size_t N, size_t M, typename T = unsigned char>
std::bitset<N> read(size_t skip_n_bits, const std::array<T, M>& data)
{
const size_t numBits = sizeof(T) * 8;
std::bitset<N> toReturn; // initially all zeros
// if we want to skip all bits, return all zeros
if (M*numBits <= skip_n_bits)
return toReturn;
// create a bitset to store all the bits represented in our data array
std::bitset<M*numBits> tmp;
// set bits in tmp based on data
// convert T into bit representations
size_t pos = M*numBits-1;
for (const T& element : data)
{
for (size_t i=0; i < numBits; ++i)
{
tmp.set(pos-i, (1 << (numBits - i-1)) & element);
}
pos -= numBits;
}
// grab just the bits we need
size_t startBit = tmp.size()-skip_n_bits-1;
for (size_t i = 0; i < N; ++i)
{
toReturn[N-i-1] = tmp[startBit];
tmp <<= 1;
}
return toReturn;
}
Full working demo
And now we can call it like so:
// return 8-bit bitset, skip 12 bits
std::array<unsigned char, 3> data{{'A', 'B', 'C'}};
auto&& returned = read<8>(12, data);
std::cout << returned << std::endl;
Prints
00100100
which is precisely our input 01000001 01000010 01000011 skipping the first twelve bits (from the left towards the right), and only grabbing the next 8 available.
I'd argue this is a bit easier to read than what you've got, esp. from a C++ programmer's point of view.

Convert unsigned short to char

I know there's other posts like this but I think mine is different. I have a group of numbers I wish to display. However, it's saved as unsigned short. As it's given to me from a network buffer, all my data is unsigned short format. So for a serial number starting with "ABC-" the two unsigned shorts will be holding 0x4142 and 0x432D (Already in ASCII format). I need to convert those to type char to display using printf and %s, but for the rest of my system, they need to remain as an unsigned short. This is what I've tried so far, but the output is blank:
unsigned char * num[3];
num[0] = (unsigned char*)(SYSTEM_N >> 8);
num[1] = (unsigned char*)(SYSTEM_N & 0x00FF);
printf("System Number: %s \r\n", num);
Can anyone shed some light on this for me? Thanks!

There are several errors: 1) array is too short, 2) defining the array as an array of pointers and 3) not terminating the string.
#include <stdio.h>
int main (void)
{
unsigned short system_m = 0x4142;
unsigned short system_n = 0x432D;
unsigned char num[5];
num[0] = system_m >> 8;
num[1] = system_m & 0xFF;
num[2] = system_n >> 8;
num[3] = system_n & 0xFF;
num[4] = '\0';
printf("System Number: %s \r\n", num);
return 0;
}
EDIT alternatively if you don't want to keep the string, just display the information with this:
#include <stdio.h>
int main (void)
{
unsigned short system_m = 0x4142;
unsigned short system_n = 0x432D;
printf("System Number: %c%c%c%c \r\n",
system_m >> 8, system_m & 0xFF, system_n >> 8, system_n & 0xFF);
return 0;
}
Program output:
System Number: ABC-

You probably meant to write
unsigned char num[3];
as you have it, you declare an array holding three char* pointers.
Also don't forget to set the closing NUL character, before printing:
num[2] = '\0';

A general solution might be something like this, assuming ushort_arr contains the unsigned shorts in an array and ushort_arr_size indicates its size.
char *str = malloc(ushort_arr_size * 2 + 1);
// check if str == NULL
int j = 0;
for (int i = 0; i < ushort_arr_size; i++) {
str[j++] = ushort_arr[i] >> 8;
str[j++] = ushort_arr[i];
}
str[j] = '\0';
printf("string is: %s\n", str);
free(str);
Though perhaps this might be more effective without the memory management, if you only want to print it once:
for (int i = 0, j = 0; i < ushort_arr_size; i++) {
putchar(ushort_arr[i] >> 8);
putchar(ushort_arr[i] & 0xFF);
}

CRC24Q implementation

I am trying to implement the algorithm of a CRC check, which basically created a value, based on an input message.
So, consider I have a hex message 3F214365876616AB15387D5D59, and I want to obtain the CRC24Q value of the message.
The algorithm that I found to do this is the following:
typedef unsigned long crc24;
crc24 crc_check(unsigned char *input) {
unsigned char *octets;
crc24 crc = 0xb704ce; // CRC24_INIT;
int i;
int len = strlen(input);
octets = input;
while (len--) {
crc ^= ((*octets++) << 16);
for (i = 0; i < 8; i++) {
crc <<= 1;
if (crc & 0x1000000)
crc ^= CRC24_POLY;
}
}
return crc & 0xFFFFFF;
}
where *input=3F214365876616AB15387D5D59.
The problem is that ((*octets++) << 16) will shift by 16 bits the ascii value of the hex character and not the character itself.
So, I made a function to convert the hex numbers to characters.
I know the implementation looks weird, and I wouldn't be surprised if it were wrong.
This is the convert function:
char* convert(unsigned char* message) {
unsigned char* input;
input = message;
int p;
char *xxxx[20];
xxxx[0]="";
for (p = 0; p < length(message) - 1; p = p + 2) {
char* pp[20];
pp[0] = input[0];
char *c[20];
*input++;
c[0]= input[0];
*input++;
strcat(pp,c);
char cc;
char tt[2];
cc = (char ) strtol(pp, &pp, 16);
tt[0]=cc;
strcat(xxxx,tt);
}
return xxxx;
}
SO:
unsigned char *msg_hex="3F214365876616AB15387D5D59";
crc_sum = crc_check(convert((msg_hex)));
printf("CRC-sum: %x\n", crc_sum);
Thank you very much for any suggestions.

Shouldn't the if (crc & 0x8000000) be if (crc & 0x1000000) otherwise you're testing the 28th bit not the 25th for 24-bit overflow

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

c++ parsing hex char array efficiently - c++

#include <iostream> #include <string> int main() { std::istringstream buffer("50FFFEF080"); unsigned long long value; buffer >> std::hex >> value; int var1 = value & 0xFFFF; int var2 = (value >> 16) & 0xFF; int var3 = (value >> 24) & 0xFF; int var4 = (value >> 32) & 0xFF; return 0; }

Related

Converting string hexadecmials to unsigned char (BYTE) in C

Base64 image file encoding with C++

Shifting arrays of bytes and skipping bits

Convert unsigned short to char

CRC24Q implementation

Categories

Resources