hex string to unsigned char[]

hex string to unsigned char[] - c++

today I tried to convert a hex string to an unsigned char[]
string test = "fe5f0c";
unsigned char* uchar= (unsigned char *)test.c_str();
cout << uchar << endl;
This resulted in the output of
fe5f0c
hrmpf :-(. The desired behaviour would be as follows:
unsigned char caTest[2];
caTest[0] = (unsigned char)0xfe;
caTest[1] = (unsigned char)0x5f;
caTest[2] = (unsigned char)0x0c;
cout << caTest << endl;
which prints unreadable ascii code. As so often I am doing something wrong ^^. Would appreciate any suggestions.
Thanks in advance

Sure, you just have to isolate the bits you are interested in after parsing:
#include <string>
#include <cstdlib>
#include <iostream>
typedef unsigned char byte;
int main()
{
std::string test = "40414243";
unsigned long x = strtoul(test.c_str(), 0, 16);
byte a[] = {byte(x >> 24), byte(x >> 16), byte(x >> 8), byte(x), 0};
std::cout << a << std::endl;
}
Note that I changed the input string to an eight digit number, since otherwise the array would start with the value 0, and operator<< would interpret that as the end and you wouldn't be able to see anything.

"fe5f0c" is a string of 6 bytes (7 containing the null terminator). If you looked at it as an array you would see:
char str[] = { 102, 101, 53, 102, 48, 99 };
But you want
unsigned char str[] = { 0xfe, 0x5f, 0x0c };
The former is a "human readable" representation whereas the latter is "machine readable" numbers. If you want to convert between them, you need to do so explicitly using code similar to what #Fred wrote.
Casting (most of the time) does not imply a conversion, you just tell the compiler to trust you and that it can forget what it thinks it knows about the expression you're casting.

Here is a simpler way for hexadecimal string literals:
unsigned char *uchar = "\xfe\x5f\x0c";

Related

Char array to hex string conversion - unexpected output

I have a simple program converting dynamic char array to hex string representation.
#include <iostream>
#include <sstream>
#include <iomanip>
#include <string>
using namespace std;
int main(int argc, char const* argv[]) {
int length = 2;
char *buf = new char[length];
buf[0] = 0xFC;
buf[1] = 0x01;
stringstream ss;
ss << hex << setfill('0');
for(int i = 0; i < length; ++i) {
ss << std::hex << std::setfill('0') << std::setw(2) << (int) buf[i] << " ";
}
string mystr = ss.str();
cout << mystr << endl;
}
Output:
fffffffc 01
Expected output:
fc 01
Why is this happening? What are those ffffff before fc? This happens only on certain bytes, as you can see the 0x01 is formatted correctly.

Three things you need to know to understand what's happening:
The first thing is that char can be either signed or unsigned, it's implementation (compiler) specific
When converting a small signed type to a large signed type (like e.g. a signed char to an int), they will be sign extended
How negative values are stored using the most common two's complement system, where the highest bit in a value defines if a value is negative (bit is set) or not (bit is clear)
What happens here is that char seems to be signed, and 0xfc is considered a negative value, and when you convert 0xfc to an int it will be sign-extended to 0xfffffffc.
To solve it use explicitly unsigned char and convert to unsigned int.

This is called "sign extension".
char is a signed type, so 0xfc will become negative value if you force it in to a char.
Its decimal value is -4
When you cast it to int, it extends the sign bit to give you the same value.
(It happens here (int) buf[i])
On your system, int is 4 bytes, so you get the extra bytes filled with ff.

Initializing an unsigned char array with hex values in C++

I would like to initialize an unsigned char array with 16 hex values. However, I don't seem to know how to properly initialize/access those values. When I try to access them as I might want to intuitively, I'm getting no value at all.
This is my output
The program was run with the following command: 4
Please be a value! -----> p
Here's some plaintext
when run with the code below -
int main(int argc, char** argv)
{
int n;
if (argc > 1) {
n = std::stof(argv[1]);
} else {
std::cerr << "Not enough arguments\n";
return 1;
}
char buff[100];
sprintf(buff,"The program was run with the following command: %d",n);
std::cout << buff << std::endl;
unsigned char plaintext[16] =
{0x0f, 0xb0, 0xc0, 0x0f,
0xa0, 0xa0, 0xa0, 0xa0,
0x00, 0x00, 0xa0, 0xa0,
0x00, 0x00, 0x00, 0x00};
unsigned char test = plaintext[1]^plaintext[2];
std::cout << "Please be a value! -----> " << test << std::endl;
std::cout << "Here's some plaintext " << plaintext[3] << std::endl;
return 0;
}
By way of context, this is part of a group project for school. We are ultimately trying to implement the Serpent cipher, but keep on getting tripped up by unsigned char arrays. Our project specification says that we must have two functions that take what would be Byte arrays in Java. I assume the closest relative in C++ is an unsigned char[]. Otherwise I would use vector. Elsewhere in the code I've implemented a setKey function which takes an unsigned char array, packs its values into 4 long long ints (the key needs to be 256 bits) and performs various bit-shifting and xor operations on those ints to generate the keys necessary for the cryptographic algorithm. Hope that's enough background on what I'm looking to do. I'm guessing I'm just overlooking some basic C++ functionality here. Thanks for any and all help!

A char is an 8-bit value capable of storing -128 <= n <= +127, frequently used to store character representations in different encodings and commonly - in Western, Roman-alphabet installations - char is used to indicate representation of ASCII or utf encoded values. 'Encoded' means the symbols/letter in the character set have been assigned numeric values. Think of the periodic table as an encoding of elements, so that 'H' (Hydrogen) is encoded as 1, Germanium as 32. In the ASCII (and UTF-8) tables, position 32 represents the character we call "space".
When you use operator << on a char value, the default behavior is to assume you are passing it a character encoding, e.g. an ASCII character code. If you do
char c = 'z';
char d = 122;
char e = 0x7A;
char f = '\x7a';
std::cout << c << d << e << f << "\n";
All four assignments are equivalent. 'z' is a shortcut/syntactic-sugar for char(122), 0x7A is hex for 122, and '\x7a' is an escape that forms the ascii character with a value of 0x7a or 122 - i.e. z.
Where many new programmers go wrong is that they do this:
char n = 8;
std::cout << n << endl;
this does not print "8", it prints ASCII character at position 8 in the ASCII table.
Think for a moment:
char n = 8; // stores the value 8
char n = a; // what does this store?
char n = '8'; // why is this different than the first line?
Lets rewind a moment: when you store 120 in a variable, it can represent the ASCII character 'x', but ultimately what is being stored is just the numeric value 120, plain and simple.
Specifically: When you pass 122 to a function that will ultimately use it to look up a font entry from a character set using the Latin1, ISO-8859-1, UTF-8 or similar encodings, then 120 means 'z'.
At the end of the day, char is just one of the standard integer value types, it can store values -128 <= n <= +127, it can trivially be promoted to a short, int, long or long long, etc, etc.
While it is generally used to denote characters, it also frequently gets used as a way of saying "I'm only storing very small values" (such as integer percentages).
int incoming = 5000;
int outgoing = 4000;
char percent = char(outgoing * 100 / incoming);
If you want to print the numeric value, you simply need to promote it to a different value type:
std::cout << (unsigned int)test << "\n";
std::cout << unsigned int(test) << "\n";
or the preferred C++ way
std::cout << static_cast<unsigned int>(test) << "\n";

I think (it's not completely clear what you are asking) that the answer is as simple as this
std::cout << "Please be a value! -----> " << static_cast<unsigned>(test) << std::endl;
If you want to output the numeric value of a char or unsigned char, you have to cast it to an int or unsigned first.
Not surprisingly, by default, chars are output as characters not integers.
BTW this funky code
char buff[100];
sprintf(buff,"The program was run with the following command: %d",n);
std::cout << buff << std::endl;
is more simply written as
std::cout << "The program was run with the following command: " << n << std::endl;

std::cout and std::cin always treats char variable as a char
If you want to input or output as int, you must manually do it like below.
std::cin >> int_var; c = int_var;
std::cout << (int)c;
If using scanf or printf, there is no such problem as the format parameter ("%d", "%c", "%s") tells howto covert input buffer (integer, char, string).

Take two hex characters from file and store as a char with associated hex value

I'd like to take the next two hex characters from a stream and store them as the associated associated hex->decimal numeric value in a char.
So if an input file contains 2a3123, I'd like to grab 2a, and store the numeric value (decimal 42) in a char.
I've tried
char c;
instream >> std::setw(2) >> std::hex >> c;
but this gives me garbage (if I replace c with an int, I get the maximum value for signed int).
Any help would be greatly appreciated! Thanks!
edit: I should note that the characters are guaranteed to be within the proper range for chars and that the file is valid hexadecimal.

OK I think dealing with ASCII decoding is a bad idea at all and does not really answer the question.
I think your code does not work because setw() or istream::width() works only when you read to std::string or char*. I guess it from here
How ever you can use the goodness of standard c++ iostream converters. I came up with idea that uses stringstream class and string as buffer. The thing is to read n chars into buffer and then use stringstream as a converter facility.
I am not sure if this is the most optimal version. Probably not.
Code:
#include <iostream>
#include <sstream>
int main(void){
int c;
std::string buff;
std::stringstream ss_buff;
std::cin.width(2);
std::cin >> buff;
ss_buff << buff;
ss_buff >> std::hex >> c;
std::cout << "read val: " << c << '\n';
}
Result:
luk32#genaker:~/projects/tmp$ ./a.out
0a10
read val: 10
luk32#genaker:~/projects/tmp$ ./a.out
10a2
read val: 16
luk32#genaker:~/projects/tmp$ ./a.out
bv00
read val: 11
luk32#genaker:~/projects/tmp$ ./a.out
bc01
read val: 188
luk32#genaker:~/projects/tmp$ ./a.out
01bc
read val: 1
And as you can see not very error resistant. Nonetheless, works for the given conditions, can be expanded into a loop and most importantly uses the iostream converting facilities so no ASCII magic from your side. C/ASCII would probably be way faster though.
PS. Improved version. Uses simple char[2] buffer and uses non-formatted write/read to move data thorough the buffer (get/write as opposed to operator<</operator>>). The rationale is pretty simple. We do not need any fanciness to move 2 bytes of data. We ,however, use formatted extractor to make the conversion. I made it a loop version for the convenience. It was not super simple though. It took me good 40 minutes of fooling around to figure out very important lines. With out them the extraction works for 1st 2 characters.
#include <iostream>
#include <sstream>
int main(void){
int c;
char* buff = new char[3];
std::stringstream ss_buff;
std::cout << "read vals: ";
std::string tmp;
while( std::cin.get(buff, 3).gcount() == 2 ){
std::cout << '(' << buff << ") ";
ss_buff.seekp(0); //VERY important lines
ss_buff.seekg(0); //VERY important lines
ss_buff.write(buff, 2);
if( ss_buff.fail() ){ std::cout << "error\n"; break;}
std::cout << ss_buff.str() << ' ';
ss_buff >> std::hex >> c;
std::cout << c << '\n';
}
std::cout << '\n';
delete [] buff;
}
Sample output:
luk32#genaker:~/projects/tmp$ ./a.out
read vals: 0aabffc
(0a) 0a 10
(ab) ab 171
(ff) ff 255
Please note, the c was not read as intended.
I found everything needed here http://www.cplusplus.com/reference/iostream/

You can cast a Char to an int and the int will hold the ascii value of the char. For example, '0' will be 48, '5' will be 53. The letters occur higher up so 'a' will be cast to 97, 'b' to 98 etc. So knowing this you can take the int value and subtract 48, if the result is greater than 9, subtract another 39. Then char 0 will have been turned to int 0, char 1 to int 1 all the way up to char a being set to int 10, char b to int 11 etc.
Next you will need to multiply the value of the first by 16 and add it to the second to account for the bit shift. Using your example of 2a.
char 2 casts to int 50. Subtract 48 and get 2. Multiply by 16 and get 32.
char a casts to int 97. Subtract 48 and get 49, this is higher than 9 so subtract another 39 and get 10. Add this to the end result of the last one (32) and you get 42.
Here is the code:
int HexToInt(char hi, char low)
{
int retVal = 0;
int hiBits = (int)hi;
int loBits = (int)low;
retVal = Convert(hiBits) * 16 + Convert(loBits);
return retVal;
}
int Convert(int in)
{
int retVal = in - 48;
//If it was not a digit
if(retVal > 10)
retVal = retVal - 7;
//if it was not an upper case hex didgit
if(retVal > 15)
retVal = retVal - 32;
return retVal;
}
The first function can actually be written as one line thus:
int HexToInt(char hi, char low)
{
return Convert((int)hi) * 16 + Convert((int)low);
}
NOTE: This only accounts for lower case letters and only works on systems that uses ASCII, i.e. Not IBM ebcdic based systems.

How to convert long long ASCII hex values into a string?

I have a long long holding ASCII hex values and want to convert it to a string. I have this code:
char myBuffer[8];
long long myLongLong = 0x7177657274797569;
sprintf(myBuffer,"%c%c%c%c%c%c%c%c",myLongLong);
int x;
cout << myBuffer;
cin >> x;
return 0;
The hex code should be "qwertyui", but it always gives other value.
I tried with %c, %s, %X but it doesn't give me the output I need, the closest was %c but it prints out only one char.

That code is wrong in so many ways I don't know where to start...
myBuffer is too small to hold the 8 chars + the NUL terminator, ie. should be myBuffer[9].
sprintf is expecting 8 arguments, you're only passing 1. The other required arguments will be whatever's on the stack.
myLongLong is not a char
You don't take into account endianness.
You're using C functions and doing things in a C way in C++. Why don't you use std::strings as opposed to C-style strings and stringstreams as an alternative to sprintf?
The closest almost working example of what you want, as similar to your example, is something like:
#include <cstdio>
#include <iostream>
using namespace std;
int main(void)
{
char myBuffer[9];
long long myLongLong = 0x7177657274797569;
char *c_ptr = (char*)&myLongLong;
sprintf(myBuffer,"%c%c%c%c%c%c%c%c", c_ptr[0], c_ptr[1], c_ptr[2], c_ptr[3], c_ptr[4], c_ptr[5], c_ptr[6], c_ptr[7]);
int x;
cout<<myBuffer;
cin>>x;
return 0;
}
Which will output "iuytrewq" on my little-endian machine. As I mentioned, that doesn't take into account the endianness. If the machine is little-endian then you could read/print the bytes in reverse.
I really don't understand why you're trying to do this though...

You could try
union { char buf[8]; long long num; } u;
u.num = 0x7177657274797569LL;
cout << u.str << endl;
But I don't understand what you want really to do. What about endianness ?

Use a string stream
long long myLongLong = 0x7177657274797569;
std::stringstream ss;
ss << std::hex << myLongLong;
std::cout << ss << std::endl

You want to print each byte of the long-long as an ascii char?
Then you need to loop over the long long extracting one byte at a time, look at bit shifts and masking.
Hint it's generally easier (if you know the length) to work from the last byte and shift right
or - you could just memcpy the long-long into the char array - except for any byte ordering issues

Try the following code.
#include <iostream>
using namespace std;
int main(void)
{
char myBuffer[8];
long long myLongLong = 0x7177657274797569;
for(int i = 0; i<8;i++)
{
myBuffer[i] = myLongLong>>(64-(i+1)*8);
}
cout<<myBuffer<<endl;
return 0;
}

Standard string behaviour with characters in C++

I have a problem which I do not understand. I add characters to a standard string. Whe I take them out the value printed is not what I expected.
int main (int argc, char *argv[])
{
string x;
unsigned char y = 0x89, z = 0x76;
x += y;
x += z;
cout << hex << (int) x[0] << " " <<(int) x[1]<< endl;
}
The output:
ffffff89 76
What I expected:
89 76
Any ideas as what is happening here?
And how do I fix it?

The string operator [] is yielding a char, i.e. a signed value. When you cast this to an int for output it will be a signed value also.
The input value cast to a char is negative and therefore the int also will be. Thus you see the output you described.

Most likely char is signed on your platform, therefore 0x89 and 0x76 become negative when it's represented by char.
You've to make sure that the string has unsigned char as value_type, so this should work:
typedef basic_string<unsigned char> ustring; //string of unsigned char!
ustring ux;
ux += y;
ux += z;
cout << hex << (int) ux[0] << " " <<(int) ux[1]<< endl;
It prints what you think should print:
89 76
Online demo : http://www.ideone.com/HLvcv

You have to account for the fact that char may be signed. If you promote it to int directly, the signed value will be preserved. Rather, you first have to convert it to the unsigned type of the same width (i.e. unsigned char) to get the desired value, and then promote that value to an integer type to get the correct formatted printing.
Putting it all together, you want something like this:
std::cout << (int)(unsigned char)(x[0]);
Or, using the C++-style cast:
std::cout << static_cast<int>(static_cast<unsigned char>(x[0]))

The number 0x89 is 137 in decimal system. It exceeds the cap of 127 and is now a negative number and therefore you see those ffffffthere. You could just simply insert (unsigned char) after the (int) cast. You would get the required result.
-Sandip

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

hex string to unsigned char[] - c++

Here is a simpler way for hexadecimal string literals: unsigned char *uchar = "\xfe\x5f\x0c";

Related

Char array to hex string conversion - unexpected output

Initializing an unsigned char array with hex values in C++

Take two hex characters from file and store as a char with associated hex value

How to convert long long ASCII hex values into a string?

Standard string behaviour with characters in C++

Categories

Resources