convert hexadecimal string to binary and seperate into bits n C++ - c++

I need to covert hexadecimal string to binary then pass the bits into different variables.
For example, my input is:
std::string hex = "E136";
How do I convert the string into binary output 1110 0001 0011 0110?
After that I need to pass the bit 0 to variable A, bits 1-9 to variable B and bits 10-15 to variable C.
Thanks in advance

How do I convert the string [...]?
Start with result value of null, then for each character (starting at first, indicating most significant one) determine its value (in range of [0:15]), multiply the so far received result by 16 and add the current value to. For your given example, this will result in
(((0 * 16 + v('E')) * 16 + v('1')) * 16 + v('3')) + v('6')
There are standard library functions doing the stuff for you, such as std::strtoul:
char* end;
unsigned long value = strtoul(hex.c_str(), &end, 16);
// ^^ base!
The end pointer useful to check if you have read the entire string:
if(*char == 0)
{
// end of string reached
}
else
{
// some part of the string was left, you might consider this
// as error (could occur if e. g. "f10s12" was passed, then
// end would point to the 's')
}
If you don't care for end checking, you can just pass nullptr instead.
Don't convert back to a string afterwards, you can get the required values by masking (&) and bitshifting (>>), e. g getting bits [1-9]:
uint32_t b = value >> 1 & 0x1ffU;
Working on integrals is much more efficient than working on strings. Only when you want to print out the final result, then convert back to string (if using a std::ostream, operator<< already does the work for you...).

While playing with this sample, I realized that I gave a wrong recommendation:
std::setbase(2) does not work by standard. Ouch! (SO: Why doesn't std::setbase(2) switch to binary output?)
For conversion of numbers to string with binary digits, something else must be used. I made this small sample. Though, the separation of bits is considered as well, my main focus was on output with different bases (and IMHO worth another answer):
#include <algorithm>
#include <iomanip>
#include <iostream>
#include <string>
std::string bits(unsigned value, unsigned w)
{
std::string text;
for (unsigned i = 0; i < w || value; ++i) {
text += '0' + (value & 1); // bit -> character '0' or '1'
value >>= 1; // shift right one bit
}
// text is right to left -> must be reversed
std::reverse(text.begin(), text.end());
// done
return text;
}
void print(const char *name, unsigned value)
{
std::cout
<< name << ": "
// decimal output
<< std::setbase(10) << std::setw(5) << value
<< " = "
// binary output
#if 0 // OLD, WRONG:
// std::setbase(2) is not supported by standard - Ouch!
<< "0b" << std::setw(16) << std::setfill('0') << std::setbase(2) << value
#else // NEW:
<< "0b" << bits(value, 16)
#endif // 0
<< " = "
// hexadecimal output
<< "0x" << std::setw(4) << std::setfill('0') << std::setbase(16) << value
<< '\n';
}
int main()
{
std::string hex = "E136";
unsigned value = strtoul(hex.c_str(), nullptr, 16);
print("hex", value);
// bit 0 -> a
unsigned a = value & 0x0001;
// bit 1 ... 9 -> b
unsigned b = (value & 0x03FE) >> 1;
// bit 10 ... 15 -> c
unsigned c = (value & 0xFC00) >> 10;
// report
print(" a ", a);
print(" b ", b);
print(" c ", c);
// done
return 0;
}
Output:
hex: 57654 = 0b1110000100110110 = 0xe136
a : 00000 = 0b0000000000000000 = 0x0000
b : 00155 = 0b0000000010011011 = 0x009b
c : 00056 = 0b0000000000111000 = 0x0038
Live Demo on coliru
Concerning, the bit operations:
binary bitwise and operator (&) is used to set all unintended bits to 0. The second value can be understood as mask. It would be more obvious if I had used binary numbers but this is not supported in C++. Hex codes do nearly as well as a hex digit represents always the same pattern of 4 bits. (as 16 = 24) After some time of practice, you usually learn to "see" the bits in the hex code.
About the right shift (>>), I was not quite sure. OP didn't require that bits have to be moved somewhere – only that they had to be separated into distinct variables. So, these right-shift's might be obsolete.
So, this question which seemed to be trivial leaded to a surprising enlightment (for me).

Related

Unreal Engine “FParse::HexNumber” How is it working?

(0x8877665544332211 >> 8 & 0xffff)
I am trying to convert this operation to hexadecimal, but I have not been successful.
I couldn’t use FParse::HexNumber.
https://docs.unrealengine.com/5.0/en-US/API/Runtime/Core/Misc/FParse/HexNumber64/
Has anyone tried or can help?
#include <iostream>
#include <sstream>
int main()
{
int i = (0x8877665544332211 >> 8 & 0xffff);
std::ostringstream ss;
ss << std::hex << i;
std::string result = ss.str();
std::cout << result << std::endl; // 0x3322
return 0;
}
I wrote this code in standard c++, I couldn't run it on Unreal Engine.
I don't understand how FParse::HexNumber works
Let's start with just 0x8877665544332211 >> 8. One hexadecimal digit represents 4 bits, so 8 bits is two hexadecimal digits. >> means shift right, so this much evaluates to 0x88776655443322.
That gives us: 0x88776655443322 & 0xffff. As above, one hex digit is 4 bits. 0xf means all four bits are set. So, 0xffff is 16 bits, all set. For some bit x, x & 1 = x, so this translates to keeping the 16 least significant bits of the left operand, which is: 0x3322

Create and fill a 10 bits set from two 8 bits characters

We have 2 characters a and b of 8 bits that we want to encode in a 10 bits set. What we want to do is take the first 8 bits of character a put them in the first 8 bits of the 10 bits set. Then take only the first 2 bits of character b and fill the rest.
QUESTION: Do I need to shift the 8 bits in order to concatenate the other 2 ?
// Online C++ compiler to run C++ program online
#include <iostream>
#include <bitset>
struct uint10_t {
uint16_t value : 10;
uint16_t _ : 6;
};
uint10_t hash(char a, char b){
uint10_t hashed;
// Concatenate 2 bits to the other 8
hashed.value = (a << 8) + (b & 11000000);
return hashed;
}
int main() {
uint10_t hashed = hash('a', 'b');
std::bitset<10> newVal = hashed.value;
std::cout << newVal << " "<<hashed .value << std::endl;
return 0;
}
Thanks #Scheff's Cat. My cat says Hi
Do I need to shift the 8 bits in order to concatenate the other 2?
Yes.
The bits of a have to be shifted left to make room for the two bits of b. As there is room needed for two bits a left shift by 2 is appropriate. (Before my recent update, there was a wrong left shift by 8 which I didn't notice. Shame on me.)
The bits of b have to be shifted right. The reason is that OP wants to combine the two most significant bits of b with them of a. As these two bits have to appear as least significant bits in the result they have to be shifted to that position.
It should be:
hashed.value = (a << 2) + ((b & 0xc0) >> 6);
or
hashed.value = (a << 2) + ((b & 0b11000000) >> 6);
As b is of type char (which is signed or unsigned depending on the compiler), it is even better to swap the order of & and >>:
hashed.value = (a << 2) + ((b >> 6) & 0x03);
or
hashed.value = (a << 2) + ((b >> 6) & 0b11);
This ensures that any possible sign bit extension is eliminated which may occur if the type char is a signed type in the specific compiler and b has a negative value (i.e. the most significant bit is set and will be replicated in the conversion to int).
MCVE on coliru:
#include <iostream>
#include <bitset>
struct uint10_t {
uint16_t value : 10;
uint16_t _ : 6;
};
uint10_t hash(char a, char b){
uint10_t hashed;
// Concatenate 2 bits to the other 8
hashed.value = (a << 2) + ((b >> 6) & 0b11);
return hashed;
}
int main() {
uint10_t hashed = hash('a', 'b');
std::cout << "a: " << std::bitset<8>('a') << '\n';
std::cout << "b: " << std::bitset<8>('b') << '\n';
std::bitset<10> newVal = hashed.value;
std::cout << " " << newVal << " " << hashed.value << std::endl;
}
Output:
a: 01100001
b: 01100010
0110000101 389
One may wonder why the two upper bits of a are not lost although a is of type char which is usually an 8 bit type. The reason is that integral arithmetic operations work at least on int types. Hence, a << 2 involves the implicit conversion of a to int which has at least 16 bit.

Totally unrelated result when multiplying two integers?

#include <iostream>
using std::cout;
using std::endl;
int main(void) {
std::string fx = "6x^2+6x+4";
int part1 = fx[0] * fx[3];
cout << fx[0] << endl;
cout << fx[3] << endl;
cout << part1;
}
So I have this string and fx[0] and fx[3] are obviously integers: when I print them to the console they print out just fine; however, part1 (their multiplication) equals some totally unrelated number? Can anyone help?
Here is the output:
6
2
2700
Your fx[0] and fx[3] variables are of type char (which is an integer type in C++). However, the actual values in those two elements of your fx string will be representations of the digits, 6 and 2, not the numerical values of those digits.
Very often, those representations will be ASCII codes (but that's not required); however, what is required is that the representations of the digits 0 thru 9 have contiguous, sequential values. Thus, by subtracting the value of the digit, 0, we can convert to their numerical representations.
In your case, the following line will do the conversion:
int part1 = (fx[0]-'0') * (fx[3]-'0');
The reason why you see the correct values when printing fx[0] and fx[3] is because the version of the cout << operator that takes a char argument is designed to print the represented character (not its 'ASCII' code); however, the cout << operator for an int type (like your part1) will print the actual value represented internally. (Try changing one of your lines to cout << (int)fx[0] << endl; to see the difference.)
P.S. Don't forget the #include <string> header – some implementations do that implicity inside the <iostream> header, but don't rely on that!
well, first of all, string::operator[] returns a char... then, a char can be casted to an int, and the cast works checking the ID in the ASCII table (in your case)
In ASCII, the ID of "6" and "2" are respectively 54 and 52 (you can check it here for example)... so your program is taking the two char, casting them to int, and multiplying them (54 * 50 = 2700)
If you need to interpret those as the integer value they represent, you can check this answer:
int val = '6' - '0'; // val == 6
Characters are values representing glyphs from some representation, usually the ASCII table. The numeric value of a character is not the same as the glyph that is printed on the screen. To convert a numeric-looking char to an actual "0-based" numeric value, subtract '0' from your char value.
(fx[3]-'0']) will be the numeric value of character represented at position 3.
You are multiplying character types. so the characters '6' and '2' will converted to its integer values 54 and 50 respectively then multiplication is applied. This works based on C++ type conversion rule. Then you will get 2,700. Try the modified sample code
#include <iostream>
using std::cout;
using std::endl;
int main(void) {
std::string fx = "6x^2+6x+4";
int part1 = fx[0] * fx[3];
cout << fx[0] << endl;
cout << fx[3] << endl;
cout << part1;
cout << std::endl;
cout << (int)fx[0] << " " << (int)fx[3] << std::endl;
}
And the results
6
2
2700
54 50

Two bytes into one

First off, I apologize if this is a duplicate; but my Google-fu seems to be failing me today.
I'm in the middle of writing an image format module for Photoshop, and one of the save options for this format, includes a 4-bit alpha channel. Of course, the data I have to convert is 8-bit/1 byte alpha - so I need to essentially take every two bytes of alpha, and merge it into one.
my attempt (below), I believe has a lot of room for improvement:
for(int x=0,w=0;x < alphaData.size();x+=2,w++)
{
short ashort=(alphaData[x] << 8)+alphaData[x+1];
alphaFinal[w]=(unsigned char)ashort;
}
alphaData and alphaFinal are vectors that contains the 8-bit alpha data and the 4-bit alpha data, respectively. I realize that reducing two bytes into the value of one, is bound to result in loss of "resolution", but I can't help but think there's a better way of doing this.
For extra information, here's the loop that does the reverse (converts 4-bit alpha from the format to 8-bit for Photoshop)
alphaData serves the same purpose as above, and imgData is an unsigned char vector that holds the raw image data. (alpha data is tacked on after the actual rgb data for the image in this particular variant of the format)
for(int b=alphaOffset,x2=0;b < (alphaOffset+dataLength); b++,x2+=2)
{
unsigned char lo = (imgData[b] & 15);
unsigned char hi = ((imgData[b] >> 4) & 15);
alphaData[x2]=lo*17;
alphaData[x2+1]=hi*17;
}
Are you sure that it's
alphaData[x2]=lo*17;
alphaData[x2+1]=hi*17;
and not
alphaData[x2]=lo*16;
alphaData[x2+1]=hi*16;
In any case, to generate the values that work with the decoding function you have posted, you just have to reverse the operations. So multiplying by 17 becomes dividing by 17 and the shifts and masks get reordered to look like this:
for(int x=0,w=0;x < alphaData.size();x+=2,w++)
{
unsigned char alpha1 = alphaData[x] / 17;
unsigned char alpha2 = alphaData[x+1] / 17;
Assert(alpha1 < 16 && alpha2 < 16);
alphaFinal[w]=(alpha2 << 4) | alpha1;
}
short ashort=(alphaData[x] << 8)+alphaData[x+1];
alphaFinal[w]=(unsigned char)ashort;
You're actually losing alphaData[x] in alphaFinal. You shift alphaData[x] by 8 bits to the left and then assign 8 low bits.
Also your for loop is unsafe, if for some reason alphaData.size() is odd, you'll run out of range.
what you want to do, I think, is to truncate an 8-bit value into a 4-bit one; not to combine two 8-bit vales. In other words, you want to drop the four least significant bits of each alpha value, not to combine two different alpha values.
So, basically, you want to right-shift by 4.
output = (input >> 4); /* truncate four bits */
in case you're not familiar with binary shifts, take this random 8-bit number:
10110110
>> 1
= 01011011
>> 1
= 00101101
>> 1
= 00010110
>> 1
= 00001011
so,
10110110
>> 4
= 00001011
and to reverse, left-shift instead...
input = (output << 4); /* expand four bits */
which, using the result from that same random 8-bit number as before, would be
00001011
>> 4
= 10110000
obviously, as you noted, 4 bits of precision is lost. But you'd be surprised how little it's noticed in a fully-composited work.
This code
for(int x=0,w=0;x < alphaData.size();x+=2,w++)
{
short ashort=(alphaData[x] << 8)+alphaData[x+1];
alphaFinal[w]=(unsigned char)ashort;
}
Is broken. Given
#include <iostream>
using std::cout;
using std::endl;
typedef unsigned char uchar;
int main() {
uchar x0 = 1; // for alphaData[x]
uchar x1 = 2; // for alphaData[x+1]
short ashort = (x0 << 8) + x1; // The value 0x0102
uchar afinal = (uchar)ashort; // truncates to 0x02.
cout << std::hex
<< "x0 = 0x" << x0 << " << 8 = 0x" << (x0 << 8) << endl
<< "x1 = 0x" << x1 << endl
<< "ashort = 0x" << ashort << endl
<< "afinal = 0x" << (unsigned int)afinal << endl
;
}
If you are saying that your source stream contains sequences of 4-bit pairs stored in 8-bit storage values, which you need to re-store as a single 8-bit value, then what you want is:
for(int x=0,w=0;x < alphaData.size();x+=2,w++)
{
unsigned char aleft = alphaData[x] & 0x0f; // 4 bits.
unsigned char aright = alphaData[x + 1] & 0x0f; // 4 bits.
alphaFinal[w] = (aleft << 4) | (aright);
}
"<<4" is equivalent to "*16", as ">>4" is equivalent to "/16".

Take two hex characters from file and store as a char with associated hex value

I'd like to take the next two hex characters from a stream and store them as the associated associated hex->decimal numeric value in a char.
So if an input file contains 2a3123, I'd like to grab 2a, and store the numeric value (decimal 42) in a char.
I've tried
char c;
instream >> std::setw(2) >> std::hex >> c;
but this gives me garbage (if I replace c with an int, I get the maximum value for signed int).
Any help would be greatly appreciated! Thanks!
edit: I should note that the characters are guaranteed to be within the proper range for chars and that the file is valid hexadecimal.
OK I think dealing with ASCII decoding is a bad idea at all and does not really answer the question.
I think your code does not work because setw() or istream::width() works only when you read to std::string or char*. I guess it from here
How ever you can use the goodness of standard c++ iostream converters. I came up with idea that uses stringstream class and string as buffer. The thing is to read n chars into buffer and then use stringstream as a converter facility.
I am not sure if this is the most optimal version. Probably not.
Code:
#include <iostream>
#include <sstream>
int main(void){
int c;
std::string buff;
std::stringstream ss_buff;
std::cin.width(2);
std::cin >> buff;
ss_buff << buff;
ss_buff >> std::hex >> c;
std::cout << "read val: " << c << '\n';
}
Result:
luk32#genaker:~/projects/tmp$ ./a.out
0a10
read val: 10
luk32#genaker:~/projects/tmp$ ./a.out
10a2
read val: 16
luk32#genaker:~/projects/tmp$ ./a.out
bv00
read val: 11
luk32#genaker:~/projects/tmp$ ./a.out
bc01
read val: 188
luk32#genaker:~/projects/tmp$ ./a.out
01bc
read val: 1
And as you can see not very error resistant. Nonetheless, works for the given conditions, can be expanded into a loop and most importantly uses the iostream converting facilities so no ASCII magic from your side. C/ASCII would probably be way faster though.
PS. Improved version. Uses simple char[2] buffer and uses non-formatted write/read to move data thorough the buffer (get/write as opposed to operator<</operator>>). The rationale is pretty simple. We do not need any fanciness to move 2 bytes of data. We ,however, use formatted extractor to make the conversion. I made it a loop version for the convenience. It was not super simple though. It took me good 40 minutes of fooling around to figure out very important lines. With out them the extraction works for 1st 2 characters.
#include <iostream>
#include <sstream>
int main(void){
int c;
char* buff = new char[3];
std::stringstream ss_buff;
std::cout << "read vals: ";
std::string tmp;
while( std::cin.get(buff, 3).gcount() == 2 ){
std::cout << '(' << buff << ") ";
ss_buff.seekp(0); //VERY important lines
ss_buff.seekg(0); //VERY important lines
ss_buff.write(buff, 2);
if( ss_buff.fail() ){ std::cout << "error\n"; break;}
std::cout << ss_buff.str() << ' ';
ss_buff >> std::hex >> c;
std::cout << c << '\n';
}
std::cout << '\n';
delete [] buff;
}
Sample output:
luk32#genaker:~/projects/tmp$ ./a.out
read vals: 0aabffc
(0a) 0a 10
(ab) ab 171
(ff) ff 255
Please note, the c was not read as intended.
I found everything needed here http://www.cplusplus.com/reference/iostream/
You can cast a Char to an int and the int will hold the ascii value of the char. For example, '0' will be 48, '5' will be 53. The letters occur higher up so 'a' will be cast to 97, 'b' to 98 etc. So knowing this you can take the int value and subtract 48, if the result is greater than 9, subtract another 39. Then char 0 will have been turned to int 0, char 1 to int 1 all the way up to char a being set to int 10, char b to int 11 etc.
Next you will need to multiply the value of the first by 16 and add it to the second to account for the bit shift. Using your example of 2a.
char 2 casts to int 50. Subtract 48 and get 2. Multiply by 16 and get 32.
char a casts to int 97. Subtract 48 and get 49, this is higher than 9 so subtract another 39 and get 10. Add this to the end result of the last one (32) and you get 42.
Here is the code:
int HexToInt(char hi, char low)
{
int retVal = 0;
int hiBits = (int)hi;
int loBits = (int)low;
retVal = Convert(hiBits) * 16 + Convert(loBits);
return retVal;
}
int Convert(int in)
{
int retVal = in - 48;
//If it was not a digit
if(retVal > 10)
retVal = retVal - 7;
//if it was not an upper case hex didgit
if(retVal > 15)
retVal = retVal - 32;
return retVal;
}
The first function can actually be written as one line thus:
int HexToInt(char hi, char low)
{
return Convert((int)hi) * 16 + Convert((int)low);
}
NOTE: This only accounts for lower case letters and only works on systems that uses ASCII, i.e. Not IBM ebcdic based systems.