Get Hex Value of UChar in ICU4C - c++

I'm working with ICU in a C++ library.
How can I get the Unicode Hex value of a UChar? For example, 'a' should be equal to 0x0041 (http://www.unicode.org/charts/PDF/U0000.pdf).

How about something simple like
std::cout << std::hex << std::setw(4) << std::setfill('0')
<< static_cast<int>('a') << '\n';
Though it prints 0061 and not 0041, which is the correct hex value for a.

Related

What does cout << std::ios::hex do?

This question comes from a bug that I got into recently. I was trying to save some integer values to file as hex. As an example, this is what I should do:
cout << std::hex << value << endl; // (1)
But by mistake, I use it as the following:
cout << std::ios::hex << value << endl; // (2)
The compiler does not complain but obviously the result is not correct. I was trying a couple more values randomly and it seems that (2) actually give partially correct result except that it append 800 as a prefix. I don't understand where the 800 is coming from and I don't see a good reference anywhere. Can anybody explain what's happening under the hood?
cout << std::hex << 255 << endl; // output: FF
cout << std::ios::hex << 255 << endl; // output: 800ff
cout << std::hex << 135 << endl; // output: 87
cout << std::ios::hex << 135 << endl; // output: 80087
cout << std::hex << 11 << endl; // output: b
cout << std::ios::hex << 11 << endl; // output: 800b
This is actually std::ios_base::hex. It's an implementation-defined bitmask. Internally, the stream has an integer called fmtflags where it stores the current state of the formatting.
In your implementation, hex is the flag 0x800. Other flags will indicate whether it's in scientific notation mode, whether boolalpha is on, etc. etc.
The std::hex function sets the std::ios_base::hex flag in fmtflags.
So your output is the integer value of this flag (in hex since you sent std::hex previously).
std::hex is a manipulator, i.e., it is a function with a specific signature:
std::ios_base& hex(std::ios_base& stream) {
stream.setf(std::ios_base::hex, std::ios_base::basefield);
return stream;
}
There are some special output operators defined for stream to process manipulators. For the version operating on references to std::ios_base there is (ignoring that the operator is actually a function template):
std::ostream& operator<< (std::ostream& out, std::ios_base&(*manip)(std::ios_base&));
When used with a stream, the manipulator function is being called and it sets a specific format flag, in this case std::ios_base::hex (which is how std::ios::hex is actually defined). Since std::ios_base::hex is a member of a group of flags (the others are std::ios_base::dec and std::ios_base::oct) setting it also needs to clear any potential other flag in the group. Thus, setf() is called with a mask (std::ios_base::basefield) to clear any of the other potentially set flags.
The format flags std::ios_base::fmtflags is a bitmask type. The value std::ios_base::hex is one of the values. When formatting it you'll get some number, most likely a power of 2 (however, it doesn't have to be a power of 2). The value you see is simply 0x800 (i.e. 2048) printed using hex notation: setting any of the formatting flags (other than the width()) is sticky, i.e., they remain until the flag is unset. If you want to see the value 2048 (for the implementation you are using) you'd use
std::cout << std::dec << std::ios_base::hex << "\n"; // 2048
std::cout << std::hex << std::ios_base::hex << "\n"; // 800
std::cout << std::showbase << std::ios_base::hex << "\n"; // 0x800
The last line sets the flag showbase which indicates the base of an integer value with a prefix:
no prefix => decimal
a leading 0x => hexadecimal
a leading 0 (but no x) => octal
std::hex is a special object that, when applied to a stream using operator<<,
sets the basefield of the stream str to hex as if by calling str.setf(std::ios_base::hex, std::ios_base::base field)
std::ios::hex (aka std::ios_base::hex) is the actual bitmask value that gets passed to the setf method. Its value is implementation defined, and it seems to be 0x800 in your case.

std::cout gives different output from qDebug

I am using Qt, and I have an unsigned char *bytePointer and want to print out a number-value of the current byte. Below is my code, which is meant to give the int-value and the hex-value of the continuous bytes that I receive from a machine attached to the computer:
int byteHex=0;
byteHex = (int)*bytePointer;
qDebug << "\n int: " //this is the main issue here.
<< *bytePointer;
std::cout << " (hex: "
<< std::hex
<< byteHex
<< ")\n";
}
This gives perfect results, and I get actual numbers, however this code is going into an API and I don't want to use Qt-only functions, such as qDebug. So when I try this:
int byteHex=0;
byteHex = (int)*bytePointer;
std::cout << "\n int: " //I changed qDebug to std::cout
<< *bytePointer;
std::cout << " (hex: "
<< std::hex
<< byteHex
<< ")\n";
}
The output does give the hex-values perfectly, however the int-values return symbols (like ☺, └, §, to list a few).
My question is: How do I get std::cout to give the same output as qDebug?
EDIT: for some reason the symbols only occur with a certain Qt setting. I have no idea why it happened but it's fixed now.
As others pointed out in comment, you change the outputting to hex, but you do not actually set it back here:
std::cout << " (hex: "
<< std::hex
<< byteHex
<< ")\n";
You will need to apply this afterwards:
std::cout << std::dec;
Standard output streams will output any character type as a character, not a numeric value. To output the numeric value, convert to a non-character integer type:
std::cout << int(*bytePointer);

custom std::hex manipulator that works for unsigned char

Please consider the following:
unsigned char a(65);
unsigned char z(90);
std::cout << std::hex << a << ", " << z <<std::endl;
Output:
A, Z
But desired output is:
41, 5a
To achieve this I'd like to avoid having to convert values like this, say:
std::cout << std::hex << int(a) << ", " << int(z) <<std::endl;
and instead have some magical manipulator that I can include beforehand:
std::cout << uchar_hex_manip << a << ", " << z << std::endl;
So my question is, how can I define 'uchar_hex_manip' to work as required?
UPDATE: I appreciate all the comments and suggestions so far but I have already said I want to avoid converting the values and no-one seems to have acknowledged that fully. The 'a << ", " << z' I mentioned above is representative of the values to be later streamed in - the actual use case of this in our application is that there is something more complex than that going on where for various reasons it is ideal not to have to shoe-horn in some casts for specific cases.
If you want to print char as hex, you will need to convert it to a an int:
std::cout << std::hex << static_cast<int>('a');
should do the trick.
The reason std::hex doesn't work on char (or unsigned char) is that the stream output operator for char is defined to print the character as the output. There is no modifier to change this behaviour (and although #soon suggests to write your own class - that's a lot of work to avoid a cast).

cout print hex instead of decimal

has it occurred to anyone that a simple std::cout might print a value in hex format when it is supposed to format just a decimal(like an integer)?
for example, I have a line as :
std::cout << "_Agent [" << target << "] is still
among " << ((target->currWorker)->getEntities().size()) << " entities
of worker[" << target->currWorker << "]" << std::endl;
which would print :
_Agent [0x2c6d530] is still among 0x1 entities of worker[0x2c520f0]
Note:
1-the said out put is sometime decimal and some times hex
2- the behaviour is smae even if I change ((target->currWorker)->getEntities().size()) to (int)((target->currWorker)->getEntities().size())
any hints?
thanks
You probably have set std::cout to print hex in prior in the context of your code but forget to reset. For example:
std::cout<<std::hex<<12;
/*blah blah blah*/
std::cout<<12; //this will print in hex form still
so you have to do like the following
std::cout<<std::dec<<12;
to print in decimal form.
Try to find line like this std::cout << std::showbase << std::hex; some where in your code, which sets std::cout to print output in hexadecimal with 0x base indicator prefix.
To reset it to show decimal add this line std::cout<<std::dec before the current cout.
You can learn more about c++ io manipulators flags here

How to set up C++ Number Formatting to a certain precision?

I understand that you can use iomanip to set a precision flags for floats (e.g. have 2.0000 as opposed to 2.00).
Is there a way possible to do this, for integers?
I would like a hex number to display as 000e8a00 rather than just e8a00 or 00000000 rather than 0.
Is this possible in C++, using the standard libraries?
With manipulators:
std::cout << std::setfill('0') << std::setw(8) << std::hex << 0 << std::endl;
Without manipulators:
std::cout.fill('0');
std::cout.width(8);
std::cout.setf(std::ios::hex, std::ios::basefield);
std::cout << 42 << std::endl;
You can also do this with boost::format, which I find often saves typing:
std::cout << boost::format("%08x\n") % 0xe8a00;
It also allows for some nice code reuse, if you have multiple places you need to do the same formatting:
boost::format hex08("%08x");
std::cout << hex08 % 0xe8aa << std::endl;