for this code -
int main()
{
std::wstring wstr = L"é";
std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
std::stringstream ss;
ss << std::hex << std::setfill('0');
for (auto c : myconv.to_bytes(wstr))
{
ss << std::setw(2) << static_cast<unsigned>(c);
}
string ssss = ss.str();
cout << "ssss = " << ssss << endl;
Why does this print ffffffc3ffffffa9
instead of c3a9?
Why does it append ffffff in beginning?
If you want to run it in ideone - https://ideone.com/qZtGom
c is of type char, which is signed on most systems.
Converting a char to an unsigned causes value to be sign-extended.
Examples:
char(0x23) aka 35 --> unsigned(0x00000023)
char(0x80) aka -128 --> unsigned(0xFFFFFF80)
char(0xC3) aka -61 --> unsigned(0xFFFFFFc3)
[edit: My first suggestion didn't work; removed]
You can cast it twice:
ss << std::setw(2) << static_cast<int>(static_cast<unsigned char>(c));
The first cast gives you an unsigned type with the same bit pattern, and since unsigned char is the same size as char, there is no sign extension.
But if you just output static_cast<unsigned char>(c), the stream will treat it as a character, and print .. something .. depending on your locale, etc.
The second cast gives you an int, which the stream will output correctly.
Related
I made a utf8 to utf16 conversion where i get the code units for the utf16 char16_t.
{
std::string u8 = u8"ʑʒʓʔ";
// UTF-8 to UTF-16/char16_t
std::u16string u16_conv = std::wstring_convert<
std::codecvt_utf8_utf16<char16_t>, char16_t>{}
.from_bytes(u8);
std::cout << "UTF-8 to UTF-16 conversion produced "
<< u16_conv.size() << " code units:\n";
for (char16_t c : u16_conv)
std::cout << std::hex << std::showbase << c << ' ';
}
Output :
TF-8 to UTF-16 conversion produced 4 code units:
0x291 0x292 0x293 0x294
I now need to pass the code units to a stringstream if possible and I don't know how to convert this into a 2 bytes like so :
0x02,0x91,0x02,0x92,0x02,0x93,0x02,0x94
Any suggestions ? Maybe converting it first to a uint8_t vector?
A straightforward way is adding each bytes to std::stringstream using a loop.
std::stringstream ss;
for (char16_t c : u16_conv)
{
ss << (char)(c >> 8);
ss << (char)c;
}
std::string str = ss.str();
for (char c : str) {
std::cout << std::hex << std::showbase << (int)(unsigned char)c << ' ';
}
I am attempting to extract a hash-digest in hexadecimal via a stringstream, but I cannot get it to work when iterating over data.
Using std::hex I can do this easily with normal integer literals, like this:
#include <sstream>
#include <iostream>
std::stringstream my_stream;
my_stream << std::hex;
my_stream << 100;
std::cout << my_stream.str() << std::endl; // Prints "64"
However when I try to push in data from a digest it just interprets the data as characters and pushes them into the stringstream. Here is the function:
#include <sstream>
#include <sha.h> // Crypto++ library required
std::string hash_string(const std::string& message) {
using namespace CryptoPP;
std::stringstream buffer;
byte digest[SHA256::DIGESTSIZE]; // 32 bytes or 256 bits
static SHA256 local_hash;
local_hash.CalculateDigest(digest, reinterpret_cast<byte*>(
const_cast<char*>(message.data())),
message.length());
// PROBLEMATIC PART
buffer << std::hex;
for (size_t i = 0; i < SHA256::DIGESTSIZE; i++) {
buffer << *(digest+i);
}
return buffer.str();
}
The type byte is just a typedef of unsigned char so I do not see why this would not input correctly. Printing the return value using std::cout gives the ASCI mess of normal character interpretation. Why does it work in the first case, and not in the second case?
Example:
std::string my_hash = hash_string("hello");
std::cout << hash << std::endl; // Prints: ",≥M║_░ú♫&Φ;*┼╣Γ₧←▬▲\▼ºB^s♦3bôïÿ$"
First, the std::hex format modifier applies to integers, not to characters. Since you are trying to print unsigned char, the format modifier is not applied. You can fix this by casting to int instead. In your first example, it works because the literal 100 is interpreted as an integer. If you replace 100 with e.g. static_cast<unsigned char>(100), you would no longer get the hexadecimal representation.
Second, std::hex is not enough, since you likely want to pad each character to a 2-digit hex value (i.e. F should be printed as 0F). You can fix this by also applying the format modifiers std::setfill('0') and std::setw(2) (reference, reference).
Applying these modifications, your code would then look like this:
#include <iomanip>
...
buffer << std::hex << std::setfill('0') << std::setw(2);
for (size_t i = 0; i < SHA256::DIGESTSIZE; i++) {
buffer << static_cast<int>(*(digest+i));
}
I am using a C++ code to read some binary output from an electronic board through USB. The output is stored on an unsigned char buffer. When I'm trying to print out the value or write it to an output file, I get dummy output instead of hex and binary value, as shown here:
햻"햻"㤧햻"㤧햻"햻"㤧
This is the output file declaration:
f_out.open(outfilename, ios::out);
if (false == f_out.is_open()) {
printf("Error: Output file could not be opened.\n");
return(false);
}
This is the output command:
xem->ReadFromPipeOut(0xA3, 32, buf2);
f_out.write((char*)buf2, 32);
//f_out << buf2;
"xem" is a class for the USB communication. ReadFromPipeOut method, reads the output from the board and stores it on the buffer buf2. This is the buffer definition inside the main:
unsigned char buf2[32];
Why do you expect hex output? You ask to write chars, it writes chars.
To output hex values, you can do this:
f_out << std::hex;
for (auto v : buf2)
f_out << +v << ' ';
To get numbers in the output, values should be output as integers, not as characters. +v converts unsigned char into unsigned int thanks to integral promotion. You can be more explicit about it and use static_cast<unsigned int>(v).
unsigned char buf[3] = {0x12, 0x34, 0x56};
std::cout << std::hex;
for (auto v : buf)
std::cout << static_cast<unsigned int>(v) << ' ';
// Output: 12 34 56
To output numbers as binary:
for (auto v : buf)
std::cout << std::bitset<8>(v) << ' ';
(no need for std::hex and static_cast here)
To reverse the order:
for (auto it = std::rbegin(buf); it != std::rend(buf); ++it)
std::cout << std::bitset<8>(*it) << ' ';
Note that the order of bytes in a multi-byte integer depends on endian-ness. On a little-endian machine the order is reserved.
Lets assume I have this very simple example:
vector<unsigned char> bytes {0xFF, 0xFF, 0xFD};
for (const char & v: bytes) {
cout << hex << setfill('0') << setw(2) << uppercase << static_cast<unsigned>(v) <<" ";
}
cout << endl;
This gives:
FFFFFFFF FFFFFFFF FFFFFFFD
However, I would like to have it short, like:
FF FF FD
So why do I get some many extra "FFFFF"?
for (const char & v: bytes)
You're implicitly converting each element in bytes to a char, which seems to be signed on your platform. Then when you cast to unsigned the char undergoes sign extension and you end up with large hex values.
Change the above to one of the following
for (const unsigned char & v: bytes)
for (auto const& v: bytes)
for (auto v: bytes) // since it's only a char copying might be better
Live demo
You get the desired result if you keep the char unsigned in the loop:
for (const unsigned char & v: bytes) {
// ^^^^^^^^
cout << hex << setfill('0') << setw(2) << uppercase << static_cast<unsigned>(v) <<" ";
}
auto or auto& would work as well, because vector elements are unsigned.
The reason you get FFs is that char on your system is signed, meaning that the values get sign-extended on conversion to integers.
Demo.
As other people said, this happens because you're casting the elements to a signed char. I prefer using a for loop in this way to prevent such mistakes:
vector<unsigned char> bytes{ 0xFF, 0xFF, 0xFD };
for (int i = 0; i < bytes.size(); i++) {
cout << hex << setfill('0') << setw(2) << uppercase << static_cast<unsigned>(bytes[i]) << " ";
}
Using iterators would also be a good idea.
I'm using Cryptopp to generate a random string.
This is the code:
const unsigned int BLOCKSIZE = 16 * 8;
byte pcbScratch[ BLOCKSIZE ];
// Construction
// Using a ANSI approved Cipher
CryptoPP::AutoSeededX917RNG<CryptoPP::DES_EDE3> rng;
rng.GenerateBlock( pcbScratch, BLOCKSIZE );
// Output
std::cout << "The generated random block is:" << std::endl;
string str = "";
for( unsigned int i = 0; i < BLOCKSIZE; i++ )
{
std::cout << "0x" << std::setbase(16) << std::setw(2) << std::setfill('0');
std::cout << static_cast<unsigned int>( pcbScratch[ i ] ) << " ";
str += pcbScratch[i];
}
std::cout << std::endl;
std::cout << str <<std::endl;
I've put int the code a new var: string str = "".
Then in the for append for each result, the part of the string.
But my output is dirty! I see only strange ASCII char.
How can I set well the string?
Thank you.
You will want to some output encoding, e.g.
base64
hex
because what you are seeing is the raw binary data, interpreted as if it were text. Random characters are the consequence
AFAICT (google) you should be able to use something like this
#include <base64.h>
string base64encoded;
StringSource(str, true, new Base64Encoder(new StringSink(base64encoded)));
Appending arbitrary bytes (chars) to the end of a string is going to result in that containing some non-printable characters:
http://en.wikipedia.org/wiki/Control_character
You don't mention what you wanted or expected. Did you want the string to be the same as what got sent to std::cout? If so, you can use a stringstream via #include <sstream>:
std::stringstream ss;
for( unsigned int i = 0; i < BLOCKSIZE; i++ )
{
ss << "0x" << std::setbase(16) << std::setw(2) << std::setfill('0');
ss << static_cast<unsigned int>(pcbScratch[i]);
}
str = ss.str();
You can also use Crypto++'s built in HexEncoder:
std::cout << "The generated random block is:" << std::endl;
string str = "0x";
StringSource ss(pcbScratch, BLOCKSIZE, true,
new HexEncoder(
new StringSink(str),
true, // uppercase
2, // grouping
" 0x" // separator
) // HexDecoder
); // StringSource
The StringSource 'owns' the HexEncoder, so there's no need to call delete.