I'm sure there's something I am missing here but I am comparing the contents of a regular string literal (in a utf8 encoded document) with a u8 string literal and on Windows the u8 encoded literal doesn't contain the expected utf8 encoded data while on Linux it does.
Details:
cpp file is utf8 encoded
C++17 is enabled
compiling using vs 2019 on Windows
compiling using gcc 9.2.1 on Linux
Here's the code:
#include <iostream>
#include <string>
struct HexCharStruct {
unsigned char c;
HexCharStruct(unsigned char _c) : c(_c) { }
};
inline std::ostream& operator<<(std::ostream& o, const HexCharStruct& hs) {
return (o << std::hex << (int)hs.c);
}
inline HexCharStruct hex(unsigned char _c) {
return HexCharStruct(_c);
}
int main( int argc, char** argv ) {
std::string s1 = "🎂";
std::string s2 = u8"🎂";
std::cout << "s1: ";
for (const char& c : s1)
std::cout << hex(c) << " ";
std::cout << "\ns2: ";
for (const char& c : s2)
std::cout << hex(c) << " ";
return 0;
}
Here are the hex values printed on Windows and Linux for s1 and s2 when I run this:
s1 (Windows): f0 9f 8e 82
s1 (Linux): f0 9f 8e 82
s2 (Windows): c3 b0 c5 b8 c5 bd e2 80 9a
s2 (Linux): f0 9f 8e 82
The utf8 hex values for 🎂 are f0 9f 8e 82 so everything is as expected except for s2 on Windows. Can anyone explain this?
The Microsoft compiler assumes source is ANSI-encoded, which depends on the localized version of Windows in use. On U.S. and Western European Windows the encoding is assumed to be Windows-1252.
When the compiler assumes Windows-1252, it decodes the UTF-8 bytes encoded in the source in the wrong encoding and thinks it is four Windows-1252 characters, then encodes those characters in UTF-8. A quick demo (Python):
>>> '🎂'.encode('utf8') # bytes in the file
b'\xf0\x9f\x8e\x82'
>>> b'\xf0\x9f\x8e\x82'.decode('Windows-1252') # What the compiler reads.
'🎂'
>>> '🎂'.encode('utf8') # What the compiler generates for u8 string.
b'\xc3\xb0\xc5\xb8\xc5\xbd\xe2\x80\x9a'
To use UTF-8 sources, two options are to encode the source in UTF-8 w/ BOM or add the /utf-8 compiler switch.
Related
I am trying to write a C++ program for my Computer Machine Organization class in which I perform a memory dump in hex on some address stored in memory. I don't really understand what a memory dump is, and am pretty new to writing C++. My questions are:
How can I create a method that takes two arguments in which they specify address in memory?
How can I further modify those arguments to specify a word address that is exactly 4 bytes long?
How can I then convert those addresses into hex values?
I know that this is a lot, but thank you for any suggestions.
For anyone who needs it, here is my code so far:
#include <stdio.h>
// Create something to do the methods on
char array[3] = {'a', 'b', 'c'};
void mdump(char start, char end){
// Create pointers to get the address of the starting and ending characters
char* pointer1 = (char *)& start;
char* pointer2 = (char *)& end;
// Check to see if starting pointer is in lower memory than ending pointer
if(pointer1 < pointer2){
printf("Passed");
}
else{
printf("Failed");
}
// Modify both the arguments so that each of them are exactly 4 bytes
// Create a header for the dump
// Iterate through the addresses, from start pointer to end pointer, and produce lines of hex values
// Declare a struct to format the values
// Add code that creates printable ASCII characters for each memory location (print "cntrl-xx" for values 0-31, or map them into a blank)
// Print the values in decimal and in ASCII form
}
int main(){
mdump(array[0], array[2]);
return 0;
}
How to write a Hex dump tool while learning C++:
Start with something simple:
#include <iostream>
int main()
{
char test[32] = "My sample data";
// output character
std::cout << test[0] << '\n';
}
Output:
M
Live demo on coliru
Print the hex-value instead of the character:
#include <iostream>
int main()
{
char test[32] = "My sample data";
// output a character as hex-code
std::cout << std::hex << test[0] << '\n'; // Uh oh -> still a character
std::cout << std::hex << (unsigned)(unsigned char)test[0] << '\n';
}
Output:
M
4d
Live demo on coliru
Note:
The stream output operator for char is intended to print a character (of course). There is another stream output operator for unsigned which fits better. To achieve that it's used, the char has to be converted to unsigned.
But be prepared: The C++ standard doesn't mandate whether char is signed or unsigned—this decision is left to the compiler vendor. To be on the safe side, the 'char' is first converted to 'unsigned char' then converted to unsigned.
Print the address of the variable with the character:
#include <iostream>
int main()
{
char test[32] = "My sample data";
// output an address
std::cout << &test[0] << '\n'; // Uh oh -> wrong output stream operator
std::cout << (const void*)&test[0] << '\n';
}
Output:
My sample data
0x7ffd3baf9b70
Live demo on coliru
Note:
There is one stream output operator for const char* which is intended to print a (zero-terminated) string. This is not what is intended. Hence, the (ugly) trick with the cast to const void* is necessary which triggers another stream output operator which fits better.
What if the data is not a 2 digit hex?
#include <iomanip>
#include <iostream>
int main()
{
// output character as 2 digit hex-code
std::cout << (unsigned)(unsigned char)'\x6' << '\n'; // Uh oh -> output not with two digits
std::cout << std::hex << std::setw(2) << std::setfill('0')
<< (unsigned)(unsigned char)'\x6' << '\n';
}
Output:
6
06
Live demo on coliru
Note:
There are I/O manipulators which can be used to modify the formatting of (some) stream output operators.
Now, put it all together (in loops) et voilà : a hex-dump.
#include <iomanip>
#include <iostream>
int main()
{
char test[32] = "My sample data";
// output an address
std::cout << (const void*)&test[0] << ':';
// output the contents
for (char c : test) {
std::cout << ' '
<< std::hex << std::setw(2) << std::setfill('0')
<< (unsigned)(unsigned char)c;
}
std::cout << '\n';
}
Output:
0x7ffd345d9820: 4d 79 20 73 61 6d 70 6c 65 20 64 61 74 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Live demo on coliru
Make it nice:
#include <algorithm>
#include <iomanip>
#include <iostream>
int main()
{
char test[32] = "My sample data";
// hex dump
const size_t len = sizeof test;
for (size_t i = 0; i < len; i += 16) {
// output an address
std::cout << (const void*)&test[i] << ':';
// output the contents
for (size_t j = 0, n = std::min<size_t>(len - i, 16); j < n; ++j) {
std::cout << ' '
<< std::hex << std::setw(2) << std::setfill('0')
<< (unsigned)(unsigned char)test[i + j];
}
std::cout << '\n';
}
}
Output:
0x7fffd341f2b0: 4d 79 20 73 61 6d 70 6c 65 20 64 61 74 61 00 00
0x7fffd341f2c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Live demo on coliru
Make it a function:
#include <algorithm>
#include <iomanip>
#include <iostream>
void hexdump(const char* data, size_t len)
{
// hex dump
for (size_t i = 0; i < len; i += 16) {
// output an address
std::cout << (const void*)&data[i] << ':';
// output the contents
for (size_t j = 0, n = std::min<size_t>(len - i, 16); j < n; ++j) {
std::cout << ' '
<< std::hex << std::setw(2) << std::setfill('0')
<< (unsigned)(unsigned char)data[i + j];
}
std::cout << '\n';
}
}
int main()
{
char test[32] = "My sample data";
std::cout << "dump test:\n";
hexdump(test, sizeof test);
std::cout << "dump 4 bytes of test:\n";
hexdump(test, 4);
std::cout << "dump an int:\n";
int n = 123;
hexdump((const char*)&n, sizeof n);
}
Output:
dump test:
0x7ffe900f4ea0: 4d 79 20 73 61 6d 70 6c 65 20 64 61 74 61 00 00
0x7ffe900f4eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
dump 4 bytes of test:
0x7ffe900f4ea0: 4d 79 20 73
dump an int:
0x7ffe900f4e9c: 7b 00 00 00
Live demo on coliru
Note:
(const char*)&n may look a bit adventurous. In fact, conversion of pointers is always something which should be at best not necessary. However, for the dump tool this is the easiest way to access the bytes of arbitrary data. (This is one of the rare cases which is explicitly allowed by the standard.)
An even nicer hexdump can be found in
SO: How would I create a hex dump utility in C++?
(which I recommended OP beforehand).
I have written some code that loads some files containing a list of words (one word pr line). each word is added to a multiset. later I try to search the multiset with multiset.find("aWord"). where I look for the word and substrings of the word in the multiset.
This code works fine if I compile it with qt on a windows system.
But don't work if i compile it in qt on my mac !
my goal is to make it work from qt on my mac.
I am woking on macbook Air (13" early 2018) with a
macOS Majave version 10.14.4 instalation
Buil version 18E226
local 18.5.0 Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT
2019; root:xnu-4903.251.3~3/RELEASE_X86_64 x86_64
Using a qt installation:
QTKit:
Version: 7.7.3
Obtained from: Apple
Last Modified: 13/04/2019 12.11
Kind: Intel
64-Bit (Intel): Yes
Get Info String: QTKit 7.7.3, Copyright 2003-2012, Apple Inc.
Location: /System/Library/Frameworks/QTKit.framework
Private: No
And xcode installation:
Xcode 10.2
Build version 10E125
I have tried to print out:
every strings that i am searching for
and every string i should find in the multiset as hex format
and concluded that some of the letters do not match.
in there hex value. despite i think my whole system run utf-8 and the file also is utf-8 encoded.
Dictionary.h
#ifndef DICTIONARY_H
#define DICTIONARY_H
#include <iostream>
#include <vector>
#include <set>
class Dictionary
{
public:
Dictionary();
void SearchForAllPossibleWordsIn(std::string searchString);
private:
std::multiset<std::string, std::less<std::string>> mDictionary;
void Initialize(std::string folder);
void InitializeLanguage(std::string folder, std::string languageFileName);
};
#endif // DICTIONARY_H
Dictionary.cpp
#include "Dictionary.h"
#include <vector>
#include <set>
#include <iostream>
#include <fstream>
#include <exception>
Dictionary::Dictionary()
{
Initialize("../Lektion10Projekt15-1/");
}
void Dictionary::Initialize(std::string folder)
{
InitializeLanguage(folder,"da-utf8.wl");
}
void Dictionary::InitializeLanguage(std::string folder, std::string languageFileName)
{
std::ifstream ifs;
ifs.open(folder+languageFileName,std::ios_base::in);
if (ifs.fail()) {
std::cerr <<"Error! Class: Dictionary. Function: InitializeLanguage(...). return: ifs.fail to load file '" + languageFileName + "'" << std::endl;
}else {
std::string word;
while (!ifs.eof()) {
std::getline(ifs,word);
mDictionary.insert(word);
}
}
ifs.close();
}
void Dictionary::SearchForAllPossibleWordsIn(std::string searchString)
{
std::vector<std::string> result;
for (unsigned int a = 0 ; a <= searchString.length(); ++a) {
for (unsigned int b = 1; b <= searchString.length()-a; ++b) {
std::string substring = searchString.substr(a,b);
if (mDictionary.find(substring) != mDictionary.end())
{
result.push_back(substring);
}
}
}
if (!result.empty()) {
for (unsigned int i = 0; i < result.size() ;++i) {
std::cout << result[i] << std::endl;
}
}
}
main.cpp
#include <iostream>
#include "Dictionary.h"
int main()
{
Dictionary myDictionary;
myDictionary.SearchForAllPossibleWordsIn("byggearbejderen");
return 0;
}
I have tried to change the following line in main.cpp
myDictionary.SearchForAllPossibleWordsIn("byggearbejderen");
to (OBS: the first word in the word list is byggearbejderen)
std::ifstream ifs;
ifs.open("../Lektion10Projekt15-1/da-utf8.wl",std::ios::in);
if (ifs.fail()) {
std::cerr <<"Error!" << std::endl;
}else {
std::getline(ifs,searchword);
}
ifs.close();
myDictionary.SearchForAllPossibleWordsIn(searchword);
And then in the main.cpp add som print out with the expected string and substring in hex value.
std::cout << " cout as hex test:" << std::endl;
myDictionary.SearchForAllPossibleWordsIn(searchword);
std::cout << "Suposet search resul for ''bygearbejderen''" << std::endl;
for (char const elt: "byggearbejderen")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "byggearbejderen" << std::endl;
for (char const elt: "arbejderen")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "arbejderen" << std::endl;
for (char const elt: "ren")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "ren" << std::endl;
for (char const elt: "en")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "en" << std::endl;
for (char const elt: "n")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "n" << std::endl;
And also added the same print in the outprint of result in Dictonary.cpp
std::cout << "result of seartchword as hex" << std::endl;
if (!result.empty()) {
for (unsigned int i = 0; i < result.size() ;++i)
{
for (char const elt: result[i] )
{
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
}
std::cout << result[i] << std::endl;
}
}
which gave the following output:
result of seartchword as hex
ffffffef ffffffbb ffffffbf 62 79 67 67 65 61 72 62 65 6a 64 65 72 65 6e 0d byggearbejderen
61 72 62 65 6a 64 65 72 65 6e 0d arbejderen
72 65 6e 0d ren
65 6e 0d en
6e 0d n
Suposet search resul for ''bygearbejderen''
62 79 67 67 65 61 72 62 65 6a 64 65 72 65 6e 00 byggearbejderen
61 72 62 65 6a 64 65 72 65 6e 00 arbejderen
72 65 6e 00 ren
65 6e 00 en
6e 00 n
where I notice that some values ​​were different.
I don't know why this is the case when i am on a macOS but not the case on windows. I do not know if there are any settings of encoding in my environment I need to change or set correct.
I would like i my main.cpp looked liked this:
#include <iostream>
#include "Dictionary.h"
int main()
{
Dictionary myDictionary;
myDictionary.SearchForAllPossibleWordsIn("byggearbejderen");
return 0;
}
resulting in the following output:
byggearbejderen
arbejderen
ren
en
n
Line endings for text files are different on Windows than they are on a Mac. Windows uses both CR/LF characters (ASCII codes 13 and 10, respectively). Old Macs used the CR character alone, Linux systems use just the LF. If you create a text file on Windows, then copy it to your Mac, the line endings might not be handled correctly.
If you look at the last character in your output, you'll see it is a 0d, which would be the CR character. I don't know how you generated that output, but it is possible that the getline on the Mac is treating that as a normal character, and including it in the string that has been read in.
The simplest solution is to either process that text file beforehand to get the line endings correct, or strip the CR off the end of the words after they are read in.
I try applied XTR-DH for Key Agreement with this example:
//////////////////////////////////////////////////////////////////////////
// Alice
// Initialize the Diffie-Hellman class with a random prime and base
AutoSeededRandomPool rngA;
DH dhA;
dh.Initialize(rngA, 128);
// Extract the prime and base. These values could also have been hard coded
// in the application
Integer iPrime = dhA.GetGroupParameters().GetModulus();
Integer iGenerator = dhA.GetGroupParameters().GetSubgroupGenerator();
SecByteBlock privA(dhA.PrivateKeyLength());
SecByteBlock pubA(dhA.PublicKeyLength());
SecByteBlock secretKeyA(dhA.AgreedValueLength());
// Generate a pair of integers for Alice. The public integer is forwarded to Bob.
dhA.GenerateKeyPair(rngA, privA, pubA);
//////////////////////////////////////////////////////////////////////////
// Bob
AutoSeededRandomPool rngB;
// Initialize the Diffie-Hellman class with the prime and base that Alice generated.
DH dhB(iPrime, iGenerator);
SecByteBlock privB(dhB.PrivateKeyLength());
SecByteBlock pubB(dhB.PublicKeyLength());
SecByteBlock secretKeyB(dhB.AgreedValueLength());
// Generate a pair of integers for Bob. The public integer is forwarded to Alice.
dhB.GenerateKeyPair(rngB, privB, pubB);
//////////////////////////////////////////////////////////////////////////
// Agreement
// Alice calculates the secret key based on her private integer as well as the
// public integer she received from Bob.
if (!dhA.Agree(secretKeyA, privA, pubB))
return false;
// Bob calculates the secret key based on his private integer as well as the
// public integer he received from Alice.
if (!dhB.Agree(secretKeyB, privB, pubA))
return false;
// Just a validation check. Did Alice and Bob agree on the same secret key?
if (VerifyBufsEqualp(secretKeyA.begin(), secretKeyB.begin(), dhA.AgreedValueLength()))
return false;
return true;
And here my code :
//Alice
AutoSeededRandomPool aSRPA;
XTR_DH xtrA(aSRPA, 512, 256);
Integer iPrime = xtrA.GetModulus();
Integer i_qnumber = xtrA.GetSubgroupOrder();
Integer iGeneratorc1 = xtrA.GetSubgroupGenerator().c1;
Integer iGeneratorc2 = xtrA.GetSubgroupGenerator().c2;
SecByteBlock privateA(xtrA.PrivateKeyLength());
SecByteBlock publicA(xtrA.PublicKeyLength());
SecByteBlock secretKeyA(xtrA.AgreedValueLength());
xtrA.GenerateKeyPair(aSRPA, privateA, publicA);
//Bob
AutoSeededRandomPool aSRPB;
XTR_DH xtrB(iPrime, i_qnumber, iGeneratorc1); // Use c1 or c2 or both ???
SecByteBlock privB(xtrB.PrivateKeyLength());
SecByteBlock publB(xtrB.PublicKeyLength());
SecByteBlock secretKeyB(xtrB.AgreedValueLength());
xtrB.GenerateKeyPair(aSRPB, privateB, publicB);
// Agreement
// Alice calculates the secret key based on her private integer as well as the
// public integer she received from Bob.
if (!xtrA.Agree(secretKeyA, privateA, publicB))
return false;
// Bob calculates the secret key based on his private integer as well as the
// public integer he received from Alice.
if (!xtrB.Agree(secretKeyB, privateB, publicA))
return false;
// Just a validation check. Did Alice and Bob agree on the same secret key?
if (VerifyBufsEqualp(secretKeyA.begin(), secretKeyB.begin(), xtrA.AgreedValueLength()))
return false;
return true;
I got this error
Severity Code Description Project File Line Suppression State
Error C2664 'CryptoPP::XTR_DH::XTR_DH(CryptoPP::XTR_DH &&)': cannot convert argument 3 from 'CryptoPP::Integer' to 'const CryptoPP::GFP2Element &' ConsoleApplication1 d:\tugas akhir\code\consoleapplication1\consoleapplication1\consoleapplication1.cpp 91
My question is :
Number of generator is c1 and c2. Is it need both for make xtrB or just one ?
I have tried take number of p, q and g from xtrA and input it for initiate for xtrB but its too long for integer. What the solution ?
Thanks before
XTR_DH xtrB(iPrime, i_qnumber, iGeneratorc1); // Use c1 or c2 or both ???
You should use the the following constructor from XTR-DH | Constructors:
XTR_DH (const Integer &p, const Integer &q, const GFP2Element &g)
There are two ways to setup xtrB. First, the way that uses the constructor (and artificially small parameters):
$ cat test.cxx
#include "cryptlib.h"
#include "osrng.h"
#include "xtrcrypt.h"
#include <iostream>
int main()
{
using namespace CryptoPP;
AutoSeededRandomPool aSRP;
XTR_DH xtrA(aSRP, 170, 160);
const Integer& iPrime = xtrA.GetModulus();
const Integer& iOrder = xtrA.GetSubgroupOrder();
const GFP2Element& iGenerator = xtrA.GetSubgroupGenerator();
XTR_DH xtrB(iPrime, iOrder, iGenerator);
std::cout << "Prime: " << std::hex << xtrB.GetModulus() << std::endl;
std::cout << "Order: " << std::hex << xtrB.GetSubgroupOrder() << std::endl;
std::cout << "Generator" << std::endl;
std::cout << " c1: " << std::hex << xtrB.GetSubgroupGenerator().c1 << std::endl;
std::cout << " c2: " << std::hex << xtrB.GetSubgroupGenerator().c2 << std::endl;
return 0;
}
And then:
$ g++ -DNDEBUG -g2 -O3 -fPIC -pthread test.cxx ./libcryptopp.a -o test.exe
$ ./test.exe
Prime: 2d4c4f9f4de9e32e84a7be42f019a1a4139e0fe7489h
Order: 89ab07fa5115443f51ce9a74283affaae2d7748fh
Generator
c1: 684fedbae519cb297f3448d5e564838ede5ed1fb81h
c2: 39112823212ccd7b01f10377536f51bf855752c7a3h
Second, the way that stores the domain parameters in an ASN.1 object (and artificially small parameters):
$ cat test.cxx
#include "cryptlib.h"
#include "osrng.h"
#include "files.h"
#include "xtrcrypt.h"
#include <iostream>
int main()
{
using namespace CryptoPP;
AutoSeededRandomPool prng;
XTR_DH xtrA(prng, 170, 160);
xtrA.DEREncode(FileSink("params.der").Ref());
XTR_DH xtrB(FileSource("params.der", true).Ref());
std::cout << "Prime: " << std::hex << xtrB.GetModulus() << std::endl;
std::cout << "Order: " << std::hex << xtrB.GetSubgroupOrder() << std::endl;
std::cout << "Generator" << std::endl;
std::cout << " c1: " << std::hex << xtrB.GetSubgroupGenerator().c1 << std::endl;
std::cout << " c2: " << std::hex << xtrB.GetSubgroupGenerator().c2 << std::endl;
return 0;
}
And then:
$ g++ -DNDEBUG -g2 -O3 -fPIC -pthread test.cxx ./libcryptopp.a -o test.exe
$ ./test.exe
Prime: 2ee076b3254c1520151bbe0391a77971f92e277ba37h
Order: f7674a8c2dd68d32c3da8e74874a48b9adf00fcbh
Generator
c1: 2d469e63b474ac45578a0027a38864f303fad03ba9h
c2: 1d5e5714bc19ef25eee0535584176889df8f26c4802h
And finally:
$ dumpasn1 params.der
0 94: SEQUENCE {
2 22: INTEGER 02 EE 07 6B 32 54 C1 52 01 51 BB E0 39 1A 77 97 1F 92 E2 77 BA 37
26 21: INTEGER 00 F7 67 4A 8C 2D D6 8D 32 C3 DA 8E 74 87 4A 48 B9 AD F0 0F CB
49 21: INTEGER 2D 46 9E 63 B4 74 AC 45 57 8A 00 27 A3 88 64 F3 03 FA D0 3B A9
72 22: INTEGER 01 D5 E5 71 4B C1 9E F2 5E EE 05 35 58 41 76 88 9D F8 F2 6C 48 02
: }
In practice you probably want to use something like this, which validates the parameters after loading them. You should always validate your security parameters.
// Load the domain parameters from somewhere
const Integer& iPrime = ...;
const Integer& iOrder = ...;
const GFP2Element& iGenerator = ...;
// Create the key agreement object using the parameters
XTR_DH xtrB(iPrime, iOrder, iGenerator);
// Verify the the parameters using the key agreement object
if(xtrB.Validate(aSRP, 3) == false)
throw std::runtime_error("Failed to validate parameters");
You are probably going to use something like the second method shown above. That is, you are going to generate your domain parameters once, and then both parties will use them. Below both parties xtrA and xtrB use params.der:
int main()
{
using namespace CryptoPP;
AutoSeededRandomPool prng;
XTR_DH xtrA(FileSource("params.der", true).Ref());
XTR_DH xtrB(FileSource("params.der", true).Ref());
if(xtrA.Validate(prng, 3) == false)
throw std::runtime_error("Failed to validate parameters");
if(xtrB.Validate(prng, 3) == false)
throw std::runtime_error("Failed to validate parameters");
...
}
I'm running some updates through Undefined Behavior Sanitizer. The sanitizer is producing a message I don't quite understand:
kalyna.cpp:1326:61: runtime error: load of address 0x0000016262c0 with insufficient space for an object of type 'const uint32_t'
0x0000016262c0: note: pointer points here
20 8b c1 1f a9 f7 f9 5c 53 c4 cf d2 2f 3f 52 be 84 ed 96 1b b8 7a b2 85 e0 96 7d 5d 70 ee 06 07
^
The code in question attempts to make cache timing attacks harder by touching addresses within the range of a cache line. Line 1326 is the line with reinterpret_cast:
// In KalynaTab namespace
uint64_t S[4][256] = {
...
};
...
// In library's namespace
const int cacheLineSize = GetCacheLineSize();
volatile uint32_t _u = 0;
uint32_t u = _u;
for (unsigned int i=0; i<256; i+=cacheLineSize)
u &= *reinterpret_cast<const uint32_t*>(KalynaTab::S+i);
Why is the santizier claiming a uint32_t u does not have sufficient space to hold an uint32_t?
Or maybe, am I parsing the error message correctly? Is that what the sanitzier is complaining about? If I am parsing it incorrectly, then what is the sanitzer complaining about?
$ lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
$ gcc --version
gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
The identifier S does not convert to a pointer of the type you think it does. As a result, your pointer arithmetic is throwing you way out of range of your data, and is best shown by example:
#include <iostream>
#include <cstdint>
uint64_t S[4][256];
int main()
{
std::cout << static_cast<void*>(S+0) << '\n';
std::cout << static_cast<void*>(S+1) << '\n';
std::cout << static_cast<void*>(S+2) << '\n';
std::cout << static_cast<void*>(S+3) << '\n';
std::cout << '\n';
std::cout << static_cast<void*>(*S+0) << '\n';
std::cout << static_cast<void*>(*S+1) << '\n';
std::cout << static_cast<void*>(*S+2) << '\n';
std::cout << static_cast<void*>(*S+3) << '\n';
}
Output (obviously platform dependent)
0x1000020b0
0x1000028b0
0x1000030b0
0x1000038b0
0x1000020b0
0x1000020b8
0x1000020c0
0x1000020c8
Note the stride of the first sequence of numbers 0x800 per inferior row. That makes sense since each row is made up of 0x100 entries of 8 bytes each (the uint64_t elements). The type of the pointer being used in the pointer arithmetic is uint64_t (*)[256].
Now note the stride of the second sequence, which peers into only S[0]. The stride is 8 bytes, one for each slot. The type of the converted pointer in this calculation is uint64_t *.
In short, your pointer arithmetic is assuming S converts to uint64_t*, and it doesn't. Like all array-to-pointer conversions, it converts to a pointer-to-first-element, including the type of said-same. The element type in the array of arrays is uint64_t[256], so the converted pointer type is uint64_t (*)[256].
I have a USB string descriptor in a uint8_t array. For example:
0000:12 03 34 00 45 00 36 00 31 00 42 00 43 00 30 00 ..4.E.6.1.B.C.0.
0010:30 00 0.
(The first two bytes are the length and descriptor type; the remaining bytes are the uint16_t characters.)
I would like to print this on the terminal with as little hassle as possible, and preferably without having to screw around with all the other printing (which happens like cout << "Hello, world" << endl;)
In particular, I would like to do:
cout << "Serial number is: " << some_cast_or_constructor( buf + 2, len - 2 ) << endl;
and for the string descriptor above, get the following on a terminal:
Serial number is: 4E61BC00
Is this possible, or do I have to delve into Unicode arcana?
[edit to add:]
Per #PaulMcKenzie, I tried this program:
#include <iostream>
#include <fstream>
#include <exception>
#include <string>
#include <locale>
int
main( int argc, char **argv )
{
char buf[] = { 34, 00, 45, 00, 36, 00, 31, 00, 42, 00, 43, 00, 30, 00, 30, 00 };
std::wcout << "Hello" << std::wstring( (const wchar_t *)buf, sizeof(buf) ) << std::endl;
return 0;
}
The output:
user:/tmp$ g++ foo.cc
user:/tmp$ ./a.out
Hello??????????
user:/tmp$
In your source code, I detect two errors:
1- in your USB rawdata (on the top), values are hexadecimal and in your buf[] values are decimal. It should be written:
char buf[] = { 0x34, 0x00, 0x45, 0x00, 0x36, 0x00, 0x31, 0x00, 0x42,
0x00, 0x43, 0x00, 0x30, 0x00, 0x30, 0x00 };
2- in your print message, the lenght is equal to sizeof(buf) but it is 'char' (1 byte) and not 'wchar_t' (2bytes). It should be written:
std::wcout << "Hello" << std::wstring( (const wchar_t *)buf, (sizeof(buf) >> 1) ) << std::endl;
And, this code gives the expected result on a Windows PC... be sure there is not a big/little endian conversion before managing 'wchar_t' on your computer.
Could you check the sizeof(wchar_t) under Linux ? This post
'Difference and conversions between wchar_t for Linux and for
Windows' supposes that wchar_t is a 32bits value.
If you've reached this question because you're having trouble with Unicode, wide characters and similar on Linux, the quickest way I found to move forward is to use libiconv. The <codecvt> header file that you'll read about in C++ docs is not yet implemented in GNU libstdc++ (as of October 2016).
Here is a quick sample program that demonstrates libiconv:
#include <iostream>
#include <locale>
#include <cstdint>
#include <iconv.h>
#include <string.h>
int
main( int, char ** )
{
const char a[] = "ABC";
const wchar_t b[] = L"ABC";
const char c[] = u8"ABC";
const char16_t d[] = u"ABCDEF";
const char32_t e[] = U"ABC";
iconv_t utf16_to_utf32 = iconv_open( "UTF-32", "UTF-16" );
wchar_t wcbuf[32];
char *inp = (char *)d;
size_t inl = sizeof(d);
char *outp = (char *)wcbuf;
size_t outl = sizeof(wcbuf);
iconv( utf16_to_utf32, &inp, &inl, &outp, &outl );
std::wcout << "sizeof(a) = " << sizeof(a) << ' ' << a << std::endl
<< "sizeof(b) = " << sizeof(b) << ' ' << b << std::endl
<< "sizeof(c) = " << sizeof(c) << ' ' << c << std::endl
<< "sizeof(d) = " << sizeof(d) << ' ' << d << std::endl
<< "sizeof(e) = " << sizeof(e) << ' ' << e << std::endl
<< "Converted char16_t to UTF-32: " << std::wstring( wcbuf, (wchar_t *)outp - wcbuf ) << std::endl;
iconv_close( utf16_to_utf32 );
return 0;
}
Resulting output:
user#debian:~/code/unicode$ ./wchar
sizeof(a) = 4 ABC
sizeof(b) = 16 ABC
sizeof(c) = 4 ABC
sizeof(d) = 14 0x7ffefdae5a40
sizeof(e) = 16 0x7ffefdae5a30
Converted char16_t to UTF-32: ABCDEF
user#debian:~/code/unicode$
Note that std::wcout doesn't print char16_t or char32_t properly. However, you can use iconv to convert UTF-16 (which is apparently what you get from u"STRING") to UTF-32 (which is apparently compatible with wchar_t on a late-model Linux system).