c++ replacing order of items in byte array

c++ replacing order of items in byte array - c++

I'm writing a code for Arduino C++.
I have a byte array with hex byte values, for example:
20 32 36 20 E0 EC 20 F9 F0 E9 E9 E3 F8 5C 70 5C 70 5C 73 20 E3 E2 EC 20 F8 E0 E5 E1 EF 20 39 31 5C
There are four ASCII digits in these bytes:
HEX 0x32 is number 2 in ascii code
HEX 0x35 is number 5 in ascii code
HEX 0x39 is number 9 in ascii code
and so on....
https://www.ascii-codes.com/cp862.html
So the hex values 32, 36 represent the number 26, and 39, 31 represent 91.
I want to find these numbers and reverse each group, so that (in this example) 62 and 19 are represented instead of 26 and 91.
The output would thus have to look like this:
20 36 32 20 E0 EC 20 F9 F0 E9 E9 E3 F8 5C 70 5C 70 5C 73 20 E3 E2 EC 20 F8 E0 E5 E1 EF 20 31 39 5C
The numbers don't have to be two digits but could be anything in 0-1000
I also know that each group of such numbers is preceded by the hex value 20, if that helps.
I have done this in C# (with some help of Stack overflow users :-) ):
string result = Regex.Replace(HexMessage1,
#"(?<=20\-)3[0-9](\-3[0-9])*(?=\-20)",
match => string.Join("-", Transform(match.Value.Split('-'))));
private static IEnumerable<string> Transform(string[] items)
{
// Either terse Linq:
// return items.Reverse();
// Or good old for loop:
string[] result = new string[items.Length];
for (int i = 0; i < items.Length; ++i)
result[i] = items[items.Length - i - 1];
return result;
}
Can someone help me make it work on C++?

Loop over the array, element by element, looking for 0x32 or 0x39. If found, check the next byte (if within bounds) to see if it matches 0x36 or 0x31 (respectively). If it does then swap the current and the next byte. Continue the loop, skipping over the current and the next byte.

Related

Decrypt rotating XOR with 10-byte key across packet bytes in C++

Trying to figure out how to write something in C++ that will decrypt a rotating XOR to packet bytes that will be in varying sizes with a known 10 byte key (key can be ascii or hex, whatever is easier).
For example:
XOR 10-byte key in hex: 41 30 42 44 46 4c 58 53 52 54
XOR 10-byte key in ascii: A0BDFLXSRT
Here is "This is my message in clear text" in hex that needs to be decrypted with above key
15 58 2B 37 66 25 2B 73 3F 2D 61 5D 27 37 35 2D 3F 36 72 3D 2F 10 21 28 23 2D 2A 73 26 31 39 44
I need a way to apply my XOR key over the top of these bytes like this in a rotating fashion:
41 30 42 44 46 4c 58 53 52 54 41 30 42 44 46 4c 58 53 52 54 41 30 42 44 46 4c 58 53 52 54 52 54
15 58 2B 37 66 25 2B 73 3F 2D 61 5D 27 37 35 2D 3F 36 72 3D 2F 10 21 28 23 2D 2A 73 26 31 39 44
When I XOR these together, I get the data in readable format again:
"This is my message in clear text"
I've seen some examples that take the entire XOR key and apply it to each byte, but that's not what I need. It needs to somehow rotate over the bytes until the end of the data is reached.
Is there anyone who can assist?

Use the % (modulus) operator!
using byte_t = unsigned char;
std::vector< byte_t > xor_key;
std::vector< byte_t > cipher_text;
std::string plain_text;
plain_text.reserve( cipher_text.size( ) );
for( std::size_t i = 0;
i < cipher_text.size( );
++i )
{
auto const cipher_byte = cipher_text[ i ];
// i % xor_key.size( ) will "rotate" the index
auto const key_byte = xor_key[ i % xor_key.size( ) ];
// xor like usual!
plain_text += static_cast< char >( cipher_byte ^ key_byte );
}

You just use a loop, and a modulus operation, to XOR one byte of the key, with one byte of the cypher text.
void decrypt(char *cypher, int length, char* key) {
for (int i = 0; i < length; i++) cypher[i] ^= key[i % 10];
}
By indexing the key with modulus (remainder) of 10, it will always have "rotating" values between 0 and 9.

Regex / Python3 - re.findall() - Find all occurrences between opcodes

Background
I'm reverse engineering a TCP stream that uses a Type-Length-Value approach to encoding data.
Example:
TCP Payload: b'0000001f001270622e416374696f6e4e6f74696679425243080310840718880e20901c'
---------------------------------------------------------------------------------------
Type: 00 00 # New function call
Length: 00 1f # Length of Value (Length of Function + Function + Data)
Value: 00 12 # Length of Function
Value: 70 62 2e 41 63 74 69 6f 6e 4e 6f 74 69 66 79 42 52 43 # Function ->(hex2ascii)-> pb.ActionNotifyBRC
Value: 08 03 10 84 07 18 88 0e 20 90 1c # Data
However the Data is a data object that can include multiple variables with variable data lengths.
Data: 08 05 10 04 10 64 18 c8 01 20 ef 0f
----------------------------------------------
Opcode : Value
08 : 05 # var1 : 1 byte
10 : 04 # var2 : 1 byte
18 : c8 01 # var3 : 1-10 bytes
20 : ef 0f # var4 : 1-10 bytes
Currently I am parsing the Data using the following Python3 code:
############################### NOTES ###############################
# Opcodes sometimes rotate starting positions but the general order is always held:
# Data: 20 ef 0f 08 05 10 04 10 64 18 c8 01
#####################################################################
import re
import binascii
def dataVariable(data, start, end):
p = re.compile(start + b'(.*?)' + end)
return p.findall(data + data)
data = bytearray.fromhex('08051004106418c80120ef0f')
var3 = dataVariable(data, b'\x18', b'\x20')
print("Variable 3:", end=' ')
for item in set(var3):
print(binascii.hexlify(item), end=' ')
----------------------------------------------------------------------------
[Output]: Variable 3: b'c801'
So far all good...
Problem
If an Opcode appears in the previous variables Value the code is no longer reliable.
Data: 08 05 10 04 10 64 18 c8 20 01 20 ef 0f
----------------------------------------------
Opcode : Value
08 : 05
10 : 04
18 : c8 20 01 # The Value includes the next opcode (20)
20 : ef 0f
----------------------------------------------------------------------------
[Output]: Variable 3: b'c8'
[Output]: Variable 4: b'0120ef0f'
I was expecting an output of:
[Output]: Variable 3: b'c8' b'c82001'
[Output]: Variable 4: b'0120ef0f' b'ef0f'
It seems like there is an issue with my regular expression?
Update
To further clarify, var3 and var4 are representing integers.
I have managed to figure out how the length of the Value was being encoded. The most significant bit was being used as a flag to inform me that another byte was coming. You can then strip the MSB of each byte, swap the endianness and convert to decimal.
data -> binary representation -> strip MSB and swap endianness -> decimal representation
ac d7 05 -> 10101100 11010111 00000101 -> 0001 01101011 10101100 -> 93100
e4 a6 04 -> 11100100 10100110 00000100 -> 0001 00010011 01100100 -> 70500
90 e1 02 -> 10010000 11100001 00000010 -> 10110000 10010000 -> 45200
dc 24 -> 11011100 00100100 -> 00010010 01011100 -> 4700
f0 60 -> 11110000 01100000 -> 00110000 01110000 -> 12400

You may use
def dataVariable(data, start, end):
p = re.compile(b'(?=(' + start + b'.*' + end + b'))')
res = []
for x in p.findall(data):
cur = b''
for i, m in enumerate([x[i:i+1] for i in range(len(x))]):
if i == 0:
continue
if m == end and cur:
res.append(cur)
cur = cur + m
return res
See the Python demo:
data = bytearray.fromhex('08051004106418c8200120ef0f0f') # => b'c82001' b'c8'
#data = bytearray.fromhex('185618205720') # => b'56182057' b'2057' b'5618'
var3 = dataVariable(data, b'\x18', b'\x20')
print("Variable 3:", end=' ')
for item in set(var3):
print(binascii.hexlify(item), end=' ')
Output is Variable 3: b'c8' b'c82001' for '08051004106418c8200120ef0f0f' string and b'56182057' b'2057' b'5618' for 185618205720 input.
The pattern is of (?=(...)) type to find all overlapping matches. If you do not need the overlapping feature, remove these parts from the regex.
The point here is:
match all substrings starting with start and up to the last end with start + b'.*' + end pattern
iterate through the match dropping the first start byte and add an item to the resulting list when the end byte is found, adding up found bytes at each iteration (thus, getting all inner substrings inside the match).

ElGamal encryption example?

I apologise in advance for the n00bishness of asking this question, but I've been stuck for ages and I'm struggling to figure out what to do next. Essentially, I am trying to perform ElGamal encryption on some data. I have been given the public part of an ephemeral key pair and a second static key, as well as some data. If my understanding is correct, this is all I need to perform the encryption, but I'm struggling to figure out how using Crypto++.
I've looked endlessly for examples, but I can find literally zero on Google. Ohloh is less than helpful as I just get back endless pages of the cryptopp ElGamal source files, which I can't seem to be able to figure out (I'm relatively new to using Crypto++ and until about 3 days ago hadn't even heard of ElGamal).
The closest I've been able to find as an example comes from the CryptoPP package itself, which is as follows:
bool ValidateElGamal()
{
cout << "\nElGamal validation suite running...\n\n";
bool pass = true;
{
FileSource fc("TestData/elgc1024.dat", true, new HexDecoder);
ElGamalDecryptor privC(fc);
ElGamalEncryptor pubC(privC);
privC.AccessKey().Precompute();
ByteQueue queue;
privC.AccessKey().SavePrecomputation(queue);
privC.AccessKey().LoadPrecomputation(queue);
pass = CryptoSystemValidate(privC, pubC) && pass;
}
return pass;
}
However, this doesn't really seem to help me much as I'm unaware of how to plug in my already computed values. I am not sure if I'm struggling with my understanding of how Elgamal works (entirely possible) or if I'm just being an idiot when it comes to using what I've got with CryptoPP. Can anyone help point me in the right direction?

I have been given the public part of an ephemeral key pair and a second static key, as well as some data.
We can't really help you here because we know nothing about what is supposed to be done.
The ephemeral key pair is probably for simulating key exchange, and the static key is long term for signing the ephemeral exchange. Other than that, its anybody's guess as to what's going on.
Would you happen to know what the keys are? is the ephemeral key a Diffie-Hellman key and the static key an ElGamal signing key?
If my understanding is correct, this is all I need to perform the encryption, but I'm struggling to figure out how using Crypto++.
For the encryption example, I'm going to cheat a bit and use an RSA encryption example and port it to ElGamal. This is about as difficult as copy and paste because both RSA encryption and ElGamal encryption adhere to the the PK_Encryptor and PK_Decryptor interfaces. See the PK_Encryptor and PK_Decryptor classes for details. (And keep in mind, you might need an ElGamal or Nyberg-Rueppel (NR) signing example).
Crypto++ has a cryptosystem built on ElGamal. The cryptosystem will encrypt a large block of plain text under a symmetric key, and then encrypt the symmetric key under the ElGamal key. I'm not sure what standard it follows, though (likely IEEE's P1363). See SymmetricEncrypt and SymmetricDecrypt in elgamal.h.
The key size is artificially small so the program runs quickly. ElGamal is a discrete log problem, so its key size should be 2048-bits or higher in practice. 2048-bits is blessed by ECRYPT (Asia), ISO/IEC (Worldwide), NESSIE (Europe), and NIST (US).
If you need to save/persist/load the keys you generate, then see Keys and Formats on the Crypto++ wiki. The short answer is to call decryptor.Save() and decryptor.Load(); and stay away from the {BER|DER} encodings.
If you want, you can use a standard string rather than a SecByteBlock. The string will be easier if you are interested in printing stuff to the terminal via cout and friends.
Finally, there's now a page on the Crypto++ Wiki covering the topic with the source code for the program below. See Crypto++'s ElGamal Encryption.
#include <iostream>
using std::cout;
using std::cerr;
using std::endl;
#include <cryptopp/osrng.h>
using CryptoPP::AutoSeededRandomPool;
#include <cryptopp/secblock.h>
using CryptoPP::SecByteBlock;
#include <cryptopp/elgamal.h>
using CryptoPP::ElGamal;
using CryptoPP::ElGamalKeys;
#include <cryptopp/cryptlib.h>
using CryptoPP::DecodingResult;
int main(int argc, char* argv[])
{
////////////////////////////////////////////////
// Generate keys
AutoSeededRandomPool rng;
cout << "Generating private key. This may take some time..." << endl;
ElGamal::Decryptor decryptor;
decryptor.AccessKey().GenerateRandomWithKeySize(rng, 512);
const ElGamalKeys::PrivateKey& privateKey = decryptor.AccessKey();
ElGamal::Encryptor encryptor(decryptor);
const PublicKey& publicKey = encryptor.AccessKey();
////////////////////////////////////////////////
// Secret to protect
static const int SECRET_SIZE = 16;
SecByteBlock plaintext( SECRET_SIZE );
memset( plaintext, 'A', SECRET_SIZE );
////////////////////////////////////////////////
// Encrypt
// Now that there is a concrete object, we can validate
assert( 0 != encryptor.FixedMaxPlaintextLength() );
assert( plaintext.size() <= encryptor.FixedMaxPlaintextLength() );
// Create cipher text space
size_t ecl = encryptor.CiphertextLength( plaintext.size() );
assert( 0 != ecl );
SecByteBlock ciphertext( ecl );
encryptor.Encrypt( rng, plaintext, plaintext.size(), ciphertext );
////////////////////////////////////////////////
// Decrypt
// Now that there is a concrete object, we can check sizes
assert( 0 != decryptor.FixedCiphertextLength() );
assert( ciphertext.size() <= decryptor.FixedCiphertextLength() );
// Create recovered text space
size_t dpl = decryptor.MaxPlaintextLength( ciphertext.size() );
assert( 0 != dpl );
SecByteBlock recovered( dpl );
DecodingResult result = decryptor.Decrypt( rng, ciphertext, ciphertext.size(), recovered );
// More sanity checks
assert( result.isValidCoding );
assert( result.messageLength <= decryptor.MaxPlaintextLength( ciphertext.size() ) );
// At this point, we can set the size of the recovered
// data. Until decryption occurs (successfully), we
// only know its maximum size
recovered.resize( result.messageLength );
// SecByteBlock is overloaded for proper results below
assert( plaintext == recovered );
// If the assert fires, we won't get this far.
if(plaintext == recovered)
cout << "Recovered plain text" << endl;
else
cout << "Failed to recover plain text" << endl;
return !(plaintext == recovered);
}
You can also create the Decryptor from a PrivateKey like so:
ElGamalKeys::PrivateKey k;
k.GenerateRandomWithKeySize(rng, 512);
ElGamal::Decryptor d(k);
...
And an Encryptor from a PublicKey:
ElGamalKeys::PublicKey pk;
privateKey.MakePublicKey(pk);
ElGamal::Encryptor e(pk);
You can save and load keys to and from disk as follows:
ElGamalKeys::PrivateKey privateKey1;
privateKey1.GenerateRandomWithKeySize(prng, 2048);
privateKey1.Save(FileSink("elgamal.der", true /*binary*/).Ref());
ElGamalKeys::PrivateKey privateKey2;
privateKey2.Load(FileSource("elgamal.der", true /*pump*/).Ref());
privateKey2.Validate(prng, 3);
ElGamal::Decryptor decryptor(privateKey2);
// ...
The keys are ASN.1 encoded, so you can dump them with something like Peter Gutmann's dumpasn1:
$ ./cryptopp-elgamal-keys.exe
Generating private key. This may take some time...
$ dumpasn1 elgamal.der
0 556: SEQUENCE {
4 257: INTEGER
: 00 C0 8F 5A 29 88 82 8C 88 7D 00 AE 08 F0 37 AC
: FA F3 6B FC 4D B2 EF 5D 65 92 FD 39 98 04 C7 6D
: 6D 74 F5 FA 84 8F 56 0C DD B4 96 B2 51 81 E3 A1
: 75 F6 BE 82 46 67 92 F2 B3 EC 41 00 70 5C 45 BF
: 40 A0 2C EC 15 49 AD 92 F1 3E 4D 06 E2 89 C6 5F
: 0A 5A 88 32 3D BD 66 59 12 A1 CB 15 B1 72 FE F3
: 2D 19 DD 07 DF A8 D6 4C B8 D0 AB 22 7C F2 79 4B
: 6D 23 CE 40 EC FB DF B8 68 A4 8E 52 A9 9B 22 F1
: [ Another 129 bytes skipped ]
265 1: INTEGER 3
268 257: INTEGER
: 00 BA 4D ED 20 E8 36 AC 01 F6 5C 9C DA 62 11 BB
: E9 71 D0 AB B7 E2 D3 61 37 E2 7B 5C B3 77 2C C9
: FC DE 43 70 AE AA 5A 3C 80 0A 2E B0 FA C9 18 E5
: 1C 72 86 46 96 E9 9A 44 08 FF 43 62 95 BE D7 37
: F8 99 16 59 7D FA 3A 73 DD 0D C8 CA 19 B8 6D CA
: 8D 8E 89 52 50 4E 3A 84 B3 17 BD 71 1A 1D 38 9E
: 4A C4 04 F3 A2 1A F7 1F 34 F0 5A B9 CD B4 E2 7F
: 8C 40 18 22 58 85 14 40 E0 BF 01 2D 52 B7 69 7B
: [ Another 129 bytes skipped ]
529 29: INTEGER
: 01 61 40 24 1F 48 00 4C 35 86 0B 9D 02 8C B8 90
: B1 56 CF BD A4 75 FE E2 8E 0B B3 66 08
: }
0 warnings, 0 errors.

Visual Studio Character Sets 'Not set' vs 'Multi byte character set'

I've working with a legacy application and I'm trying to work out the difference between applications compiled with Multi byte character set and Not Set under the Character Set option.
I understand that compiling with Multi byte character set defines _MBCS which allows multi byte character set code pages to be used, and using Not set doesn't define _MBCS, in which case only single byte character set code pages are allowed.
In the case that Not Set is used, I'm assuming then that we can only use the single byte character set code pages found on this page: http://msdn.microsoft.com/en-gb/goglobal/bb964654.aspx
Therefore, am I correct in thinking that is Not Set is used, the application won't be able to encode and write or read far eastern languages since they are defined in double byte character set code pages (and of course Unicode)?
Following on from this, if Multi byte character set is defined, are both single and multi byte character set code pages available, or only multi byte character set code pages? I'm guessing it must be both for European languages to be supported.
Thanks,
Andy
Further Reading
The answers on these pages didn't answer my question, but helped in my understanding:
About the "Character set" option in visual studio 2010
Research
So, just as working research... With my locale set as Japanese
Effect on hard coded strings
char *foo = "Jap text: テスト";
wchar_t *bar = L"Jap text: テスト";
Compiling with Unicode
*foo = 4a 61 70 20 74 65 78 74 3a 20 83 65 83 58 83 67 == Shift-Jis (Code page 932)
*bar = 4a 00 61 00 70 00 20 00 74 00 65 00 78 00 74 00 3a 00 20 00 c6 30 b9 30 c8 30 == UTF-16 or UCS-2
Compiling with Multi byte character set
*foo = 4a 61 70 20 74 65 78 74 3a 20 83 65 83 58 83 67 == Shift-Jis (Code page 932)
*bar = 4a 00 61 00 70 00 20 00 74 00 65 00 78 00 74 00 3a 00 20 00 c6 30 b9 30 c8 30 == UTF-16 or UCS-2
Compiling with Not Set
*foo = 4a 61 70 20 74 65 78 74 3a 20 83 65 83 58 83 67 == Shift-Jis (Code page 932)
*bar = 4a 00 61 00 70 00 20 00 74 00 65 00 78 00 74 00 3a 00 20 00 c6 30 b9 30 c8 30 == UTF-16 or UCS-2
Conclusion:
The character encoding doesn't have any effect on hard coded strings. Although defining chars as above seems to use the Locale defined codepage and wchar_t seems to use either UCS-2 or UTF-16.
Using encoded strings in W/A versions of Win32 APIs
So, using the following code:
char *foo = "C:\\Temp\\テスト\\テa.txt";
wchar_t *bar = L"C:\\Temp\\テスト\\テw.txt";
CreateFileA(bar, GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
CreateFileW(foo, GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
Compiling with Unicode
Result: Both files are created
Compiling with Multi byte character set
Result: Both files are created
Compiling with Not set
Result: Both files are created
Conclusion:
Both the A and W version of the API expect the same encoding regardless of the character set chosen. From this, perhaps we can assume that all the Character Set option does is switch between the version of the API. So the A version always expects strings in the encoding of the current code page and the W version always expects UTF-16 or UCS-2.
Opening files using W and A Win32 APIs
So using the following code:
char filea[MAX_PATH] = {0};
OPENFILENAMEA ofna = {0};
ofna.lStructSize = sizeof ( ofna );
ofna.hwndOwner = NULL ;
ofna.lpstrFile = filea ;
ofna.nMaxFile = MAX_PATH;
ofna.lpstrFilter = "All\0*.*\0Text\0*.TXT\0";
ofna.nFilterIndex =1;
ofna.lpstrFileTitle = NULL ;
ofna.nMaxFileTitle = 0 ;
ofna.lpstrInitialDir=NULL ;
ofna.Flags = OFN_PATHMUSTEXIST|OFN_FILEMUSTEXIST ;
wchar_t filew[MAX_PATH] = {0};
OPENFILENAMEW ofnw = {0};
ofnw.lStructSize = sizeof ( ofnw );
ofnw.hwndOwner = NULL ;
ofnw.lpstrFile = filew ;
ofnw.nMaxFile = MAX_PATH;
ofnw.lpstrFilter = L"All\0*.*\0Text\0*.TXT\0";
ofnw.nFilterIndex =1;
ofnw.lpstrFileTitle = NULL;
ofnw.nMaxFileTitle = 0 ;
ofnw.lpstrInitialDir=NULL ;
ofnw.Flags = OFN_PATHMUSTEXIST|OFN_FILEMUSTEXIST ;
GetOpenFileNameA(&ofna);
GetOpenFileNameW(&ofnw);
and selecting either:
C:\Temp\テスト\テopenw.txt
C:\Temp\テスト\テopenw.txt
Yields:
When compiled with Unicode
*filea = 43 3a 5c 54 65 6d 70 5c 83 65 83 58 83 67 5c 83 65 6f 70 65 6e 61 2e 74 78 74 == Shift-Jis (Code page 932)
*filew = 43 00 3a 00 5c 00 54 00 65 00 6d 00 70 00 5c 00 c6 30 b9 30 c8 30 5c 00 c6 30 6f 00 70 00 65 00 6e 00 77 00 2e 00 74 00 78 00 74
00 == UTF-16 or UCS-2
When compiled with Multi byte character set
*filea = 43 3a 5c 54 65 6d 70 5c 83 65 83 58 83 67 5c 83 65 6f 70 65 6e 61 2e 74 78 74 == Shift-Jis (Code page 932)
*filew = 43 00 3a 00 5c 00 54 00 65 00 6d 00 70 00 5c 00 c6 30 b9 30 c8 30 5c 00 c6 30 6f 00 70 00 65 00 6e 00 77 00 2e 00 74 00 78 00 74
00 == UTF-16 or UCS-2
When compiled with Not Set
*filea = 43 3a 5c 54 65 6d 70 5c 83 65 83 58 83 67 5c 83 65 6f 70 65 6e 61 2e 74 78 74 == Shift-Jis (Code page 932)
*filew = 43 00 3a 00 5c 00 54 00 65 00 6d 00 70 00 5c 00 c6 30 b9 30 c8 30 5c 00 c6 30 6f 00 70 00 65 00 6e 00 77 00 2e 00 74 00 78 00 74
00 == UTF-16 or UCS-2
Conclusion:
Again, the Character Set setting doesn't have a bearing on the behaviour of the Win32 API. The A version always seems to return a string with the encoding of the active code page and the W one always returns UTF-16 or UCS-2. I can actually see this explained a bit in this great answer: https://stackoverflow.com/a/3299860/187100.
Ultimate Conculsion
Hans appears to be correct when he says that the define doesn't really have any magic to it, beyond changing the Win32 APIs to use either W or A. Therefore, I can't really see any difference between Not Set and Multi byte character set.

No, that's not really the way it works. The only thing that happens is that the macro gets defined, it doesn't otherwise have a magic effect on the compiler. It is very rare to actually write code that uses #ifdef _MBCS to test this macro.
You almost always leave it up to a helper function to make the conversion. Like WideCharToMultiByte(), OLE2A() or wctombs(). Which are conversion functions that always consider multi-byte encodings, as guided by the code page. _MBCS is an historical accident, relevant only 25+ years ago when multi-byte encodings were not common yet. Much like using a non-Unicode encoding is a historical artifact these days as well.

In the reference it is stated that:
By definition, the ASCII character set is a subset of all
multibyte-character sets. In many multibyte character sets, each
character in the range 0x00 – 0x7F is identical to the character that
has the same value in the ASCII character set. For example, in both
ASCII and MBCS character strings, the 1-byte NULL character ('\0') has
value 0x00 and indicates the terminating null character.
As you guessed, by enabling _MBCS Visual Studio also supports ASCII single character set.
In a second reference, single character set seems to be supported even if we enable _MBCS:
MBCS/Unicode portability: Using the Tchar.h header file, you can build
single-byte, MBCS, and Unicode applications from the same sources.
Tchar.h defines macros prefixed with _tcs , which map to str, _mbs, or
wcs functions, as appropriate. To build MBCS, define the symbol _MBCS.
To build Unicode, define the symbol _UNICODE. By default, _MBCS is
defined for MFC applications. For more information, see Generic-Text
Mappings in Tchar.h.

Binary File interpretation

I am reading in a binary file (in c++). And the header is something like this (printed in hexadecimal)
43 27 41 1A 00 00 00 00 23 00 00 00 00 00 00 00 04 63 68 72 31 FFFFFFB4 01 00 00 04 63 68 72 32 FFFFFFEE FFFFFFB7
when printed out using:
std::cout << hex << (int)mem[c];
Is there an efficient way to store 23 which is the 9th byte(?) into an integer without using stringstream? Or is stringstream the best way?
Something like
int n= mem[8]
I want to store 23 in n not 35.

You did store 23 in n. You only see 35 because you are outputting it with a routine that converts it to decimal for display. If you could look at the binary data inside the computer, you would see that it is in fact a hex 23.
You will get the same result as if you did:
int n=0x23;
(What you might think you want is impossible. What number should be stored in n for 1E? The only corresponding number is 31, which is what you are getting.)

Do you mean you want to treat the value as binary-coded decimal? In that case, you could convert it using something like:
unsigned char bcd = mem[8];
unsigned char ones = bcd % 16;
unsigned char tens = bcd / 16;
if (ones > 9 || tens > 9) {
// handle error
}
int n = 10*tens + ones;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

c++ replacing order of items in byte array - c++

Loop over the array, element by element, looking for 0x32 or 0x39. If found, check the next byte (if within bounds) to see if it matches 0x36 or 0x31 (respectively). If it does then swap the current and the next byte. Continue the loop, skipping over the current and the next byte.

Related

Decrypt rotating XOR with 10-byte key across packet bytes in C++

Regex / Python3 - re.findall() - Find all occurrences between opcodes

ElGamal encryption example?

Visual Studio Character Sets 'Not set' vs 'Multi byte character set'

Binary File interpretation

Categories

Resources