Edits and Updates
3/24/2013:
My output hash from Python is now matching the hash from c++ after converting to utf-16 and stoping before hitting any 'e' or 'm' bytes. However the decrypted results do not match. I know that my SHA1 hash is 20 bytes = 160 bits and RC4 keys can vary in length from 40 to 2048 bits so perhaps there is some default salting going on in WinCrypt that I will need to mimic. CryptGetKeyParam KP_LENGTH or KP_SALT
3/24/2013:
CryptGetKeyParam KP_LENGTH is telling me that my key ength is 128bits. I'm feeding it a 160 bit hash. So perhaps it's just discarding the last 32 bits...or 4 bytes. Testing now.
3/24/2013:
Yep, that was it. If I discard the last 4 bytes of my SHA1 hash in python...I get the same decryption results.
Quick Info:
I have a c++ program to decrypt a datablock. It uses the Windows Crytographic Service Provider so it only works on Windows. I would like it to work with other platforms.
Method Overview:
In Windows Crypto API
An ASCII encode password of bytes is converted to a wide character representation and then hashed with SHA1 to make a key for an RC4 stream cipher.
In Python PyCrypto
An ASCII encoded byte string is decoded to a python string. It is truncated based on empircally obsesrved bytes which cause mbctowcs to stop converting in c++. This truncated string is then enocoded in utf-16, effectively padding it with 0x00 bytes between the characters. This new truncated, padded byte string is passed to a SHA1 hash and the first 128 bits of the digest are passed to a PyCrypto RC4 object.
Problem [SOLVED]
I can't seem to get the same results with Python 3.x w/ PyCrypto
C++ Code Skeleton:
HCRYPTPROV hProv = 0x00;
HCRYPTHASH hHash = 0x00;
HCRYPTKEY hKey = 0x00;
wchar_t sBuf[256] = {0};
CryptAcquireContextW(&hProv, L"FileContainer", L"Microsoft Enhanced RSA and AES Cryptographic Provider", 0x18u, 0);
CryptCreateHash(hProv, 0x8004u, 0, 0, &hHash);
//0x8004u is SHA1 flag
int len = mbstowcs(sBuf, iRec->desc, sizeof(sBuf));
//iRec is my "Record" class
//iRec->desc is 33 bytes within header of my encrypted file
//this will be used to create the hash key. (So this is the password)
CryptHashData(hHash, (const BYTE*)sBuf, len, 0);
CryptDeriveKey(hProv, 0x6801, hHash, 0, &hKey);
DWORD dataLen = iRec->compLen;
//iRec->compLen is the length of encrypted datablock
//it's also compressed that's why it's called compLen
CryptDecrypt(hKey, 0, 0, 0, (BYTE*)iRec->decrypt, &dataLen);
// iRec is my record that i'm decrypting
// iRec->decrypt is where I store the decrypted data
//&dataLen is how long the encrypted data block is.
//I get this from file header info
Python Code Skeleton:
from Crypto.Cipher import ARC4
from Crypto.Hash import SHA
#this is the Decipher method from my record class
def Decipher(self):
#get string representation of 33byte password
key_string= self.desc.decode('ASCII')
#so far, these characters fail, possibly others but
#for now I will make it a list
stop_chars = ['e','m']
#slice off anything beyond where mbstowcs will stop
for char in stop_chars:
wc_stop = key_string.find(char)
if wc_stop != -1:
#slice operation
key_string = key_string[:wc_stop]
#make "wide character"
#this is equivalent to padding bytes with 0x00
#Slice off the two byte "Byte Order Mark" 0xff 0xfe
wc_byte_string = key_string.encode('utf-16')[2:]
#slice off the trailing 0x00
wc_byte_string = wc_byte_string[:len(wc_byte_string)-1]
#hash the "wchar" byte string
#this is the equivalent to sBuf in c++ code above
#as determined by writing sBuf to file in tests
my_key = SHA.new(wc_byte_string).digest()
#create a PyCrypto cipher object
RC4_Cipher = ARC4.new(my_key[:16])
#store the decrypted data..these results NOW MATCH
self.decrypt = RC4_Cipher.decrypt(self.datablock)
Suspected [EDIT: Confirmed] Causes
1. mbstowcs conversion of the password resulted in the "original data" being fed to the SHA1 hash was not the same in python and c++. mbstowcs was stopping conversion at 0x65 and 0x6D bytes. Original data ended with a wide_char encoding of only part of the original 33 byte password.
RC4 can have variable length keys. In the Enhanced Win Crypt Sevice provider, the default length is 128 bits. Leaving the key length unspecified was taking the first 128 bits of the 160 bit SHA1 digest of the "original data"
How I investigated
edit: based on my own experimenting and the suggestions of #RolandSmith, I now know that one of my problems was mbctowcs behaving in a way I wasn't expecting. It seems to stop writing to sBuf on "e" (0x65) and "m"(0x6d) (probably others). So the passoword "Monkey" in my description (Ascii encoded bytes), would look like "M o n k" in sBuf because mbstowcs stopped at the e, and placed 0x00 between the bytes based on the 2 byte wchar typedef on my system. I found this by writing the results of the conversion to a text file.
BYTE pbHash[256]; //buffer we will store the hash digest in
DWORD dwHashLen; //store the length of the hash
DWORD dwCount;
dwCount = sizeof(DWORD); //how big is a dword on this system?
//see above "len" is the return value from mbstowcs that tells how
//many multibyte characters were converted from the original
//iRec->desc an placed into sBuf. In some cases it's 3, 7, 9
//and always seems to stop on "e" or "m"
fstream outFile4("C:/desc_mbstowcs.txt", ios::out | ios::trunc | ios::binary);
outFile4.write((const CHAR*)sBuf, int(len));
outFile4.close();
//now get the hash size from CryptGetHashParam
//an get the acutal hash from the hash object hHash
//write it to a file.
if(CryptGetHashParam(hHash, HP_HASHSIZE, (BYTE *)&dwHashLen, &dwCount, 0)) {
if(CryptGetHashParam(hHash, 0x0002, pbHash, &dwHashLen,0)){
fstream outFile3("C:/test_hash.txt", ios::out | ios::trunc | ios::binary);
outFile3.write((const CHAR*)pbHash, int(dwHashLen));
outFile3.close();
}
}
References:
wide characters cause problems depending on environment definition
Difference in Windows Cryptography Service between VC++ 6.0 and VS 2008
convert a utf-8 to utf-16 string
Python - converting wide-char strings from a binary file to Python unicode strings
PyCrypto RC4 example
https://www.dlitz.net/software/pycrypto/api/current/Crypto.Cipher.ARC4-module.html
Hashing a string with Sha256
http://msdn.microsoft.com/en-us/library/windows/desktop/aa379916(v=vs.85).aspx
http://msdn.microsoft.com/en-us/library/windows/desktop/aa375599(v=vs.85).aspx
You can test the size of wchar_t with a small test program (in C):
#include <stdio.h> /* for printf */
#include <stddef.h> /* for wchar_t */
int main(int argc, char *argv[]) {
printf("The size of wchar_t is %ld bytes.\n", sizeof(wchar_t));
return 0;
}
You could also use printf() calls in your C++ code to write e.g. iRec->desc and the result of the hash in sbuf to the screen if you can run the C++ program from a terminal. Otherwise use fprintf() to dump them to a file.
To better mimic the behavior of the C++ program, you could even use ctypes to call mbstowcs() in your Python code.
Edit: You wrote:
One problem is definitely with mbctowcs. It seems that it's transferring an unpredictable (to me) number of bytes into my buffer to be hashed.
Keep in mind that mbctowcs returns the number of wide characters converted. In other words, a 33 byte buffer in a multi-byte encoding
can contain anything from 5 (UTF-8 6-byte sequences) up to 33 characters depending on the encoding used.
Edit2: You are using 0 as the dwFlags parameter for CryptDeriveKey. According to its documentation, the upper 16 bits should contain the key length. You should check CryptDeriveKey's return value to see if the call succeeded.
Edit3: You could test mbctowcs in Python (I'm using IPython here.):
In [1]: from ctypes import *
In [2]: libc = CDLL('libc.so.7')
In [3]: monkey = c_char_p(u'Monkey')
In [4]: test = c_char_p(u'This is a test')
In [5]: wo = create_unicode_buffer(256)
In [6]: nref = c_size_t(250)
In [7]: libc.mbstowcs(wo, monkey, nref)
Out[7]: 6
In [8]: print wo.value
Monkey
In [9]: libc.mbstowcs(wo, test, nref)
Out[9]: 14
In [10]: print wo.value
This is a test
Note that in Windows you should probably use libc = cdll.msvcrt instead of libc = CDLL('libc.so.7').
Related
I have a question about utf16_t character interaction and SHA-256 generation with OpenSSL.
The thing is, I'm currently writing code that should deal with password hashing. I've generated a 256-bit hash, and I want to throw it into the database in a UTF-16 encoded character field. In my C++ code, I use char16_t to store such data. However, there is a problem. utf16_t can have more than 16 bytes, depending on the machine it ends up on. And if I use memcpy() to copy bytes from my SHA-256 hash, it may turn out to be a mess on some machines.
What should I do in this situation? Read bytes differently, store hashes in the database differently, maybe something else?
SHA256 generates 256 essentially random bits (32 bytes) of data. It will not always generate valid UTF-16 data.
You need to somehow encode the 32 bytes into more-than-32 utf-16 bytes to store in your database. Or you can convert the database field to a proper 256-bit binary type
One of the easier-to-implement ways to store it in your DB as a string would be to map each byte to a character 1-to-1 (and store 32 bytes of data with 32 bytes of zeroes in between):
unsigned char sha256_hash[256/8];
get_hash(sha256_hash);
// encoding
char16_t db_data[256/8];
for (int i = 0; i < std::size(db_data); ++i) {
db_data[i] = char16_t(sha256_hash[i]);
}
write_to_db(db_data);
char16_t db_data[256/8];
read_from_db(db_data);
// decoding
unsigned char sha256_hash[256/8];
for (int i = 0; i < std::size(sha256_hash); ++i) {
assert((std::uint16_t) db_data[i] <= 0xFF);
sha256_hash[i] = (unsigned char) db_data[i];
}
Be careful if you are using null-terminated strings though. You will need an extra character for the null terminator and map the 0 byte to something else (0x100 would be a good choice).
But if you have additional requirements (like it being readable characters), you might consider base64 or a hexadecimal encoding
I have code that parses OpenPGP packets and I have n, e of the public key packet as well as s of the signature packet as byte arrays.
In order to verify a signature I first initialize CryptAcquireContext (I also tried with PROV_RSA_FULL instead of PROV_RSA_AES)
HCRYPTPROV hCryptProv;
CryptAcquireContext(&hCryptProv, nullptr, nullptr, PROV_RSA_AES, CRYPT_VERIFYCONTEXT);
then create a hash
HCRYPTHASH hHash;
CryptCreateHash(hCryptProv, CALG_SHA1, 0, 0, &hHash); // as the digest algorithm of the signature was 2 => SHA1
and populate it using CryptHashData. This works so far as well as parsing and importing the public key using CryptImportKey.
typedef struct _RSAKEY
{
BLOBHEADER blobheader;
RSAPUBKEY rsapubkey;
BYTE n[4096 / 8];
} RSAKEY;
static int verify_signature_rsa(HCRYPTPROV hCryptProv, HCRYPTHASH hHash, public_key_t &p_pkey, signature_packet_t &p_sig)
{
int i_n_len = mpi_len(p_pkey.key.sig.rsa.n); // = 512; p_pkey.key.sig.rsa.n is of type uint8_t n[2 + 4096 / 8];
int i_s_len = mpi_len(p_sig.algo_specific.rsa.s); // = 256; p_sig.algo_specific.rsa.s is of type uint8_t s[2 + 4096 / 8]
HCRYPTKEY hPubKey;
RSAKEY rsakey;
rsakey.blobheader.bType = PUBLICKEYBLOB; // 0x06
rsakey.blobheader.bVersion = CUR_BLOB_VERSION; // 0x02
rsakey.blobheader.reserved = 0;
rsakey.blobheader.aiKeyAlg = CALG_RSA_KEYX;
rsakey.rsapubkey.magic = 0x31415352;// ASCII for RSA1
rsakey.rsapubkey.bitlen = i_n_len * 8; // = 4096
rsakey.rsapubkey.pubexp = 65537;
memcpy(rsakey.n, p_pkey.key.sig.rsa.n + 2, i_n_len); // skip first two byte which are MPI length
std::reverse(rsakey.n, rsakey.n + i_n_len); // need to convert to little endian for WinCrypt
CryptImportKey(hCryptProv, (BYTE*)&rsakey, sizeof(BLOBHEADER) + sizeof(RSAPUBKEY) + i_n_len, 0, 0, &hPubKey); // no error
std::unique_ptr<BYTE[]> pSig(new BYTE[i_s_len]);
memcpy(pSig.get(), p_sig.algo_specific.rsa.s + 2, i_s_len); // skip first two byte which are MPI length
std::reverse(p_sig.algo_specific.rsa.s, p_sig.algo_specific.rsa.s + i_s_len); // need to convert to little endian for WinCrypt
if (!CryptVerifySignature(hHash, pSig.get(), i_s_len, hPubKey, nullptr, 0))
{
DWORD err = GetLastError(); // err=2148073478 -> INVALID_SIGNATURE
CryptDestroyKey(hPubKey);
return -1;
}
CryptDestroyKey(hPubKey);
return 0;
}
CryptVerifySignature fails with GetLastError() decoding to INVALID_SIGNATURE.
On https://www.rfc-editor.org/rfc/rfc4880#section-5.2.2 I read
With RSA signatures, the hash value is encoded using PKCS#1 encoding
type EMSA-PKCS1-v1_5 as described in Section 9.2 of RFC 3447. This
requires inserting the hash value as an octet string into an ASN.1
structure.
Is that needed or is that automatically done by CryptVerifySignature? If not, how to do that?
The PKCS#1 padding is not likely to be the problem. The hint that it uses an OID for the hash algorithm by default is pointing to PKCS#1 v1.5 type of signatures, so I think you can rest assured that the right padding is used.
More confirmation can be found in the CryptSignHash documentation:
By default, the Microsoft RSA providers use the PKCS #1 padding method for the signature. The hash OID in the DigestInfo element of the signature is automatically set to the algorithm OID associated with the hash object. Using the CRYPT_NOHASHOID flag will cause this OID to be omitted from the signature.
Looking through the API documentation, the following caught my eye:
The native cryptography API uses little-endian byte order while the .NET Framework API uses big-endian byte order. If you are verifying a signature generated by using a .NET Framework API, you must swap the order of signature bytes before calling the CryptVerifySignature function to verify the signature.
This does mean that the API is not PKCS#1 v1.5 compliant as the byte order is explicitly specified therein. This is therefore certainly something to be aware of and could be part of a solution.
The error was in this line
std::reverse(p_sig.algo_specific.rsa.s, p_sig.algo_specific.rsa.s + i_s_len); // need to convert to little endian for WinCrypt
which should read
std::reverse(pSig.get(), pSig.get() + i_s_len); // need to convert to little endian for WinCrypt
because converting the source of the bytes from big to little endian does not convert another buffer after a copy.
I have a std::string output. Using utf8proc i would like to transform it into an valid utf8 string.
http://www.public-software-group.org/utf8proc-documentation
typedef int int32_t;
#define ssize_t int
ssize_t utf8proc_reencode(int32_t *buffer, ssize_t length, int options)
Reencodes the sequence of unicode characters given by the pointer buffer and length as UTF-8. The result is stored in the same memory area where the data is read. Following flags in the options field are regarded: (Documentation missing here) In case of success the length of the resulting UTF-8 string is returned, otherwise a negative error code is returned.
WARNING: The amount of free space being pointed to by buffer, has to exceed the amount of the input data by one byte, and the entries of the array pointed to by str have to be in the range of 0x0000 to 0x10FFFF, otherwise the program might crash!
So first, how do I add an extra byte at the end? Then how do I convert from std::string to int32_t *buffer?
This does not work:
std::string g = output();
fprintf(stdout,"str: %s\n",g.c_str());
g += " "; //add an extra byte??
g = utf8proc_reencode((int*)g.c_str(), g.size()-1, 0);
fprintf(stdout,"strutf8: %s\n",g.c_str());
You very likely don't actually want utf8proc_reencode() - that function takes a valid UTF-32 buffer and turns it into a valid UTF-8 buffer, but since you say you don't know what encoding your data is in then you can't use that function.
So, first you need to figure out what encoding your data is actually in. You can use http://utfcpp.sourceforge.net/ to test whether you already have valid UTF-8 with utf8::is_valid(g.begin(), g.end()). If that's true, you're done!
If false, things get complicated...but ICU ( http://icu-project.org/ ) can help you; see http://userguide.icu-project.org/conversion/detection
Once you somewhat reliably know what encoding your data is in, ICU can help again with getting it to UTF-8. For example, assuming your source data g is in ISO-8859-1:
UErrorCode err = U_ZERO_ERROR; // check this after every call...
// CONVERT FROM ISO-8859-1 TO UChar
UConverter *conv_from = ucnv_open("ISO-8859-1", &err);
std::vector<UChar> converted(g.size()*2); // *2 is usually more than enough
int32_t conv_len = ucnv_toUChars(conv_from, &converted[0], converted.size(), g.c_str(), g.size(), &err);
converted.resize(conv_len);
ucnv_close(conv_from);
// CONVERT FROM UChar TO UTF-8
g.resize(converted.size()*4);
UConverter *conv_u8 = ucnv_open("UTF-8", &err);
int32_t u8_len = ucnv_fromUChars(conv_u8, &g[0], g.size(), &converted[0], converted.size(), &err);
g.resize(u8_len);
ucnv_close(conv_u8);
after which your g is now holding UTF-8 data.
I am using this simple function for decrypting a AES Encrypted string
unsigned char *aes_decrypt(EVP_CIPHER_CTX *e, unsigned char *ciphertext, int *len)
{
int p_len = *len, f_len = 0;
unsigned char *plaintext = (unsigned char*)malloc(p_len + 128);
memset(plaintext,0,p_len);
EVP_DecryptInit_ex(e, NULL, NULL, NULL, NULL);
EVP_DecryptUpdate(e, plaintext, &p_len, ciphertext, *len);
EVP_DecryptFinal_ex(e, plaintext+p_len, &f_len);
*len = p_len + f_len;
return plaintext;
}
The problem is that len is returning a value that does not match the entire decoded string. What could be the problem ?
When you say "string", I assume you mean a zero-terminated textual string. The encryption process is dependent on a cipher block size, and oftentimes padding. What's actually being encoded and decoded is up to the application... it's all binary data to the cipher. If you're textual string is smaller than what's returned from the decrypt process, your application needs to determine the useful part. So for example if you KNOW your string inside the results is zero-terminated, you can get the length doing a simple strlen. That's risky of course if you can't guarantee the input... probably better off searching the results for a null up to the decoded length...
If you are using cipher in ECB, CBC or some other chaining modes, you must pad plain text to the length, which is multiple of cipher block length. You can see a PKCS#5 standard for example. High-level functions like in OpenSSL can perform padding transparently for programmer. So, encrypted text can be larger than plain text up to additional cipher block size.
I am trying to decrypt a piece of a file with wincrypt and I cannot seem to make this function decrypt correctly. The bytes are encrypted with the RC2 implementation in C# and I am supplying the same password and IV to both the encryption and decryption process (encrypted in C#, decrypted in c++).
All of my functions along the way are returning true until the final "CryptDecrypt" function. Instead of me typing out any more, here is the function:
static char* DecryptMyFile(char *input, char *password, int size)
{
HCRYPTPROV provider = NULL;
if(CryptAcquireContext(&provider, NULL, MS_ENHANCED_PROV, PROV_RSA_FULL, 0))
{printf("Context acquired.");}
else
{
if (GetLastError() == NTE_BAD_KEYSET)
{
if(CryptAcquireContext(&provider, 0, NULL, PROV_RSA_FULL, CRYPT_NEWKEYSET))
{printf("new key made.");}
else
{
printf("Could not acquire context.");
}
}
else
{printf("Could not acquire context.");}
}
HCRYPTKEY key = NULL;
HCRYPTHASH hash = NULL;
if(CryptCreateHash(provider, CALG_MD5, 0, 0, &hash))
{printf("empty hash created.");}
else
{printf("could not create hash.");}
if(CryptHashData(hash, (BYTE *)password, strlen(password), 0))
{printf("data buffer is added to hash.");}
else
{printf("error. could not add data buffer to hash.");}
if(CryptDeriveKey(provider, CALG_RC2, hash, 0, &key))
{printf("key derived.");}
else
{printf("Could not derive key.");}
DWORD dwKeyLength = 128;
if(CryptSetKeyParam(key, KP_EFFECTIVE_KEYLEN, reinterpret_cast<BYTE*>(&dwKeyLength), 0))
{printf("success");}
else
{printf("failed.");}
BYTE IV[8] = {0,0,0,0,0,0,0,0};
if(CryptSetKeyParam(key, KP_IV, IV, 0))
{printf("worked");}
else
{printf("faileD");}
DWORD dwCount = size;
BYTE *decrypted = new BYTE[dwCount + 1];
memcpy(decrypted, input, dwCount);
decrypted[dwCount] = 0;
if(CryptDecrypt(key,0, true, 0, decrypted, &dwCount))
{printf("succeeded");}
else
{printf("failed");}
return (char *)decrypted;
}
input is the data passed to the function, encrypted. password is the same password used to encrypt the data in C#. size is the size of the data while encrypted.
All of the above functions return true until CryptDecrypt, which I cannot seem to figure out why. At the same time, I'm not sure how the CryptDecrypt function would possibly edit my "decrypted" variable, since I am not passing a reference of it.
Any help or advice onto why this is not working would be greatly appreciated. This is my first endeavour with wincrypt and first time using C++ in years.
If it is of any more help, as well, this is my encryption (in C#):
public static byte[] EncryptString(byte[] input, string password)
{
PasswordDeriveBytes pderiver = new PasswordDeriveBytes(password, null);
byte[] ivZeros = new byte[8];
byte[] pbeKey = pderiver.CryptDeriveKey("RC2", "MD5", 128, ivZeros);
RC2CryptoServiceProvider RC2 = new RC2CryptoServiceProvider();
//using an empty initialization vector for convenience.
byte[] IV = new byte[8];
ICryptoTransform encryptor = RC2.CreateEncryptor(pbeKey, IV);
MemoryStream msEncrypt = new MemoryStream();
CryptoStream csEncrypt = new CryptoStream(msEncrypt, encryptor, CryptoStreamMode.Write);
csEncrypt.Write(input, 0, input.Length);
csEncrypt.FlushFinalBlock();
return msEncrypt.ToArray();
}
I have confirmed that my hash value in C++ is identical to my key in C#, created by PasswordDeriveBytes.CryptDeriveKey
First, as in my comment, use GetLastError() so you know what it failed. I'll assume that you get NTE_BAD_DATA, all the other errors are much more easier to deal with since they basically mean you missed some step int he API call sequence.
The typical reason why CryptDecrypt would fail with NTE_BAD_DATA would be that you're decrypting the last block of a block cypher (as you are) and the decrypted padding bytes are incorrect. This can happen if the input is truncated (not all encrypted bytes were saved to the file) or if the key is incorrect.
I would suggest you take this methodically since there are so many places where this can fail that will all manifest only at CryptDecrypt time:
Ensure that the file you encrypt in C# can be decrypted in C#. This would eliminate any file save truncation issues.
Try to encrypt and decrypt with fixed hard codded key first (no password derived), this will ensure that your key set code IV initialization are correct (as well as padding mode and cypher chaining mode).
Ensure that the password derivation process arives at the same hash. Things like ANSI vs. Unicode or terminal 0 can wreak havok on the MD5 hash and result in wildly different keys from apparently the same password hash.
Some people have discovered issues when moving between operating systems.
The CryptDeriveKey call uses a "default key length" based on the operating system and algorithm chosen. For RC2, the default generated key length is 40 bits on Windows 2000 and 128 bits on Windows 2003. This results in a "BAD DATA" return code when the generated key is used in a CryptDecrypt call.
Presumably this is related to "garbage" appearing at the end of the final buffer after trying to apply a 128 bit key to decrypt a 40 bit encrypted stream. The error code typically indicates bad padding bytes - but the root cause may be a key generation issue.
To generate a 40 bit encryption key, use ( ( 40 <<16 ) ) in the flags field of the CryptDeriveKey call.