I'm trying to implement AES cryptography using the AES machine instructions (basing it on Intel's white paper) available on my Sandy Bridge. Unfortunately, I've come to a halt in the phase of generating the round keys for decryption. Specifically, the instruction aesimc (applying the Inverse Mix Columns operation) returns an incorrect result.
In their paper they have an example:
So with input:
48 69 28 53 68 61 79 29 5B 47 75 65 72 6F 6E 5D
I get the following using _mm_aesimc_si128():
2D BF F9 31 99 CD 3A 37 B7 C7 81 FD 7D E0 3D 8E
It should have returned:
62 7A 6F 66 44 B1 09 C8 2B 18 33 0A 81 C3 B3 E5
Not the same result. Why is this the case?
If you want to reproduce it, I tested it with the code below (remember the arguments -maes -msse4 when compiling):
#include <wmmintrin.h>
#include <iostream>
using namespace std;
void print_m128i(__m128i data) {
unsigned char *ptr = (unsigned char*) &data;
for (int i = 0; i < 16; i++) {
int val = (int) ptr[i];
if (val < 0xF) {
cout << "0";
}
cout << uppercase << hex << val << " ";
}
cout << endl;
}
int main() {
unsigned char *data = (unsigned char*)
"\x48\x69\x28\x53\x68\x61\x79\x29\x5B\x47\x75\x65\x72\x6F\x6E\x5D";
__m128i num = _mm_loadu_si128((__m128i*) data);
__m128i num2 = _mm_aesimc_si128(num);
print_m128i(num2);
return 0;
}
EDIT: The example in Intel's white paper was wrong. As Hans suggested, my chip is little-endian so byte-swapping is necessary - to and fro.
The bytes are backwards. You want 0x5d to be the least significant byte, it has to come first. This is a little-endian chip. In VS, use Debug + Windows + Registers, right-click + tick SSE to see the register values.
Related
Can someone tell me whats wrong with my code?
It works fine in my test example.. but when I use it in production model it decrypts the string but adds a padded symbol to maintain some kind of block size or something.
I didn't post my encrypt/decrypt methods as they would make this post too big, plus they work fine as my test example decrypts and encrypts properly, ini.GetValue is a INI retrieval method there is nothing wrong with it, plus you can see the base64 size is the same as the example code, so I believe it works fine, I never had any problems with it before without encryption when I used it, it returns a const char*The problem is known as you can see the production code ciphertext has appended to it 2 null bytes which I find strange becasue both codes are pretty much identical, I'm not good at C++ so I'm probably overlooking some basic char array stuff
The encryption code I use is AES-256-CBC from OpenSSL 1.1.1
Look at my outputs to see whats wrong.
Good looking example code:
Ciphertext is:
000000: 7a e1 69 61 65 bb 74 ad 1a 68 8a ae 73 70 b6 0e z.iae.t..h..sp..
000010: 4f c9 45 9b 44 ca e2 be e2 aa 16 14 cd b1 79 7b O.E.D.........y{
000020: 86 a5 92 26 e6 08 3e 55 61 4e 60 03 50 f3 e4 c1 ...&..>UaN`.P...
000030: fe 5a 2c 0b df c9 1b d8 92 1f 48 75 0d f8 c2 44 .Z,.......Hu...D
Base64 (size=88):
000000: 65 75 46 70 59 57 57 37 64 4b 30 61 61 49 71 75 euFpYWW7dK0aaIqu
000010: 63 33 43 32 44 6b 2f 4a 52 5a 74 45 79 75 4b 2b c3C2Dk/JRZtEyuK+
000020: 34 71 6f 57 46 4d 32 78 65 58 75 47 70 5a 49 6d 4qoWFM2xeXuGpZIm
000030: 35 67 67 2b 56 57 46 4f 59 41 4e 51 38 2b 54 42 5gg+VWFOYANQ8+TB
000040: 2f 6c 6f 73 43 39 2f 4a 47 39 69 53 48 30 68 31 /losC9/JG9iSH0h1
000050: 44 66 6a 43 52 41 3d 3d DfjCRA==
b cip len = 64
a cip len = 16
plain b = 0
plain a = 3
Decrypted text is:
wtf
Decrypted base64 is:
wtf
000000: 77 74 66 00 wtf.
Bad production code example:
Base64 (size=88)
000000: 6a 7a 48 30 46 71 73 54 45 47 4d 76 2f 67 76 59 jzH0FqsTEGMv/gvY
000010: 4d 73 34 54 2f 39 58 32 6c 37 54 31 4d 6d 56 61 Ms4T/9X2l7T1MmVa
000020: 36 45 4f 38 52 64 45 57 42 6b 65 48 71 31 31 45 6EO8RdEWBkeHq11E
000030: 39 2b 77 37 47 4e 49 4a 47 4a 71 42 55 74 54 70 9+w7GNIJGJqBUtTp
000040: 30 36 58 46 31 4d 66 45 79 44 45 71 5a 69 58 54 06XF1MfEyDEqZiXT
000050: 79 45 53 6b 65 41 3d 3d yESkeA==
Ciphertext is:
000000: 8f 31 f4 16 ab 13 10 63 2f fe 0b d8 32 ce 13 ff .1.....c/...2...
000010: d5 f6 97 b4 f5 32 65 5a e8 43 bc 45 d1 16 06 47 .....2eZ.C.E...G
000020: 87 ab 5d 44 f7 ec 3b 18 d2 09 18 9a 81 52 d4 e9 ..]D..;......R..
000030: d3 a5 c5 d4 c7 c4 c8 31 2a 66 25 d3 c8 44 a4 78 .......1*f%..D.x
000040: 00 00 ..
b cip len = 65
a cip len = 17
crypt miss-match
plain b = 16
crypt write fail
plain a = 16
000000: 77 74 66 09 09 09 09 09 09 09 09 05 05 05 05 05 wtf.............
Here are my codes as you can see they both look very similar so I don't understand whats the problem.
Here is a little helper function for hexdump outputs I use.
void Hexdump(void* ptr, int buflen)
{
unsigned char* buf = (unsigned char*)ptr;
int i, j;
for (i = 0; i < buflen; i += 16) {
myprintf("%06x: ", i);
for (j = 0; j < 16; j++)
if (i + j < buflen)
myprintf("%02x ", buf[i + j]);
else
myprintf(" ");
myprintf(" ");
for (j = 0; j < 16; j++)
if (i + j < buflen)
myprintf("%c", isprint(buf[i + j]) ? buf[i + j] : '.');
myprintf("\n");
}
}
char* base64(const unsigned char* input, int length) {
const auto pl = 4 * ((length + 2) / 3);
auto output = reinterpret_cast<char*>(calloc(pl + 1, 1)); //+1 for the terminating null that EVP_EncodeBlock adds on
const auto ol = EVP_EncodeBlock(reinterpret_cast<unsigned char*>(output), input, length);
if (pl != ol) { myprintf("b64 calc %d,%d\n",pl, ol); }
return output;
}
unsigned char* decode64(const char* input, int length) {
const auto pl = 3 * length / 4;
auto output = reinterpret_cast<unsigned char*>(calloc(pl + 1, 1));
const auto ol = EVP_DecodeBlock(output, reinterpret_cast<const unsigned char*>(input), length);
if (pl != ol) { myprintf("d64 calc %d,%d\n", pl, ol); }
return output;
}
Here is the test example that works fine.
/* enc test */
/* Message to be encrypted */
unsigned char* plaintext = (unsigned char*)"wtf";
/*
* Buffer for ciphertext. Ensure the buffer is long enough for the
* ciphertext which may be longer than the plaintext, depending on the
* algorithm and mode.
*/
unsigned char* ciphertext = new unsigned char[128];
/* Buffer for the decrypted text */
unsigned char decryptedtext[128];
int decryptedtext_len, ciphertext_len;
/* Encrypt the plaintext */
ciphertext_len = encrypt(plaintext, strlen((char*)plaintext), ciphertext);
/* Do something useful with the ciphertext here */
myprintf("Ciphertext is:\n");
Hexdump((void*)ciphertext, ciphertext_len);
myprintf("Base64 (size=%d):\n", strlen(base64(ciphertext, ciphertext_len)));
Hexdump((void*)base64(ciphertext, ciphertext_len), 4 * ((ciphertext_len + 2) / 3));
/* Decrypt the ciphertext */
decryptedtext_len = decrypt(ciphertext, ciphertext_len, decryptedtext);
/* Add a NULL terminator. We are expecting printable text */
decryptedtext[decryptedtext_len] = '\0';
/* Show the decrypted text */
myprintf("Decrypted text is:\n");
myprintf("%s\n", decryptedtext);
myprintf("Decrypted base64 is:\n");
myprintf("%s\n", decode64(base64(decryptedtext, decryptedtext_len), 4 * ((decryptedtext_len + 2) / 3)));
Hexdump(decode64(base64(decryptedtext, decryptedtext_len), 4 * ((decryptedtext_len + 2) / 3)), 4 * ((decryptedtext_len + 2) / 3));
/* enc test end */
Here is the bad production code:
//Decrypt the username
const char* b64buffer = ini.GetValue("Credentials", "SavedPassword", "");
int b64buffer_length = strlen(b64buffer);
myprintf("Base64 (size=%d)\n", b64buffer_length);
Hexdump((void*)b64buffer, b64buffer_length);
int decryptedtext_len;
int decoded_size = 3 * b64buffer_length / 4;
unsigned char* decryptedtext = new unsigned char[decoded_size];
//unsigned char* ciphertext = decode64(b64buffer, b64buffer_length); //had this before same problem as below line, this worked without initializing new memory I perfer to fix this back up
unsigned char* ciphertext = new unsigned char[decoded_size];
memcpy(ciphertext, decode64(b64buffer, b64buffer_length), decoded_size); //same problem as top line.
myprintf("Ciphertext is:\n");
Hexdump((void*)ciphertext, decoded_size);
/* Decrypt the ciphertext */
decryptedtext_len = decrypt(ciphertext, decoded_size - 1, decryptedtext);
/* Add a NULL terminator. We are expecting printable text */
decryptedtext[decryptedtext_len] = '\0';
Hexdump(decryptedtext, decryptedtext_len);
strcpy(password_setting, (char*)decryptedtext); //save decrypted password back
delete[] decryptedtext;
delete[] ciphertext;
In the example that works, you get ciphertext_len directly from the encryption function. When you display the ciphertext, you use this length.
In the "bad production code", you calculate decoded_size from the length of the Base64 data. However, Base64 encoded data always has a length that is a multiple of 4. If the original data size is not a multiple of 3, then there are one or two padding characters added to the string. In both of your examples, you have two of these characters, the '=' at the end of the Base64 data.
When calculating the length of the decrypted data, you need to account for these bytes. If there are no '=' characters at the end of the string, use the length that you calculated (3 * N / 4). If there is one '=' character, reduce that calculated length by 1, and if there are two '=' characters, reduce the calculated length by 2. (There will not be 3 padding characters.)
Edit: Here is my fix: (sspoke)
char* base64(const unsigned char* input, int length) {
const auto pl = 4 * ((length + 2) / 3);
auto output = reinterpret_cast<char*>(calloc(pl + 1, 1)); //+1 for the terminating null that EVP_EncodeBlock adds on
const auto ol = EVP_EncodeBlock(reinterpret_cast<unsigned char*>(output), input, length);
if (pl != ol) { printf("encode64 fail size size %d,%d\n",pl, ol); }
return output;
}
unsigned char* decode64(const char* input, int* length) {
//Old code generated base length sizes because it didn't take into account the '==' signs.
const auto pl = 3 * *length / 4;
auto output = reinterpret_cast<unsigned char*>(calloc(pl + 1, 1));
const auto ol = EVP_DecodeBlock(output, reinterpret_cast<const unsigned char*>(input), *length);
if (pl != ol) { printf("decode64 fail size size %d,%d\n", pl, ol); }
//Little bug fix I added to fix incorrect length's because '==' signs are not considered in the output. -sspoke
if (*length > 3 && input[*length - 1] == '=' && input[*length - 2] == '=')
*length = ol - 2;
else if (*length > 2 && input[*length - 1] == '=')
*length = ol - 1;
else
*length = ol;
return output;
}
I have a chunk of data which is supposed to be zlib compressed data (I was not 100% sure).
I first tried to uncompress it with gzip by prepending "1F 8B 08 00 00 00 00 00". Just like in the accepted answer of this thread (https://unix.stackexchange.com/questions/22834/how-to-uncompress-zlib-data-in-unix). It worked out and it was probably the right approach, because the output contained a lot of human readable strings.
I then tried to implement this in a c++ program using zlib. But it seems that zlib generates a different output. Am I missing something? zlib and gzip should be basically the same (despite the headers and trailers), shouldn't they? Or do I have a simple error in my code below? (the chunk of data is shortened for the sake of simplicity)
unsigned char* decompressed;
unsigned char* dataChunk = /*...*/;
printHex(dataChunk, 160);
int error = inflateZlib(dataChunk, 160, decompressed, 1000);
printHex(decompressed, 160);
//zerr(error);
printHex(unsigned char* data, size_t n)
{
for(size_t i = 0; i < n; i++)
{
std::cout << std::hex << (uint16_t)data[i] << " ";
}
std::cout << std::dec << "\n-\n";
}
int inflateZlib(unsigned char* data, size_t length, unsigned char* decompressed, size_t maxDecompressed)
{
decompressed = new unsigned char[maxDecompressed];
z_stream infstream;
infstream.zalloc = Z_NULL;
infstream.zfree = Z_NULL;
infstream.opaque = Z_NULL;
infstream.avail_in = (uInt)(length); // size of input
infstream.next_in = (Bytef *)data; // input char array
infstream.avail_out = (uInt)maxDecompressed; // size of output
infstream.next_out = (Bytef *)decompressed; // output char array
// the actual DE-compression work.
int ret = inflateInit(&infstream);
zerr(ret);
ret = inflate(&infstream, Z_NO_FLUSH);
zerr(ret);
inflateEnd(&infstream);
return ret;
}
This produces the following output:
78 9c bd 58 4b 88 23 45 18 ee 3c 67 e3 24 93 cc ae 8a f8 42 10 c4 cb 1a 33 a3 7b f0 60 e6 e0 e6 e0 49 90 bd 29 4d 4d 77 25 dd 99 ee ea de aa ee 4c 32 82 2c e8 c1 93 ac 47 c5 45 f 82 8 5e 16 f ba 78 18 45 d0 83 7 95 15 5c d0 c3 aa b0 b2 ee 65 5c f0 e4 c5 bf aa 1f a9 ea 74 cf 64 7 31 c3 24 9d fa fe bf ea ab ff 59 15 ab 62 6a b5 5d 9b 8c 18 2a 5b 15 47 d3 b4 92 55 35 b5 ba b7 3d c6 46 b0 a3 35 3 1c 50 64 61 93 7a a4 67 d5 0 e1 c2 d8 e4 92 75 fe 56 b3 ca a6 76 c2 f0 1c 8f
-
0 0 6 c0 83 50 0 0 16 b0 78 9c bd 58 4b 88 23 45 18 ee 3c 67 e3 24 93 cc ae 8a f8 42 10 c4 cb 1a 33 a3 7b f0 60 e6 e0 e6 e0 49 90 bd 29 4d 4d 77 25 dd 99 ee ea de aa ee 4c 32 82 2c e8 c1 93 ac 47 c5 45 f 82 8 5e 16 f ba 78 18 45 d0 83 7 95 15 5c d0 c3 aa b0 b2 ee 65 5c f0 e4 c5 bf aa 1f a9 ea 74 cf 64 7 31 c3 24 9d fa fe bf ea ab ff 59 15 ab 62 6a b5 5d 9b 8c 18 2a 5b 15 47 d3 b4 92 55 35 b5 ba b7 3d c6 46 b0 a3 35 3 1c 50 64 61 93 7a a4 67 d5 0 e1 c2 d8 e4 92 75
-
which is not what I want.
Whereas gzip:
printf "\x1f\x8b\x08\x00\x00\x00\x00\x00\x78\x9c\xbd\x58\x4b\x88\x23\x45\x18\xee\x3c\x67\xe3\x24\x93\xcc\xae\x8a\xf8\x42\x10\xc4\xcb\x1a\x33\xa3\x7b\xf0\x60\xe6\xe0\xe6\xe0\x49\x90\xbd\x29\x4d\x4d\x77\x25\xdd\x99\xee\xea\xde\xaa\xee\x4c\x32\x82\x2c\xe8\xc1\x93\xac\x47\xc5\x45\xf\x82\x8\x5e\x16\xf\xba\x78\x18\x45\xd0\x83\x7\x95\x15\x5c\xd0\xc3\xaa\xb0\xb2\xee\x65\x5c\xf0\xe4\xc5\xbf\xaa\x1f\xa9\xea\x74\xcf\x64\x7\x31\xc3\x24\x9d\xfa\xfe\xbf\xea\xab\xff\x59\x15\xab\x62\x6a\xb5\x5d\x9b\x8c\x18\x2a\x5b\x15\x47\xd3\xb4\x92\x55\x35\xb5\xba\xb7\x3d\xc6\x46\xb0\xa3\x35\x3\x1c\x50\x64\x61\x93\x7a\xa4\x67\xd5\x0\xe1\xc2\xd8\xe4\x92\x75\xfe\x56\xb3\xca\xa6\x76\xc2\xf0\x1c\x8f" | gzip -dc | hexdump -C
produces:
gzip: stdin: unexpected end of file
00000000 68 03 64 00 05 77 69 6e 67 73 61 02 68 03 6c 00 |h.d..wingsa.h.l.|
00000010 00 00 01 68 04 64 00 06 6f 62 6a 65 63 74 6b 00 |...h.d..objectk.|
00000020 0c 74 65 74 72 61 68 65 64 72 6f 6e 31 68 05 64 |.tetrahedron1h.d|
00000030 00 06 77 69 6e 67 65 64 6c 00 00 00 06 6c 00 00 |..wingedl....l..|
00000040 00 05 68 02 64 00 08 63 6f 6c 6f |..h.d..colo|
0000004b
which is what I want.
I was able to decode the data you provided by using zlib 1.2.8 and the inflateInit2 function with 32 for windowBits. I used 32 based on this information from the zlib documentation:
windowBits can also be zero to request that inflate use the window size in the zlib header of the compressed stream.
and
Add 32 to windowBits to enable zlib and gzip decoding with automatic header detection
Here's the full code. I stripped out error checking since I don't have a zerr function. It doesn't appear you're using Visual C++, so you will want to remove the #pragma to avoid a warning as well.
#include <iostream>
#include <iomanip>
#include <cstdint>
#include <cctype>
#include "zlib.h"
#pragma comment(lib, "zdll.lib")
const size_t block_size = 16;
void printLine(unsigned char* data, size_t offset, size_t n)
{
if(n)
{
std::cout << std::setw(8) << std::setfill('0') << std::right << offset << " ";
for(size_t x = 0; x < block_size; ++x)
{
if(x % (block_size/2) == 0) std::cout << " ";
uint16_t d = x < n ? data[x] : 0;
std::cout << std::hex << std::setw(2) << d << " ";
}
std::cout << "|";
for(size_t x = 0; x < block_size; ++x)
{
int c = (x < n && isalnum(data[x])) ? data[x] : '.';
std::cout << static_cast<char>(c);
}
std::cout << "|\n";
}
}
void printHex(unsigned char* data, size_t n)
{
const size_t blocks = n / block_size;
const size_t remainder = n % block_size;
for(size_t i = 0; i < blocks; i++)
{
size_t offset = i * block_size;
printLine(&data[offset], offset, block_size);
}
size_t offset = blocks * block_size;
printLine(&data[offset], offset, remainder);
std::cout << "\n";
}
int inflateZlib(unsigned char* data, uint32_t length, unsigned char* decompressed, uint32_t maxDecompressed)
{
z_stream infstream;
infstream.zalloc = Z_NULL;
infstream.zfree = Z_NULL;
infstream.opaque = Z_NULL;
infstream.avail_in = length;
infstream.next_in = data;
infstream.avail_out = maxDecompressed;
infstream.next_out = decompressed;
inflateInit2(&infstream, 32);
inflate(&infstream, Z_FINISH);
inflateEnd(&infstream);
return infstream.total_out;
}
int main()
{
unsigned char dataChunk[] =
"\x1f\x8b\x08\x00\x00\x00\x00\x00\x78\x9c\xbd\x58\x4b\x88\x23\x45"
"\x18\xee\x3c\x67\xe3\x24\x93\xcc\xae\x8a\xf8\x42\x10\xc4\xcb\x1a"
"\x33\xa3\x7b\xf0\x60\xe6\xe0\xe6\xe0\x49\x90\xbd\x29\x4d\x4d\x77"
"\x25\xdd\x99\xee\xea\xde\xaa\xee\x4c\x32\x82\x2c\xe8\xc1\x93\xac"
"\x47\xc5\x45\xf\x82\x8\x5e\x16\xf\xba\x78\x18\x45\xd0\x83\x7\x95"
"\x15\x5c\xd0\xc3\xaa\xb0\xb2\xee\x65\x5c\xf0\xe4\xc5\xbf\xaa\x1f"
"\xa9\xea\x74\xcf\x64\x07\x31\xc3\x24\x9d\xfa\xfe\xbf\xea\xab\xff"
"\x59\x15\xab\x62\x6a\xb5\x5d\x9b\x8c\x18\x2a\x5b\x15\x47\xd3\xb4"
"\x92\x55\x35\xb5\xba\xb7\x3d\xc6\x46\xb0\xa3\x35\x03\x1c\x50\x64"
"\x61\x93\x7a\xa4\x67\xd5\x00\xe1\xc2\xd8\xe4\x92\x75\xfe\x56\xb3"
"\xca\xa6\x76\xc2\xf0\x1c\x8f";
unsigned char decompressed[1000] = {};
printHex(dataChunk, sizeof(dataChunk));
uint32_t len = inflateZlib(dataChunk, sizeof(dataChunk), decompressed, sizeof(decompressed));
printHex(decompressed, len);
return 0;
}
I think you might want to define decompressed differently:
unsigned char decompressed[1000];
I have the following code, which writes 6 floats to disk in binary form and reads them back:
#include <iostream>
#include <cstdio>
int main()
{
int numSegs = 2;
int numVars = 3;
float * data = new float[numSegs * numVars];
for (int i = 0; i < numVars * numSegs; ++i) {
data[i] = i * .23;
std::cout << data[i] << std::endl;
}
FILE * handle = std::fopen("./sandbox.out", "wb");
long elementsWritten =
std::fwrite(data, sizeof(float), numVars*numSegs, handle);
if (elementsWritten != numVars*numSegs){
std::cout << "Error" << std::endl;
}
fclose(handle);
handle = fopen("./sandbox.out", "rb");
float * read = new float[numSegs * numVars];
fseek(handle, 0, SEEK_SET);
fread(read, sizeof(float), numSegs*numVars, handle);
for (int i = 0; i < numVars * numSegs; ++i) {
std::cout << read[i] << std::endl;
}
}
It outputs:
0
0.23
0.46
0.69
0.92
1.15
0
0.23
0.46
0.69
0.92
1.15
When I load the file in a hexer, we get:
00 00 00 00 1f 85 6b 3e 1f 85 eb 3e d7 a3 30 3f
1f 85 6b 3f 33 33 93 3f -- -- -- -- -- -- -- --
I want to be calculate the float value from the decimal directly. For example: 1f 85 6b 3e becomes 0.23 and 1f 85 eb 3e becomes 0.46.
I've tried a few "binary to float" calculators on the web. When I put in the hexadecimal representation of the number, 0x1f856b3e, in both calculators I get back 5.650511E-20
. But I thought the value should be 0.23 since I provided bytes 5-8 to the calculator and these bytes represent the second float written to disk.
What am I doing wrong?
This is an endianness issue if you for example switch:
1f 85 6b 3e
to:
3e 6b 85 1f
this will result in .23 when you convert it using one of your converters, for example I used IEEE 754 Converter and Floating Point to Hex Converter allows you to do double as well as single precision conversions.
I am trying to XOR some already encrypted files.
I know that the XOR key is 0x14 or dec(20).
My code works except for one thing. All the '4' is gone.
Here is my function for the XOR:
void xor(string &nString) // Time to undo what we did from above :D
{
const int KEY = 0x14;
int strLen = (nString.length());
char *cString = (char*)(nString.c_str());
for (int i = 0; i < strLen; i++)
{
*(cString+i) = (*(cString+i) ^ KEY);
}
}
Here is part of my main:
ifstream inFile;
inFile.open("ExpTable.bin");
if (!inFile) {
cout << "Unable to open file";
}
string data;
while (inFile >> data) {
xor(data);
cout << data << endl;
}
inFile.close();
Here is a part of the encypted file:
$y{bq //0 move
%c|{ //1 who
&c|qfq //2 where
'saufp //3 guard
x{wu`}{z //4 location
But x{wu}{z` is returning //location. Its not displaying the 4.
Note the space infront of the X. thats supposed to be decoded to 4.
What am I missing? Why is it not showing all the 4? <space> = 4 // 4 = <space>
UPDATE
This is the list of all the specific conversions:
HEX(enc) ASCII(dec)
20 4
21 5
22 6
23 7
24 0
25 1
26 2
27 3
28 <
29 =
2a >
2b ?
2c 8
2d 9
2e :
2f ;
30 $
31 %
32 &
33 '
34
35 !
36 "
37 #
38 ,
39 -
3a .
3b /
3c (
3d )
3e *
3f +
40 T
41 U
42 V
43 W
44 P
45 Q
46 R
47 S
48 \
49 ]
4a ^
4b _
4c X
4d Y
4e Z
4f [
50 D
51 E
52 F
53 G
54 #
55 A
56 B
57 C
58 L
59 M
5a N
5b O
5c H
5d I
5e J
5f K
60 t
61 u
62 v
63 w
64 p
65 q
66 r
67 s
68 |
69 }
6a
6b
6c x
6d y
6e z
6f {
70 d
71 e
72 f
73 g
75 a
76 b
77 c
78 l
79 m
7a n
7b o
7c h
7d i
7e j
7f k
1d /tab
1e /newline
Get rid of all casts.
Don't use >> for input.
That should fix your problems.
Edit:
// got bored, wrote some (untested) code
ifstream inFile;
inFile.open("ExpTable.bin", in | binary);
if (!inFile) {
cerr << "Unable to open ExpTable.bin: " << strerror(errno) << "\n";
exit(EXIT_FAILURE);
}
char c;
while (inFile.get(c)) {
cout.put(c ^ '\x14');
}
inFile.close();
Are you sure that it is printing '//location'? I think it would print '// location' -- note the space after the double-slash. You are XORing 0x34 with 0x14. The result is 0x20, which is a space character. Why would you want to xor everything with 0x14 anyway?
** edit ** ignore the above; I missed part of your question. The real answer:
Are you entirely sure that the character before the x is a 0x20? Perhaps it's some unprintable character that looks like a space? I would check the hex value.
Every time i try to read a file form the hard drive and cast the data into a structure, i end up with problems of the data not casting properly. Is there a requirement with the reinterpret_cast() function that requires the number of bytes in a structure be a multiple of 4 bytes? If not, what am I doing wrong? If so, how do i get around that?
my structure looks like this: (they are in 50 byte chunks)
class stlFormat
{
public:
float normalX, normalY, normalZ;
float x1,y1,z1;
float x2,y2,z2;
float x3,y3,z3;
char byte1, byte2;
};
Rest of my code:
void main()
{
int size;
int numTriangles;
int * header = new int [21]; // size of header
ifstream stlFile ("tetrahedron binary.STL", ios::in|ios::binary|ios::ate);
size = stlFile.tellg(); // get the size of file
stlFile.seekg(0, ios::beg); //read the number of triangles in the file
stlFile.read(reinterpret_cast<char*>(header), 84);
numTriangles = header[20];
stlFormat * triangles = new stlFormat [numTriangles]; //create data array to hold vertex data
stlFile.seekg (84, ios::beg); //read vertex data and put them into data array
stlFile.read(reinterpret_cast<char*>(triangles), (numTriangles * 50));
cout << "number of triangles: " << numTriangles << endl << endl;
for (int i = 0; i < numTriangles; i++)
{
cout << "triangle " << i + 1 << endl;
cout << triangles[i].normalX << " " << triangles[i].normalY << " " << triangles[i].normalZ << endl;
cout << triangles[i].x1 << " " << triangles[i].y1 << " " << triangles[i].z1 << endl;
cout << triangles[i].x2 << " " << triangles[i].y2 << " " << triangles[i].z2 << endl;
cout << triangles[i].x3 << " " << triangles[i].z3 << " " << triangles[i].z3 << endl << endl;
}
stlFile.close();
getchar();
}
Just for you John, although its rather incomprehensible. Its in hex format.
73 6f 6c 69 64 20 50 61 72 74 33 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
04 00 00 00 ec 05 51 bf ab aa aa 3e ef 5b f1 be
00 00 00 00 00 00 00 00 f3 f9 2f 42 33 33 cb 41
80 e9 25 42 9a a2 ea 41 33 33 cb 41 00 00 00 00
00 00 00 00 00 00 00 00 00 00 ab aa aa 3e ef 5b
71 3f 33 33 4b 42 00 00 00 00 f3 f9 2f 42 33 33
cb 41 80 e9 25 42 9a a2 ea 41 00 00 00 00 00 00
00 00 f3 f9 2f 42 00 00 ec 05 51 3f ab aa aa 3e
ef 5b f1 be 33 33 cb 41 00 00 00 00 00 00 00 00
33 33 cb 41 80 e9 25 42 9a a2 ea 41 33 33 4b 42
00 00 00 00 f3 f9 2f 42 00 00 00 00 00 00 00 00
80 bf 00 00 00 00 33 33 cb 41 00 00 00 00 00 00
00 00 33 33 4b 42 00 00 00 00 f3 f9 2f 42 00 00
00 00 00 00 00 00 f3 f9 2f 42 00 00
Most likely, float has an alignment of four bytes on your system. This means that, because you use it in your structure, the compiler will make sure the start of the structure when allocated using normal methods will always be a multiple of four bytes. Since the raw size of your structure is 4*12+2 = 50 bytes, it needs to be rounded up to the next multiple of four bytes - otherwise, the second element of arrays of this structure would be unaligned. So your struct ends up 52 bytes, throwing off your parsing.
If you need to parse a binary format, it's often a good idea to either use compiler-specific directives to disable alignment, or read one field at a time, to avoid these problems.
For example, on MSVC++, you can use __declspec(align(1)) Edit: Actually __declspec(align(X)) can only increase alignment restrictions. Oops. You'll need to either load one field at a time, or make the padding part of the binary format.
I used my favorite text editor (editpadpro) to save the file you posted in the OP as a binary file called "c:\work\test.bin", edited your code to the following, and it (apparently) produced the correct (expected) output. Please try it out.
#include <cstdlib>
#include <iostream>
#include <fstream>
using namespace std;
#pragma pack( push, 1 )
class stlFormat
{
public:
float normalX, normalY, normalZ;
float x1,y1,z1;
float x2,y2,z2;
float x3,y3,z3;
char byte1, byte2;
};
#pragma pack( pop )
struct foo
{
char c, d, e;
};
void main()
{
size_t sz = sizeof(foo);
int size;
int numTriangles;
int * header = new int [21]; // size of header
ifstream stlFile ("c:\\work\\test.bin", ios::in|ios::binary|ios::ate);
size = stlFile.tellg(); // get the size of file
stlFile.seekg(0, ios::beg); //read the number of triangles in the file
stlFile.read(reinterpret_cast<char*>(header), 84);
numTriangles = header[20];
stlFormat * triangles = new stlFormat [numTriangles]; //create data array to hold vertex data
stlFile.seekg (84, ios::beg); //read vertex data and put them into data array
stlFile.read(reinterpret_cast<char*>(triangles), (numTriangles * 50));
cout << "number of triangles: " << numTriangles << endl << endl;
for (int i = 0; i < numTriangles; i++)
{
cout << "triangle " << i + 1 << endl;
cout << triangles[i].normalX << " " << triangles[i].normalY << " " << triangles[i].normalZ << endl;
cout << triangles[i].x1 << " " << triangles[i].y1 << " " << triangles[i].z1 << endl;
cout << triangles[i].x2 << " " << triangles[i].y2 << " " << triangles[i].z2 << endl;
cout << triangles[i].x3 << " " << triangles[i].z3 << " " << triangles[i].z3 << endl << endl;
}
stlFile.close();
getchar();
}
instead of fiddling with padding and differences between platforms, maybe have a look at serialization to/from binary files? It might be somewhat less performant then reading data straight into memory, but it's way more extensible.
You should be aware that you are throwing portability out the window with that kind of code: your files may be incompatible with new versions of your program if you compile with a different compiler or for a different system.
That said, you might fix this by using sizeof( int[21] ) and sizeof( stlFormat[ numTriangles ] ) rather than hardcoded sizes in bytes. Reason being, as others noted, the alignment bytes your compiler may or may not add.
If this is a program that other people may use or files might be shared, look up serialization.
IMO you really ought to be explicitly reading the triangles directly (deserialization) instead of casting bytes. Doing so will help you avoid portability and performance problems. If you're doing a lot of calculations with those triangles after you read them, the performance hit for using a non-standard memory layout can be non-trivial.
Replace the line "stlFile.read(reinterpret_cast(triangles), (numTriangles * 50));" with this:
for (int i = 0; i < numTriangles; i++)
{
stlFile.read((char*)&triangles[i].normalX, sizeof(float));
stlFile.read((char*)&triangles[i].normalY, sizeof(float));
stlFile.read((char*)&triangles[i].normalZ, sizeof(float));
stlFile.read((char*)&triangles[i].x1, sizeof(float));
stlFile.read((char*)&triangles[i].y1, sizeof(float));
stlFile.read((char*)&triangles[i].z1, sizeof(float));
stlFile.read((char*)&triangles[i].x2, sizeof(float));
stlFile.read((char*)&triangles[i].y2, sizeof(float));
stlFile.read((char*)&triangles[i].z2, sizeof(float));
stlFile.read((char*)&triangles[i].x3, sizeof(float));
stlFile.read((char*)&triangles[i].y3, sizeof(float));
stlFile.read((char*)&triangles[i].z3, sizeof(float));
stlFile.read(&triangles[i].byte1, 1);
stlFile.read(&triangles[i].byte2, 1);
}
It takes a little more code and a little more time to read in the triangles, but you'll avoid a few potential headaches.
Note that writing triangles also requires similar code to avoid inadvertently writing out some padding.
I think the problem is not so much the reading of each individual triangle as that the triangle array isn't laid out as you think. There appear to be 50 bytes in each struct, but the allocated memory is almost certainly laid out as if the structs were 52 bytes. Consider reading in each struct individually.
Two more points:
First, there is no such thing as void main in C++. Use int main().
Second, you seem to be leaking memory. You'd be better off in general using the vector facility.
Storing a struct entirely at once isn't portable unless you take great care with compiler-specific flags and all compilers and architectures might still not allow the same binary format. Storing a field (e.g. a floating-point number) at a time is better, but still isn't portable because of endianess issues and possibly different data types (e.g. what is sizeof(long) on your system).
In order to save integers safely and portably, you have to format them byte at a time into a char buffer that will then be written out to a file. E.g.
char buf[100]; // Extra space for more values (instead of only 4 bytes)
// Write a 32 bit integer value into buf, using big endian order
buf[0] = value >> 24; // The most significant byte
buf[1] = value >> 16;
buf[2] = value >> 8;
buf[3] = value; // The least significant byte
Similarly, reading back has to be done a byte at a time:
// Converting the pointer to unsigned to avoid sign extension issues
unsigned char* ubuf = reinterpret_cast<unsigned char*>(buf);
value = ubuf[0] << 24 | ubuf[1] << 16 | ubuf[2] << 8 | ubuf[3];
If little endian order is desired, invert the indexing order of buf and ubuf.
Because no pointer casting of integer types to char or vice-versa are done, the code is fully portable. Doing the same for floating-point types requires extra caution and a pointer cast so that the value can be handled as an integer, so that bit shifting works. I won't cover that in detail here.
While this solution seems extremely painful to use, you only need to write a few helper functions to make it tolerable. Alternatively, especially if the exact format used does not matter to you, you can use an existing serialization library. Boost.Serialization is a rather nice library for that.