How to convert a binary byte into a printable numeric value?

How to convert a binary byte into a printable numeric value? - c++

I have to convert the CRYPTO++ AES ciphertext of 128 bits into a pribtable numerical string.
I am currently using the following code to do the casting, but bitset is too slow for my case. Does anyone know any efficient way of doing this?
string output = "";
for (std::size_t i = 0; i < 16; ++ i) {
output += bitset<8>(ciphertext[i]).to_string();
}
How to convert a binary byte into a printable numeric value? Thanks a lot!

There are plenty of clever methods to compute a binary string from a number, but it doesn't really matter; Whatever method you use, you can use that method to fill up a table once:
std::string bytes[256];
for (unsigned char c = 0; c<=255; ++c) {
bytes[c] = bitset<8>(c).to_string();
}
And then bytes[c] will give you the string for a particular byte.
In your post you show four lines of code. Below is what those four lines of code would change to using the above precomputed strings:
string output = "";
for (std::size_t i = 0; i < 16; ++ i) {
output += bytes[ciphertext[i]];
}
Also, your code likely involves some allocations during your loop. The best way to avoid those depends entirely on how you use the output string, but at the minimum output.reserve(16*8) can't hurt.

I would do
char ct_b[16];
char ct_h[33]; // 2 hex digits per byte + NUL
snprintf(ct_h, 33,
"%02x%02x%02x%02x%02x%02x%02x%02x"
"%02x%02x%02x%02x%02x%02x%02x%02x",
ct_h[ 0], ct_h[ 1], ct_h[ 2], ct_h[ 3],
ct_h[ 4], ct_h[ 5], ct_h[ 6], ct_h[ 7],
ct_h[ 8], ct_h[ 9], ct_h[10], ct_h[11],
ct_h[12], ct_h[13], ct_h[14], ct_h[15]);
This will certainly be faster than what you have, at the expense of a good bit more repetition. It does produce hexadecimal rather than binary, but it's very likely that hex is what you really want.
(In case you haven't seen string constant concatenation before: The absence of a comma after the first half of the string constant is intentional.)
(Please tell me you aren't using ECB.)

string output = "";
for (std::size_t i = 0; i < 16; ++ i) {
output += bitset<8>(ciphertext[i]).to_string();
}
There's also the Crypto++ source/sink method if you are itnerested:
string output;
ArraySource as(ciphertext, sizeof(ciphertext),
true /*pump*/,
new HexEncoder(
new StringSink(output)
) // HexEncoder
); // ArraySource

Related

MATLAB to C++: Coder: not consistent array dimension concatenation

adapting the code from this coder-compatible solution to read csv data i ran into the following issue during the runtime issue check of Matlab Coder:
Error using cat>>check_non_axis_size (line 283)
Dimensions of arrays being concatenated are not consistent.
Error in cat>>cat_impl (line 102)
check_non_axis_size(isempty, i, sizes{i}, varargin{:});
Error in cat (line 22)
result = cat_impl(#always_isempty_matrix, axis, varargin{:});
Error in readCsv (line 28)
coder.ceval('sscanf', [token, NULL], ['%lf', NULL], coder.wref(result(k)));
my adaptation:
function result = readCsv(filepath, rows, columns)
NULL = char(0);
fid = fopen(filepath, 'r');
% read entire file into char array
remainder = fread(fid, '*char');
% preallocation for speedup
result = coder.nullcopy(zeros(columns,rows));
k = 1;
while ~isempty(remainder)
% comma, newline
delimiters = [',', char(10)];
% strtok ignores leading delimiter,
% returns chars upto, but not including,
% the next delimiter
[token,remainder] = strtok(remainder, delimiters);
% string to double conversion
% no need to worry about return type / order
% since we only look at one token at a time
if coder.target('MATLAB')
result(k) = sscanf(token, '%f');
else
coder.ceval('sscanf', [token, NULL], ['%lf', NULL], coder.wref(result(k)));
end
k = k + 1;
end
% workaround for filling column-major but breaks on single-line csv
result = reshape(result,rows, [])';
disp(k)
fclose(fid);
the .csv in case is a 200x51 matrix
testing in matlab: works as expected - the .csv is read 1:1 as with csvread()
the error pops up during code generation, and as far as I understand, an issue with writing the result of sscanf into the preallocated result array - but only for the c code.
Addendum: a line with only integer values (1,1,1,...,0) works fine, a line with actual floats (6.7308,38.7101,...,40.5999,0) breaks with the aforementioned error.

remainder = fread(f, [1, Inf], '*char');
turns out sizeA argument is not optional in this case

Modulu vs if statement in a loop

I have a loop like this:
for(i = 0; i < arrayLength; i++)
{
result[i / 8] = SETBIT(result[i / 8], ++(*bitIndex), array[i]);
*bitIndex %= 8;
}
I wonder what is better, performance-wise, if I use the above style, or this style:
for(i = 0; i < arrayLength; i++)
{
result[i / 8] = SETBIT(result[i / 8], ++(*bitIndex), array[i]);
if(*bitIndex == 8) *bitIndex = 0;
}
This is in C, compiled with GCC, an explanation would be appreciated as well.
Thanks

It doesn't make any sense to talk about optimization...
without a specific target system in mind
before you have ensured that all compiler optimizations are enabled and work
before you have performed some kind of benchmarking
In your case, for most systems, there will be no difference between the two examples. What you really should focus on is to write readable code, instead of doing the opposite. Consider rewriting your program into something like this:
for(i = 0; i < arrayLength; i++)
{
(*bitIndex)++;
SETBIT(&result[i / 8], *bitIndex, array[i]);
*bitIndex %= 8;
}
This will likely yield exactly the same binary executable as to what you already had, +- a few CPU ticks.

In all honesty, there will only be a fraction of a second difference between them. But I believe the first would be SLIGHTLY better for simple READING style. However, this question references performance in which I still think the first would have the EVER SO SLIGHT edge for the fact the compiler is reading/computing less info.

You don't say why bitIndex is a pointer. But, assuming that bitIndex is a parameter to the function containing the given loop, I would pull the *bitIndex stuff out of the loop, so:
unsigned bi = *bitIndex ;
for(i = 0 ; i < arrayLength ; i++)
{
result[i / 8] = SETBIT(&result[i / 8], ++bi, array[i]) ;
bi %= 8 ;
} ;
*bitIndex = bi ;
[I assume that the ++bi is correct, and that SETBIT() requires 1..8, where bi is 0..7.]
As noted elsewhere, the compiler has the % 8 for breakfast and replaces it by & 0x7 (for unsigned, but not for signed). With gcc -O2, the above produced a loop of 48 bytes of code (15 instructions). Fiddling with *bitIndex in the loop was 50 bytes of code (16 instructions), including a read and two writes of *bitIndex.
How much actual difference this makes is anybody's guess... it could be that the memory read and writes are completely subsumed by the rest of the loop.
If bitIndex is a pointer to a local variable, then the compiler will pull the value into a register for the duration of the loop all on its own -- so it thinks it's worth doing !

Converting letters to numbers in C++

PROBLEM SOLVED: thanks everyone!
I am almost entirely new to C++ so I apologise in advance if the question seems trivial.
I am trying to convert a string of letters to a set of 2 digit numbers where a = 10, b = 11, ..., Y = 34, Z = 35 so that (for example) "abc def" goes to "101112131415". How would I go about doing this? Any help would really be appreciated. Also, I don't mind whether capitalization results in the same number or a different number. Thank you very much in advance. I probably won't need it for a few days but if anyone is feeling particularly nice how would I go about reversing this process? i.e. "101112131415" --> "abcdef" Thanks.
EDIT: This isn't homework, I'm entirely self taught. I have completed this project before in a different language and decided to try C++ to compare the differences and try to learn C++ in the process :)
EDIT: I have roughly what I want, I just need a little bit of help converting this so that it applies to strings, thanks guys.
#include <iostream>
#include <sstream>
#include <string>
int returnVal (char x)
{
return (int) x - 87;
}
int main()
{
char x = 'g';
std::cout << returnVal(x);
}

A portable method is to use a table lookup:
const unsigned int letter_to_value[] =
{10, 11, 12, /*...*/, 35};
// ...
letter = toupper(letter);
const unsigned int index = letter - 'A';
value = letter_to_value[index];
cout << index;

Each character has it's ASCII values. Try converting your characters into ASCII and then manipulate the difference.
Example:
int x = 'a';
cout << x;
will print 97; and
int x = 'a';
cout << x - 87;
will print 10.
Hence, you could write a function like this:
int returnVal(char x)
{
return (int)x - 87;
}
to get the required output.
And your main program could look like:
int main()
{
string s = "abcdef"
for (unsigned int i = 0; i < s.length(); i++)
{
cout << returnVal(s[i]);
}
return 0;
}

This is a simple way to do it, if not messy.
map<char, int> vals; // maps a character to an integer
int g = 1; // if a needs to be 10 then set g = 10
string alphabet = "abcdefghijklmnopqrstuvwxyz";
for(char c : alphabet) { // kooky krazy for loop
vals[c] = g;
g++;
}

What Daniel said, try it out for yourself.
As a starting point though, casting:
int i = (int)string[0] + offset;
will get you your number from character, and: stringstream will be useful too.

How would I go about doing this?
By trying to do something first, and looking for help only if you feel you cannot advance.
That being said, the most obvious solution that comes to mind is based on the fact that characters (i.e. 'a', 'G') are really numbers. Suppose you have the following:
char c = 'a';
You can get the number associated with c by doing:
int n = static_cast<int>(c);
Then, add some offset to 'n':
n += 10;
...and cast it back to a char:
c = static_cast<char>(n);
Note: The above assumes that characters are consecutive, i.e. the number corresponding to 'a' is equal to the one corresponding to 'z' minus the amount of letters between the two. This usually holds, though.

This can work
int Number = 123; // number to be converted to a string
string Result; // string which will contain the result
ostringstream convert; // stream used for the conversion
convert << Number; // insert the textual representation of 'Number' in the characters in the stream
Result = convert.str(); // set 'Result' to the contents of the stream
you should add this headers
#include <sstream>
#include <string>

Many answers will tell you that characters are encoded in ASCII and that you can convert a letter to an index by subtracting 'a'.
This is not proper C++. It is acceptable when your program requirements include a specification that ASCII is in use. However, the C++ standard alone does not require this. There are C++ implementations with other character sets.
In the absence of knowledge that ASCII is in use, you can use translation tables:
#include <limits.h>
// Define a table to translate from characters to desired codes:
static unsigned int Translate[UCHAR_MAX] =
{
['a'] = 10,
['b'] = 11,
…
};
Then you may translate characters to numbers by looking them up in the table:
unsigned char x = something;
int result = Translate[x];
Once you have the translation, you could print it as two digits using printf("%02d", result);.
Translating in the other direction requires reading two characters, converting them to a number (interpreting them as decimal), and performing a similar translation. You might have a different translation table set up for this reverse translation.

Just do this !
(s[i] - 'A' + 1)
Basically we are converting a char to number by subtracting it by A and then adding 1 to match the number and letters

C/C++ function for generating a hash for passwords (using MD5 or another algorithm)?

I'm looking for a function for C/C++ that behaves identically to PHP's md5() function -- pass in a string, return a one-way hash of that string. I'm also open to other algorithms than md5() if they are as secure (or more secure), reasonably fast, and ideally one-way.
The reason I'm searching for said function is for the same purpose I would use PHP's md5() function: to store a one-way hash of a user's password in a database rather than the actual text of the user's password (in case the database's data is ever compromised, the user's passwords would still be relatively secret).
I've spent around two hours searching now. All the code I've found either was for getting an MD5 of file data (instead of just a string), wouldn't compile, was for another programming language, or required an entire library (such as Crypto++, OpenSSL, hashlib++) to be added to my project, some of which are very large (is that really necessary when all I want is just one one-way string hashing function?).
Seeing as how this is a common need, I'm assuming someone has already written and made available exactly what I'm looking for.. can someone point me to it?
Thanks in advance.

Seriously, use a library (OpenSSL is a good choice). They're well-tested, and you can just drop them into your project without having to worry if you get the code right or not. Don't worry about the size of the library, any functions you don't use will not be included in your final executable.
I'd also recommend avoiding MD5, as it has known weaknesses, in favor of something stronger such as SHA-256 or Blowfish.
But whichever algorithm and implementation you go with, do not forget to salt your inputs!

Wikipedia's MD5 simple implementation has easy code and is very fast.
I would recommend it over the above solutions (for MD5 if it must be MD5) because it does not require an external library and the code does not contain #ifdefs
/*
* Simple MD5 implementation
*
* Compile with: gcc -o md5 -O3 -lm md5.c
*
* NOTE: this code only works on little-endian machines.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
// Constants are the integer part of the sines of integers (in radians) * 2^32.
const uint32_t k[64] = {
0xd76aa478, 0xe8c7b756, 0x242070db, 0xc1bdceee ,
0xf57c0faf, 0x4787c62a, 0xa8304613, 0xfd469501 ,
0x698098d8, 0x8b44f7af, 0xffff5bb1, 0x895cd7be ,
0x6b901122, 0xfd987193, 0xa679438e, 0x49b40821 ,
0xf61e2562, 0xc040b340, 0x265e5a51, 0xe9b6c7aa ,
0xd62f105d, 0x02441453, 0xd8a1e681, 0xe7d3fbc8 ,
0x21e1cde6, 0xc33707d6, 0xf4d50d87, 0x455a14ed ,
0xa9e3e905, 0xfcefa3f8, 0x676f02d9, 0x8d2a4c8a ,
0xfffa3942, 0x8771f681, 0x6d9d6122, 0xfde5380c ,
0xa4beea44, 0x4bdecfa9, 0xf6bb4b60, 0xbebfbc70 ,
0x289b7ec6, 0xeaa127fa, 0xd4ef3085, 0x04881d05 ,
0xd9d4d039, 0xe6db99e5, 0x1fa27cf8, 0xc4ac5665 ,
0xf4292244, 0x432aff97, 0xab9423a7, 0xfc93a039 ,
0x655b59c3, 0x8f0ccc92, 0xffeff47d, 0x85845dd1 ,
0x6fa87e4f, 0xfe2ce6e0, 0xa3014314, 0x4e0811a1 ,
0xf7537e82, 0xbd3af235, 0x2ad7d2bb, 0xeb86d391 };
// leftrotate function definition
#define LEFTROTATE(x, c) (((x) << (c)) | ((x) >> (32 - (c))))
// These vars will contain the hash
uint32_t h0, h1, h2, h3;
void md5(uint8_t *initial_msg, size_t initial_len) {
// Message (to prepare)
uint8_t *msg = NULL;
int new_len;
uint32_t bits_len;
int offset;
uint32_t *w;
uint32_t a, b, c, d, i, f, g, temp;
// Note: All variables are unsigned 32 bit and wrap modulo 2^32 when calculating
// r specifies the per-round shift amounts
const uint32_t r[] = {7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22,
5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20,
4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23,
6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21};
// Initialize variables - simple count in nibbles:
h0 = 0x67452301;
h1 = 0xefcdab89;
h2 = 0x98badcfe;
h3 = 0x10325476;
// Pre-processing: adding a single 1 bit
//append "1" bit to message
/* Notice: the input bytes are considered as bits strings,
where the first bit is the most significant bit of the byte.[37] */
// Pre-processing: padding with zeros
//append "0" bit until message length in bit ≡ 448 (mod 512)
//append length mod (2 pow 64) to message
for(new_len = initial_len*8 + 1; new_len%512!=448; new_len++);
new_len /= 8;
msg = (uint8_t*)calloc(new_len + 64, 1); // also appends "0" bits
// (we alloc also 64 extra bytes...)
memcpy(msg, initial_msg, initial_len);
msg[initial_len] = 128; // write the "1" bit
bits_len = 8*initial_len; // note, we append the len
memcpy(msg + new_len, &bits_len, 4); // in bits at the end of the buffer
// Process the message in successive 512-bit chunks:
//for each 512-bit chunk of message:
for(offset=0; offset<new_len; offset += (512/8)) {
// break chunk into sixteen 32-bit words w[j], 0 ≤ j ≤ 15
w = (uint32_t *) (msg + offset);
#ifdef DEBUG
printf("offset: %d %x\n", offset, offset);
int j;
for(j =0; j < 64; j++) printf("%x ", ((uint8_t *) w)[j]);
puts("");
#endif
// Initialize hash value for this chunk:
a = h0;
b = h1;
c = h2;
d = h3;
// Main loop:
for(i = 0; i<64; i++) {
if (i < 16) {
f = (b & c) | ((~b) & d);
g = i;
} else if (i < 32) {
f = (d & b) | ((~d) & c);
g = (5*i + 1) % 16;
} else if (i < 48) {
f = b ^ c ^ d;
g = (3*i + 5) % 16;
} else {
f = c ^ (b | (~d));
g = (7*i) % 16;
}
temp = d;
d = c;
c = b;
b = b + LEFTROTATE((a + f + k[i] + w[g]), r[i]);
a = temp;
}
// Add this chunk's hash to result so far:
h0 += a;
h1 += b;
h2 += c;
h3 += d;
}
// cleanup
free(msg);
}
int main(int argc, char **argv) {
if (argc < 2) {
printf("usage: %s 'string'\n", argv[0]);
return 1;
}
char *msg = argv[1];
size_t len = strlen(msg);
// benchmark
int i;
for (i = 0; i < 1000000; i++) {
md5((uint8_t*)msg, len);
}
//var char digest[16] := h0 append h1 append h2 append h3 //(Output is in little-endian)
uint8_t *p;
// display result
p=(uint8_t *)&h0;
printf("%2.2x%2.2x%2.2x%2.2x", p[0], p[1], p[2], p[3], h0);
p=(uint8_t *)&h1;
printf("%2.2x%2.2x%2.2x%2.2x", p[0], p[1], p[2], p[3], h1);
p=(uint8_t *)&h2;
printf("%2.2x%2.2x%2.2x%2.2x", p[0], p[1], p[2], p[3], h2);
p=(uint8_t *)&h3;
printf("%2.2x%2.2x%2.2x%2.2x", p[0], p[1], p[2], p[3], h3);
puts("");
return 0;
}

There is a reference implementation for MD5 in C at the bottom of RFC 1321, which doesn't require any extra libraries.

here is a site that has the MD5 algorithm in many languages:
http://userpages.umbc.edu/~mabzug1/cs/md5/md5.html
also if you use Visual C++, you can use .NET which has encryption support here is some documentation:
http://msdn.microsoft.com/en-us/library/system.security.cryptography.md5.aspx#Y0
hope that helps!

SHA-1 is easy. Pseudocode here: http://en.wikipedia.org/wiki/SHA-1
HOWEVER, you need to salt your passwords. This means you save a few bytes of random data in front of the password and hashed password.
General form (salt is fixed length):
salt + sha1(salt + password) = hash
Update from decade later: DO NOT USE. SHA-1 should be aged out now. The collision attack doesn't matter. SHA-1 is currently too fast and a dictionary attack is within range, salt or no salt.

See crypt(). It can do MD5 when passed a specific salt.

The Boost library has a fairly good implementation of the SHA-1 hash function. You can find the source for it here.

Parsing version numbers to real numbers

I would like to determine if one version number is greater than another. The version number could be any of the following:
4
4.2
4.22.2
4.2.2.233
...as the version number is beyond my control, so I couldn't say how many dots could actually exist in the number.
Since the number is not really a real number, I can't simply say,
Is 4.7 > 4.2.2
How can I go about converting a number, such as 4.2.2 into a real number that could be checked against another version number?
I would preferably like a ColdFusion solution, but the basic concept would also be fine.

This is ripped from the plugin update code in Mango Blog, and updated a little bit. It should do exactly what you want. It returns 1 when argument 1 is greater, -1 when argument 2 is greater, and 0 when they are exact matches. (Note that 4.0.1 will be an exact match to 4.0.1.0)
It uses the CF list functions, instead of arrays, so you might see a small performance increase if you switched to arrays instead... but hey, it works!
function versionCompare( version1, version2 ){
var len1 = listLen(arguments.version1, '.');
var len2 = listLen(arguments.version2, '.');
var i = 0;
var piece1 = '';
var piece2 = '';
if (len1 gt len2){
arguments.version2 = arguments.version2 & repeatString('.0', len1-len2);
}else if (len2 gt len1){
arguments.version1 = arguments.version1 & repeatString('.0', len2-len1);
}
for (i=1; i lte listLen(arguments.version1, '.'); i=i+1){
piece1 = listGetAt(arguments.version1, i, '.');
piece2 = listGetAt(arguments.version2, i, '.');
if (piece1 neq piece2){
if (piece1 gt piece2){
return 1;
}else{
return -1;
}
}
}
//equal
return 0;
}
Running your example test:
<cfoutput>#versionCompare('4.7', '4.2.2')#</cfoutput>
prints:
1

If version 4 actually means 4.0.0, and version 4.2 actually means 4.2.0, you could easily convert the version to a simple integer.
suppose that every part of the version is between 0 and 99, then you could calculate an 'integer version' from X.Y.Z like this:
Version = X*100*100 + Y*100 + Z
If the ranges are bigger or smaller you could use factors higher or lower than 100.
Comparing the version then becomes easy.

Parse each number separately and compare them iteratively.
if (majorVersion > 4 &&
minorVersion > 2 &&
revision > 2)
{
// do something useful
}
// fail here
That's obviously not CF code, but you get the idea.

A version number is basically a period delimited array of numbers, so you can parse both versions into number arrays, and then compare each element in the first array to the corresponding element in the second array.
To get the array, do:
<cfset theArrayofNumbers = listToArray(yourVersionString, ".")>
and then you can do your comparisons.

You can split the string containing the version by periods, then start at the first index and compare down until one is greater than the other (or if they are equal, one contains a value the other does not).
I'm afraid I've never written in coldfusion but that would be the basic logic I'd follow.
This is a rough unoptimized example:
bool IsGreater(string one, string two)
{
int count;
string[] v1;
string[] v2;
v1 = one.Split(".");
v2 = two.Split(".");
count = (one.Length > two.Length) ? one.Length : two.Length;
for (int x=0;x<count;x++)
{
if (Convert.ToInt32(v1[x]) < Convert.ToInt32(v2[x]))
return false;
else if (Convert.ToInt32(v1[x]) > Convert.ToInt32(v2[x])
return true;
} // If they are the same it'll go to the next block.
// If you're here, they both were equal for the shortest version's digit count.
if (v1.Length > v2.Length)
return true; // The first one has additional subversions so it's greater.
}

There is no general way to convert multiple-part version numbers into real numbers, if there is no restriction on the size of each part (e.g. is 4.702.0 > 4.7.2?).
Normally you would define a custom comparison function by creating a sequence or array of version-number parts or components, so 4.7.2 is represented as [4, 7, 2] and 4.702.0 is [4, 702, 0]. Then you compare each element of the two arrays until they don't match:
left = [4, 7, 2]
right = [4, 702, 0]
# check index 0
# left[0] == 4, right[0] == 4
left[0] == right[0]
# equal so far
# check index 1
# left[1] == 7, right[1] == 702
left[1] < right[1]
# so left < right
I don't know about ColdFusion, but in some languages you can do a direct comparison with arrays or sequences. For example, in Python:
>>> left = [4, 7, 2]
>>> right = [4, 702, 0]
>>> left < right
True

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to convert a binary byte into a printable numeric value? - c++

Related

MATLAB to C++: Coder: not consistent array dimension concatenation

Modulu vs if statement in a loop

Converting letters to numbers in C++

C/C++ function for generating a hash for passwords (using MD5 or another algorithm)?

Parsing version numbers to real numbers

Categories

Resources