Computing a (Non-MD5) 128 bit hash with salt - crystal-lang

So for a piece of code I am writing, I want to create a 128 bit hash - like the one in the MurmurHash3 library (https://pypi.python.org/pypi/mmh3/2.5.1)
Note: I also want to add a salt to the hash which I already have as a string
I was looking around and it was suggested to truncate a SHA256 hash to 128 bits, but is there a way to get SHA256 using Crystal?
I know it supports MD5 and SHA1 in its libraries, but could I even use the OpenSSL library in the code? Would this require the OS to be running OpenSSL?
EDIT:
There is an OpenSSL::Digest module in Crystal (https://crystal-lang.org/api/0.24.1/OpenSSL/Digest.html) but how can I generate a hash to eventually be truncated to 128 bits

You CAN use the OpenSSL module to generate a SHA256 digest, or any other algorithm supported by OpenSSL. Unfortunately, I would not suggest truncating as that is not an accurate representation of a true hash and has a higher chance of collision. Have you thought about porting murmur to Crystal, I am sure many people would love to see the library. My other suggestion is to just use 256 bits as its more secure anyway.

Related

Cryptographically secure RNG in C++ for RSA PKCS#1 (key generation)

I'm trying to re-implement the RSA key generation in C++ (as a hobby/learning playground) and by far my biggest problem seems to be generating a random number in range x,y which is also cryptographically secure (the primes p and q, for example).
I suppose using mt19937 or std::rand with a secure random seed (e.g. /dev/urandom or OpenSSL RAND_bytes etc) would not be considered 'cryptographically secure' in this case (RSA)?
ISAAC looked promising but I have zero clue on how to use it since I wasn't able to find any documentation at all.
Notably, this is also my first C++ project (I've done some C, Rust etc before... So C++ at least feels somewhat familiar and I'm not a complete newbie, mind you).
I suppose using mt19937 or std::rand with a secure random seed (e.g. /dev/urandom or OpenSSL RAND_bytes etc) would not be considered 'cryptographically secure' in this case (RSA)?
No, those are not cryptographically secure for basically any purpose.
ISAAC looked promising but I have zero clue on how to use it since I wasn't able to find any documentation at all.
Well, it stood the time I suppose. But I'd simply use a C++ library such as Crypto++ or Botan or something similar and then just implement the RSA key pair generation bit, borrowing one of their secure random generators. With a bit of luck they also have a bignum library so that you don't have to implement that either.

Crypto++ DES implementation and the key length

I've been looking for a crypto library for C++ for a while, and finally came across Crypto++. The library seemed OK until I tried to use 3DES. The problem is that the key length for DES algorithm implemented in this library is 64 bits (instead of usual 56 bits). I spent some time searching for explanation and all I finally got was a couple of words from the Crypto++ documentation:
The DES implementation in Crypto++ ignores the parity bits (the least significant bits of each byte) in the key.
Does this mean that if I have a usual 56-bit key and want to decrypt some data with this library I have to "expand" my key by inserting a meaningless extra bit after every 7 bits of my key data? Or is there another way to use 56-bit keys with this wonderful library?
A usual DES key is 8-bytes, it is just that the lsb (least significant bit), the parity bit, is ignored in almost all current implementations DES.
But that begs the question wether DES and/or 3DES (with 112-bit and/or 168-bit keys) should be used. The answer for DES: no, for 3DES: only for legacy compatibility. Both of these have been superseded by AES.

Use Cityhash for file integrity

The problem is that I want to use hash functions to check file integrity and encryption here is unnecessary, so I think the non-cryptographic hash cityhash may be a good choice, since what I want is just the speed and less collisions.
While the source has just provided the cityhash function with fixed length string as input and hash code as output. Then how can I use the function to hash a file?
Can I divide the file into several chunks, calculate every chunk's hash code and XOR every hash code? Will it affect the collision efficiency or speed? Do you have any other good ideas?
This is not an appropriate application for CityHash, and it will exhibit poor collision resistance when used this way.
If you want a quick file integrity checksum, use a CRC family function, like CRC16. If you want something more extensive, the speed of cryptographic hashes such as SHA1 should be more than sufficient. (Almost any modern CPU can hash data basically as fast as it can read it from memory.)

Encrypt data block with C/C++, platform independent

Say, if I have a byte array of a various length and a pass-phrase, what is the quickest way to encrypt it in a platform-independent way?
PS. I can make a SHA1 digest on the pass-phrase but how do I apply it to the byte array -- doing a simple repeated XOR makes it too obvious.
PS2. Sorry, crypto guys, if I"m asking too obvious stuff...
A Hash (like sha1) create a one-way result, you cannot decrypt a hash. XORing the data is not secure by any means, don't do that.
If you need to be able to decrypt the data, then I suggest using something like Twofish which uses a symmetric key block cipher and is not restricted by licensing or patents (thus you can find platform independent reference code).

Fastest and LightWeight Hashing Algorithm for Large Files & 512 KB Chunks [C,Linux,MAC,Windows]

I'm working on a Project which involves computation of Hashes for Files. The Project is like a File Backup Service, So when a file gets uploaded from Client to Server, i need to check if that file is already available in the server. I generate a CRC-32 Hash for the file and then send the hash to server to check if it's already available.
If the file is not in server, i used to send the file as 512 KB Chunks[for Dedupe] and i have to calculate hash for this each 512 KB Chunk. The file sizes may be of few GB's sometimes and multiple clients will connect to the server. So i really need a Fast and LightWeight Hashing algorithm for files. Any ideas ..?
P.S : I have already noticed some Hashing Algorithm questions in StackOverflow, but the answer's not quite comparison of the Hashing Algorithms required exactly for this kind of Task. I bet this will be really useful for a bunch of People.
Actually, CRC32 does not have neither the best speed, neither the best distribution.
This is to be expected : CRC32 is pretty old by today's standard, and created in an era when CPU were not 32/64 bits wide nor OoO-Ex, also distribution properties were less important than error detection. All these requirements have changed since.
To evaluate the speed and distribution properties of hash algorithms, Austin Appleby created the excellent SMHasher package.
A short summary of results is presented here.
I would advise to select an algorithm with a Q.Score of 10 (perfect distribution).
You say you are using CRC-32 but want a faster hash.
CRC-32 is very basic and pretty fast.
I would think the I/O time would be much longer than the hash time.
You also want a hash that will not have collisions.
That is two different files or 512 KB chunks gets the same hash value.
You could look at any of the cryptographic hashs like MD5 (do not use for secure applications)
or SHA1.
If you are only using CRC-32 to check if a file is a duplicate, you are going to get false duplicates because different files can have the same crc-32. You had better use sha-1, crc-32 and md5 are both too weak.