Compare AES with MD4 or another

Compare AES with MD4 or another - c++

Could somebody compare the safety of algorithms AES-CMAC with only 5 rounds, MD4 or some else hash function for checksums? Both in terms of speed and safety. We want fast and reliable checksum for detecting random error changes in streams (no bad guys involved). Ideally also with standard C/C++ implementation.

CRCs are explicitly designed to detect transmission errors; unlike hashes they provide concrete guarantees on what errors they will definitely detect. If it's transmission errors rather than an adversary you're concerned about, use a CRC.

Related

Hashing large keys without collisions

I am writing a checkers engine. I am aware of standard (zorbrist) hashing schemes, however because of the nature of my engine, they are unsuitable. Any collision of any sort will result in catastrophic errors.
To fix this, I would like to use the entire (unique) representation of the board state as the key. This should not really be a problem, since the state is determined by 6 32bit unsigned integers. This worked in python without any problems besides speed. In C++, I'm using std::unordered_map.
Every way I've tried to implement this has failed. I've tried an std::pair of boost::uint_type128 as the key. Again, there needs to be a guarantee that there won't be collisions.

Are there faster hash functions for unordered_map/set in C++?

Default function is from std::hash. I wonder if there are better hash functions for saving computational time? for integer keys as well as string keys.
I tried City Hash from Google for both integer and string keys, but its performance is a little worse than std::hash.

std::hash functions are already good in performance. I think you should try open source hash functions.
Check this out https://github.com/Cyan4973/xxHash. I quote from its description: "xxHash is an Extremely fast Hash algorithm, running at RAM speed limits. It successfully completes the SMHasher test suite which evaluates collision, dispersion and randomness qualities of hash functions. Code is highly portable, and hashes are identical on all platforms (little / big endian)."
Also this thread from another question on this site: Fast Cross-Platform C/C++ Hashing Library. FNV, Jenkins and MurmurHash are known to be fast.

You need to explain 'better' in what sense? The fastest hash function would be simply use the value, but that is useless. A more specific answer would depend on your memory constraints and what probabilities of collision are you willing to accept.
Also note that the inbuilt hash functions are built differently for different types, and as a result, I expect the hash functions for int and string to already by optimised in the general sense for time complexity and collision probability.

Which error correction could should I use for GF(32)

I searched for comparisons between Reed-Solomon, Turbo and LDPC codes but they all seem to focus on efficiency. I'm more interested in commercial license of available libs, easiness and GF(32), i.e. a code with 32 symbols only (available Reed-Solomon implementations work for GF(256) and above).
Efficiency (speed) is not relevant. The messages are comprised of 24 symbols.
Can you provide a quick comparison on the most well-known Reed-Solomon, Turbo and LDPC codes for this case in which speed is not relevant?
Thanks.

Basically, Reed-Solomon is optimal, thus it means that you can exactly correct up to (n-k)/2 errors (k=length of your message, n=length of message + EC symbols), while TurboCodes and LDPC are near-optimal, meaning that you can correct up to (n-k-e)/2 where e is a small constant, so in ideal cases you are very close to (n-k)/2 (that's why it's called near-optimal, it's close to the Shannon limit). TurboCodes and LDPC have similar error correction power, and there are lots of variants depending on your needs (you can find lots of literature reviews or presentations).
What the different variants of LDPC or Turbocodes do is to optimize the algorithm to fit certain characteristics of the erasure channel (ie, the data) so as to reduce the constant e (and thus approaching the Shannon limit). So the best variant in your case depends on the details of your erasure channel. Also, to my knowledge, they are all in public domain now (maybe not yet for Turbocodes patents, but if not yet then they will soon).

Example of compiler optimizations that can be 'easily' done on C++ code but not C code

This question talks of an optimization of the sort function that cannot be readily achieved in C:
Performance of qsort vs std::sort?
Are there more examples of compiler optimizations which would be impossible or at least difficult to achieve in C when compared to C++?

As #sehe mentioned in a comment. It's about the abstractions more than anything else. In other words, if the language allows the coder to express intent better, then it can emit code which implements that intent in a more optimal fashion.
A simple example is std::fill. Sure for basic types, you could use memset, but, let's say it's an array of 32-bit unsigned longs. std::fill knows that the array size is a multiple of 32-bits. And depending on the compiler, it might even be able to make the assumption that the array is properly aligned on a 32-bit boundary as well.
All of this combined may allow the compiler to emit code which sets the value 32-bit at a time, with no run-time checks to make sure that it is valid to do so. If we are lucky, the compiler will recognize this and replace it with a particularly efficient architecture specific version of the code.
(in reality gcc and probably the other mainstream compilers do in fact do this for just about anything that could be considered equivalent to a memset already, including std::fill).
often, memset is implemented in a way that has run-time checks for these types of things in order to choose the optimal code path. While this difference is probably negligible, the idea is that we have better expressed the intent of "filling" an array with a specific value, so the compiler is able to make slightly better choices.
Other more complicated language features do a good job of using the expression of intent to get larger gains, but this is the simplest example.
To be clear, my point is not that std::fill is "better" than memset, instead this is an example of how c++ allows better expression of intent to the compiler, allowing it to have more information during compile time, resulting in some optimizations being easier to implement.

It depends a bit on what you think of as the optimization here. If you're thinking of it purely as "std::sort vs. qsort", then there are thousands of other similar optimizations. Using a C++ template can supports inlining in situations where essentially the only reasonable alternative in C is to use a pointer to a function and nearly no known compiler will inline the code being called. Depending on your viewpoint, this is either a single optimization, or an entire (open-ended) family of them.
Another possibility is using template meta-programming to turn something into a compile-time constant that would normally have to be computed at run-time with C. In theory, you could usually do this by embedding a magic number. This is possible via a #define into C, but can lose context, flexibility or both (e.g., in C++ you can define a constant at compile time, carry out an arbitrary calculation from that input, and produce a compile-time constant used by the rest of the code. Given the much more limited calculations you can carry out in a #define, that's not possible nearly as often.
Yet another possibility is function overloading and template specialization. These are separate, but give the same basic result: using code that's specialized to a particular type. In C, to keep the number of functions you deal with halfway reasonable, you frequently end up writing code that (for example) converts all integers to a long, then does math on that. Templates, template specialization, and overloading make it relatively easy to use code that keeps the smaller types their native sizes, which can give a substantial speed increase (especially when it can enable vectorizing the math).
One last obvious possibility stems from simply providing quite a few pre-built data structures and algorithms, and allowing such things to be packaged for relatively easy, efficient re-use. I doubt I could even count the number of times I wrote code in C using what I knew were relatively inefficient data structures and/or algorithms, simply because it wasn't worth the time to find (or adapt) a more efficient one to the task at hand. Yes, if it really became a major bottleneck, I'd go to the trouble of finding or writing something better -- but doing a bit of comparing, it's still fairly common to see speed double when written in C++.
I should add, however, that all of these are undoubtedly possible with C, at least in theory. If you approach this from a viewpoint of something like language complexity theory and theoretical models of computation (e.g., Turing machines) there's no question that C and C++ are equivalent. With enough work writing specialized versions of each function, you can/could theoretically do all of those same things with C as you can with C++.
From a viewpoint of what code you can plan on really writing in a practical project, the story changes very quickly -- the limit on what you can do mostly comes down to what you can reasonably manage, not anything like the theoretical model of computation represented by the language. Levels of optimization that are almost entirely theoretical in C are not only practical, but quite routine in C++.

Even the qsort vs std::sort example is invalid. If a C implementation wanted, it could put an inline version of qsort in stdlib.h, and any decent C compiler could handle inlining the comparison function. The reason this usually isn't done is that it's massively bloated and of dubious performance benefit -- issues C++ folks tend not to care about...

Fast implementation of MD5 in C++

First of all, to be clear, I'm aware that a huge number of MD5 implementations exist in C++. The problem here is I'm wondering if there is a comparison of which implementation is faster than the others. Since I'm using this MD5 hash function on files with size larger than 10GB, speed indeed is a major concern here.

I think the point avakar is trying to make is: with modern processing power the IO speed of your hard drive is the bottleneck not the calculation of the hash. Getting a more efficient algorithm will not help you as that is not (likely) the slowest point.
If you are doing anything special (1000's of rounds for example) then it may be different, but if you are just calculating a hash of a file. You need to speed up your IO, not your math.

I don't think it matters much (on the same hardware; but indeed GPGPU-s are different, and perhaps faster, hardware for that kind of problem). The main part of md5 is a quite complex loop of complex arithmetic operations. What does matter is the quality of compiler optimizations.
And what does also matter is how you read the file. On Linux, mmap and madvise and readahead could be relevant. Disk speed is probably the bottleneck (use an SSD if you can).
And are you sure you want md5 specifically? There are simpler and faster hash coding algorithms (md4, etc.). Still your problem is more I/O bound than CPU bound.

I'm sure there are plenty of CUDA/OpenCL adaptations of the algorithm out there which should give you a definite speedup. You could also take the basic algorithm and think a bit -> get a CUDA/OpenCL implementation going.
Block-ciphers are perfect candidates for this type of implementation.
You could also get a C implementation of it and grab a copy of the Intel C compiler and see how good that is. The vectorization extensions in Intel CPUs are amazing for speed boosts.

table available here:
http://www.golubev.com/gpuest.htm
looks like probably your bottleneck will be your harddrive IO

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js