HMAC is a well known algorithm used in authentication mechanisms. It uses message digest algorithms, such as MD5, SHA1 and SHA256. Python provides modules which allows you to create HMAC shadowed passwords to be used in your applications. The implementation is based on hashlib and hmac, and both modules are using the OpenSSL library wrapped in Python using the Python C/API. I’ve measured the behaviour of those algorithms in a threaded environment, to let me know which one is the more convenient to be used in sites without creating heavy CPU loads when creating user accounts in batch processes, since we are using a VPS which is handles its costs based on resource usage.
The first observation is the fact that all available algorithms have an exponential increment on CPU usage while they increase its complexity. Since hashlib do not uses locking, it is transparent to the module to be used in a threaded environment. There is no resource locking or thread blocking routines in the module, so it do not requires synchronisation, and even it doesn’t use synchronisation internally. Probably OpenSSL uses some kind of it, but it is transparent, without affecting the runtime.
Is well known that the probability of hash collisions on lower algorithms, like MD5 and SHA1 is higher than SHA512 and similar ones. Probably we just need to use a safer algorithm that offers a better performance than others, but offers a safer digest like SHA256 as minimal requirement, so we store passwords for a long time without the risk of finding some password cracker that can crash our security system. We know that MD5 and SHA1 are not safe any more.
If you’re willing to spend about 2,000 USD and a week or two picking up CUDA, you can put together your own little supercomputer cluster which will let you try around 700,000,000 passwords a second. And that rate you’ll be cracking those passwords at the rate of more than one per second. [Source bcrypt].
On a threaded environment you can see that lower algorithms have a similar performance and behaviour. All of them seems to have a very similar behaviour, where thread creation is time consuming, but once all threads are created it begins increasing the time required to complete the task.
The performance for strong algorithms seem to be similar, but more time consuming that the lower ones. Also on higher algorithms, such as SHA384 and SHA512 have a more stable behaviour, using more homogeneous times to complete the task. Seems that the best option by stability and being stronger enough is SHA384, which has better performance than SHA512 and it reliable enough to store strong passwords. The bcrypt advice should be taken seriously by some institutions that require strong password storage, but mere applications like blogs and similar ones, do not require that kind of strength.
Probably we will adopt SHA384 as our best option. We need that process to be finished early and we do not have a TRNG installed on the machine to obtain enough entropy. We are thinking to buy an Entropy Key device to guarantee a better performance on cryptographic routines, like randomise initial passwords and bring a better performance on the server. Also we are thinking to buy some devices from Soekris Engineering to enhance the performance of certain tasks. I well known that cryptographic tasks, like using TLS/SSL connections over TCP like HTTPS can lead to heavy CPU usage. I hope that all of those tools will serve us well.