The rustls project:
Compared to previous reports, the benchmark tool can now perform the same benchmarks in many threads simultaneously. Each thread runs the same benchmarking operation as before, and threads do not contend with each other except via the internals of the TLS library.
As before, the benchmarking is performed by measuring a TLS client “connecting” to a TLS server over a memory buffer – there is no network latency, system calls, or other overhead that would be present in a typical networked application. This arrangement actually should be the worst case for multithreaded testing: every thread should be working all the time (rather than waiting for IO) and therefore contention on any locks in the library under test should be maximal.
A fantastic showing by rustls with much higher handshakes-per-second and low latency with little variance as well. Testing was performed on an 80‑core Ampere Altra ARM server against BoringSSL and two versions of OpenSSL.