BLAKE2s Hashing Accelerator: A Solo Tapeout JourneyPermalink

Julia Desmazes:

Some people have hobbies, I find that cryptographic hashing accelerators are fascinating engineering challenges. Please hear me out. Each algorithm offers slightly different design tradeoffs, presents unique opportunities for optimization, is easy to validate, and each serves as a perfect excuse to voluntarily subject yourself to a Friday evening of debugging hash result mismatches.

Now that the stage is set, let me introduce today’s star: BLAKE2. BLAKE2 is a family of cryptographic hash functions that takes arbitrary amounts of data and compresses it down to a variable-sized digest (hash). This hash is then used as a digital signature for message authentication codes and integrity protection mechanisms.

The BLAKE2 family comes in two primary variants:

  • BLAKE2b, designed for 64-bit platforms
  • BLAKE2s, the 32-bit variant

What makes BLAKE2 particularly interesting is that it was originally designed and optimized for high software performance. It’s fundamentally a software-first algorithm, in contrast to AES, which maps so clearly to hardware that your architecture practically writes itself as you read the spec.

This article will mostly focus on the why behind the design choices and not an in-depth presentation of the design itself.

Chip design is a mystery to me, and I imagine most people. Julia’s post sheds a lot of light on the process and challenges, and ends up with a finalised design that will be manufactured as part of the Tiny Tapeout project.