This article discusses several open-source implementations of the HKDF scheme for Android. Since HKDF is a relatively simple algorithm, it allows for a good case study of cryptographic code. The primary audience are applied cryptographers and software engineers working on protocols.

Key derivation functions (KDFs) are an important building block for practical protocols. They take as input the input keying material (IKM), an optional salt, and a context string (info). Their output is a pseudorandom key of specified length. For most variants we can assume that the resulting key material is indistinguishable from random which makes it easily usable as input for other cryptographic algorithms. Below is a typical example.

// Derive encryption key from master key
enc_aes_key = KDF(ikm=master_key, salt=salt, info="encryption_key", l=32)

KDFs are used in protocols for many purposes. First, we can use KDFs to transform key material to the key length required by specific algorithms. For example, reducing a 512-bit key to 128-bit for AES. Second, we can use KDFs to derive independent keys from the same key material. A common use case is domain separation where we want to ensure independent keys for different purposes, e.g. for encryption and integrity, to avoid accidental key re-use and other subtle interactions. For this we use different info parameters to bind the keys to the different contexts. Finally, we can use the KDF to combine key input from multiple sources (e.g. Signal’s X3DH) and ratchet operations (e.g. Signal’s Double Ratchet).

The HMAC-based KDF (HKDF) is a versatile KDF scheme devised by Krawczyk and built using the Hashed Message Authentication Code (HMAC) primitive. It is handy, because HMAC is commonly available on many platforms, and very fast. It also comes with well understood security guarantees for the output key material. HKDF works in two steps: During the extract phase the IKM and salt are hashed into an intermediate pseudorandom key PRK. The expand phase then combines the PRK with the info parameter and iteratively produces output blocks. It does so by taking the previous round’s state and hashing it again with a new counter value. These blocks then form the output key material (OKM).

// extract
PRK = HMAC(salt, IKM)

// expand
T_0 = []
T_1 = HMAC(PRK, T_0 + info + "0x01")  // `+` is concatenation
T_2 = HMAC(PRK, T_0 + info + "0x02")
...
OKM = T1 + T2 + ...

Password hashing and key stretching is a special case of key derivation. Here the password serves as low-entropy input keying material and, in turn, the function is intentionally designed to require high computational or memory costs. I discussed password hashing on Android in a previous blog post.

KDFs on Android

Despite the ubiquity of HKDF, it is not part of the cryptography API on Android (or the JDK in general). Luckily, its specification RFC 5869 is relatively simple and all the underlying cryptographic primitives are readily available. Therefore, it is tempting to simply write a custom implementation as part of the overall protocol. This led to the diverse set of open-source implementations that we can discuss here.

The motivation for this blog post is to critically discuss how cryptographic algorithms are implemented – partially inspired by Filippo’s discussion of Restic. In particular, I compare different approaches to common challenges and weigh overall design decisions. Also, I hope to inspire some appreciation for the intricate task that is implementing cryptography for the real world.

We will look at the HKDF implementations from the following projects and reference them by nicknames:

We want implementations that are secure, correct, fast, complete, and readable. The good news is that all covered implementations seem to be fit for purpose – however, this blog post is not a security review and thus provides no definite answer in this regard. Where implementations are incomplete or particular, they appear to work fine within the constraints of the surrounding protocol.

Common themes

In this section, I discuss common challenges that pertain not only to HKDF, but also when implementing other cryptographic algorithms.

Test vectors

Specifications, and in particular RFCs, typically come with test vectors. Hence, they are the obvious choice to include in unit tests. I have implemented a small test harness that runs the test vectors from the RFC against all implementations.

The implementations SignalHkdf, MozillaHkdf, and AwsHkdf are hard-coded to SHA-256 and hence fail test cases 4–7 that use SHA-1 for the underlying HMAC primitive. This is fine as they are only used within the constraint of the surrounding protocol. In fact, this limitation might be a feature, because SHA-1 is considered broken.

Test Case 6 turned out to be interesting because it uses a zero-length salt value, i.e. byte[0]. This is different to not providing salt value, i.e. a null value. Some impementation, e.g. SignalHkdf, do not handle this case correctly and throw an java.lang.IllegalArgumentException: Empty key. I discuss this in more detail when talking about SecretKeySpec in the respective subsection below. Ideally, this special case should have been included in the RFC test vectors for both hash functions so that it is easy to add to tests even when supporting only one of them.

I also added additional test cases. For instance, the specification requires that the output key material must not exceed 255*HashLen where HashLen is the length of the output of the underlying HMAC scheme. However, both SignalHkdf and MozillaHkdf allow arbritary length output. We discuss this check in the next subsection in more detail.

Calculating iteration count

The specification requires that we perform the expansion step no more than 256 times–i.e. the length of OKM is bound to ≤255*HashLen where HashLen is the HMAC output length. Implementations approach this logic differently.

The implementations SignalHkdf, MozillaHkdf, and PartickFavHkdf convert output size and hash length to floating point numbers and then perform a division. This result is then rounded up and converted to int to get the explicit iteration count.

// org.mozilla.android.sync.crypto.Hkdf:92
int iterations = (int) Math.ceil(((double)len) / ((double)BLOCKSIZE));

While straight-forward, it is not obvious that this is correct. A concerned reviewer might point out that the conversions between floating point and integer numbers can loose information. In this case we are actually good for the blocksizes of SHA-256 and SHA-1, because int has 32-bit precision which can be fully represented in the 53-bit “significand” of the double type. However, using this approach with other type combinations like int + float or long + double will yield wrong results. A more reliable approach is a dedicated ceilDivision method for integer types:

// adapted from com.lambdapioneer.sloth.utils.ceilOfIntegerDivision:15
internal fun ceilOfIntegerDivision(a: Int, b: Int): Int {
    check(b > 0)
    return if (a % b == 0) { a / b } else { (a / b) + 1 }
}

Clever implementers avoid the explicit division altogether. Both AwsHkdf and GoogleTinkHkdf check the requested output size against the upper bound derived with multiplication. The loop then simply compares the current OKM length with the targeted output length. Note, that this comparison is safe against integer overflow, because we know the upper bound of getMacLength().

// com.google.crypto.tink.subtle.Hkdf:51
if (size > 255 * mac.getMacLength()) {
    throw new GeneralSecurityException("size too large");
}

Code structure

The RFC organizes the algorithm into separate extract and expand methods. This structure aids understanding and many implementations have followed this lead. It makes for clean code (see e.g. SlothHkdf, also code below) that is easy to compare against the specification. In addition, one can also use the intermediate results from the test vectors for more thorough unit tests.

// com.lambdapioneer.sloth.crypto.Hkdf:32
internal fun extract(salt: ByteArray?, ikm: ByteArray): ByteArray {
    ...
}

// com.lambdapioneer.sloth.crypto.Hkdf:38
internal fun expand(prk: ByteArray, info: ByteArray, l: Int): ByteArray {
    ...
}

However, clean code can interfere with performance goals and requires more a careful and subtle implementation to avoid extra work. In the HKDF case, one risk is creating many new instances of the Mac engine instead of reusing previous ones. The SlothHkdf solves this by caching an instance in a member field that is used in both methods.

In GoogleTinkHkdf all is implemented in one method. This makes efficient re-using of objects trivial. Compared to SlothHkdf above, the avoidance of instance-level state also makes the code thread-safe by default.

// com.google.crypto.tink.subtle.Hkdf:47
public static byte[] computeHkdf(
    String macAlgorithm, final byte[] ikm, final byte[] salt,
    final byte[] info, int size,
) throws GeneralSecurityException {
    ...
}

Nevertheless, no size fits all. For example, consider a protocol that expand multiple times (e.g. with different info) from the same IKM. Then the ability to re-use the intermediate PRK result from the extract step can potentically save some hash operations.

Salty SecretKeySpec

The salt parameter for HKDF is optional and it can be either not provided (null) or empty (byte[0]). For the first case, the HKDF specifies that the salt should be set to an all-zero array with the length of the underlying hash function. All mentioned implementations handle this case correctly. However, for some, e.g. SignalHkdf, an empty salt value causes an exceptions to be thrown as the implementations feeds the seed value directly into SecretKeySpec whose constructor requires a non-empty key.

I assume that this is an extra safety check which is meant to protect against developer mistakes since it is not strictly required. In fact, the HMAC spec is defined for empty keys: “append zeros to the end of K to create a B byte string” (see RFC 2104, Section 2). We can easily verify that both relevant implementations do so. OpenJDK’s HmacCore.java sets 0x00 bytes using a clever if-statement. And Android’s OpenSSLMac.java or respectively the underlying hmac.c achieve this via padding.

This leaves implementations with two choices. The simple solution is to handle the empty salt case in the same way as they already handle the null case. However, technically the empty case’s behavior relies on a different spec which is worth pointing out in a comment. Alternatively, because the underlying implementations are in fact happy with empty keys, we can create our own SecretKeySpec that does not perform the superfluous check for zero-length keys.

// com.lambdapioneer.sloth.crypto.Hkdf:58
private class PermissiveSecretKeySpec(
    private val key: ByteArray,
    private val algorithm: String,
) : KeySpec, SecretKey {
    override fun getAlgorithm() = algorithm
    override fun getFormat() = "RAW"
    override fun getEncoded() = key.clone()
}

Cloning secrets and zero-ing memory

There are, of course, many more aspects worth discussing. I will briefly mention them here for completeness, but also consider to expand on them in a separate blog post.

For instance, there are some pitfalls regarding varying versions of the API–and more interestingly between the JDK used for local unit tests and the Android runtime used by the real app and instrumentation tests. For instance, OpenJDK v8 will overwrite the encoded key returned by the SecretKey in HmacCore.java. Whereas this is not the case for the Android one implemented in OpenSSLMac.java. As a result, implementations of SecretKeySpec that simply return the underlying ByteArray directly as a reference in getEncoded() work fine in instrumentation tests, but will produce wrong results when run as local unit tests.

// com.sun.crypto.provider.HmacCore:95
protected void engineInit(Key key, AlgorithmParameterSpec params)
    throws InvalidKeyException, InvalidAlgorithmParameterException {
        ...
        Arrays.fill(secret, (byte)0);
        secret = null;
        ...
}

Because of this the default implementation SecretKeySpec clones its byte[] key member in getEncoded() so that it always returns a fresh copy. However, it also clones the raw key data already in its constructor. As a result, the key might be in memory at three locations: (i) the call site, (ii) the member field of SecretKeySpec, and (iii) the caller of getEncoded(). Even if (iii) erases the key like in HmacCode.java and (i) does the same, there is no way to zero the secret within SecretKeySpec. As a result, code that does so in the call site might give a wrong impression of security.

// javax.crypto.spec.SecretKeySpec:91
public SecretKeySpec(byte[] key, String algorithm) {
    ...
    this.key = key.clone();  // NOTE: first clone
    ...
}

// javax.crypto.spec.SecretKeySpec:183
public byte[] getEncoded() {
    return this.key.clone(); // NOTE: second clone
}

While SecretKeySpec inherits javax.security.auth.Destroyable via the SecretKey interface, it does not effectively implement either of its methods. Nevertheless, this pattern could, once generally adopted, provide a scalable solution for effective secret management in Java/Kotlin. Other languages like Rust and C++ allow for a more natural approach to clearing memory by using RAII patterns. See, for example, the Rust zeroize::ZeroizeOnDrop derive macro.

Discussion of selected implementations

All implementations appear fit for purpose within their respective applications. I also want to make clear that the aim of this post is to educate and not to provide a review/ranking of the chosen implementations.

The HKDF library by patrickfav appears to a great choice for simply importing HKDF functionality as a dependency. It passes all test vectors and provides plenty of flexibility for the underlying hashing algorithms and separate extract/expand calls. However, compared to the other implementations, it introduces a rather complex API with abstractions (e.g. HkdfMacFactory or Expander) that appear not strictly necessary.

Personally, I like the implementation in Google’s Tink library a lot. It implements the algorithm correctly and efficiently in a single method of about 30 lines of code. While compact, the code is clear and the interesting edge cases for the salt are highlighted. In particular the expand operations are written in an optimal Java-style buffer processing loop. If one is already using Tink, this is perfect.

The Mozilla Sync implementation is slightly odd. It has a weird and unnecesary logic for converting int to byte by first encoding and decoding them as HEX strings… Also, many of its internal methods are public when they could be private.

In the AWS implementation we see many attempts to erase byte arrays by filling them with 0x00 bytes. However, as discussed above, this might give a false sense of security, as the SecretKeySpec creates non-zeroable copies itself. Also, some parts of the code, e.g. explicitely initializing a new byte array to 0x00, indicate that the code might be translated without much attention from a different language.

I wrote my own HKDF implementation as part of the Sloth prototype. Since Sloth is a library, I wanted to avoid introducing additional dependencies. It’s the only Kotlin implementation in this selection and uses its own PermissiveSecretKeySpec to deal with empty salt values. However, the expand method can benefit from optimizations as it currently creates more temporary byte arrays than necessary.

Conclusion

Even simple cryptographic algorithms are not simple to implement. We have discussed typical challenges such as understanding the underlying cryptographic API, code structure, and numerical operations. Once an implementation passes the specification’s test vectors, it is usually correct for most real-world input. However, its edge cases typically require extra effort to analyze and test.

Knowledge on how to implement cryptographic algorithms is rare and there is a lack of literature. Papers like this one discussing details of an Curve25519 implementation remain a rarity with no clear venues for publications.

P.S. This has been the longest blog post I have written so far and its research has taken multiple days. Please reach out if you find mistakes or have feedback.

Credits: cover photo by Nick on Unsplash.