Working with customers developing embedded systems one often needs compact cryptographic primitives to implement the security functions required by the applications running on the systems. With the prevalence of Internet of Things (IoT) and increased connectivity in automotive space, that need keeps increasing. A term quite often used in this context is lightweight algorithms.
I have found that one way to really gain familiarity with a cryptographic primitive is to implement it. To that effect I’m continuously developing and releasing open hardware implementations of ciphers, hash functions, block cipher modes. In this post we will take a look a what lightweight algorithms means as well as looking at a fairly recent lightweight block cipher called PRINCE, which I have implemented. A block cipher with some properties that I think makes it a viable alternative to the block cipher AES for some applications.
AES – Long Live the King
In the world of block ciphers, AES has been king for quite a few years. After winning the first open, international crypto algorithm competition, the algorithm Rijndael became the Advanced Encryption Standard (AES). NIST organized the competition and released AES in 2001 as FIPS-1967 (PDF). Since then AES has been implemented in anything from key fobs to high speed bulk encryption and Gbps network systems. AES is used in open source security solutions as well as Common Criteria certified equipment used by governments around the world. AES is quite probably the block cipher that has been analyzed and attacked more than any other cipher.
Looking at AES from a hardware perspective, one of the really cool features of AES is its flexibility. AES can be implemented in a very compact version that operates on the 16 byte state one byte at a time. Going in the opposite direction, AES can be implemented with full parallel, single cycle rounds and pipelines for each of the 10 to 14 rounds (depending on the key size – 128, 192 or 256 bits). Depending on the block cipher mode, the decipher functionality can easily be removed, the key expansion can be changed into an on-the-fly key expansion etc. I’ve implemented all these variants some time or another with my open AES core.
Internet of Things (IoT) and Lightweight Algorithms
Even if you often hear that the S in IoT stands for security, the reality is that IoT really triggered a new wave of research into security for low cost, connected systems. This includes algorithm research and development. The established algorithms, and especially AES was seen as being too complex, too expensive, too heavyweight.
Considering the flexibility and especially the ability of AES to be implemented in such a compact way, I’ve never really understood the need for new block ciphers. Today you can for instance buy ARM Cortex-M0+ based Microcontrollers with an integrated AES core for less than one Euro. A good example is the Silicon Labs EFM32ZG108F8-QFN24 that sells for under 0,8 Euros in single unit volume.
Figure 1: The Silicon Labs EFM32. A good example of a very low cost MCU with integrated AES engine.
Since AES is so well trusted, a new cipher must provide a lot of benefits to be a viable block cipher alternative to AES. I’m not saying that IoT security requirements can easily be met with current primitives. On the contrary. I often find that for things like key agreement/key exchange, hashing and message authentication the more established algorithms used in desktop, server, cloud solutions rarely fits the cost and resource constraints. But the IoT crypto research focus to a large extent instead seems to be on lightweight block ciphers. The CryptoLUX Lightweight Cryptography Lounge lists a number of algorithms in different categories:
- 36 block ciphers
- 8 hash functions
- 8 stream ciphers
- 12 Authenticated Encyption with Assiated Data (AEAD) ciphers
These pages were last updated in 2017 and the research has of course continued since then. But I would be surprised if the ratio between the categories have changed dramatically since then.
Lightweight Block Ciphers
So then, what is a lightweight algorithm? There seems to be a several factors being used to measure lightness:
- Size of an implementation. Code size in terms for SW. FPGA resources or cells/gates.
- Resources required – typically amount memory or registers to store data and intermediate state.
- Complexity of the implementation – What types of operations are needed and size of operands. For example multiplication, and bit rotations.
- Energy required to perform a security operation.
For block ciphers, lightness in general seems to be when a cipher requires less of something compared to AES.
The venerable block cipher XTEA could be considered a lightweight cipher. It has a compact implementation and a small state. But the number of iterations is quite high. RC4 (apart from being a stream cipher, and utterly broken) is an example of a very compact and simple cipher in terms of code size and operand sizes and operands required. But the state is 256 bytes.
Looking at several of the lightweight block ciphers I observe some common themes:
- Small block sizes. Typically 64 bits, but block size of 32 or 48 bits is sometimes also supported. This may reduce the message expansion for small messages due to less padding. But on the other hand the amount of data that can be protected under a given key will be much less compared to AES.
- Short keys. 64, 96 up to 128 bits. But rarely support for 256 bit keys. This reduce the strength of the security provided and reduce the lifetime for the key being used.
- Simple round functions, but with many rounds. 16, 32 and sometimes more than 64 rounds. Even if rounds can be coalesced in a hardware implementation, the latency for performing a complete block operation can be quite high. In combination with a small block size the overall throughput can be rather low.
Limitations like these may make a lightweight cipher less useful. If the IoT system must add support for, and even frequently need to perform key exchange operations with its host, the gain in resource requirements may be lost.
The Block Cipher PRINCE
One lightweight block cipher that I see have some really interesting properties, and at least partially less issues is [PRINCE] (pdf). PRINCE (Wikipedia), presented in 2012, is a low-latency, lightweight block cipher. The key size is 128 bits and has a block size of 64 bits. The number of iterations are 11-13 (depending on how you want to count the middle, partial rounds). This compares pretty good to AES that has 10 rounds when using a 128 bit key. The difference is that the rounds in PRINCE are much less complex. In hardware it is easy to coalesce the rounds to reduce latency and achieve better performance compared to AES.
Figure 2: The structure of the PRINCE block cipher with forward, middle and reverse rounds. (The figure is from the PRINCE paper.)
The cipher is based on a single key Even-Mansour-like scheme construction] called an FX-construction. This provides whitening directly in the block cipher itself, something AES lacks. The derivation of subkeys for the FX-construction is simple. And the round key schedule in PRINCE is also very simple, much less complex than the round key schedule in AES. Another interesting feature of PRINCE is the α-reflection property. The cipher is designed in such a way that the difference between encryption and decryption is the order of the subkeys and an XOR operation with the constant α:
Figure 3: Equality between Decryption and Encryption by order of subkeys and α. (The figure is from the PRINCE paper.)
As the text above points out this means that the same code or hardware implementing the cipher can be used for both encryption and decryption. For AES the decryption functionality can possibly be removed if the block cipher modes used only require support for encryption. The commonly used mode CTR only needs encryption in the underlying block cipher. But the α-reflection property in PRINCE makes the cipher less complex for all modes.
In terms of security there are a number of papers published, and there has even been a number of challenges. So far the cipher seems to hold up against attacks.
One interesting real-world usage of PRINCE is for on-device memory protection in the LPC55S6x Cortex-M33 based microcontrollers from NXP.
I have implemented PRINCE as an open, BSD-licensed core. My implementation has a total of three pipeline stages. Compared to an AES-implementation with one cycle/round, the PRINCE core provides 1.5x throughput. On a Xilinx Artix-7 FPGA device the PRINCE core requires 1587 LUTs and 646 registers. My compact but slow (40+ cycles) AES implementation] requires 2298 slices (which for Artix-7 should be 9192 LUTs) and 2989 registers. The difference in resources required is quite dramatic. Bear in mind that the resources also includes the implementation of a simple API, and the larger block size of AES requires more registers and some more logic for the register access multiplexers and API decode logic. For the just the core functionality the difference would be even larger.
PRINCE has a block size of only 64 bits, which limits the amount of data that can be protected under a given key. It also makes PRINCE less useful as a primitive for implementing a block cipher based Message Authentication Code (MAC) algorithm. But for encryption, compared to AES, PRINCE has some really interesting properties, such as the low latency, simple implementation, the α-reflection that allows reuse of the encipher implementation for decryption, really low resource usage and the whitening provided by the FX-construction. PRINCE is a really interesting lightweight block cipher with properties that translates to reduced resource usage and cost.
Written by Joachim Strömbergson. Embedded and Hardware security specialist at Assured AB
(c) 2020 Assured AB