Tuesday, February 5, 2013

Time is money (in CBC ciphersuites)

While protocols are not always nicely written, deviating from them has a big disadvantage. You cannot blame someone else if there is a problem. It has a small advantage though, you avoid monoculture and an attack that is applicable to the protocol may not applicable to you. What happened in the following story is something in between.

But let's start from the beginning. Few months ago I received a paper from Nadhem Alfardan and Kenny Paterson about a new timing attack on TLS CBC ciphersuites. That paper got on my low priority queue since I'm not that impressed by timing attacks. They are nicely done on the lab, but fall short in the field. Nevertheless at some point this attack got my attention because it claimed it could recover few bits of the plaintext when using GnuTLS over ethernet.

Let's see what is the actual scenario though. In order for the attack to work the client must operate as follows. It connects to a server, it sends some data (which will be encrypted), the attacker will intercept them, and terminate the client's connection abnormally. The client will then reconnect and resend the same data, again and again.

That is not the typical scenario of TLS, but it is not so unrealistic either (think of a script that does that). The main idea of the attack is that the described attacker can distinguish between different padding values of TLS messages using the available timing information between receipt of a message and server reply (which may just be the TCP tear down messages).

How is that done with GnuTLS? The attacker takes the input ciphertext and forms a message of 20 AES blocks. Then takes the part of the message he's interested at and copies the matching block and its predecessor as the last two encrypted blocks (which contain the MAC and padding).

Then on every different user connection it XORs the last byte of the penultimate block with a byte (say Delta to be consistent with the paper) of his choice. On every client reconnection he repeats that using a different Delta and waits for the server response (a TLS server notifies the client of decryption failure).

In the paper they plot a diagram that shows that certain values of Delta require different time to receive a reply. I re-implemented the attack for verification, and the measurements on the server can be seen in the following diagram. Note that the last plaintext byte is zero (also note that some attack results in the paper are due to a bug the authors discovered which is now fixed, the discussion here refers to the fixed version of GnuTLS).
Figure 1. Median server timings for AES-CBC-SHA1, for a varying Delta, on GnuTLS 3.1.6 or earlier (view on the server)
It is clear that different values of Delta cause different response time. What we see in these timing results is the differences due to the sizes of the data being applied to the hash algorithm. In GnuTLS the bigger the Delta, the larger the pad (remember that the last byte of plaintext was zero), and less time is spent in hashing data.

The small increase in processing time seen on every block, is the correctness check of the pad (which takes more time as the pad increases).  In that particular CPU I used, the differences per block seem to be around 200 nanoseconds, and the difference between the slower and the fastest block is around 1 microsecond.

The question is, can an attacker notice such small differences over ethernet? In the paper the authors claim yes (and present some nice figures). I repeated the test, not over the ethernet, but over unix domain sockets, i.e., only delays due to kernel processing are added. Let's see the result.

Figure 2. Median server timings for AES-CBC-SHA1, for a varying Delta, on GnuTLS 3.1.6 or earlier (attacker view)

Although it is more noisy the pattern is still visible. Could we avoid that pattern from being visible? Let's first follow the RFC padding removal precisely and check again.

Figure 1. Median server timings for AES-CBC-SHA1, for a varying Delta, on GnuTLS 3.1.6 with TLS conformant padding check applied (view on the server)

That's much better. Obvious patterns disappeared. However, there is no happy end yet. In the same paper the authors present another attack on OpenSSL which uses the above padding method. Surprisingly that attack can recover more data than before (but at a slower pace). That's the Full Plaintext recovery attack on the paper and it uses the fact that TLS 1.2 suggests to assume 0 bytes of padding. Zero bytes of padding is an invalid value (padding in TLS is from 1 to 256 bytes) and thus cannot be set by a legitimate message.

This can be exploited and then invalid padding can be distinguished from valid padding, by selecting message size in a way that if 0 is used as pad, the additional byte that will be hashed will result to a full block processed by the hash algorithm. That is we would expect that if a correct pad is guessed less processing would occur. Let's visualize it on an example.

Figure 4. Median server timings for AES-CBC-SHA1, for a varying Delta, on a TLS compliant implementation

Notice at the lonely point on the bottom left. We can see that when Delta is zero the processing is 200ms faster. For the selected message, the zero Delta resulted to a correct pad. How to fix that? That is pretty tricky.

In order to fix this pattern the code that is doing the CBC pad removal has to be aware of the internal block sizes in the hash, as well as any internal padding used by the hash. In a typical layered implementation (or at least in GnuTLS) that isn't easy. The internal hash block size wasn't available, because one shouldn't need to know that. The paper suggests a fix, that assumes a block size of 64, which is correct for all the HMAC algorithms in TLS, but that isn't future proof (e.g. SHA3 hasn't got the same block size).

So what to do there? The fix for GnuTLS is now similar to the hack described in the paper. The hash block size and knowledge of the internal padding are taken into account to achieve a pretty uniform processing time (see below). Any differences are now into the level of 10's of nanoseconds.
Figure 4. Median server timings for AES-CBC-SHA1, for a varying Delta, after work-around is applied


Now we're very close to a happy end. What is important though, is to prevent a similar attack from occurring next year. This isn't a new attack, even the TLS 1.1 protocol acknowledges that timing channel, but does nothing to solve it. TLS implementations now need to have 2-3 pages of code just to remove the CBC padding, and that makes clear that the TLS spec is broken and needs to be updated.

Together with Alfredo Pironti we plan to propose proposed a new padding mechanism which one of its targets is to eliminate that CBC padding issue (the other target is to allow arbitrary padding to any ciphersuite). The way the timing channels in CBC padding are eliminated, is by considering pad as part of the transferred data (i.e., it is properly authenticated). As such the pad check code does not provide any side-channel information because it operates only after the MAC is verified.

PS. Kudos go to Oscar Reparaz for long discussions on side channel attacks.

PS2. I should note that Nadhem and Kenny were very kind to notify me of the bugs and the attack. While this may seem like the obvious thing to do, it is not that common with other academic authors (I'm very tempted to place a link here).