Thursday, September 28, 2017

CHES 2017 Taipei, Taiwan

CHES 2017 was held September 25th - 28th in Taipei, Taiwan. This being my first trip to CHES, I was glad to see a mix of academics and people in industry whom had ties with cryptographic hardware on embedded systems.

Although I have a limited frame of reference, I feel the standard of the conference was quite high - the presenters all knew what they were talking about in great detail, and were able to accurately describe the contribution they had made to their respective fields.

My favourite talks were in the 'Side-Channel Analysis' and the 'Emerging Attacks' sessions, as the talks in these two sessions in particular were engaging and close to the work I have been doing during my PhD.

However, my obligatory post-conference blog post will be on 'Sliding right into disaster: Left-to-right sliding windows leak', a joint work by Daniel J. Bernstein, Joachim Breitner, Daniel Genkin, Leon Groot Bruinderink, Nadia Heninger, Tanja Lange, Christine van Vredendaal, and Yuval Yarom (I wasn't aware so many people could work on one paper at the same time!).

The contribution of the paper was showing that although the Right-to-Left sliding window didn't provide leak a great deal of information, the Left-to-Right sliding window provided just enough to recover the full key (in some cases).

For a brief recap, RSA uses modular exponentiation, and in many implementations the 'sliding window' method is used for efficiency. This can be done either Left-to-Right or Right-to-Left, and although they are very similar, they have very slight differences: the Right-to-Left method tends to be easier to program, uses the same number of multiplications as Left-to-Right, but requires a bit more storage. Both are used in practice: the paper shows that the Libgcrypt crypto library uses the Left-to-Right method (and hence they provide an attack against this implementation).

One way to think about it is that if you want to compute x^25, you would convert the exponent 25 into binary, manipulate this bitstring in some way (depending on whether you are going Left-to-Right or Right-to-Left, and also on the size of your window w), and then parse the bitstring: for every non-zero bit, perform a multiply; for every zero bit, perform a square (or something to that effect)

In this manipulated bitstring in the Right-to-Left method, due to the way the bitstring is created, we are guaranteed to have w - 1 zero bits after a non-zero bit. From a leakage point of view, this doesn't provide much information.

However, in the Left-to-Right method, two non-zero bits can be as close as adjacent. This allows us to infer certain details about the bitstring by applying certain rules to what we know (the leakage), and in some cases, working out the value of the key.

If we are able to recover >50% of the key bits this way, we can implement an efficient Heninger-Shacham attack to recover the remaining bits.

The paper was presented by Leon Groot Bruinderink, and he explained it in such a way that I found it clear to understand how the attack works, and how one would prevent against this kind of attack (not using Left-to-Right would be a start). They also contacted Libgcrypt with details of the attack, and it has been fixed in version 1.7.8.

Aside from the papers, CHES has been pretty amazing: the venue was a 5 star hotel in the centre of Taipei, the food was absolutely incredible (even the banquet, which introduced me to the wonders of sea cucumber), and the excursion to the Taipei Palace Museum was exceptionally educational (which as we all know is the best kind of fun).

I would definitely recommend CHES to anyone interested in the more practical side of cryptography, although if it ever takes place in Taiwan again, I strongly suggest you Youtube how to use chopsticks. Unfortunately I never learnt, and after a humiliating trip to the ShiLin Night Market, am now featured on several locals' phones in a video named 'The Tourist who couldn't eat Beef Noodle Soup'.

Tuesday, September 5, 2017

Crypto 2017 - How Microsoft Wants to Fix the Internet (Spoiler: Without Blockchains)

In the second invited talk at Crypto, Cédric Fournet from Microsoft Research presented the recent efforts of Project Everest (Everest VERified End-to-end Secure Transport), which seems an attempt to fix implementing TLS once and for all. Appropriately for such a gigantic task, more than a dozen researchers on three continents (and the UK) work on making it verifiable and efficient at the same time.

As with every self-respecting talk in the area, it started with the disaster porn that is the history of TLS (our lifes depend on it, but it's complicated, there have been 20 years of failures, insert your favourite attack here). However, the Crypto audience hardly needs preaching to (just a reminder that the work isn't done with a proof sketch; performance and error handling also matters), so the talk swiftly moved on to the proposed solutions.

The story starts in 2013 with miTLS, which was the first verified standard-compliant implementation. However, it still involved hand-written proofs and was more of an experimental platform. Enter Everest: They want to tick all the boxes by providing verified drop-in replacements for the HTTPS ecosystem. It covers the whole range from security definitions to code with good performance and side-channel protection.

As an example, Cédric presented Poly1305, a MAC that uses arithmetic modulo $2^{130}-5$ and forms part of the upcoming TLS 1.3 specification. Unsurprisingly, there have already been found bugs in OpenSSL's implementation. Project Everest have implemented Poly1305 with ~3,000 lines of code in Low*, a subset of F* (a functional language) that allows both C-style programming (with pointer arithmetic) as well as verification. Compiling this code with KreMLin (another output of Project Everest) results in machine code that is as fast as hand-written C implementations. The same holds for ChaCha2 and Curve25519.

However, hand-written assembly is still faster. The project aims to catch up on this with Vale, which was published at Usenix this year. Vale promises extensible, automated assembly verification and side-channel protection.

So what is the way forward? TLS 1.3 is on the horizon, bringing various improvements at the cost of a considerable re-design. This requires new implementations, and the project hopes to be first to market with an efficient and verifiable one.

For the rest of talk, Cédric gave more details on how F* naturally lends itself to security games, and how they proved concrete security bounds for the TLS 1.2 and 1.3 record ciphersuites.

All in all, I think it was a great example for an invited talk at Crypto, putting cryptography in a bigger context and bringing work that isn't necessarily on a cryptographer's radar to our attention.

Tuesday, August 22, 2017

Crypto 2017 - LPN Decoded

Crypto 2017 kicked off this morning in Santa Barbara. After a quick eclipse-watching break, the lattice track ended with the presentation of LPN Decoded by Andre Esser, Robert Kübler, and Alexander May.

Learning Parity with Noise (LPN) is used as the underlying hardness problem in many cryptographic protocols. It is equivalent to decoding random linear codes, and thus offers strong security guarantees. The authors of this paper propose a memory-efficient algorithm to the LPN problem. They also propose the first quantum algorithm for LPN.

Let's first recall the definition of the LPN problem. Let a secret $s \in \mathbb{F}_2^k$ and samples $(\mathbb{a}_i, b_i)$, where $b_i$ equals $\langle \mathbb{a}_i,s \rangle + e_i$ for some $e \in \{0,1\}$ with $Pr(e_i=1)= \tau < 1/2$. From the samples $(\mathbb{a}_i,b_i)$, recover $s$. We see that the two LPN parameters are $k$ and $\tau$. Notice that this can be seen as a sub-case of the Ring Learning with Errors problems; in fact, LWR originated as an extension of LPN.

If $\tau$ is zero, we can draw $k$ independent samples and solve for $s$ by Gaussian elimination. This can also be extended to an algorithm for $\tau < 1/2$, by computing a guess $s'$ and testing whether $s'=s$. This works well for small $\tau$, for example $\tau = 1/\sqrt{k}$, used in some public key encryption schemes. Call this approach Gauss.

For larger constant $\tau$, the best current algorithm is BKW. However, although BKW has the best running time, it cannot be implemented for even medium size LPN parameters because of its memory consumption. Further, BKW has a bad running time dependency on $\tau$. Both algorithms also require many LPN oracle calls. 

The authors take these two as a starting point. They describe a Pooled Gauss algorithm, which only requires a polynomial number of samples. From those, they look for error-free samples, similar to Gauss. The resulting algorithm has the same space and running time complexity, but requires significantly less oracle calls. It has the additional advantage of giving rise to a quantum version, where Grover search can be applied to save a square root faction in the running time. 

They then describe a second hybrid algorithm, where a dimension reduction step is added. Thus Well-Pooled Gauss proceeds in two steps. First, reduce the dimension $k$ to $k'$ (for example, using methods such as BKW). Then, decode the smaller instance via Gaussian elimination. The latter step is improved upon by using the MMT algorithm.

For full results, see the original paper. Their conclusion is that Decoding remains the best strategy for small $\tau$. It also has quantum optimisations and is memory-efficient. The Hybrid approach has in fact no advantage over this for small values of $\tau$. For larger values however, they manage to solve for the first time an LPN instance of what they call medium parameters - $k = 243$, $\tau = 1/8$ - in 15 days.

Tuesday, May 2, 2017

Eurocrypt 2017 - Parallel Implementations of Masking Schemes and the Bounded Moment Leakage Model

MathJax TeX Test Page
Side-channel analysis made its way into Eurocrypt this year thanks to two talks, the first of which given by François-Xavier Standaert on a new model to prove security of implementations in. When talking about provable security in the context of side-channel analysis, there is one prominent model that comes to mind: the d-probing model, where the adversary is allowed to probe d internal variables (somewhat related to d wires inside an actual implementation) and learn them. Another very famous model, introduced ten years later, is the noisy leakage model in which the adversary is allowed probes on all intermediate variables (or wires) but the learnt values are affected by errors due to noise. To complete the picture, it was proved that security in the probing model implies security in the noisy leakage one.

The work of Barthe, Dupressoir, Faust, Grégoire, Standaert and Strub is motivated precisely by the analysis of these two models in relation to how they specifically deal with parallel implementation of cryptosystems. On one hand, the probing model admits very simple and elegant description and proofs' techniques but it is inherently oriented towards serial implementations; on the other hand, the noisy leakage model naturally includes parallel implementations in its scope but, admitting the existence of noise in leakage functions, it lacks simplicity. The latter is particularly important when circuits are analysed with automated tools and formal methods, because these can rarely deal with errors.

The contribution of the paper can then be summarised in the definition of a new model trying to acquire the pros of both previous models: the Bounded Moment leakage Model (BMM). The authors show how it relates to the probing model and give constructions being secure in their model. In particular, they prove that BMM is strictly weaker than the probing model in that security in the latter implies security in the former but they give a counterexample that the opposite does not hold. The informal definition of the model given during the talk is the following:
An implementation is secure at order o in the BMM if all mixed statistical moments of order up to o of its leakage vectors are independent of any sensitive variable manipulated.

A parallel multiplication algorithm and a parallel refreshing algorithm are the examples brought to show practical cases where the reduction between models stated before holds, the statement of which is the following:
A parallel implementation is secure at order o in the BMM if its serialisation is secure at order o in the probing model.
The falsity of the converse is shown in a slightly different setting, namely the one of continuous leakage: the adversary does not just learn values carried by some wires by probing them, but such an operation can be repeated as many times as desired and the probes can be moved adaptively. Clearly this is a much stronger adversary in that accumulation of knowledge over multiple probing sessions is possible, which is used as a counterexample to show that security in the continuous BMM does not imply security in the continuous probing model. The refreshing scheme mentioned above can easily be broken in the latter after a number of iterations linear in the number of shares, but not in the former as adapting the position of the probes does not help: an adversary in the BMM can already ask for leakage on a bounded function of all the shares.

Both slides and paper are already available.

Eurocrypt 2017: On dual lattice attacks against small-secret LWE and parameter choices in HElib and SEAL

This morning, Martin gave a great talk on lattice attacks and parameter choices for Learning With Errors (LWE) with small and sparse secret. The work presents new attacks on LWE instances, yielding revised security estimates. This leads to a revised exponent of the dual lattice attack by a factor of 2L/(2L+1), for log q = Θ(L*log n). The paper exploits the fact that most lattice-based FHE schemes use short and sparse secret. We will write q to denote the LWE modulus throughout.

Let's first have a look at the set-up. Remember LWE consists of distinguishing between pairs (A, As+e) and (A,b). In the first instance, A is selected uniformly at random and b is selected from a special (usually Gaussian) distribution. In the second one, both A and b are uniformly random. Selecting s, as this work shows, is perhaps trickier than previously thought. Theory says that, in order to preserve security, selecting a short and sparse secret s means the dimension must be increased to n*log_2(q). Practice says just ignore that and pick a small secret anyway. More formally, HElib typically picks a secret s such that exactly h=64 entries are in {-1,1} and all the rest are 0. SEAL picks uniformly random secrets in {-1,0,1}.

We also recall that the dual lattice attack consists of finding a short vector w such that Aw = 0, then checking if
<Aw, (As+e)w> = <w,e>
is short. If we are in the presence of an LWE sample, e is short, so the inner product is short. Short*short = short, as any good cryptographer can tell you.

The improvements presented in this paper rely on three main observations. Firsly, a revised dual lattice attack is presented. This step is done by adapting BKW-style algorithms in order to increase efficiency and can be done in general, i.e. does not depend on either shortness or sparseness of the secret. It is achieved by applying BKZ to the target basis, then re-randomising the result and applying BKZ again, with different block size.

The second optimisation exploits the fact that we have small secrets. We observe that we can relax the condition on w somewhat. Indeed, if s is short, then finding w such that Aw is short instead of 0 is good enough. Therefore, we look for vectors (v,w) in the lattice

L = {(y,x): yA = x (mod q)}.

Now in small secret LWE instances, ||s||<||e|| and so we may allow ||v||>||w|| such that
||<w,s>|| ≈ ||<v,e>||.

Finally, the sparsity of the small secret is exploited. This essentially relies on the following observation: when s is very sparse, most of the columns of A become irrelevant, so we can just ignore them.

The final algorithm SILKE is the combination of the three above steps. The steps are the following.
  • Perform BKZ twice with different block sizes to produce many short vectors
  • Scale the normal form of the dual lattice
  • If sparse, ignore the presumed zero columns, correct for mistakes by checking shifted distribution

As usual, Martin wins Best Slides Award for including kittens.

Wednesday, April 12, 2017

Is Your Banking App Secure?

Last week I was in Malta for Financial Cryptography and Data Security 2017 to present my recent work on securing the PKCS#11 cryptographic API.

One talk that stood out for me was by researchers from the University of Birmingham, who looked for vulnerabilities in the mobile apps provided by major UK banks.

Sadly, they found major weaknesses in apps from 5 of the 15 banks they investigated.

Several apps use certificate pinning, where the app hard-codes a certificate from a trusted CA and only accepts public keys that are signed by the pinned certificate.
This is good practice, as an attacker can add their own certificate to the phone's trust store, but it won't be accepted by the app.
However, two Android apps (for Natwest and Co-op) accepted any public key signed by the pinned certificate, without checking the domain name!
So the attack works as follows:

  1. Purchase a certificate for a domain you own from the trusted CA
  2. The app will accept your public key with this certificate
  3. Man-in-the-middle all the encrypted traffic between the user and their bank.

Curiously, the authors note: "Co-op [...] hired two penetration testing companies to test their apps, both of which had missed this vulnerability". It seems odd that such an obvious mistake wasn't picked up in testing.

The group also found that several banks - Santander, First Trust and Allied Irish - served adverts to their app users over unencrypted HTTP, meaning an attacker could spoof these ads and mount a phishing scam, perhaps by displaying a fake 'security warning' and directing users to re-enter their account details on a malicious page. It was pointed out in the talk that we're much more likely to 'feel safe' within an app (and hence trust all the content we see) than, say, visiting a webpage using a laptop, so this kind of in-app phishing scam could be very effective.

There are even more exploits described in the paper.

It was refreshing to hear that the vulnerable banks responded well to the disclosures made by the Birmingham group and patched their apps as a result. But I'm a little baffled that these basic errors were ever made in such security critical applications.

Wednesday, March 29, 2017

PKC 2017: Kenny Paterson accepting bets on breaking TLS 1.3

The member of the TLS 1.3 working group is willing to bet for a beer that the 0-RTT handshake of TLS 1.3 will get broken in the first two years.

In his invited talk, Kenny managed to fill a whole hour on the history of SSL/TLS without even mentioning symmetric cryptography beyond keywords, thus staying within the topic of the conference. Despite all versions of SSL being broken to at least some degree, the standardised TLS became the most import security protocol on the Internet.

The core part of TLS is the handshake protocol, which establishes the choice of ciphers and the session key. Kenny highlighted the high complexity stemming from the many choices (e.g., using a dedicated key exchange protocol or not) and the possible interaction with other protocols in TLS. Together with further weaknesses of the specification, this created the space for the many attacks we have seen. On the upside, these attacks express an increased attention by academics, which comes together with an increased attention by the society as whole. Both have laid the ground for improvements in both the deployment and future versions of TLS. For example, the support of forward secrecy has increased from 12 percent to 86 according to SSL pulse.

Turning to concrete attacks, most important in the area of PKC is the Bleichenbacher attack published already at Crypto 1998 (a human born then would a considered a full adult at the conference venue now). Essentially, it exploits that RSA with the padding used in TLS is not CCA-secure, and it recovers the session key after roughly $2^{20}$ interactions with a server. Nevertheless, the TLS 1.0 specification published shortly after Bleichenbacher's publication incorporates the problematic padding (recommending mitigation measures), and later versions retain it for compatibility. The DROWN shows the danger of this by exploiting the fact that many servers still offer SSLv2 (about 8% of Alexa top 200k) and that it is common to use the same key for several protocol versions. An attacker can recover the session key of a TLS session by replaying a part of it in an SSLv2 session that uses the same key.

On a more positive note, Kenny presented the upcoming TLS 1.3, which is under development since 2014. It addresses a lot of the weaknesses of previous versions by involving academics from an early stage and doing away with a lot of the complexity (including reducing options and removing ciphers). It furthermore aims to decrease latency of the handshake by allowing the parties to send encrypted data as early as possible, reducing the round trip time to one in many cases. The goal of low latency has also led to the inclusion of QUIC, which provides zero round trip time, that is, the client can send data already in the first message when resuming a session. However, QUIC is not fully forward-secure and therefore confined to a separate API. Nevertheless, Kenny predicts that the sole availability will be too tempting for developers, hence the bet offered.

Concluding, he sees three major shifts in TLS this far: from RSA to elliptic-curve Diffie-Hellman, to Curve25519, and away from SHA-1 in certificates. A fourth shift might happen with the introduction of post-quantum algorithms such as Google's deployment of New Hope. Less optimistically, he expects that implementation vulnerabilities will continue to come up.

Update: An earlier version of this post mentioned the non-existing Curve255199 instead of Curve25519, and it attributed New Hope to Google.