The most common technique for building end-to-end secure messaging is the double-ratchet (DR) protocol. At Wickr we’ve spent quite some effort analyzing (and improving on) the DR (and its many variants). Despite this work, we have so far opted for a decidedly different approach with our secure messaging protocol. I wanted to share some of our reasoning behind this decision.
A first reason not to make the switch was the realization that many of the new security properties one could add to the standard DR are in fact, already present in Wickr’s Messaging Protocol. Now having said that, at first glance the DR does seem to have two remaining advantages compared to the Wickr Messaging Protocol (WMP). But it turns out that these too do not present a compelling reason for Wickr to adopt a DR based solution. The first advantage actually concerns a security property Wickr has consciously chosen not to pursue. Meanwhile, the second advantage comes at, what feel is, a very steep price. Not only does it create some serious engineering/usability issues, it also introduces a new type of vulnerability in the protocol; one we believe to be unacceptable for Wickr to suffer in some of the scenarios in which it is being deployed. Finally, we found that other features in Wickr’s wider system (i.e. beyond its just its core messaging protocol) make this second advantage a less valuable property to have (even if we could avoid the new attack and basic usability issues is seems to introduce). In the rest of this post I’ll go into more detail each part in this decision process.
What We’ve Looked At.
The Wickr’s Messaging Protocol (WMP) and the Double-Ratchet (DR) share most of their core security goals. Essentially, both aim to provide end-to-end authenticity, forward secrecy (FS) and post compromise security (PCS). FS ensures messages received before a device compromise should remain secret. Conversely, PCS ensures that even if a device state (including all its key material) was leaked continued regular usage of the protocol will result in healing the channel again i.e. returning all future communication back to a secure state. This has motivated us at Wickr to take a much closer look at the DR.
To clarify, we haven’t only been looking at the standard variant of the DR as used by practically all deployed DR-based messaging platforms (i.e. X3DH + Signal’s DR). Rather, we’ve also been looking at a whole host of extensions to that approach (several of which Wickr introduced here at EUROCRYPT 2019). Of course, we are only interested in truly practical variants that can be feasibly implemented and used under reasonable engineering and efficiency constraints. But that has still left us with quite a wide selection of extensions to consider. Concretely, besides the original (now widely used) version we’ve also explored practical variants including ones that do the following:
- Get faster PCS especially in one-sided conversations and when one party stays offline for long periods of time.
- Ensure authenticity for incoming messages despite device state leakage.
- Guarantee outgoing messages are secret despite device state having leaked.
- Provide FS and PCS against attackers armed with a powerful quantum computer.
Something to note here is that the WMP already has the first 3 of those properties. So augmenting the DR with those properties brings it would only put it on par with the WMP in those respects, not improve on it.
One thing I should mention here is that the DR, especially in its original form, also targets a third basic security property called sometimes called (cryptographic) deniability. The study of deniability is a storied, subtle and complicated topic within privacy and cryptography. Grossly simplified, a deniable protocol is one which creates no (cryptographic) evidence that participants ever took part and/or performed a given action (such as sending a given message or being a member of a given group).
At Wickr we believe that deniability is, in principle, an interesting and valuable topic of study and security goal for a system to target. However, we believe that cryptographic deniability is somewhat less interesting, at least for the ways that Wickr is being deployed. Instead, we believe that a host of other, non-cryptographic, features of a communication system will almost always end up providing significantly stronger practical protection when it comes to separating the real-world identities of users from their digital actions. Things like:
- Hiding IP addresses (e.g. via onion-routing, proxies, VPNs, etc.)
- Masking digital footprints (e.g. OS version, software version, etc.)
- Using pseudonymous accounts not (necessarily) tied to external identifying information such as emails or phone numbers.
- Allowing users to have full control over every aspect of the system including the back-end servers (e.g. as opposed to using a single global public network).
- Emphasizing ephemerality at every stage in the system (such as the client software, delivery server, etc.).
Thus, Wickr has focused its efforts on these and more aspects to enable anonymous free communication for its users while (the much narrower) cryptographic deniability property for the underlying messaging protocol has not been a priority for Wickr.
Hijacking Double-Ratchet Channels.
Despite the common goals of FS and PCS (not to mention the potential for strengthening the DR) we have found there to remain a fundamental trade-off in terms of what is (or even could be) achievable between the WMP and any DR-like approach.
The trade-off stems from the fact that inherent to the DR technique is that sending (i.e. encrypting and authenticating) a message requires knowing some secret state which continues to evolve. Concretely, in Signal’s original DR this secret state primarily consists of the state of the asymmetric ratchet (and the symmetric ratchets).
On the one hand, the DR clearly benefits from this technique. In particular, it conveys the session with an authenticity properties for incoming messages. After all, if Alice and Bob have an ongoing DR session and Alice receives a message that decrypts and authenticates properly then she now that whoever sent that message must have known the (presumably) secret state belonging to Bob. What’s more, because this secret state evolves the DR can also provide PCS properties. Moreover, since that evolution is one-way (i.e. it can’t be unwound) the DR also provides FS. So far so good.
However, there’s a price being paid here. Suppose Eve manages to obtain Bob’s current secret state (e.g. by getting an image of his devices memory). Of course, Eve can now impersonate Bob to Alice. The same holds had they been using WMP. In fact, no protocol in the world could prevent that because Bob himself must be able to talk to Alice based on nothing more than that state.
But what happens if Eve beats Bob to the punch and sends a message to Alice announcing the net evolution of “Bob’s” state? At this point Bob has been effectively kicked off the channel. Alice has no way of telling the announcement comes from Eve instead of Bob so she will accept it. But only Eve actually knows the new evolved state. So, Bob has no way to prepare messages for Alice in a way that she would (recognize and) accept. In other words, Eve has transparently hijacked the channel and Bob has no recourse.
Channel Recovery With Wickr.
So how are things different for the WMP? Well, in contrast to the DR, the WMP protocol separates the secret state it uses to provide privacy and the state it uses to provide authenticity. (The former is based on Diffie-Hellman keys while the latter is based on signature keys). This gives the WMP the flexibility to continuously evolve only the privacy relevant part of this of Bob’s secret state. The net effect is both a pro and a con with respect to the DR’s situation.
On the plus side, Eve no longer has the ability to prevent Bob from authenticating messages to Alice. Specifically, regardless of how many evolutions (or other messages) Eve has sent on behalf of Bob he will always be able to re-establish a compatible state with Alice so that he can resume private communicating with her. So, while she can, under certain circumstances, impersonate Bob, she can no longer permanently kick him off the channel.
On the negative side, since the authentication relevant secret state of Bob’s is not evolved the PCS guarantees are weakened. In particular, Eve retains the ability to authenticate messages as if they came from Bob’s compromised device. Crucially though, after a short window (approx. 1 hr) she is not able to read any messages sent by Bob. (She’s not able to read messages sent by Bob the instant she loses access to his device. In contrast, in the usual DR variant she can also read those messages up to the point that at least one message was sent by Bob to Alice and then from Alice back to Bob.)
Of course, Eve always being able to forge messages from Bob is not an ideal state of affairs. So, to mitigate it the WMP uses different authentication secrets (i.e. signature keys) on each of Bob’s devices belonging to his account. Moreover, compromising any such device does not afford Eve the capability to add new devices to his account (as that requires either a password or master key neither of which are stored on Bob’s device). Thus, whenever Bob want’s he can remove (or wipe) devices to revoke Eve’s ability to authenticate messages from that device. (Wickr allows contacts and all stored messages to be automatically transferred between devices at setup time. So, when Bob, say, wipes and re-install a Wickr install on a device he suspects was compromised he effectively ends up with the exact same view of his communication history except that, under the hood, the secret state has been completely refreshed.) [CH1] [oA2]
After a fair amount of thought on the topic we’ve decided that creating the potential for Eve to permanently hijack communication with no recourse for Bob is not a price we want our users to pay for the add authentication healing capabilities of DR-based approach would give Wickr. Especially given the mitigating features I outlined above and not to mention the following issue.
Recovering From State Loss
Introducing the possibility of Hijacking is not the only problem with basing authentication on an evolving secret state. In fact, there is a much more mundane problem a normal user will, quite likely, end up running in to. What if Bob loses the state on his device? Say he lost his phone or he updated the OS on the phone so he had to wipe it clean first? If Alice authenticates him based only on this evolving secret state, he has effectively loss his account when something like that happens.
So how is this dealt with in practice? Well, in practice it seems to almost universal solution has been by letting Bob re-initialize a completely new secret state which is then (automatically) announced to Alice. Indeed, as far as I can tell, in almost all DR deployments out there, Alice will be (at best) briefly notified about such an announcement and communication will otherwise continue uninterrupted. (In fact, up till relatively recently at least both WhatsApp and iMessage don’t seem to notify Alice at all.)
The key point here though is that, in practice, this means that such implementations don’t really have PCS with respect to authenticity after all. In fact, an adversary that controls either the key distribution server or whatever mechanism Bob uses to authenticate himself to that server (usually either a phone number or an email address) will always have the ability to force such a re-initialization with Alice. In other words, the one security advantage a DR-based approach would have over the WMP that we found interesting ends up being weakened to — for the sack of usability – to be effectively (at best) as secure as the one already provided by Wickr. For all these reasons Wickr will, for the foreseeable future, be sticking with the WMP approach to messaging.
[CH1] Is there more difference that’s worth highlighting related to the impact that temporary state loss event would have on future message secrecy?
I always felt like we gained something by using fresh/signed ephemeral key material for every message. Once Eve loses access to Bob’s device, she will immediately lose the ability to read future messages from Bob to Alice. In fact, Eve’s ability to read messages in that direction has nothing to do with Bob’s state and everything to do with her access to Bob’s device (as Bob types them or they are retained in his app history). In the other direction (future messages from Alice to Bob inbound), Eve will only ever be able to read as many messages max as there are ephemeral keys in Bob’s state at the time of compromise.
If feel like DR doesn’t bear up as well in such a scenario?
[oA2] Those are covered by the 1st and 3rd improvement to the DR listed in section 1. I added a couple lines after those bullets to the end of section 1 where I now clearly state that the WMP already has these properties so updating the DR would only bring it on par with WMP not improve on it.