To mark the special day of January the 30th I release this article addressing my previously made observation that most crypto messaging schemes are cheaply compromised by non-cryptographic means, i.e. by seizing servers, buying companies, server-side wiretapping, social network analysis and other means.
The point of this article is to introduce a completely end-to-end encrypted messaging scheme that is resistant to the aforementioned non-cryptographic attacks. Namely, where servers can't see messages and cannot even know who is talking to whom and touch no data they can decrypt (dumb core). Both servers and end-user devices are disposable and message exchange patterns can not be derived from passive traffic observations in the general case. Still, the system is simple and practical to implement. I will use an instant chat application as an example, although the explained network essentially provides an end-to-end generic data pipe.
This work is dedicated to the memory of Willem van Oranje.
This work is an improvement on a [previous sketch][medium] of a secure messaging network responding to non-cryptographic threats (NCTs). The primary threat I consider is sophisticated wiretapping. The adversary is expected to go far beyond a regular man-in-the-middle attack by implementing dragnet surveillance, equipment seizure, factory-stage or in-transit equipment backdooring and sophisticated social network analysis. That makes any "central" equipment (servers, routers, cables) convenient sweet spots for wiretapping. In the past year, we witnessed several high-profile cases of such attacks, like:
- the incident of NSA wiretapping Google's internal data links,
- various hurdles several secure mail services faced recently,
- the practice of server seizure and non-discriminate data copying by law enforcement,
- generally, the fact that dragnet surveilance is becoming an established practice even in Western democracies.
Outside of the NSA context, I may mention several cases of protest activists leaders being kidnapped, tortured and killed during the recent Maidan protests in Ukraine. Although I do not have first hand knowledge of abduction techniques, [other events][sms] suggest that authorities tracked GSM phones of likely protesters based on their location and movement. I heard a similar story from a participant of 2011 Minsk protest: all protesters were listed based on their GSM phone location data and later harassed. Given that authorities have the data on protesters' locations and calls, singling out informal leaders, be they masked or not, is a purely technical operation, as well as kidnapping. In this context, the particular content of a victim's communication is less significant than the very fact of it. A finding that a person is in the center of a certain social network is a sufficient trigger for violent actions. Kicking out "hubs" is an effective and ancient technique for social network fragmentation and [neutralization][frontinusI-I-4].
To address these non-cryptographic threats (NCT), a messaging network must have its message exchange patterns hidden, and neither content nor senders/recipients must be known to the network's infrastructure elements (i.e. servers). The approach may seem somewhat unusual as normally the objective is to encrypt communication of certain nodes. Our objective is to hide the very fact of communication, assuming the adversary has enormous wiretapping and subversion abilities.
The key requirement is that we use disposable (nearly stateless) servers. I introduce the term "disposable" as servers are not entirely stateless. Still, their (transient) state can be booted from the network. Further on, disposability enables resilience, as a loss of any number of servers creates no disruption and no security risk as long as the rest can handle the load.
Servers are divided into two orthogonal categories: storage and switching. No long-term storage is generally assumed for "switching" servers, which are ideally ROM booted. Meanwhile, "storage" servers hold chunks of client-encrypted data they have no keys for. The content held by storage servers might be safely replicated and even made public. Servers must obey protocol to form a federated network; there is no dependence on any central resource like a DNS name or a database or a central owner. I assume that servers employ consistent hashing to spread the load among themselves, along with redundancy techniques, like [k-quorums][cassandra].
Logic, presentation and most of crypto functions are offloaded to end nodes. I assume that participants can have sufficient control of their terminal devices to guarantee their own security. Recently, we see a shift to rich client side apps; even Web apps are increasingly able to process and store data autonomously. In the era of multicore pocket devices that is not a big surprise. CPU and storage footprint of a secure messaging app is well within capabilities of a today's average phone. The only functions that are left to be implemented by servers of the core are storage and message switching.
The primary function of our system is message delivery. Messages are
entirely encrypted end-to-end, including addresses. So, how the
switching core delivers messages from sender to recipient without
knowing either? I resort to a listen/forward scheme based on 64 bit
destination labels. The scheme is conceptually similar to
subscribe-publish schemes except that labels are neither static nor
even guessable by third parties. A label is a throwaway address for a
unidirectional channel. A label is derived from a shared secret
initially established through a Diffie-Hellman exchange.
Any outgoing message is an opaque cryptogram prepended with
the current label of this sender-to-recipient channel.
A message is directed to one
of switching servers. From that point on, it is forwarded to the
subscriber, i.e. the recipient. Even if a message is intercepted, only
the recepient may decode it using it's private key.
// relax; explain the general idea; details are in the next sec
The "cloud" of switching servers can be organized to guarantee an arbitrary entry point for any client based on DHT-like principles of message forwarding. Hence, not only the content of the conversation is encrypted but also tracking the link between the sender and the receiver ideally requires a compromise of the entire chain of intermediary forwarding servers. Such a generic secure two-way data pipe pair is the basic primitive any further services could be build on.
part - approach (1-2 sent) - definitions - action
In this section I give some technical detail of the system's key components including semi-formal definitions of crypto and routing routines.
I use a private-public key pair as the only requirement to join,
instead of "account" or "address". A public key is essentially a
participant's id; messages are sent "from" one key pair "to" another.
Every two communicating participants establish a shared secret
through a Diffie-Hellman exchange. The shared secret both identifies
and enables their communication channel: secretAB = DiffieHellman (AlicePrivKey, BobPubKey) = DiffieHellman (AlicePubKey, BobPrivKey)
.
Each hour, the shared secret is mixed with a timestamp and the
recipient's public key, then hashed to obtain a forwarding label:
fwdLabelBobToAlice = SHA256(thisUTCHourTS xor secretAB xor AlicePubKey) & (2^64-1)
. A label is used by the receiver to subscribe
at the switching "cloud" and by the sender to send messages. Note that
labels are asymmetric: fwdLabelBobToAlice != fwdLabelAliceToBob
. A
receiver refreshes its subscription for every incoming channel, i.e.
for every of its contacts, every hour. Note that unsolicited messages
are impossible to send as forwarding labels are not guessable.
Suppose there is only one server doing all the switching. Then,
functions of the switching node mostly boil down to maintaining a
subscription table fwdLabel : connection
. Each incoming
subscription creates an entry referencing its source connection. Each
incoming message gets rerouted to all the connections that subscribed
to its forwarding label. In the case of no subscriptions a message
gets dropped. Each subscription expires in an hour.
A distributed system needs some degree of coordination and resilience
to maintain the subscription table in a distributed fashion. I resort
to a recursive forwarding scheme along the lines of distributed
hashtable (DHT) and consistent hashing schemes. Every server is
associated with k
points in the key space: {p_1...p_k}
. The
essential part of the process is that every subscription or message
must reach the node responsible for the respective key range, i.e. the
server associated with a point closest to the forwarding label:
min{|fwdLabel-p|}
. Speaking of subscriptions, every switching server
forwards it to the closest peer, marking a local "breadcrumb"
subscription in the table. That way, a subscription reaches the
responsible server, leaving a "breadcrumb trail" of subscription table
records that leads back to the original subscriber (Alice the
Receiver). Let's call this a "descent" path, as a subscription
greedily advances to minimize the distance function. Bob the Sender
uploads a message with the same label fwdLabelBobToAlice
to any
switching server. Then, the message is forwarded the same way by some
descent path, except no breadcrumbs are left. Once the responsible
server is reached (descent is over), the message ascends along the
breadcrumb trail to Alice the Subscriber. Essentially, the responsible
server is the rendez-vouz point for subscriptions and messages,
joining the ascent and descent paths of the travel. By definition,
both ascent and descent paths can have no cycles, so no packet/TTL
marking is needed. Note that the descent path is "calculated" based
on hash/label values, while the ascent path is "memorized" in
subscription tables.
An important choice is whether to queue (the SMTP way) or to drop (the IP way) messages once no receiver is present. The former option makes the switching core more complicated and resource-hungry; the latter option requires explicit receiver's acknowledgements and message retransmission by the sender. As the system lacks any persistent association between clients and servers, also forward labels may expire, and there is a good reason for that, then the only viable option is drop-and-retransmission. That makes acknowledgements the third basic message type after subscriptions and messages. An acknowledgement is sent by the reverse channel back to the sender. Note that acknowledgements also enable end-to-end congestion control.
Thus, the basic principles led us to an end-to-end architecture where complexity is at the edges and the dumb core only does message switching. Essentially, such a network provides encrypted cryptonymous end-to-end data pipes.
While my primary objective is to answer various non-cryptographic threats to a secure messaging system, it is time to make a 360' overview of various practical issues and secondary threats.
In the presented system, association of sender and receiver requires subverting the entire switching server chain. As the actual path of a message changes hourly and is not predictable in advance, that essentially requires control of the majority of switching servers. Still, perfect cryptographic opaqueness of the system may be workarounded by statistical traffic analysis and message correlation. The only practical approach I can see here is to "hide a leaf in the forest", i.e. to ensure that secondary information (packet sizes and timings) is too chaotic to derive any meaningful conclusions. There is a toolbox of techniques that may be of some help here, that mostly boil down to either randomizing or rounding up of message sizes and timings (e.g. buffering, padding, random delays and jitter, non-fatal message misdirection).
Another set of pratical issues is related to the system's resilience to DDoS and failure. One significant advantage of the presented system is the impossibility of DDoS being directed at a particular participant. As forwarding labels are only known to the sender and the receiver (just Alice and Bob) and an entry point is arbitrary (Alice may connect to any switching server) then there is no practical possibility to direct a stream of fake messages to jam a particular recipient. The only remaining option is to DDoS the entire system. Also, only a switching server needs to know most of other switching servers; a regular participant only needs to know his/her entry point. That makes a system-level DDoS somewhat tricky. In case switching servers are able to recuperate after a server loss, which is the very point of consistent hashing schemes, then DDoS of a single server does not give much as well.
Addressing the faulty node problem, the wait-and-see approach is to wait a hour till forward label changes likely excluding the faulty node from the ascend/descent path. In case some node completely disconnects, the previous node in the descend path may resubmit all the affected subscriptions to new responsible nodes. A somewhat more serious approach to improve resilience is to extend the forwarding scheme by k-quorums of responsible servers instead of just a single one. Finally, the entire switching core may be diagnosed end-to-end (traceroute-like) to detect and exclude faulty nodes.
The fact that every Alice-to-Bob channel is served separately may
raise the question of computational costs of the entire scheme and the
load imposed on the switching core. Assuming an average user has 1000
contacts, that means 1000 subscriptions refreshed each hour, i.e.
about one 64-bit subscription entry per user issued every 4 seconds.
The amount of state consumed in the core by a single user is around
k*64*logN
bits, where k
is the number of contacts and N is the
number of switching servers, assuming some classic DHT forwarding
scenario. That practically amounts to tens of kilobytes. 64*k
is
the necessary amount of subscription refresh traffic sent hourly
(practically, tens of bits per second). That amount of traffic is
easy to serve even by mobile devices and is equally easy to hide.
Note that the core does no crypto, except probably for TLS, and keeps
no data but the forwarding table.
Speaking of practical issues, participation incentives should probably precede all of the above. All successful federated protocols have robust incentives. I may only note that if switching servers have a peering/QoS agreement among them, then every switching server may provide a certain number of anonymous entry points of certain bandwidth. Another observation is that bandwidth costs tend to decay exponentially for several decades already, so even a time-unlimited grant (like "1Mbits forever") has somewhat predictable lifetime costs. Business-wise, the main shortcoming of the explained scheme is that it lacks any lock-in or price discrimination factors.
Tor is a widely known onion routing system: end-user traffic is encrypted several times over and forwarded by a random sequence of servers to achieve anonymity. Tor mostly protects against IP metadata attacks as it hides the destination IP address at the source and the source IP at the destination.
The BitMessage protocol is a proposed solution for a secure messaging transport which is generally modeled after BitCoin. BitCoin is architected after a digital notary scheme mostly focusing on non-repudiation. That explains the everybody-keeps-everything approach, which is less than convenient for a secure messaging system.
Telegram is a secure mobile chat app. In fact, Telegram is far from either secure or anonymous in the strict sense, as new users are sent SMS activation codes and their contacts are discovered in the phone's address book (and further uploaded to the server). Telegram employs crypto as a selling point mostly, but the system itself hardly provides any guarantees on top of regular TLS, save for the end-to-end crypto extension (like in XMPP).
My objective was to design a minimalist messaging architecture focused on non-cryptographic threats to security and anonymity that use the network infrastructure as a vector of attack. What mostly remains unaddressed is "meatspace threats", like over-the-shoulder surveilance or good old face-to-face threats and subversion, aka "the traitor threat" which is generally unfixable by technological means. Equally, the proposed design does not protect from a Lybian-style blackout of communications, although a non-functioning communication network is perfectly secure in some sense.
Again, the objective is to neutralize various threats and asymmetries created by the fact that an attacker may be able to eavesdrop the most of related infrastructure, may seize equipment en masse, may collect and analyze enormous amounts of data. That kind of a threat, mostly considered paranoid in the near past, now became the most common one.