Part of my series countering common misconceptions in space journalism.
This blog is a follow on to my original post on Starlink. Starlink is an emerging high performance satellite-based internet routing network developed by SpaceX. Its ultimate purpose is to become the de-facto internet backbone provider, connect billions more people to the internet, and revolutionize access to space.
The usual disclaimers apply. I have no relevant inside knowledge of Starlink operations. I’m not an expert in networking, and unlike Starlink’s staff I haven’t spent years working only on this problem. In fact, I’m usually deeply confused at the best of times. But I had a cool idea and I wanted to share it.
As of September 2020, the plan is for tens of thousands of Starlink satellites to orbit in about 9 separate low Earth orbits, each orbit with a different altitude and inclination. Each sub-constellation in a single orbit forms, in effect, a northward and southward comoving sheet of satellites that, when laser links are implemented, can communicate readily to local satellites. Thus, each orbit forms a “double cover” of the Earth up to some latitude.
Initially, Starlink satellites will communicate with each other via ground stations and user terminals. Later versions will implement laser links to other nearby satellites in the same orbit, permitting high data flows within each 2D data sheet. With nine separate orbit families, each having a north- and south-moving sheet, routing can get quite complicated.
For the purposes of this blog, each unidirectional microwave or laser link can be thought of as a discrete channel linking many nodes, each of which functions as a router. Thus, each satellite is a routing node connecting dozens of incoming and outgoing data links. Each discrete channel would have self-contained error correction, encoding, and dynamic link quality assessment. The user terminals and high bandwidth gateways can also be thought of as nodes in the identical sense.
At the packet level, the question of Starlink routing boils down to “how does a packet traverse the network?” This question is complicated by the following factors, expanded in the rest of this blog:
- Satellites are moving, so links between nodes are constantly changing.
- Connections between satellites are ephemeral and subject to unexpected degradation, so packets may get lost.
- Ideally, operation is decentralized, private, and maximally agnostic, in addition to preserving low latency.
This is quite a wish list.
This blog focuses on the security and privacy aspects of the Starlink network. If these were not concerns it would be relatively trivial to perform some global simulation of the constellation, centrally calculate optimal routes, push that info out to the satellites themselves, and have them execute precomputed behavior. With TLS, the packet contents would be encrypted from end to end but, crucially, network topology and packet metadata would be exposed.
In many ways, internet metadata is more of a privacy concern than the packet contents themselves. The metadata encodes the graph, social and otherwise, of anyone on Earth. A unique, time-evolving fingerprint that often reveals more about a person than the packet contents.
Exposing metadata, even to SpaceX, is a bad idea. Intelligence agencies can use it to spy, while competitors can use it to undermine SpaceX’s technological advantages. If SpaceX has access to it, it is legally compelled to filter content in various jurisdictions, undermining freedom, net neutrality, and greatly increasing legal risk. Since SpaceX would probably prefer to spend money on engineers than lawyers, building a network that lacks access to metadata, for anyone, is a good idea. Such a network by default preserves privacy, ensures security for e-commerce, and discourages hackers from meddling.
The ideal for the network is that it function as a decentralized autonomous transparent communications medium with no side channel leakage, no persistence, minimal state, forward secrecy, and metadata protection, even in the case where the satellites themselves are compromised or logging data. Yes, the threat model has to include the possibility that SpaceX’s own hardware or software could misbehave. This makes the routing problem a variant of the Byzantine General’s problem, though blockchain is not required here.
Since the threat model has to include the possibility of radio or laser signals being intercepted by third party receivers, appropriately encrypted signals should be statistically indistinguishable from white noise.
While the Tor network was set up to provide some of these privacy-preserving features, it lacks the performance, reliability, functionality, and geographic connection necessary to work for Starlink. However, a well executed Starlink network should preserve privacy just as well as Tor. Starlink also has to provide maximum bandwidth, minimum latency, and minimal packet header overhead.
This blog won’t go into detail on the Border Gateway Protocol, the usual system the internet uses to pass packets between networks. That said, the geographic size and speed of the Starlink network does present some interesting challenges for operators of content delivery networks. I’m sure Cloudflare has thought about this!
Instead, I’m going to focus on the nuts and bolts of how packets might actually traverse the interior of the Starlink network without sacrificing speed or privacy.
How to actually make it work?
At its core, the question boils down to how a packet can traverse the network from ground terminal to multiple satellites to ground terminal, exploiting some kind of map between IP addresses, physical locations, and network topology, in a robust, decentralized way. What is the barest minimum quantity of logic that can do this?
In other words, how much contextual information can we obscure in a packet header and still have functionality?
The time scale for a packet to travel around the entire Earth is about a tenth of a second. On this time scale, the satellites move about half a mile, while a ground station at the equator would move about 50 meters due to the Earth’s rotation. To a good approximation, at the speed of light the satellites are not moving, so we don’t have to worry about packets getting splinched.
In order to find the right beam to the right destination ground station, a packet doesn’t need to know anything other than the rough GPS location of the destination ground station – two decimal places is plenty.
In principle, even this information can be obfuscated from nearly all the hardware on the packet’s path. All the hardware needs to know is which direction is fastest, and that can include any buffer backlogs on that particular router.
Each packet is encrypted and given a header by the origin SpaceX hardware which encodes the rough geolocation of the end point, determined either by caching or look up. This geolocation is split into varying pieces and encrypted using time- and space-evolving keys. These keys remain valid for, say, one second. Each satellite has a key that is updated continuously depending on the local time and the location of the satellite. Provided the time window is valid, a decryption operation unlocks 2 or 3 bits of salient detail, and no more. This is enough for the router to shunt a pointer in its buffer to the appropriate output channel in real time.
For example, a packet originates in Los Angeles destined for New York. The first satellite is connected to several thousand user terminals and a handful of gateways. It determines (only) that the packet is traveling north east and routes it in that direction. Even if it logs the information about the packet, all it will know is the time it arrived, how long it was, and which direction it went.
The last satellite in the chain is able to read the rough geolocation of the destination terminal, and assign the packet to the correctly-oriented beam. But it doesn’t know where the packet came from other than “south west”.
Every user terminal pointed at that satellite receives every data packet sent in that beam, but only the intended recipient will have the necessary keys to decrypt the packet, convert it back to regular internet traffic and either send it off on some ground-based fiber, into a server, or whatever.
Even if the packet is intercepted, within a second its geolocation keys expire and it is no longer distinguishable from white noise.
Even if an extremely well-motivated adversary was able to collect a log on every packet, performing a timing attack would require that each separate log’s clock offset be corrected. In practice, executing a zero-day on the target’s device is more likely.
In the end, such a scheme is not that different to how postal addresses work. Each step in the delivery only needs to know one line of the address, and typically only your local postal worker will have granular knowledge of names and addresses necessary to ensure that the right people get the right letters. A sufficiently zealous privacy-minded user could enclose their message in four separate envelopes each with the next line of the address, to be opened only on delivery to successfully more localized geographic regions.
Finally, I am certain that there is a countably infinite number of ways to implement real-time decentralized routing on a satellite internet network, and that optimality is not equivalent to being the first one I thought of. How would you like to see the system work?