Why the foundations are more important than what is built upon them—the janion: the building block of privacy solutions—janionic properties: inscrutability, versatility, duality—the bootstrapping of janioning software—janionic exchange: nodal and pseudonymous public key pairs, closed and open routes, routing keys, intermediate nodes, inbound and outbound nestings, high-frequency subnets, directed and random janionets, drop-off points, forked routing—network properties: nodal myopia, self-monitoring and self-adjusting, consensus—active attacks: delaying, injection, deletion, transformation—passive attacks: collation, tracing, flow rate, timing, penetration.
Focusing on the Foundations
Now the future is not looking bright. But if the four dimensions of information hiding could be implemented, then even though the contest would still one of David versus Goliath, we would, at least, have a fighting chance. So how easy is it to implement the four dimensions of information hiding? That depends on the structure of the world, particularly the online world, in which we live. So rather than considering particular techniques for information hiding, let’s take a step back and ask the more fundamental question of what infrastructure is needed so that privacy solutions can be easily implemented in a variety of different ways? This underlying infrastructure has nothing in itself to do with information hiding, but it can greatly facilitate or frustrate our efforts—when the big bad wolf wants to spy on the domestic arrangements of Kermit and Miss Piggy perhaps a “house of straw” built high-up on top of a rocky pillar will serve them better than a “house of brick” built on the quicksands of today’s Internet. The development of an infrastructure that facilitates the implementation of information hiding is what we call “janography”.
Focusing on janography rather than on information hiding has two benefits: (1) it does not tie our hands as to the “how” of information hiding; and (2) because it is peripheral to, and clearly separated from, the “how” its development is less likely to be proscribed by Big Brother.
In essence, the battle for freedom is being waged on two fronts. Big Brother hopes that his steadily increasing monitoring of individuals will go unnoticed by society and will eventually reach a tipping point where privacy becomes impossible. Big Brother’s opponents hope that their steadily improving janographic infrastructure will go unnoticed by Big Brother and will eventually reach a tipping point where the elimination of privacy becomes impossible. If we can get this janographic infrastructure sufficiently well established, then Big Brother will be powerless to stop the implementation of privacy solutions that are based upon it. We’ll have Big Brother by the “short ’n curlies”, and we’ll be able to squeeze to our collective heart’s content!
Let’s give a name to the building block out of which we can construct privacy solutions that satisfy the four dimensions. Let’s call it a “janion”. A janion is something inside of which we can hide information. Any particular janion may, or may not, contain hidden information. And what properties should a janion possess?
Inscrutability—it must be possible to hide information inside a janion without distorting any of its natural properties. Janions that contain hidden information should be indistinguishable from janions that do not.
Versatility—a janion must be sufficiently flexible so that it can be used as the basis for many different privacy solutions. It should be possible to hide information inside a janion in many different ways, since we want to decouple the infrastructure that facilitates the hiding of information from the information hiding techniques that make use of it.
Duality—a janion must also be produced as a by-product of computer processes that have nothing whatsoever to do with privacy. And it must be impractical to ban these processes, or to enforce their modification in such a way so that they no longer produce janions.
In other words a janion—like the Roman God “Janus” after which it is named—must face in two ways: while being flexible enough to allow information to be hidden inside it in a variety of different ways, it must not draw attention to itself; it must be the embodiment of the ordinary, the commonplace, the unremarkable—it must be something that you’d expect to find on any computer.
Imagine a world containing trillions of janions residing on computers everywhere. In such a world Big Brother has a problem. Some of these janions are being used by people to hide information. But it’s impossible to tell which janions are and which janions are not being used for this purpose. Hence, on the basis of the hidden information alone, it is not possible to tell which people are hiding information and which are not.
Thus far we have a world where individuals can keep private matters private—we have the world of H2Eye. But if Big Brother inspects our computers he will find the software that we use to hide information inside janions. So, in order to satisfy the software hiding requirement of H3Eye, we require certain properties of the software that we use to manipulate janions:
The software that is used to manipulate janions must be stored on a computer in the form of a janion, and it must be possible to extract the software from the janion and convert it into a runnable form using dual-purpose software: software whose primary, and ostensibly only, function is totally unrelated to the nurturing and promotion of privacy.
Bootstrapping makes it impossible for an adversary to determine whether or not we possess the capacity to manipulate janions, including the capacity to hide information inside janions.
One of the characteristics of present day governments is their capacity to automatically construct extensive networks, consisting of nodes that represent individuals, and connecting links that represent communications between those individuals. Even though the contents of the individual communications may not always be known, these networks are very easily used to manipulate, suppress, and control those individuals who would otherwise draw attention to governmental failings and corruption. In this manner, governments seek to subvert, undermine, and impair the democratic accountability that they owe to their citizens.
Now unless janions that don’t contain hidden information are routinely exchanged in large numbers, the exchange of janions would, in and of itself, betray the likely presence of hidden communications between senders and receivers, allowing governments to create networks of people who communicate with one another and who also have an interest in privacy. So to implement the communications aspect of H3Eye we need to ensure that janions that don’t contain hidden information are exchanged in large numbers.
However, even if janions are exchanged in large numbers for reasons unrelated to the protection of people’s privacy, these exchanges still allow governments to create networks that detail who is communicating with whom. We have failed to be “eternally vigilant”, and the price we have paid for our inattention is that governments have accrued the powers they need to spy upon us with impunity. But to wrest back those powers is likely to prove well-nigh impossible, at least in the short term, so a more subtle approach is needed. Now the monitoring of communications by governments does not, in and of itself, diminish our freedoms; instead it is the capacity of governments to extract useful information from the data they collect—it is this information which acts as the oxygen of tyranny. What we propose is a method of janionic exchange that makes the construction of any kind of “informative” network as a result of communications monitoring impossible.
Now, take a deep breath; let it out; and then employ those speed-reading skills—when we applied our “Susie Test” to the following section the answer we received was “Yuck”!
A mechanism that moves janions around a network according to the following principles:
- A network consists of nodes. Each node has a “nodal identifier”, a “physical address”, and an associated “nodal public-key pair”. The triplets consisting of nodal identifier, physical address, and nodal public-key are made available to all nodes and are held in “distributed directories”. The “nodal private key” associated with a node is known only to that node. In addition, a node may have one or more “pseudonymous public-key pairs”: the private key of a pair is known only to the node, as before; the public key, while circulated in public, is not associated with the identifier or the physical address of the node and cannot be used to identify it.
- A “route” consists of an ordered collection of network nodes. A route is “closed” if the last node in the route is the same as the first; otherwise, it is “open” (almost all the routes used for janionic exchange will be closed). The “originating” node is the first node of an open route, and the common first and last nodes of a closed route. All other nodes are “intermediate”.
- Associated with each intermediate node on a route is a set of “instructions” that tell the node what operations to perform and provide the data that is needed to perform these operations. Mandatory data elements that appear in all instructions are the physical address of the next node on the route and a one-time “routing key”. An optional data element is a “drop-off identifier”.
- Associated with each intermediate node on a route is a “report” that is created by the node, and which details the results of the operations that it has performed.
- An “outbound nesting” for a route begins by taking the “instructions” for the last intermediate node and encrypting them using the nodal public key of that node. Then each intermediate node is taken in turn in reverse order and its instructions are appended to the outbound nesting before being encrypted using its nodal public key. An outbound nesting is constructed at the originating node of a route when the route is being planned.
- An “inbound nesting” for a route begins by taking the “report” for the first intermediate node and encrypting it using the routing key allocated to that node. Then each intermediate node is taken in turn in forward order and its report is appended to the inbound nesting before being encrypted with the routing key allocated to the node. An inbound nesting is constructed in stages, with each intermediate node that comprises a route making a contribution.
- A janionet consists of a janion, an outbound nesting, and an inbound nesting, each of which has a fixed-size.
- A random janionet is prepared automatically by selecting at random from the network a fixed number of intermediate nodes which are to form the associated route. A directed janionet is prepared by the operator of a node for the purposes of communicating with some other node.
- A janionet is sent from the originating node to the first intermediate node on the route. That node decrypts the outbound nesting of the janionet using its nodal private key, and extracts its instructions and the outbound nesting for the next node. The default instructions for a node are as follows. If the instructions contain the inverse routing keys of previous nodes, these are applied to reconstruct the janion dispatched by the originating node. Then the node attempts to decrypt the janion using its nodal private key to determine if the janion contains a message for the node. If the node wishes to make a reply it modifies the janion accordingly. The node then encrypts the janion (the modified version in the case of a reply, the decrypted version in the case of a message with no reply, and the version received from its predecessor node in the absence of a message) using the routing key found in the instructions. It updates the inbound nesting by appending the timestamp at which the janionet was received and encrypts the combination using the routing key. It creates a new janionet consisting of the encrypted janion, the updated inbound nesting, and the extracted outbound nesting. This janionet is then held in a storage area, the janionic pool, for a period of time selected at random from some probability distribution (or as otherwise directed by the instructions) before being forwarded to the next node on the route.
- A janion can be encrypted using the nodal public keys of multiple intermediate nodes when no replies are expected. Different messages for multiple intermediate nodes can be placed inside different hidden volumes within the same janion by encrypted them to the nodal public keys of different nodes, allowing each node to reply by modifying its own hidden volume.
- A node may be given instructions to clone, replace, or destroy the janionet, or to create one or more new janionets.
- A node may be instructed to send the janionet along multiple routes.
- A node may be instructed to hold a janion tagged with an associated identifier in its janionic pool for a specified period of time and then destroy it.
- A node may be instructed to search its janionic pool for a janion corresponding to a specified identifier and then forward that janion along a specified route.
- A node may be instructed to ask the nodal operator to take some manual action (for example, to remove a janion from the janionic pool and to forward it elsewhere by snail mail).
- Each node records statistics on the rates at which janionets are absorbed and emitted. When janionic flow rates are disturbed by the origination of directed janionets, by the actions of an adversary, or by network malfunction, a node will adjust the rate at which it originates random janionets so that the stochastic properties of the janionic flow rates into and out of the node remain unchanged.
- When a node is added to the network the nodal operator asks the node to automatically negotiate membership of one or more “high-frequency subnets”. In addition to the standard random exchanges with nodes selected at random from the network as a whole, the node will also make random exchanges at much higher frequencies along routes chosen at random from the high-frequency subnets. At intervals, a node will randomly remove some nodes from each subnet and add a similar number of nodes selected at random from the network as a whole.
Sending and Receiving
Now you know why Susie said, “Yuck!” Don’t you hate it when people trot out these formal, “snooze-inducing” definitions? So, let’s consider a few examples to breathe a little life into what is rapidly becoming a torrid text.
Let A be an originating node. Let B, C, D, and E be intermediate nodes, with respective routing keys of b, c, d, and e. Then a closed route and its instructions can be represented by:
- A => Bb => Cc => Dd => Ee => A
Suppose node A wishes to communicate with node C. It encrypts the janion using C’s nodal public key and sends it along the following route: A => Bb => C-bc => Dd => Ee => A. Node C’s instructions contain the inverse of node B’s routing key. So when node C receives “b(janion)” from node B it first computes “-b.b(janion)”, to give “janion”, which it can then decrypt using its nodal private key. It then applies routing key “c” so that it passes “c.decrypt(janion)” on to node D. The originating node receives e.d.c.decrypt(janion). By applying the inverse of keys “e”, “d”, and “c” node A can verify that node C has received and successfully decrypted the message.
Note that though the sender knows the identity of the recipient, the recipient does not know the identity of the sender, unless the sender wishes to sign the message—clearly an advantage in a whistle-blowing context.
Send and Reply
Suppose node A wishes to communicate with node C, and node C wishes to make a reply. The only difference from the previous example is that instead of passing “c.decrypt(janion)” on to node D, it passes c(reply) instead. By applying the inverse of keys “e”, “d”, and “c” node A can read node C’s reply. As above, the sender is unknown to the recipient, but the two can still communicate.
Two nodes A and L can communicate without knowing each other’s physical location. All they need to do is to exchange pseudonymous public keys, and then agree a drop-off identifier and a drop-off node. Node A encrypts the janion using node L’s pseudonymous public key and sends it along the following route: A => Bb => C-bc1 => Dd => Ee => A. The instructions given to node C tell the node to put the original janion into its janionic pool, tagged with a drop-off identifier for a specified period of time and then delete it. Before the janion is deleted, node L sends a janion along route L => Mm => Cc2 => Nn => L. The instructions node L gives to node C are to replace the janion it receives from node L with the janion in the janionic pool that matches a drop-off identifier given to node C by node L. If this identifier matches that given to node C by node A, then node L will receive “nc(janion)” from which it can extract the message using the inverse of routing keys “n” and “c2” and its pseudonymous private key.
Any node that is prepared to act as a drop-off point will allow other nodes to publish the information needed to initiate such exchanges: a pseudonymous public key, a drop-off identifier, and a drop-off node, as well as any other information that the originator of the communication wishes. Any nodal operator can then initiate a communication. Ideally every node on the network would offer this facility, and senders and receivers would move the drop-off node around the network in a random manner, changing it with each new communication. If every node can function as a drop-off point and drop-off points are selected at random, then tracking down communications becomes much more difficult as it becomes necessary for an adversary to identify pairs of intersecting routes.
The instructions for a particular intermediate node may request it to send a janion on one or more routes in addition to the main route. So if the main route is A => Bb => Cc => Dd => Ee => A, then node C’s instructions may require it to send the janion along route Cc => Xx => Yy => A as well. Multiple forks, forks within forks, and random forks are all possible. Any or all of the forked routes can be open as well as closed. Forks can be used for a number of purposes, as we’ll see shortly.
Each intermediate node will know from which node it received a janion. And the outbound nesting will tell it to which node it should forward a janion. But that’s all it knows about the route along which a janion is travelling. It knows nothing about the other nodes. In particular, it knows nothing about the originating node.
The Self-Monitoring Network
The janionic network can monitor itself and automatically determine if a node is malfunctioning. It can also determine if a compromised node or an adversary with access to the links between nodes is perturbing the flow of janions in some manner. This self-monitoring is possible due to:
- Closed Routes
- Nodal Myopia
- Inbound Nestings
- Routing Keys
Closed routing allows any node to examine the workings of any other node, as what any intermediate node does will be fed back to the originating node. Nodal myopia minimizes the knowledge that a compromised node has of the routing, which makes it impossible for a compromised node to avoid carrying out its instructions and escape detection. The encryption of inbound nestings and janions at each node using a one-time routing key ensures that the originating node has feedback to analyse, which helps it to determine which node, or which link, is malfunctioning.
An adversary with access to a node or to a link between nodes can do various things to perturb the network. If the originating node suspects that a route has been compromised, it can send probes along multiple routes that pass through each of the intermediate nodes on the problematic route in turn, and thereby determine which node has been compromised. The compromised node can then be blacklisted by the originating node.
Furthermore, the originating node can send a message signed using its nodal private key across the network using a non-encrypted janion to inform other nodes of the compromised node. These nodes can then perform their own tests by way of confirmation. They can then, in turn, send out warning messages signed with their nodal private keys. As a result a “network consensus” regarding the compromised node will develop, and it can be blacklisted by the network as a whole, rather than just by individual nodes. It’s impossible to spoof such warnings without adding new nodes to the network. And any compromised nodes that send out denials would be in danger of disclosing their real purpose. Hence, as long as less than 50% of the network has been compromised, a correct network consensus regarding the status of an active compromised node can be obtained (compromised nodes can of course be silent and just record information, rather than perturbing the network).
Let’s get mathematical—sorry Susie! Let’s assume that our network has n nodes, that each route possesses m intermediates nodes, and that on average r janions are originated per day by each network node. Assume that by default all routes are closed, so that the number of janions emitted by a node is the same as the number absorbed.
Let’s assume that the number of network nodes is large and that the number of janions emitted or absorbed by a node during a particular time interval has a Poisson distribution. This is a good choice since it means that the numbers emitted in non-overlapping time intervals will be stochastically independent. Hence, if r janions are originated on average by a node each day, then the probability distribution of the number originated during a time interval t has a mean of rt, a variance of rt, and the probability that exactly k janions will be originated during the time interval equals:
The total number of janions emitted or absorbed by a node in a time interval t will have a Poisson distribution with a mean and variance of r(m+1)t.
By way of example, consider a network of one billion nodes (after all, the Internet now has one billion users), with 9 intermediate nodes per route, and with each node originating 100 janions per day (imagine that each Internet user sends 100 emails per day to randomly selected email addresses). Hence, on average 1000 janions will be emitted and absorbed by each node every day, with a standard deviation of about 32. Assuming that the Poisson distribution is approximately normal, then about 95% of the time the number of emitted or absorbed janions will lie within two standard deviations of the mean. Hence, we can say that on 19 days out of every 20, the number of janions emitted or absorbed by a particular node will lie between 936 and 1064.
Given the large amount of variation in the numbers of janions emitted and absorbed it will clearly be easy to slip in the occasional directed janion without it being detected. This is exactly what we need from a good janographic infrastructure: even though it is possible to monitor communications, it is not possible to extract any useful information from that monitoring; every node behaves in exactly the same manner as every other node; and each node’s default behaviour is entirely devoid of “intentionality”.
An adversary with access to a node or to a link between nodes can do various things to perturb the network:
- Delaying Attack
- Injection Attack
- Deletion Attack
- Transformation Attack
In principle, an adversary could systematically delay janions sent by a node and then examine another node to see if statistical variations in the arrival times of janions at the latter confirm that both nodes frequently lie on the same route. However, this type of attack will not work because both nodes that are the subject of the attack can detect it as it develops and can take action to undermine it.
Let’s assume that node A frequently originates janions that are targeted at node B with far greater frequencies than if the routes were chosen at random. Let’s assume that an adversary suspects that this might be the case and monitors all janions emerging from node A, delays them in some random manner, and then monitors the arrival times of all janions at node B. Now the adversary may be able to estimate how long it takes on average for a janion to travel from node A to node B in the absence of any perturbations. If so, then as soon as node A emits a janion he can determine the expected time of arrival, and set counting windows on each side of that arrival time. With no delays then on average both windows will have the same count. But if he delays a janion, then the second window will on average have a greater count than the first.
Let’s suppose that on average each node originates 100 janions per day, and that on average node A sends one directed janion per day to node B. Since the adversary doesn’t know which of the janions emitted from node A might be targeted at node B he has to delay all of them, or delay a random sample of them. Now node A knows the statistical distribution of transfer times for janions around a closed route. If will therefore see a statistical anomaly begin to arise as the average transfer time increases. And for every janion targeted at node B that the adversary could use to glean some information, node A will have 100 times as many. Hence, node A will have solid evidence that something is wrong long before the adversary. Furthermore, node A can examine the inbound nesting to determine the exact arrival times of janions at intermediate nodes, and can thereby determine that the problem lies with all links exiting node A. Even better, node A can send probes that have zero holding times to intermediate nodes, allowing it to detect the precise statistical distribution of the delays inserted by the adversary.
Node A can defeat the adversary by changing the instructions to the first intermediate node on each route, telling that node not to hold a janion for a specific random delay, but to hold it until a specific time of day has passed. Once node A knows the magnitude of the longest delay being inserted by the adversary, it can set the emission time from the first intermediate node to be equal to the time of emission from node A plus that longest delay; hence, the time between emission from node A and emission from the first intermediate node becomes fixed.
Let’s assume, for the sake of argument, that node A does nothing. Let’s see what action node B can take in these circumstances. Now node B will be keeping track of the numbers and arrival times of janions (as part of its instructions node A can inform node B at what time it dispatched each janion). Hence, node B and the adversary both have access to the same information and can both examine the same statistics. As soon as the data begin to look statistically unusual, but before they become statistically significant, node B can send closed circuit probes with zero holding times along routes that pass through node A (assuming that node A is prepared to disclose its identity). Since the adversary does not know the origin of janions he will apply the delays to these probes and immediately reveal his presence. Node B can then inform node A to take action, and can inform the network of the presence of the adversary. Once a network consensus has been obtained, if node A has not addressed the matter, all nodes can either blacklist node A, or inform other nodes to override its instructions by holding its janions until a specified time has passed rather than for a random interval.
Since an adversary knows nothing about the route apart from the next node he cannot send a janion to either the originator or to an intended recipient. So, inserted janions can’t be used for tracking purposes. As each node monitors the numbers and the nodes from which it receives janions, and can share that information with other nodes, a node that suddenly started producing large numbers of janions without good reason would be detected and blacklisted.
An adversary who suspects node A of communicating with node B may try to delete some or all of the janions exiting from node A in the expectation of seeing a dip in the number of janions arriving at node B. This attack can be defeated in the same manner as the delaying attack considered above.
Let’s suppose a compromised node modifies the outbound nesting. Since the node doesn’t know the originating node, its effect is to delete the janion, and insert a new one at random (an attack which has already been covered).
If a node modifies a janion then the originating node will be able to determine that it has done so. For example, suppose that on route A => Bb => Cc => Dd => Ee => A node D takes “cb(janion)” from node C and changes it to “modified” before passing it along the route. Then the janion received by node A will equal “ed(modified)” instead of “edcb(janion)”. The originating node can then probe the nodes individually to determine the compromised node, and if the originating node had been attempting to send a message to node E, for example, it could then send the message along a different route.
An adversary with access to a node or to a link between nodes can do various things to examine the behaviour of the network without introducing any detectable perturbations:
- Collation Attack
- Tracing Attack
- Flow Rate Attack
- Timing Attack
- Penetration Attack
Two or more compromised nodes may record the information that is available to them as a janion passes through, share it, and then try to infer the behaviour of some other node.
For example, suppose that on route A => Bb => C-bc => Dd => Ee => A nodes B and D are compromised, and suspect that node A is communicating with node C. Now, neither node knows the instructions given to node C: when node B had the outbound nesting it was still encrypted with C’s nodal public key; and when node D received the outbound nesting, node C’s instructions had already been removed from it. Neither node has access to node C’s report: it did not exist when node B had the inbound nesting; and it had already been encrypted using node C’s routing key by the time it was passed to node D. Node B knows the original janion and “b(janion)”. Node D knows “c(message)”. But it’s not possible to distinguish “c(message)” from “cb(janion)” without knowing routing key “c”. Hence, nodes B and D have no way of knowing whether node A is communicating with node C.
The most obvious way to determine if two nodes are communicating would seem to be to trace the flow of janions leaving the first node and see if a disproportionate number of them enter the second node.
Let’s examine a particular node. Janions arrive at the node at random, and depart from the node at random. Because the respective sizes of janions, outbound nestings, and inbound nestings are fixed, it is not possible to distinguish the emitted from the absorbed based on size. Because each janion, outbound nesting, and inbound nesting is encrypted / decrypted between absorption and subsequent emission it is not possible to distinguish the emitted from the absorbed based on their contents. Because janions are held in a node’s janionic pool for random times before they are forwarded and because a node will also be originating its own janions, the best that can he said of a janion emitted by a node is that it is very likely to be one of the janions absorbed by the node within some period of time prior to its emission. For example, an adversary might be able to say the probability is 95% that the janion that was emitted by node A at 16:00 is one the 250 janions absorbed by node A between 10:00 and 16:00.
Now it’s possible to track all the janions emitted by node A and determine which nodes they enter. And for each of these nodes it’s possible to determine which of the janions that are subsequently emitted by the node might be node A’s janion. And it’s possible to repeat the process node by node. Let’s say the number of candidate janions per node is c (250 in the above example). Then if we follow the flow through n nodes, we will have encountered c**n different nodes (ignoring cross-backs). Tracing these flows very quickly results in a combinatorial explosion. For example, suppose that the network contains a billion nodes and that the number of candidate janions per node is 250, then after following the flows from node to node for four iterations an adversary will have encountered nearly 4 billion nodes—even allowing for cross-backs an adversary will have encountered almost every node in the network. In other words, if an adversary was hoping to determine whether janions emitted by node A subsequently pass through node B, the answer will be that all janions emitted by node A will appear to pass through node B even when no communication between node A and node B is taking place.
Flow Rate Attack
A nodal operator who originates directed janions will increase the total traffic emitted by the originating node and absorbed by the target node. This increase in nodal traffic could allow an adversary who monitored one of the nodes to determine that it was actively communicating with some other node.
If r random janions that pass along closed routes containing m intermediate nodes are originated on average by every network node each day, then the probability distribution of the number emitted or absorbed by any node during a time interval t has a Poisson distribution with a mean of r(m+1)t and a variance of r(m+1)t. Suppose that in addition to these random flows node A originates k directed janions per day targeted at node B. Now an adversary who is monitoring node A (node B) will detect an excess of kt janions being emitted (absorbed) during time t. How long would an adversary have to wait until the probability of the excess occurring by chance had dropped to no more than, say, 1 in 40? Assuming that the Poisson distribution is approximately normal, then, as only the upper tail is relevant, the excess must equal twice the standard deviation of the distribution of the number of random janions emitted during the same time period:
- kt = 2sqrt(r(m+1)t)
- t = 4r(m+1)/k**2
If each network node originates 100 janions per day, if there are 9 intermediate nodes on each route, and if 10 directed janions are originated by node A per day, then an adversary would have to wait 40 days for the probability of the number of excess janions being emitted to drop to 1 in 40.
Clearly the janionic infrastructure hides the flow of directed janions very well in the short to medium term. However, if the network is self-monitoring and self-adjusting then long before an adversary could detect a statistical anomaly, the originating and target nodes will have adjusted the flow rates of the random janions that they originate to ensure that the default flow rate statistics are re-established.
If communications are bidirectional or the routes selected for directed janions are closed, then each node will emit as many extra janions as it absorbs. Hence, each node can automatically reduce the rate at which it originates random janions—by one random janion for each directed janion—so that the total average flow rate through each node returns to its default value.
If communications are predominantly unidirectional and the routes are open (with node A at one end of a route and node B at the other), then the originating node needs to reduce the number of outgoing random janions while keeping the number of incoming random janions unchanged, and visa-versa for the target node. The originating node can reduce its outgoing random janions by the number needed to restore the average outgoing flow rate to its default value. It can then add instructions to its outgoing random janions so as to create as many closed forked routes as are needed to raise the average flow rate of incoming janions back to its default value. The target node can emit the same number of outgoing random janions as before, but it can send an appropriate number of them on open routes, so as to lower the average flow rate of incoming janions to its default value.
An adversary could gain evidence that node A is communicating with node B by correlating the emission of janions by node A with the absorption of janions by node B.
If r random janions that pass along closed routes containing m intermediate nodes are originated on average by every network node each day, then the probability distribution of the number emitted or absorbed by any node during a time interval t has a Poisson distribution with a mean of r(m+1)t and a variance of r(m+1)t. Since the number of janions absorbed by node B in any time interval has a Poisson distribution, the numbers absorbed in non-overlapping time intervals will be stochastically independent.
If a particular starting time is selected at random, and the count of the number of janions absorbed by node B that occurs in a time interval t that succeeds the starting time is subtracted from the count occurring in a time interval t that precedes the starting time, then the resulting probability distribution will have a mean of zero and a variance of 2r(m+1)t. Now suppose that an adversary knows that it takes up to time t for a janion to get from an originating node to a target node. Suppose the adversary waits until node A emits a janion. Suppose he monitors the count of the janions absorbed by node B during a time interval t, and then again during a second contiguous time interval t. If all the janions originated by node A are random and the network is large, then the probability that any of them will pass along a route that contains node B is negligible (see next section). Hence, the mean and variance of the difference in the counts between the two windows should be the same as when the starting point is selected at random. However, if s directed janions per day are emitted by node A and targeted at node B, then the probability that the janion monitored by an adversary is directed will be s/r(m+1), so that on average the difference in the counts between the two timing windows will be s/r(m+1), as the earlier window will always contain the count associated with a directed janion. If the adversary repeats this process, at the maximum rate of 1/t times per day for k days, then the excess count will be ks/tr(m+1) and the variance will be 2kr(m+1).
Assuming that the adversary wants to wait until the probability of the excess occurring by chance has dropped to no more than 1 in 40, and that the probability distribution of the difference in the window counts is approximately normal, then, as only the upper tail is relevant, the excess must equal twice the standard deviation of the probability distribution of the difference in the counts between the two windows:
- ks/tr(m+1) = 2sqrt(2kr(m+1))
- k = 8t**2(r(m+1))**3/s**2
If 10 directed janions are originated by node A per day and the width of the counting window is 6 hours, then an adversary would have to wait for about 13,700 years to gain solid evidence of communication between the nodes. The reason why this correlation proves so difficult to detect is that the adversary doesn’t know which 10 of the 1000 janions emitted by node A per day are targeted at node B. In other words, only 1 in 100 of the measurements he makes contains any useful information. And even then as the transit time for janions across the network is high many random janions will arrive at the target node during each of the counting windows—for every excess janion detected some 25,000 random janions will arrive in each of the counting windows.
On the other hand, if nodal holding times were negligible and janions took at most one second to get to the target node, then an adversary would only have to wait for about 15 minutes to get the evidence he needed. This difference in the amount of effort that an adversary must expend between these two scenarios illustrates the importance of long holding times and demonstrates why suspected routes can easily be confirmed on fast, low-latency networks, such as those associated with Internet browsing.
This type of timing attack will only work if the probability distribution of holding times is not uniform. Suppose that node A sends out a directed janion at about the same time each day and instructs the intermediate nodes to select random holding times so that the janion is equally likely to arrive at its target at any time during the next 24 hours. In this case, no matter how the counting windows are chosen they will record on average the same number of directed janions. If a lower latency is required, node A could emit janions about once per hour with a uniform distribution of transfer times spread over an hour. Many of these janions would of course be dummies. Alternatively node A could send out a small number of important messages with fast transit times, and let the network delay the less important messages so as to balance the statistics.
A target node can also detect statistical anomalies. If the sending node includes the time at which a janion is emitted as part of its instructions, then the target node can perform its own statistical analysis on the distribution of transit times. And it can do this far more efficiently than any adversary (100 times more efficiently in the above example). Hence the target node can detect the development of a statistical anomaly long before an adversary could do so. The target node can then request the sending node to adjust the instructions that are given to the intermediate nodes, and can subsequently confirm that these adjustments cause the anomaly to disappear.
A penetration attack occurs when an adversary progressively compromises more and more of a network’s nodes. Effectively, this attack removes nodes from the network, so that it shrinks. Now, with an active penetration attack it is possible to determine which nodes have been compromised and then blacklist them, so that, even though the network shrinks, at any time the proportion of compromised nodes that have not yet been discovered will always be small. Unfortunately, with a passive attack it is not possible to determine which nodes have been compromised. Hence, as more and more nodes are compromised, the average number of uncompromised intermediate nodes on a janion’s route decreases. At some point janions will occasionally be sent along routes that consist entirely of compromised nodes. Eventually all routes along which janions travel will have been compromised.
For all the other attacks considered so far it has not been possible for an adversary to determine the routes along which janions travel. So we need to ask (1) at what level of penetration does it become possible to routinely detect routes, and (2) does knowledge of routes allow an adversary to determine which nodes are communicating with one another.
So the first question is how likely is it that an adversary will be able to determine the route followed by a particular janion—how likely is it that all nodes other than the originating and target nodes have been compromised? Suppose that a fraction f of the network has been compromised and that janions are sent along closed routes that contain m intermediate nodes. Then the probability, p, of the entire route being compromised is given by:
- p = f**(m-1)
- f = exp(ln(p)/(m-1))
Let’s assume that there are 9 intermediate nodes. Then with 92% penetration the probability of detecting each route is 1 in 2; with 75% it is 1 in 10; with 56% it is 1 in 100; with 42% it is 1 in 1000; and with 18% it is 1 in a million.
Now we can ask for a given degree of network penetration, how long will it take before the routes corresponding to directed janions are detected? Suppose, as above, that the probability that any particular route is compromised is p, and that node A originates k directed janions targeted at node B every day. As time passes the probability that at least one directed janion will have been detected increases. If we wish to ensure that this probability rises no higher than q, then for how many days, d, can we continue to send directed janions? The number of routes followed by directed janions in d days equals kd. The probability that none of these routes has been compromised is (1-p)**kd. Hence, the probability that at least one route will have been compromised is given by
- q = 1-(1-p)**kd
- d = (1/k)ln(1-q)/ln(1-p)
Let’s assume that we want the probability that one or more routes have been compromised to be no more than 1 in 100, and that node A originates one directed janion per day. Then with 56% penetration an adversary will have to wait for 1 day; with 42% for 10 days; and with 18% for 27 years.
So we now have an answer to the first part of our question: at what degree of network penetration can an adversary frequently detect the route along which a directed janion is travelling? The next question to ask is does this matter? Effectively, we have a situation where an adversary can see that a janion travels from node A to node B and back again. Now an adversary can never prove that a janion is directed without compromising one of these two nodes. However, the size of the network determines the frequency with which randomly selected routes will contain both nodes A and B. If the frequency with which an adversary finds nodes A and B on the same route is much higher than it should be by chance, then he can conclude that the routes are selected intentionally for the purposes of communication.
So how likely is it that two nodes will be found by chance on the same route in a network containing n nodes. The probability that a random janion emitted by node A will pass immediately through node B is 1/n. If it passes first through some node X and then through node B, the probability of this occurring is 1/n**2. However, as there are n possible choices for node X, the probability that the janion passes through node B after two hops is still 1/n (and the same for multiple hops). Hence, the probability that node B will be found amongst the intermediate nodes by chance equals m/n.
Over a period of k days, an average of r(m+1)k janions will have been emitted by node A, and the probability that node B will appear on none of these routes equals (1-m/n)**r(m+1)k. Hence, the probability, p, that node B will appear on the same route as node A one or more times is given by:
- p = 1 - (1-m/n)**r(m+1)k
- k = (1/r(m+1))ln(1-p)/ln(1-m/n)
- n = m/(1-exp(ln(1-p)/r(m+1)))
Suppose that the probability p is 0.5, that the network has a billion nodes, and that there are 9 intermediate nodes per route. Then the time that must elapse before there is at least a 50% chance that node B will appear on the same route as node A equals 211 years. This is fine if node A only needs to communicate with node B once every 211 years!
Suppose we only wished to wait one day before there was at least a 50% chance of node B appearing on the same route as node A? Then the network should contain about 13,000 nodes. Hence, if node A and node B wish to communicate every day, they need to belong to a high-frequency subnet containing about 13,000 nodes. If they do, then an adversary will have no evidence to conclude that the janions they exchange are directed, rather than random. Note that as the network is progressively compromised an adversary will be able to estimate the size of the subnet and the frequency of communications within the subnet, so the size and the frequency need to be selected for consistency before penetration becomes too deep.
The size of the subnet places restrictions on the number of nodes that a particular node can communicate with on a frequent basis, but as can be seen from the example above this is far from being restrictive. The nodes belonging to the subnet should not be fixed, but should shift over time. If node A wants to start communicating with node C and it has not done so in the past, then an adversary may already have determined that node C does not belong to the subnet. If the subnet were fixed, then the sudden appearance of high-frequency exchanges with node C would provide evidence for communications. If subnet membership is fluid, then it may just have happened that node C has been added to the subnet purely by chance.
The analysis above is of the broad-brush variety, with various details omitted, and our rather rusty mathematics has doubtless led to some errors. Now, we’re not suggesting that such a network should be constructed, and, in any case, a detailed simulation of its properties would be needed before such a task was attempted.
Instead, our objective has been to illustrate that even if an adversary can monitor all nodes, and has compromised all nodes, except for those that are actually communicating with one another, it is still possible to construct a network in which it is plausible to deny that any communication is taking place.
We’re also interested in promoting certain principles. We like the ability of random flows to disguise directed ones. These random flows need not be wasteful. In a fully distributed network, directories, information, and searching would be distributed amongst the nodes. So the task for some “clever person” is to find an efficient mechanism for using random flows to implement distributed services!
We like the idea of a self-monitoring and self-adjusting network: one that can detect the perturbations caused by network users and external adversaries, and which can then adjust the network flows to prevent any statistical anomalies from developing. We like the ideas of closed circuits and nodal myopia that allow any node to secretly test that a particular node is performing as it should, without the node in question having any idea that it is being tested. And we like the idea of a network being able to arrive at a consensus, with each node voting with its own nodal private key.
We like the idea of route-based rather than point-based communications. At present, a downloaded web page or an email has a definite destination. With route-based communications, the best that can be said is that the web page or email may have had a particular destination. And we like the idea of drop-off nodes, so that two parties never have to communicate directly, but can use a node that is randomly varied with each exchange between the parties.