RIPA Part III - Again

Why Mr. Reid might be a terrorist—why you can’t “encounter” what you can’t see—why human intelligence is an optional extra for government ... Masochistium Clickium Hic!

On Good Excuses

With regard to RIPA Part III the government is being either naïve or disingenuous. The legislation will not work as intended for the simple reason that it is predicated on two assumptions, neither of which is valid.

The first is that it assumes that “criminals and terrorists” cannot produce a good excuse as to why they have in their possession encrypted material for which they do not possess the decryption key. For example, what if following the introduction of RIPA Part III we were to encrypt this blog entry using standard PGP encryption, email it to the home address of one Mr. Reid, and then tip-off the anti-terrorist squad that Mr. Reid was a terrorist operative (not an unreasonable assumption given the subversive answers that the same Mr. Reid invariably gives to Radio 4 listeners!) Mr. Reid would doubtless delete our spam email, but it would still be physically present on his computer disk, and accessible by means other than his email program. Should the anti-terrorist squad follow up on this “reliable source”, a basic forensic scan of Mr. Reid’s disk would reveal the telltale PGP headers, indicating that Mr. Reid was indeed in possession of encrypted material. And he would be unable to make the plaintext available. The question is would Her Majesty, and the rest of us, then have the pleasure of seeing Mr. Reid do time with, say, some psychotic, chair-leg wielding, and racially prejudiced cell-mate for two years?

While such a prospect would doubtless bring pleasure to many, it would not be fair, for Mr. Reid, miserable sinner though he may be, is not responsible for the emails people send him, and we could hardly expect a government minister to be possessed of the “intelligence”—be it intra- or extra-cranial—to locate and securely delete emails that his email program already tell him have been deleted. Assuming that Mr. Reid cares to extend a similar courtesy to the population at large, then what is there to prevent the “criminals and terrorists” from using the email storage area as a safe repository for encrypted material? Or, what if Mr. Reid regularly downloads some newsgroup, say “How to smile, and smile, and be a villain.” If some of the posts are encrypted, are we to oblige Mr. Reid to decrypt them? And if not, then might not “criminals and terrorists” avail themselves of this facility.

In short, with the increased use of encryption there are simply too many sources from which the guilty and the innocent may wittingly and unwittingly download encrypted material to their computers.

On “now you see it, now you don’t”

The second assumption is this: “Even though we may not have the keys needed to derive the plaintext from the encrypted material, we will always be able to detect the presence of the encrypted material.”

Paragraph seven of the government’s summary begins with the following sentence, “Over the last two to three years, investigators have begun encountering encrypted and protected data with increasing frequency.” Ay, there's the rub, investigators have begun “encountering” encrypted data. Part III of the Act rests on the singularly risible assumption that “criminals and terrorists” will continue to allow investigators to “encounter” encrypted material.

But this will not be the case. Software programs that are easy to use, that are available on the Internet for free, and that have already been downloaded by millions of people make it possible for data to be encrypted in such a manner that it is undetectable by the analytical techniques available to forensic science. People who use such software will never be caught by the proposed legislation should it be implemented.

Software of this type provides an “aleatory defence” by making encrypted material indistinguishable from random and pseudo-random data. For example, a USB memory stick containing this type of encrypted file system looks exactly like a USB memory stick that has been securely erased. For example, hidden volume filesystems make it impossible to detect whether a hidden volume is, or is not, present in any particular instance, so that investigators can only demand the encryption key to the outer volume.

In the absence of RIPA Part III “criminals and terrorists” have been content to use methods of encryption that shout out loud and clear “encrypted material—come and get it” by the presence of characteristic headers—as is the case with PGP. With RIPA Part III in place these same “criminals and terrorists” will simply move over to non-disclosing software, whose encrypted output investigators will never “encounter”. For more information on non-disclosing software see our blog entry at "No Keys" Campaign.

In summary, short of banning the use of personal computers there are no technical methods available to law enforcement authorities to prevent material from being encrypted in such a manner that it either cannot be discovered or the owner can plausibly deny knowledge of the means to decrypt it.

On Human Intelligence

There is, however, a reliable and well-proven method of tackling the “criminals and terrorists” should the government ever be minded to use it. It’s not glamorous, and it doesn’t lend itself so easily to mendacious “spinnery”. It’s called “human intelligence”. Neither the security services nor the government possess it at present. In the case of the former the deficit might be remedied by additional financial resources; in the case of the latter, we are sad to say that the only word that comes to mind is “irredeemable”.

Tiffium & Morphium – Bigus Brutium-Absentium Zonium

Why your Browser is cheating on you

You’ve installed that proxy chain, you’ve done everything correctly, by the book—you’ve called up your favourite search engine, entered your favourite topic, and soon you’re clicking away on one link after another, sure in the knowledge that Big Brother doesn’t know what sites you’re visiting—right?—wrong!—some of those clicks will be putting a smile on Big Brother’s face. ... Masochistium Clickium Hic!

Putting your life on the line

For many people in the West privacy is something of a fashion accessory. But just imagine for a moment that you live somewhere else. Just imagine that you live in China. Your name is Ms. Li Yinping. You live in Majia Village, Shouguang City, Weifang Region. You’re a member of the Falun Gong, a Chinese religious movement, one that is greatly despised by the regime. If they find out that you’re a member then you stand a good chance of being tortured, and quite possibly executed. So imagine that the search you are about to make could cost you your life if you make a mistake, if you haven’t set up your proxy correctly (needless to say, if you’re actually living in China, don’t perform this search, just read what follows, and then scrub your browser cache when you’re finished!)

Now you’ve heard a rumour about Jiang Zemin's regime rigging an event involving self-immolation in Tiananmen Square to discredit the Falun Gong. First set up your browser and proxy so that all your browsing activities will make use of the proxy and any DNS requests you make will be resolved remotely (for example, if you’re using Tor, point your browser at Privoxy, and then point Privoxy at your Tor client).

Now start your browser and go to Google—let’s assume you’ve managed to get a line out to “www.google.com”, and you’re not restricted to “www.google.cn”. Now type the following into the Google search box “Self-Immolation Tiananmen Square”. Let’s examine a few links from the page of hits returned by Google. If the page you see is similar to the one we see, then you’ll find a link called “Falun Dafa Clearwisdom.net”. Click on it. When the html page downloads you’ll see something beginning with “After July 20, 1999, Jiang Zemin's faction launched a far-reaching campaign of disinformation to justify its persecution of Falun Gong”. Press the “back” button on your browser to return to Google. Let’s try the link entitled “[PDF] Investigation of the So-Called Self-Immolation in Tiananmen Square”. Click on it. When the pdf file downloads you’ll see something beginning with “Ever since the so-called self-immolation incident occurred in Tiananmen Square, the Chinese authorities' persecution of Falun Gong – a popular Qigong practice in China outlawed by the Jiang regime – has clearly intensified”. Go back to Google again.

Now what happens behind the scenes when you click on a link? When you requested the first page, your browser passed the DNS lookup request along the proxy chain, the IP address of the site was returned, the browser sent a request along the proxy chain to return the page, and finally the page was displayed in your browser. And similarly for the second request, the browser passed the DNS lookup request along the proxy chain, the IP address of the web site was returned, the browser sent a request along the proxy chain to return the file, and finally the file was displayed in your browser. Correct?

Did you miss the Sleight of Hand?

No, not correct! If you had a sentinel installed (and if privacy is important to you then you should never browse without one), then by now you’d have that sinking feeling in your stomach. And with good reason. When you clicked on the first link the sentinel would have sat there sphinx-like, not uttering a word. But when you clicked on the second link, corresponding to “www.upholdjustice.org/English.2/s_i_investigation.pdf”, the sentinel would have awakened from its slumbers and would have reported something like this:

Sentinel Output

IP: 166.111.232.19.2760
   >>
IP: 222.212.39.104.53
Data: A? www.upholdjustice.org

IP: 222.212.39.104.53
   >>
IP: 166.111.232.19.2760
Data: 1/0/0 A IP: 207.44.152.163

The first set of data represents a DNS request originating from port 2760 on your computer with IP address 166.111.232.19. The request is sent to port 53, the standard DNS port, on the computer acting as your ISP’s DNS server, with IP address 222.212.39.104. The request asks the DNS server to find the IP address of “www.upholdjustice.org”. The second set of data represents the reply from the DNS server, indicating that the IP address you requested is 207.44.152.163.

So the second DNS request did not go through your proxy. Instead it went to the DNS server at your local ISP—let’s call it “www.shouguang.cn”. Now this DNS server will do more than just look up the IP address corresponding to the web address. Like many DNS servers around the world it will, in addition, determine whether the web address lies on Big Brother’s blacklist. And, in the present case, “www.upholdjustice.org” is not a web site that any “patriotic” Chinese citizen would wish to visit!

But I thought Tor would warn me?

If you’re using the Tor network as your proxy then you will have been told that Tor warns you when DNS look-ups are done locally. Let’s take a look. Right-click on the TorCP icon, then select “Tools”, followed by “Message History”. A pop-up window called “Recent Log Messages” will appear. Now if Tor had detected a local DNS look-up then you would find in the log a message similar to the following:

[Warn] fetch_from_buf_socks(): Your application (using socks4 on port 14839) is giving Tor only an IP address. Applications that do DNS resolves themselves may leak information. Consider using Socks4A (e.g. via privoxy or socat) instead.

But, even though you have been using Privoxy as suggested, you will find no warning message in this case. So even though the DNS look-up has been local, Tor has not detected it. As far as Tor is concerned everything is working perfectly!

More to Adobe than meets the eye!

Let’s examine what’s gone wrong:

Conversation Piece

Browser: “Hey Adobe, you there?”

Adobe: “Yep, ready and waiting.”

Browser: “Well, User wants to display “www.upholdjustice.org/English.2/ s_i_investigation.pdf.”

Adobe: “Okay! I’m on my way…now, let’s see…this Windows machine must have an Internet connection…let’s have a look…okay, the default Internet connection is to ISP ‘www.shouguang.cn’…and here’s the address of its DNS server. Hi there DNS server. Need the IP address of ‘www.upholdjustice.org’.”

DNS Server: “The IP address you need is…wait for it…yes, it’s 207.44.152.163.”

Adobe: “Ta DNS.”

DNS Server: “Hey Big Brother, did you know that someone at IP 166.111.232.19 is trying to download something from ‘www.upholdjustice.org’?”

Big Brother: “No I didn’t. But I do now!”

Adobe: “Hmm…this browser seems to have some proxy settings… perhaps I should use them instead…hey there Proxy, can you fetch the contents of ‘www.upholdjustice.org/English.2/ s_i_investigation.pdf?”

Proxy: “Sure can do…coming…coming…here it is.”

Adobe: “Ta Proxy.”

Adobe: “Hey Browser. I’ve got what you were looking for. Just move over for a moment so that I can squeeze into your window and display this pdf.”

Browser: “Hey User. Deed done. Here’s that pdf you were looking for.”

Well that’s only a guess as to what’s happening. What we do know from the sentinel is that a local DNS look-up has been performed, but that the network traffic involved in fetching the pdf file passes through the proxy. Yet the proxy does not seem to be getting the IP address alone, as if that were the case Tor would produce a warning message.

Exactly what’s happening here we can’t be sure of without knowing the internal workings of Adobe. It seems that your Internet browser doesn’t fetch the pdf for you; it simply sub-contracts the task to Adobe. Apart from passing the request to Adobe at the beginning, and providing a window for Adobe to display the file at the end, your browser has done nothing. Adobe, like many software programs these days, is Internet savvy. Adobe seems to be bypassing the proxy when it comes to doing the DNS look-up. However, rather than just passing the IP address to the proxy, it seems to be passing the full request, so that the proxy does a remote DNS lookup before it fetches the pdf file (that’s the only explanation that seems to be consistent with the lack of a warning message from Tor).

So just think back on all those pdf files that you’ve downloaded over the years. Did you ever download anything that did not have “Approved by Big Brother” stamped on it? Well, for most people who use proxies the situation is not too bad: first, pdf files come up in web searches far less frequently than html pages (and if you select the html version of a pdf, when it’s available, then all will be well); second, most sites that—how shall we put it discreetly—contain material that Big Brother would not approve of are less likely than their “kosher” cousins to make use of pdfs. Nonetheless, if you’re doing a little research on privacy or on your regime’s shortcomings, then—as the above example illustrates—it won’t be long before you download an “inappropriate” pdf, thereby inviting Big Brother to “re-educate” you.

It was obvious, wasn’t it?

Once you think about it, it’s all rather obvious. If you had been using Adobe directly you would not have fallen into the trap: if you’d opened up Adobe, then before you started to type a web address into its Internet search box, you would have paused, and asked yourself, “How can I make Adobe use my proxy?” You’d have been looking to see if Adobe had any proxy options among its preferences, and, if not, then you’d be well on your way to socksifying it.

Now this problem is likely to occur whenever a browser calls some Internet enabled program in the background and then displays the results in the browser window. The vast majority of people who use proxies will assume that once they have set up their browser to use a proxy correctly, then anything that they do with the browser will also use the proxy correctly. And as web-based computing—a la Google—is becoming more and more common, the browser is becoming the interface for more and more applications, so this problem is likely to grow.

The only satisfactory solution is to ensure that either you (1) use a sentinel; or (2) use a firewall to block connections to IP addresses other than that of your proxy for all outbound traffic.

How to download PDFS using a Proxy

Now different versions of Adobe may have different preferences, so we’ll deal here with Adobe Reader v7. Open up Adobe and then select in turn the following menu items:

  • Edit => Preferences => Internet => Internet Settings

The “Internet Properties” window that appears is the standard set of Internet property tabs that you get when you select “Internet Options” from the Control Panel. This version of Adobe has no preferences to directly use a proxy.

It’s possible to change the settings of your default Internet connection so as to use a proxy. Select the “Connection” tab, select the default Internet connection, click on the “Settings” button, tick the “Use a proxy server for this connection” check box under the “Proxy server” sub-heading, and then fill in the “Address” and “Port” fields with the values used by your proxy server.

Now, we’ve found that setting up a proxy in this manner works with other Internet enabled software, but for some reason it doesn’t work with Adobe. When we tried it our sentinel still recorded the local DNS look-up going out the door to our ISP’s DNS server. Adobe seems to ignore the proxy settings and just use the default Internet connection, as is. But try it out, it might work for you—just make sure you’ve got a means to verify that it is working correctly.

The alternative is to socksify Adobe, using a product like “SocksCap”, available here, or “FreeCap”, available here (we’ll explain more about socksification another day, but these products are easy to install and to use). We’ve tried “SocksCap” with Adobe, and it works fine, with our sentinel showing no DNS leakage. Just start up Adobe from within the SocksCap window, then open your browser and start browsing (but, as always, use a sentinel to verify that everything is working as intended).

Postscript

And as to Ms. Li Yinping? Yes, there was a real Ms. Li Yinping. And yes, she used to live in Majia Village, Shouguang City, Weifang Region, People’s Republic of China. But Li was a member of the Falun Gong, just an ordinary member, peacefully practising her religion. She couldn’t even be called a dissident for she had never protested against the regime or its edicts. But in June 2001 she was arrested by the local police, and after being tortured for several days with electric batons she died. She remains just another statistic amongst the millions of people who have been tortured and executed for displeasing the regime since the founding of The People’s Republic of China.

Tiffium & Morphium – Bigus Brutium-Absentium Zonium

Dual-Purpose Software

Typology of disclosure: overt, sequestered, covert, invisible—ethical considerations—manipulating the aleatory pool. ... Masochistium Clickium Hic!

Introduction

There is one aspect of aleatography that we have yet to address. We need a means of hiding the software that we use to extract software from within an aleation. We also need a means to create and distribute pure aleations without giving the impression that we have any interest in privacy. These objectives can be achieved through the use of dual-purpose software.

Dual-purpose software performs two very distinct functions: the primary function can be anything whatsoever, as long as it is not associated with privacy and is not likely to cause offence to any Big Brother; the secondary function is one that is associated with privacy, either because it helps to develop the aleatographic infrastructure or because it implements some information hiding technique.

The great advantage of dual-purpose software is the defence of plausible deniability that it affords to anyone who possesses it. Since most people will use the software for its primary purpose alone, and may well not even know of the software’s secondary purpose, there is no reason to suspect anyone who possesses it of having an “unhealthy” interest in privacy.

Typology of Disclosure

We can divide dual-purpose software into two categories according to whether or not it discloses its secondary purpose: disclosing and non-disclosing.

We can divide disclosing dual-purpose software into three categories according to the manner in which information about its secondary function is disclosed to potential users: overt, sequestered, and covert.

Overt

If disclosure is “overt” then both the primary and secondary purposes of the software are proclaimed for all the world to hear. The home page might start off by saying, “This software has two distinct purposes. You can use it to create crossword puzzles, or you can use it to hide information within images.” With overt dual-purpose software it is very likely that most users will understand that it can perform two unrelated functions.

Sequestered

If disclosure is “sequestered” then information about the secondary purpose of the software is made available to its users, but in such a manner that the average user is unlikely to find it. For example, the information may be buried in the depths of the documentation under an obscure sub-heading; and to initiate the secondary function it may be necessary to click on a button with some enigmatic label, having first ticked a certain check-box that lies buried within some a collection of option tabs. With sequestered disclosure the vast majority of people using the software will be unaware of its secondary purpose.

Covert

If disclosure is “covert” then information about the secondary function of the software will not be found within the software itself; and to initiate the secondary function it will be necessary, for example, to enter a specific code into a specific field that as far as the primary function is concerned serves some other purpose. The documentation needed to initiate and make use of the secondary function will not be available on the site from which the software is downloaded but will be distributed amongst privacy forums or, perhaps, only to select groups of individuals. No ordinary user will be aware of, or able to initiate, the secondary function. In the absence of documentation, it would be necessary to disassemble the executable code in order to determine that a secondary function exists.

Invisible

If disclosure is “invisible” then information about the secondary function of the software is never explicitly documented. Instead, the reader can infer from a description of how the software works that it could be used to support some secondary function. Unlike the other three methods of disclosure, this method protects the author of the software from accusations that he is writing software to support aleatography or information hiding.

Comparison

These disclosure mechanisms serve different purposes. In regimes that are merely restrictive then overt dual-purpose software is the best choice, as while no one can prove that by possessing the software a user is making use of its secondary function, the existence of that secondary function will be widely known.

Within proscriptive regimes, sequestered and covert dual-purpose software are far more useful. It is entirely plausible that an individual who possesses such software has no knowledge of its secondary function. On the other hand, fewer individuals are likely to discover that secondary function.

Of course, it’s possible to incorporate the same secondary function into different software products that are made available from different websites, where one product makes overt disclosure and the other does not. The website offering the product with overt disclosure could then mention the existence of its counterpart, and where to obtain it, for the benefit of those individuals living under proscriptive regimes.

Of particular interest is non-disclosing dual-purpose software. If the secondary function can be automated, then there is not even a need for documentation from which the existence of a secondary function might be inferred. If some automatic, non-disclosing dual-purpose software tool became popular, then its secondary function would be executed very frequently. We see this type of software as playing a very useful role in the creation and dissemination of pure aleations.

Ethical Considerations

Now, just because Big Brother is entirely lacking in morals, doesn’t mean that we have to follow suit! Certain kinds of dual-purpose software could put some users at risk. With dual-purpose aleatory software there should be no problem, but with dual-purpose information hiding software there might well be. What if the software is unwittingly downloaded by someone living under a proscriptive regime, and is subsequently found by Big Brother? If disclosure is sequestered or covert then its discovery would not in itself arouse suspicion, so there should be no difficulty. However, if disclosure is overt, then Big Brother may well conclude that the person who downloaded the software was aware of its secondary purpose. So if you’re making software with overt disclosure available for download, then we suggest you succinctly display the information about its secondary function alongside a check box that the user is required to tick before the download starts.

The second issue concerns non-disclosing dual purpose software that performs its secondary function automatically. The secondary function should not do anything that would compromise the user through the use of information that the software may gain in carrying out its primary function, and its use of computer and network resources should not be excessive.

Manipulating the Aleatory Pool

The single most valuable secondary function that dual-purpose software can perform is to create and maintain an “aleatory pool”. An aleatory pool is a collection of one or more aleations. Typically these will be stored in some working directory on the hard disk. The aleatory pool is created and manipulated by the software as part of its primary function, so the primary function needs to be one that can make use of random data. The user can insert or remove ciphertext that masquerades as aleations from the aleatory pool using the standard file copy functions provided by the operating system. The aleatory pool can be used for (1) storage; (2) transformation; and (3) communication.

A user can copy ciphertext obtained from some other source into the aleatory pool. As the ciphertext will be indistinguishable from the aleations produced by the software, the user has the perfect storage location for encrypted material.

If the software uses the aleations to modify some other data, such as an image, in a reversible manner, then by substituting ciphertext for an aleation a user would be able to insert the ciphertext into, and later retrieve it from, the data. The modified data would provide an alternative means of storage and could possibly act as a useful carrier of the ciphertext for the purposes of communication.

Communication software that creates and manipulates an aleatory pool performs some of the basic functions of a janionic network. Even users who have no interest in privacy are still creating and exchanging aleations. If the software became popular then it might well generate a nascent janionic network consisting of millions of users. Users with an interest in privacy could then replace aleations with ciphertext and have it shipped to a recipient’s aleatory pool as part of the software’s primary function.

Tiffium & Morphium – Bigus Brutium-Absentium Zonium

Aleatography

Aleations: the people-friendly aliens who will consume Big Brother from the inside out—the pure and patterned varieties—why the aleation provides a good approximation to the janion—why we need to separate information hiding and aleatography if we are to conquer Big Brother—why distributed aleatography is best. ... Masochistium Clickium Hic!

Introduction

Now in theory janography is all very well, but without a practical implementation we won’t have Big Brother quaking in his over-sized boots. And to start with we must have a concrete candidate for a janion.

Aleations

If we’re going on a journey, then the best place to start from is where we happen to be right now. Of the four dimensions of information hiding, the only one that has been widely implemented to date is HEye. And these implementations have been based on cryptography. And what do the techniques that are employed to encrypt data have in common? Well, when we remove any identifying headers and footers the resulting ciphertext looks just like “aleatory data”, like random or pseudo-random data. So let’s take aleatory data, or the “aleation”, to be the practical embodiment of a janion.

Pure and Patterned Aleations

Aleatory data comes in two flavours: the pure and the patterned. The former is random or pseudo-random data, through and through. The latter consists of mixed data, with a certain percentage of the data following some non-random pattern, and with the remaining percentage consisting of pure aleatory data. In practice, the patterned aleations that are found on most computers consist of image, audio, and video files. Typically, the higher portion of each byte, or set of bytes, is patterned and represents file content, while the lower portion is just random noise generated by physical processes within the camera or microphone.

Pure aleations are efficient carriers of information, as we can produce ciphers where the length of the ciphertext is comparable to the length of the plaintext that has been encrypted. Patterned aleations, on the other hand, offer poor storage densities as the percentage of “noisy bits” present in a typical media file is very small compared to the percentage that represents content.

Inscrutability

By definition, aleatory data, whether or not it is masquerading as ciphertext, looks the same to any statistical technique that might be used to characterize it, and so the criterion of “inscrutability” is satisfied.

Versatility

If the only constraint on an information hiding technique is that it should produce aleatory data, then we have a vast number of techniques to choose from, so the criterion of “versatility” is satisfied.

Duality

Many computer processes produce pure aleations, and it would not be practical to ban these processes, or to modify them so that they do not. However, it would be very helpful if there were rather more pure aleations on the average computer than is the case at present, particularly in those directories that are used to store personal information. But this shortcoming is one that we can do something about.

As far as patterned aleations are concerned then we are spoilt for choice. The computer world is overflowing with image, audio, and video files. Almost everyone has, or can reasonably be expected to have, such material in directories that contain personal information. Patterned aleations therefore satisfy the criterion of “duality” extremely well.

Aleatory Software

We can easily store any software tools that are needed to manipulate aleations inside aleations. Then we can use dual-purpose software to extract the aleatory software and convert it into runnable code.

Aleatory Exchange

Pure aleations are rarely exchanged. Such transfers could, of course, be hidden inside encrypted tunnels, but such tunnels are not the norm for Internet communications, and the presence of encrypted communications is easily detected by monitoring software. Now, the occasional and short duration use of SSL while downloading payment pages on merchant sites is to be expected. But the frequent use of SSL, its use for lengthy periods, or its use while downloading non-payment pages would soon be flagged as anomalous behaviour. So, we need to take active measures to increase the exchange of pure aleations.

Patterned aleations are extremely widely exchanged and are well suited for the purposes of aleatory exchange provided that the volume of plaintext that needs to be hidden inside them is relatively small.

A Good Starting Point

Aleatory data provides a good approximation to what we have required of a janion. There is a trade-off between pure and patterned aleations: the former has some weaknesses when it comes to duality and aleatory exchange, but offers excellent storage densities. The latter is excellent on all janioning criteria, though it only offers a low storage density.

We should therefore adopt a strategy of promoting the creation and exchange of pure aleations to extend the janographic infrastructure. And provided we can think up a good reason why a program needs to produce aleations, we should be able to produce them with impunity, even within those regimes that proscribe encryption. When it comes to using information hiding techniques an individual can then choose between pure and patterned aleations, depending on the level of risk involved should the hidden information be discovered.

Divide and Conquer

It is important to separate the development of aleatography from the development of the various techniques that might make use of it for information hiding. Proscribing particular information hiding techniques is easy to do. Proscribing aleatography—it would be tantamount to banning Internet access and all personal computing—is not practical in any country that hopes to develop and maintain a modern economy, so with the exception of a few maverick states, such as North Korea, the roll-out of a janographic infrastructure that is based on aleations should meet with few obstacles. However, if the same individuals and organizations are too closely involved in both activities then the banning of information hiding by a particular regime might also curtail the development of aleatography.

Distributed Aleatography

Tyranny flourishes in a hierarchical environment; freedom flourishes in a distributed one. The Internet as a janographic infrastructure illustrates this point very well. The Internet was not developed with privacy in mind, but its distributed nature, one that spans the fiefdoms of the world’s Big Brothers, makes it very useful for constructing privacy solutions. Had governments any inkling of what this once military/academic network would become then they would have strangled it at birth. As the Internet has now become essential to the successful functioning of a modern economy, it cannot be destroyed; but its use can be monitored, and its use for certain purposes and by certain individuals can be prevented—witness the successful attempt of China to curtail access to those web sites that it disapproves of. By developing aleatography we will be steadily reducing the capacity of regimes to extract useful information from the web traffic that they monitor, thereby strengthening the Internet still further.

It’s important that the development of aleatography is done in a distributed and uncoordinated manner. While banning aleatography would be very difficult, targeting the individuals who develop it would not. Because aleatography relies on dual-purpose software, it does not advertise or draw attention to itself. It is one thing for Big Brother to know that aleatography exists; but it is quite another for Big Brother to appreciate the threat that aleatography poses to his very existence. If it is developed independently by individuals and by small groups it can grow and spread in a relentless and invisible manner, hidden even from Big Brother’s Sauron-like eye. If we have a fair wind in our sails, then by the time Big Brother fully appreciates the nature of the threat it will already be too late for him to do anything about it.

Tiffium & Morphium – Bigus Brutium-Absentium Zonium

Janography

Why the foundations are more important than what is built upon them—the janion: the building block of privacy solutions—janionic properties: inscrutability, versatility, duality—the bootstrapping of janioning software—janionic exchange: nodal and pseudonymous public key pairs, closed and open routes, routing keys, intermediate nodes, inbound and outbound nestings, high-frequency subnets, directed and random janionets, drop-off points, forked routing—network properties: nodal myopia, self-monitoring and self-adjusting, consensus—active attacks: delaying, injection, deletion, transformation—passive attacks: collation, tracing, flow rate, timing, penetration. ... Masochistium Clickium Hic!

Focusing on the Foundations

Now the future is not looking bright. But if the four dimensions of information hiding could be implemented, then even though the contest would still one of David versus Goliath, we would, at least, have a fighting chance. So how easy is it to implement the four dimensions of information hiding? That depends on the structure of the world, particularly the online world, in which we live. So rather than considering particular techniques for information hiding, let’s take a step back and ask the more fundamental question of what infrastructure is needed so that privacy solutions can be easily implemented in a variety of different ways? This underlying infrastructure has nothing in itself to do with information hiding, but it can greatly facilitate or frustrate our efforts—when the big bad wolf wants to spy on the domestic arrangements of Kermit and Miss Piggy perhaps a “house of straw” built high-up on top of a rocky pillar will serve them better than a “house of brick” built on the quicksands of today’s Internet. The development of an infrastructure that facilitates the implementation of information hiding is what we call “janography”.

Focusing on janography rather than on information hiding has two benefits: (1) it does not tie our hands as to the “how” of information hiding; and (2) because it is peripheral to, and clearly separated from, the “how” its development is less likely to be proscribed by Big Brother.

In essence, the battle for freedom is being waged on two fronts. Big Brother hopes that his steadily increasing monitoring of individuals will go unnoticed by society and will eventually reach a tipping point where privacy becomes impossible. Big Brother’s opponents hope that their steadily improving janographic infrastructure will go unnoticed by Big Brother and will eventually reach a tipping point where the elimination of privacy becomes impossible. If we can get this janographic infrastructure sufficiently well established, then Big Brother will be powerless to stop the implementation of privacy solutions that are based upon it. We’ll have Big Brother by the “short ’n curlies”, and we’ll be able to squeeze to our collective heart’s content!

Janion

Let’s give a name to the building block out of which we can construct privacy solutions that satisfy the four dimensions. Let’s call it a “janion”. A janion is something inside of which we can hide information. Any particular janion may, or may not, contain hidden information. And what properties should a janion possess?

Janionic Properties

Inscrutability—it must be possible to hide information inside a janion without distorting any of its natural properties. Janions that contain hidden information should be indistinguishable from janions that do not.

Versatility—a janion must be sufficiently flexible so that it can be used as the basis for many different privacy solutions. It should be possible to hide information inside a janion in many different ways, since we want to decouple the infrastructure that facilitates the hiding of information from the information hiding techniques that make use of it.

Duality—a janion must also be produced as a by-product of computer processes that have nothing whatsoever to do with privacy. And it must be impractical to ban these processes, or to enforce their modification in such a way so that they no longer produce janions.

In other words a janion—like the Roman God “Janus” after which it is named—must face in two ways: while being flexible enough to allow information to be hidden inside it in a variety of different ways, it must not draw attention to itself; it must be the embodiment of the ordinary, the commonplace, the unremarkable—it must be something that you’d expect to find on any computer.

Imagine a world containing trillions of janions residing on computers everywhere. In such a world Big Brother has a problem. Some of these janions are being used by people to hide information. But it’s impossible to tell which janions are and which janions are not being used for this purpose. Hence, on the basis of the hidden information alone, it is not possible to tell which people are hiding information and which are not.

Janioning Software

Thus far we have a world where individuals can keep private matters private—we have the world of H2Eye. But if Big Brother inspects our computers he will find the software that we use to hide information inside janions. So, in order to satisfy the software hiding requirement of H3Eye, we require certain properties of the software that we use to manipulate janions:

Bootstrapping

The software that is used to manipulate janions must be stored on a computer in the form of a janion, and it must be possible to extract the software from the janion and convert it into a runnable form using dual-purpose software: software whose primary, and ostensibly only, function is totally unrelated to the nurturing and promotion of privacy.

Bootstrapping makes it impossible for an adversary to determine whether or not we possess the capacity to manipulate janions, including the capacity to hide information inside janions.

Janionic Exchange

One of the characteristics of present day governments is their capacity to automatically construct extensive networks, consisting of nodes that represent individuals, and connecting links that represent communications between those individuals. Even though the contents of the individual communications may not always be known, these networks are very easily used to manipulate, suppress, and control those individuals who would otherwise draw attention to governmental failings and corruption. In this manner, governments seek to subvert, undermine, and impair the democratic accountability that they owe to their citizens.

Now unless janions that don’t contain hidden information are routinely exchanged in large numbers, the exchange of janions would, in and of itself, betray the likely presence of hidden communications between senders and receivers, allowing governments to create networks of people who communicate with one another and who also have an interest in privacy. So to implement the communications aspect of H3Eye we need to ensure that janions that don’t contain hidden information are exchanged in large numbers.

However, even if janions are exchanged in large numbers for reasons unrelated to the protection of people’s privacy, these exchanges still allow governments to create networks that detail who is communicating with whom. We have failed to be “eternally vigilant”, and the price we have paid for our inattention is that governments have accrued the powers they need to spy upon us with impunity. But to wrest back those powers is likely to prove well-nigh impossible, at least in the short term, so a more subtle approach is needed. Now the monitoring of communications by governments does not, in and of itself, diminish our freedoms; instead it is the capacity of governments to extract useful information from the data they collect—it is this information which acts as the oxygen of tyranny. What we propose is a method of janionic exchange that makes the construction of any kind of “informative” network as a result of communications monitoring impossible.

Now, take a deep breath; let it out; and then employ those speed-reading skills—when we applied our “Susie Test” to the following section the answer we received was “Yuck”!

Janionic Network

A mechanism that moves janions around a network according to the following principles:

  • A network consists of nodes. Each node has a “nodal identifier”, a “physical address”, and an associated “nodal public-key pair”. The triplets consisting of nodal identifier, physical address, and nodal public-key are made available to all nodes and are held in “distributed directories”. The “nodal private key” associated with a node is known only to that node. In addition, a node may have one or more “pseudonymous public-key pairs”: the private key of a pair is known only to the node, as before; the public key, while circulated in public, is not associated with the identifier or the physical address of the node and cannot be used to identify it.
  • A “route” consists of an ordered collection of network nodes. A route is “closed” if the last node in the route is the same as the first; otherwise, it is “open” (almost all the routes used for janionic exchange will be closed). The “originating” node is the first node of an open route, and the common first and last nodes of a closed route. All other nodes are “intermediate”.
  • Associated with each intermediate node on a route is a set of “instructions” that tell the node what operations to perform and provide the data that is needed to perform these operations. Mandatory data elements that appear in all instructions are the physical address of the next node on the route and a one-time “routing key”. An optional data element is a “drop-off identifier”.
  • Associated with each intermediate node on a route is a “report” that is created by the node, and which details the results of the operations that it has performed.
  • An “outbound nesting” for a route begins by taking the “instructions” for the last intermediate node and encrypting them using the nodal public key of that node. Then each intermediate node is taken in turn in reverse order and its instructions are appended to the outbound nesting before being encrypted using its nodal public key. An outbound nesting is constructed at the originating node of a route when the route is being planned.
  • An “inbound nesting” for a route begins by taking the “report” for the first intermediate node and encrypting it using the routing key allocated to that node. Then each intermediate node is taken in turn in forward order and its report is appended to the inbound nesting before being encrypted with the routing key allocated to the node. An inbound nesting is constructed in stages, with each intermediate node that comprises a route making a contribution.
  • A janionet consists of a janion, an outbound nesting, and an inbound nesting, each of which has a fixed-size.
  • A random janionet is prepared automatically by selecting at random from the network a fixed number of intermediate nodes which are to form the associated route. A directed janionet is prepared by the operator of a node for the purposes of communicating with some other node.
  • A janionet is sent from the originating node to the first intermediate node on the route. That node decrypts the outbound nesting of the janionet using its nodal private key, and extracts its instructions and the outbound nesting for the next node. The default instructions for a node are as follows. If the instructions contain the inverse routing keys of previous nodes, these are applied to reconstruct the janion dispatched by the originating node. Then the node attempts to decrypt the janion using its nodal private key to determine if the janion contains a message for the node. If the node wishes to make a reply it modifies the janion accordingly. The node then encrypts the janion (the modified version in the case of a reply, the decrypted version in the case of a message with no reply, and the version received from its predecessor node in the absence of a message) using the routing key found in the instructions. It updates the inbound nesting by appending the timestamp at which the janionet was received and encrypts the combination using the routing key. It creates a new janionet consisting of the encrypted janion, the updated inbound nesting, and the extracted outbound nesting. This janionet is then held in a storage area, the janionic pool, for a period of time selected at random from some probability distribution (or as otherwise directed by the instructions) before being forwarded to the next node on the route.
  • A janion can be encrypted using the nodal public keys of multiple intermediate nodes when no replies are expected. Different messages for multiple intermediate nodes can be placed inside different hidden volumes within the same janion by encrypted them to the nodal public keys of different nodes, allowing each node to reply by modifying its own hidden volume.
  • A node may be given instructions to clone, replace, or destroy the janionet, or to create one or more new janionets.
  • A node may be instructed to send the janionet along multiple routes.
  • A node may be instructed to hold a janion tagged with an associated identifier in its janionic pool for a specified period of time and then destroy it.
  • A node may be instructed to search its janionic pool for a janion corresponding to a specified identifier and then forward that janion along a specified route.
  • A node may be instructed to ask the nodal operator to take some manual action (for example, to remove a janion from the janionic pool and to forward it elsewhere by snail mail).
  • Each node records statistics on the rates at which janionets are absorbed and emitted. When janionic flow rates are disturbed by the origination of directed janionets, by the actions of an adversary, or by network malfunction, a node will adjust the rate at which it originates random janionets so that the stochastic properties of the janionic flow rates into and out of the node remain unchanged.
  • When a node is added to the network the nodal operator asks the node to automatically negotiate membership of one or more “high-frequency subnets”. In addition to the standard random exchanges with nodes selected at random from the network as a whole, the node will also make random exchanges at much higher frequencies along routes chosen at random from the high-frequency subnets. At intervals, a node will randomly remove some nodes from each subnet and add a similar number of nodes selected at random from the network as a whole.

Sending and Receiving

Now you know why Susie said, “Yuck!” Don’t you hate it when people trot out these formal, “snooze-inducing” definitions? So, let’s consider a few examples to breathe a little life into what is rapidly becoming a torrid text.

Let A be an originating node. Let B, C, D, and E be intermediate nodes, with respective routing keys of b, c, d, and e. Then a closed route and its instructions can be represented by:

  • A => Bb => Cc => Dd => Ee => A

Send Only

Suppose node A wishes to communicate with node C. It encrypts the janion using C’s nodal public key and sends it along the following route: A => Bb => C-bc => Dd => Ee => A. Node C’s instructions contain the inverse of node B’s routing key. So when node C receives “b(janion)” from node B it first computes “-b.b(janion)”, to give “janion”, which it can then decrypt using its nodal private key. It then applies routing key “c” so that it passes “c.decrypt(janion)” on to node D. The originating node receives e.d.c.decrypt(janion). By applying the inverse of keys “e”, “d”, and “c” node A can verify that node C has received and successfully decrypted the message.

Note that though the sender knows the identity of the recipient, the recipient does not know the identity of the sender, unless the sender wishes to sign the message—clearly an advantage in a whistle-blowing context.

Send and Reply

Suppose node A wishes to communicate with node C, and node C wishes to make a reply. The only difference from the previous example is that instead of passing “c.decrypt(janion)” on to node D, it passes c(reply) instead. By applying the inverse of keys “e”, “d”, and “c” node A can read node C’s reply. As above, the sender is unknown to the recipient, but the two can still communicate.

Drop-Off Points

Two nodes A and L can communicate without knowing each other’s physical location. All they need to do is to exchange pseudonymous public keys, and then agree a drop-off identifier and a drop-off node. Node A encrypts the janion using node L’s pseudonymous public key and sends it along the following route: A => Bb => C-bc1 => Dd => Ee => A. The instructions given to node C tell the node to put the original janion into its janionic pool, tagged with a drop-off identifier for a specified period of time and then delete it. Before the janion is deleted, node L sends a janion along route L => Mm => Cc2 => Nn => L. The instructions node L gives to node C are to replace the janion it receives from node L with the janion in the janionic pool that matches a drop-off identifier given to node C by node L. If this identifier matches that given to node C by node A, then node L will receive “nc(janion)” from which it can extract the message using the inverse of routing keys “n” and “c2” and its pseudonymous private key.

Any node that is prepared to act as a drop-off point will allow other nodes to publish the information needed to initiate such exchanges: a pseudonymous public key, a drop-off identifier, and a drop-off node, as well as any other information that the originator of the communication wishes. Any nodal operator can then initiate a communication. Ideally every node on the network would offer this facility, and senders and receivers would move the drop-off node around the network in a random manner, changing it with each new communication. If every node can function as a drop-off point and drop-off points are selected at random, then tracking down communications becomes much more difficult as it becomes necessary for an adversary to identify pairs of intersecting routes.

Network Properties

Forked Routing

The instructions for a particular intermediate node may request it to send a janion on one or more routes in addition to the main route. So if the main route is A => Bb => Cc => Dd => Ee => A, then node C’s instructions may require it to send the janion along route Cc => Xx => Yy => A as well. Multiple forks, forks within forks, and random forks are all possible. Any or all of the forked routes can be open as well as closed. Forks can be used for a number of purposes, as we’ll see shortly.

Nodal Myopia

Each intermediate node will know from which node it received a janion. And the outbound nesting will tell it to which node it should forward a janion. But that’s all it knows about the route along which a janion is travelling. It knows nothing about the other nodes. In particular, it knows nothing about the originating node.

The Self-Monitoring Network

The janionic network can monitor itself and automatically determine if a node is malfunctioning. It can also determine if a compromised node or an adversary with access to the links between nodes is perturbing the flow of janions in some manner. This self-monitoring is possible due to:

  • Closed Routes
  • Nodal Myopia
  • Inbound Nestings
  • Routing Keys

Closed routing allows any node to examine the workings of any other node, as what any intermediate node does will be fed back to the originating node. Nodal myopia minimizes the knowledge that a compromised node has of the routing, which makes it impossible for a compromised node to avoid carrying out its instructions and escape detection. The encryption of inbound nestings and janions at each node using a one-time routing key ensures that the originating node has feedback to analyse, which helps it to determine which node, or which link, is malfunctioning.

Network Consensus

An adversary with access to a node or to a link between nodes can do various things to perturb the network. If the originating node suspects that a route has been compromised, it can send probes along multiple routes that pass through each of the intermediate nodes on the problematic route in turn, and thereby determine which node has been compromised. The compromised node can then be blacklisted by the originating node.

Furthermore, the originating node can send a message signed using its nodal private key across the network using a non-encrypted janion to inform other nodes of the compromised node. These nodes can then perform their own tests by way of confirmation. They can then, in turn, send out warning messages signed with their nodal private keys. As a result a “network consensus” regarding the compromised node will develop, and it can be blacklisted by the network as a whole, rather than just by individual nodes. It’s impossible to spoof such warnings without adding new nodes to the network. And any compromised nodes that send out denials would be in danger of disclosing their real purpose. Hence, as long as less than 50% of the network has been compromised, a correct network consensus regarding the status of an active compromised node can be obtained (compromised nodes can of course be silent and just record information, rather than perturbing the network).

Network Statistics

Let’s get mathematical—sorry Susie! Let’s assume that our network has n nodes, that each route possesses m intermediates nodes, and that on average r janions are originated per day by each network node. Assume that by default all routes are closed, so that the number of janions emitted by a node is the same as the number absorbed.

Let’s assume that the number of network nodes is large and that the number of janions emitted or absorbed by a node during a particular time interval has a Poisson distribution. This is a good choice since it means that the numbers emitted in non-overlapping time intervals will be stochastically independent. Hence, if r janions are originated on average by a node each day, then the probability distribution of the number originated during a time interval t has a mean of rt, a variance of rt, and the probability that exactly k janions will be originated during the time interval equals:

  • rt**k.exp(-rt)/k!

The total number of janions emitted or absorbed by a node in a time interval t will have a Poisson distribution with a mean and variance of r(m+1)t.

By way of example, consider a network of one billion nodes (after all, the Internet now has one billion users), with 9 intermediate nodes per route, and with each node originating 100 janions per day (imagine that each Internet user sends 100 emails per day to randomly selected email addresses). Hence, on average 1000 janions will be emitted and absorbed by each node every day, with a standard deviation of about 32. Assuming that the Poisson distribution is approximately normal, then about 95% of the time the number of emitted or absorbed janions will lie within two standard deviations of the mean. Hence, we can say that on 19 days out of every 20, the number of janions emitted or absorbed by a particular node will lie between 936 and 1064.

Given the large amount of variation in the numbers of janions emitted and absorbed it will clearly be easy to slip in the occasional directed janion without it being detected. This is exactly what we need from a good janographic infrastructure: even though it is possible to monitor communications, it is not possible to extract any useful information from that monitoring; every node behaves in exactly the same manner as every other node; and each node’s default behaviour is entirely devoid of “intentionality”.

Active Attacks

An adversary with access to a node or to a link between nodes can do various things to perturb the network:

  • Delaying Attack
  • Injection Attack
  • Deletion Attack
  • Transformation Attack

Delaying Attack

In principle, an adversary could systematically delay janions sent by a node and then examine another node to see if statistical variations in the arrival times of janions at the latter confirm that both nodes frequently lie on the same route. However, this type of attack will not work because both nodes that are the subject of the attack can detect it as it develops and can take action to undermine it.

Let’s assume that node A frequently originates janions that are targeted at node B with far greater frequencies than if the routes were chosen at random. Let’s assume that an adversary suspects that this might be the case and monitors all janions emerging from node A, delays them in some random manner, and then monitors the arrival times of all janions at node B. Now the adversary may be able to estimate how long it takes on average for a janion to travel from node A to node B in the absence of any perturbations. If so, then as soon as node A emits a janion he can determine the expected time of arrival, and set counting windows on each side of that arrival time. With no delays then on average both windows will have the same count. But if he delays a janion, then the second window will on average have a greater count than the first.

Let’s suppose that on average each node originates 100 janions per day, and that on average node A sends one directed janion per day to node B. Since the adversary doesn’t know which of the janions emitted from node A might be targeted at node B he has to delay all of them, or delay a random sample of them. Now node A knows the statistical distribution of transfer times for janions around a closed route. If will therefore see a statistical anomaly begin to arise as the average transfer time increases. And for every janion targeted at node B that the adversary could use to glean some information, node A will have 100 times as many. Hence, node A will have solid evidence that something is wrong long before the adversary. Furthermore, node A can examine the inbound nesting to determine the exact arrival times of janions at intermediate nodes, and can thereby determine that the problem lies with all links exiting node A. Even better, node A can send probes that have zero holding times to intermediate nodes, allowing it to detect the precise statistical distribution of the delays inserted by the adversary.

Node A can defeat the adversary by changing the instructions to the first intermediate node on each route, telling that node not to hold a janion for a specific random delay, but to hold it until a specific time of day has passed. Once node A knows the magnitude of the longest delay being inserted by the adversary, it can set the emission time from the first intermediate node to be equal to the time of emission from node A plus that longest delay; hence, the time between emission from node A and emission from the first intermediate node becomes fixed.

Let’s assume, for the sake of argument, that node A does nothing. Let’s see what action node B can take in these circumstances. Now node B will be keeping track of the numbers and arrival times of janions (as part of its instructions node A can inform node B at what time it dispatched each janion). Hence, node B and the adversary both have access to the same information and can both examine the same statistics. As soon as the data begin to look statistically unusual, but before they become statistically significant, node B can send closed circuit probes with zero holding times along routes that pass through node A (assuming that node A is prepared to disclose its identity). Since the adversary does not know the origin of janions he will apply the delays to these probes and immediately reveal his presence. Node B can then inform node A to take action, and can inform the network of the presence of the adversary. Once a network consensus has been obtained, if node A has not addressed the matter, all nodes can either blacklist node A, or inform other nodes to override its instructions by holding its janions until a specified time has passed rather than for a random interval.

Injection Attack

Since an adversary knows nothing about the route apart from the next node he cannot send a janion to either the originator or to an intended recipient. So, inserted janions can’t be used for tracking purposes. As each node monitors the numbers and the nodes from which it receives janions, and can share that information with other nodes, a node that suddenly started producing large numbers of janions without good reason would be detected and blacklisted.

Deletion Attack

An adversary who suspects node A of communicating with node B may try to delete some or all of the janions exiting from node A in the expectation of seeing a dip in the number of janions arriving at node B. This attack can be defeated in the same manner as the delaying attack considered above.

Transformation Attack

Let’s suppose a compromised node modifies the outbound nesting. Since the node doesn’t know the originating node, its effect is to delete the janion, and insert a new one at random (an attack which has already been covered).

If a node modifies a janion then the originating node will be able to determine that it has done so. For example, suppose that on route A => Bb => Cc => Dd => Ee => A node D takes “cb(janion)” from node C and changes it to “modified” before passing it along the route. Then the janion received by node A will equal “ed(modified)” instead of “edcb(janion)”. The originating node can then probe the nodes individually to determine the compromised node, and if the originating node had been attempting to send a message to node E, for example, it could then send the message along a different route.

Passive Attacks

An adversary with access to a node or to a link between nodes can do various things to examine the behaviour of the network without introducing any detectable perturbations:

  • Collation Attack
  • Tracing Attack
  • Flow Rate Attack
  • Timing Attack
  • Penetration Attack

Collation Attack

Two or more compromised nodes may record the information that is available to them as a janion passes through, share it, and then try to infer the behaviour of some other node.

For example, suppose that on route A => Bb => C-bc => Dd => Ee => A nodes B and D are compromised, and suspect that node A is communicating with node C. Now, neither node knows the instructions given to node C: when node B had the outbound nesting it was still encrypted with C’s nodal public key; and when node D received the outbound nesting, node C’s instructions had already been removed from it. Neither node has access to node C’s report: it did not exist when node B had the inbound nesting; and it had already been encrypted using node C’s routing key by the time it was passed to node D. Node B knows the original janion and “b(janion)”. Node D knows “c(message)”. But it’s not possible to distinguish “c(message)” from “cb(janion)” without knowing routing key “c”. Hence, nodes B and D have no way of knowing whether node A is communicating with node C.

Tracing Attack

The most obvious way to determine if two nodes are communicating would seem to be to trace the flow of janions leaving the first node and see if a disproportionate number of them enter the second node.

Let’s examine a particular node. Janions arrive at the node at random, and depart from the node at random. Because the respective sizes of janions, outbound nestings, and inbound nestings are fixed, it is not possible to distinguish the emitted from the absorbed based on size. Because each janion, outbound nesting, and inbound nesting is encrypted / decrypted between absorption and subsequent emission it is not possible to distinguish the emitted from the absorbed based on their contents. Because janions are held in a node’s janionic pool for random times before they are forwarded and because a node will also be originating its own janions, the best that can he said of a janion emitted by a node is that it is very likely to be one of the janions absorbed by the node within some period of time prior to its emission. For example, an adversary might be able to say the probability is 95% that the janion that was emitted by node A at 16:00 is one the 250 janions absorbed by node A between 10:00 and 16:00.

Now it’s possible to track all the janions emitted by node A and determine which nodes they enter. And for each of these nodes it’s possible to determine which of the janions that are subsequently emitted by the node might be node A’s janion. And it’s possible to repeat the process node by node. Let’s say the number of candidate janions per node is c (250 in the above example). Then if we follow the flow through n nodes, we will have encountered c**n different nodes (ignoring cross-backs). Tracing these flows very quickly results in a combinatorial explosion. For example, suppose that the network contains a billion nodes and that the number of candidate janions per node is 250, then after following the flows from node to node for four iterations an adversary will have encountered nearly 4 billion nodes—even allowing for cross-backs an adversary will have encountered almost every node in the network. In other words, if an adversary was hoping to determine whether janions emitted by node A subsequently pass through node B, the answer will be that all janions emitted by node A will appear to pass through node B even when no communication between node A and node B is taking place.

Flow Rate Attack

A nodal operator who originates directed janions will increase the total traffic emitted by the originating node and absorbed by the target node. This increase in nodal traffic could allow an adversary who monitored one of the nodes to determine that it was actively communicating with some other node.

If r random janions that pass along closed routes containing m intermediate nodes are originated on average by every network node each day, then the probability distribution of the number emitted or absorbed by any node during a time interval t has a Poisson distribution with a mean of r(m+1)t and a variance of r(m+1)t. Suppose that in addition to these random flows node A originates k directed janions per day targeted at node B. Now an adversary who is monitoring node A (node B) will detect an excess of kt janions being emitted (absorbed) during time t. How long would an adversary have to wait until the probability of the excess occurring by chance had dropped to no more than, say, 1 in 40? Assuming that the Poisson distribution is approximately normal, then, as only the upper tail is relevant, the excess must equal twice the standard deviation of the distribution of the number of random janions emitted during the same time period:

  • kt = 2sqrt(r(m+1)t)

or

  • t = 4r(m+1)/k**2

If each network node originates 100 janions per day, if there are 9 intermediate nodes on each route, and if 10 directed janions are originated by node A per day, then an adversary would have to wait 40 days for the probability of the number of excess janions being emitted to drop to 1 in 40.

Clearly the janionic infrastructure hides the flow of directed janions very well in the short to medium term. However, if the network is self-monitoring and self-adjusting then long before an adversary could detect a statistical anomaly, the originating and target nodes will have adjusted the flow rates of the random janions that they originate to ensure that the default flow rate statistics are re-established.

If communications are bidirectional or the routes selected for directed janions are closed, then each node will emit as many extra janions as it absorbs. Hence, each node can automatically reduce the rate at which it originates random janions—by one random janion for each directed janion—so that the total average flow rate through each node returns to its default value.

If communications are predominantly unidirectional and the routes are open (with node A at one end of a route and node B at the other), then the originating node needs to reduce the number of outgoing random janions while keeping the number of incoming random janions unchanged, and visa-versa for the target node. The originating node can reduce its outgoing random janions by the number needed to restore the average outgoing flow rate to its default value. It can then add instructions to its outgoing random janions so as to create as many closed forked routes as are needed to raise the average flow rate of incoming janions back to its default value. The target node can emit the same number of outgoing random janions as before, but it can send an appropriate number of them on open routes, so as to lower the average flow rate of incoming janions to its default value.

Timing Attack

An adversary could gain evidence that node A is communicating with node B by correlating the emission of janions by node A with the absorption of janions by node B.

If r random janions that pass along closed routes containing m intermediate nodes are originated on average by every network node each day, then the probability distribution of the number emitted or absorbed by any node during a time interval t has a Poisson distribution with a mean of r(m+1)t and a variance of r(m+1)t. Since the number of janions absorbed by node B in any time interval has a Poisson distribution, the numbers absorbed in non-overlapping time intervals will be stochastically independent.

If a particular starting time is selected at random, and the count of the number of janions absorbed by node B that occurs in a time interval t that succeeds the starting time is subtracted from the count occurring in a time interval t that precedes the starting time, then the resulting probability distribution will have a mean of zero and a variance of 2r(m+1)t. Now suppose that an adversary knows that it takes up to time t for a janion to get from an originating node to a target node. Suppose the adversary waits until node A emits a janion. Suppose he monitors the count of the janions absorbed by node B during a time interval t, and then again during a second contiguous time interval t. If all the janions originated by node A are random and the network is large, then the probability that any of them will pass along a route that contains node B is negligible (see next section). Hence, the mean and variance of the difference in the counts between the two windows should be the same as when the starting point is selected at random. However, if s directed janions per day are emitted by node A and targeted at node B, then the probability that the janion monitored by an adversary is directed will be s/r(m+1), so that on average the difference in the counts between the two timing windows will be s/r(m+1), as the earlier window will always contain the count associated with a directed janion. If the adversary repeats this process, at the maximum rate of 1/t times per day for k days, then the excess count will be ks/tr(m+1) and the variance will be 2kr(m+1).

Assuming that the adversary wants to wait until the probability of the excess occurring by chance has dropped to no more than 1 in 40, and that the probability distribution of the difference in the window counts is approximately normal, then, as only the upper tail is relevant, the excess must equal twice the standard deviation of the probability distribution of the difference in the counts between the two windows:

  • ks/tr(m+1) = 2sqrt(2kr(m+1))

or

  • k = 8t**2(r(m+1))**3/s**2

If 10 directed janions are originated by node A per day and the width of the counting window is 6 hours, then an adversary would have to wait for about 13,700 years to gain solid evidence of communication between the nodes. The reason why this correlation proves so difficult to detect is that the adversary doesn’t know which 10 of the 1000 janions emitted by node A per day are targeted at node B. In other words, only 1 in 100 of the measurements he makes contains any useful information. And even then as the transit time for janions across the network is high many random janions will arrive at the target node during each of the counting windows—for every excess janion detected some 25,000 random janions will arrive in each of the counting windows.

On the other hand, if nodal holding times were negligible and janions took at most one second to get to the target node, then an adversary would only have to wait for about 15 minutes to get the evidence he needed. This difference in the amount of effort that an adversary must expend between these two scenarios illustrates the importance of long holding times and demonstrates why suspected routes can easily be confirmed on fast, low-latency networks, such as those associated with Internet browsing.

This type of timing attack will only work if the probability distribution of holding times is not uniform. Suppose that node A sends out a directed janion at about the same time each day and instructs the intermediate nodes to select random holding times so that the janion is equally likely to arrive at its target at any time during the next 24 hours. In this case, no matter how the counting windows are chosen they will record on average the same number of directed janions. If a lower latency is required, node A could emit janions about once per hour with a uniform distribution of transfer times spread over an hour. Many of these janions would of course be dummies. Alternatively node A could send out a small number of important messages with fast transit times, and let the network delay the less important messages so as to balance the statistics.

A target node can also detect statistical anomalies. If the sending node includes the time at which a janion is emitted as part of its instructions, then the target node can perform its own statistical analysis on the distribution of transit times. And it can do this far more efficiently than any adversary (100 times more efficiently in the above example). Hence the target node can detect the development of a statistical anomaly long before an adversary could do so. The target node can then request the sending node to adjust the instructions that are given to the intermediate nodes, and can subsequently confirm that these adjustments cause the anomaly to disappear.

Penetration Attack

A penetration attack occurs when an adversary progressively compromises more and more of a network’s nodes. Effectively, this attack removes nodes from the network, so that it shrinks. Now, with an active penetration attack it is possible to determine which nodes have been compromised and then blacklist them, so that, even though the network shrinks, at any time the proportion of compromised nodes that have not yet been discovered will always be small. Unfortunately, with a passive attack it is not possible to determine which nodes have been compromised. Hence, as more and more nodes are compromised, the average number of uncompromised intermediate nodes on a janion’s route decreases. At some point janions will occasionally be sent along routes that consist entirely of compromised nodes. Eventually all routes along which janions travel will have been compromised.

For all the other attacks considered so far it has not been possible for an adversary to determine the routes along which janions travel. So we need to ask (1) at what level of penetration does it become possible to routinely detect routes, and (2) does knowledge of routes allow an adversary to determine which nodes are communicating with one another.

So the first question is how likely is it that an adversary will be able to determine the route followed by a particular janion—how likely is it that all nodes other than the originating and target nodes have been compromised? Suppose that a fraction f of the network has been compromised and that janions are sent along closed routes that contain m intermediate nodes. Then the probability, p, of the entire route being compromised is given by:

  • p = f**(m-1)

or

  • f = exp(ln(p)/(m-1))

Let’s assume that there are 9 intermediate nodes. Then with 92% penetration the probability of detecting each route is 1 in 2; with 75% it is 1 in 10; with 56% it is 1 in 100; with 42% it is 1 in 1000; and with 18% it is 1 in a million.

Now we can ask for a given degree of network penetration, how long will it take before the routes corresponding to directed janions are detected? Suppose, as above, that the probability that any particular route is compromised is p, and that node A originates k directed janions targeted at node B every day. As time passes the probability that at least one directed janion will have been detected increases. If we wish to ensure that this probability rises no higher than q, then for how many days, d, can we continue to send directed janions? The number of routes followed by directed janions in d days equals kd. The probability that none of these routes has been compromised is (1-p)**kd. Hence, the probability that at least one route will have been compromised is given by

  • q = 1-(1-p)**kd

or

  • d = (1/k)ln(1-q)/ln(1-p)

Let’s assume that we want the probability that one or more routes have been compromised to be no more than 1 in 100, and that node A originates one directed janion per day. Then with 56% penetration an adversary will have to wait for 1 day; with 42% for 10 days; and with 18% for 27 years.

So we now have an answer to the first part of our question: at what degree of network penetration can an adversary frequently detect the route along which a directed janion is travelling? The next question to ask is does this matter? Effectively, we have a situation where an adversary can see that a janion travels from node A to node B and back again. Now an adversary can never prove that a janion is directed without compromising one of these two nodes. However, the size of the network determines the frequency with which randomly selected routes will contain both nodes A and B. If the frequency with which an adversary finds nodes A and B on the same route is much higher than it should be by chance, then he can conclude that the routes are selected intentionally for the purposes of communication.

So how likely is it that two nodes will be found by chance on the same route in a network containing n nodes. The probability that a random janion emitted by node A will pass immediately through node B is 1/n. If it passes first through some node X and then through node B, the probability of this occurring is 1/n**2. However, as there are n possible choices for node X, the probability that the janion passes through node B after two hops is still 1/n (and the same for multiple hops). Hence, the probability that node B will be found amongst the intermediate nodes by chance equals m/n.

Over a period of k days, an average of r(m+1)k janions will have been emitted by node A, and the probability that node B will appear on none of these routes equals (1-m/n)**r(m+1)k. Hence, the probability, p, that node B will appear on the same route as node A one or more times is given by:

  • p = 1 - (1-m/n)**r(m+1)k

or

  • k = (1/r(m+1))ln(1-p)/ln(1-m/n)

or

  • n = m/(1-exp(ln(1-p)/r(m+1)))

Suppose that the probability p is 0.5, that the network has a billion nodes, and that there are 9 intermediate nodes per route. Then the time that must elapse before there is at least a 50% chance that node B will appear on the same route as node A equals 211 years. This is fine if node A only needs to communicate with node B once every 211 years!

Suppose we only wished to wait one day before there was at least a 50% chance of node B appearing on the same route as node A? Then the network should contain about 13,000 nodes. Hence, if node A and node B wish to communicate every day, they need to belong to a high-frequency subnet containing about 13,000 nodes. If they do, then an adversary will have no evidence to conclude that the janions they exchange are directed, rather than random. Note that as the network is progressively compromised an adversary will be able to estimate the size of the subnet and the frequency of communications within the subnet, so the size and the frequency need to be selected for consistency before penetration becomes too deep.

The size of the subnet places restrictions on the number of nodes that a particular node can communicate with on a frequent basis, but as can be seen from the example above this is far from being restrictive. The nodes belonging to the subnet should not be fixed, but should shift over time. If node A wants to start communicating with node C and it has not done so in the past, then an adversary may already have determined that node C does not belong to the subnet. If the subnet were fixed, then the sudden appearance of high-frequency exchanges with node C would provide evidence for communications. If subnet membership is fluid, then it may just have happened that node C has been added to the subnet purely by chance.

Summary

The analysis above is of the broad-brush variety, with various details omitted, and our rather rusty mathematics has doubtless led to some errors. Now, we’re not suggesting that such a network should be constructed, and, in any case, a detailed simulation of its properties would be needed before such a task was attempted.

Instead, our objective has been to illustrate that even if an adversary can monitor all nodes, and has compromised all nodes, except for those that are actually communicating with one another, it is still possible to construct a network in which it is plausible to deny that any communication is taking place.

We’re also interested in promoting certain principles. We like the ability of random flows to disguise directed ones. These random flows need not be wasteful. In a fully distributed network, directories, information, and searching would be distributed amongst the nodes. So the task for some “clever person” is to find an efficient mechanism for using random flows to implement distributed services!

We like the idea of a self-monitoring and self-adjusting network: one that can detect the perturbations caused by network users and external adversaries, and which can then adjust the network flows to prevent any statistical anomalies from developing. We like the ideas of closed circuits and nodal myopia that allow any node to secretly test that a particular node is performing as it should, without the node in question having any idea that it is being tested. And we like the idea of a network being able to arrive at a consensus, with each node voting with its own nodal private key.

We like the idea of route-based rather than point-based communications. At present, a downloaded web page or an email has a definite destination. With route-based communications, the best that can be said is that the web page or email may have had a particular destination. And we like the idea of drop-off nodes, so that two parties never have to communicate directly, but can use a node that is randomly varied with each exchange between the parties.

Tiffium & Morphium – Bigus Brutium-Absentium Zonium