Introduction: NbXOR80 a Text Based Encryption
I've been interested in encryption for most of my life but it's only over the last 16 years or so that good information has started to become more available for the casual study of strong encryption. Recently several prominent individuals have publicly encouraged developers to create new open-source methods of encryption that are not manipulated by government agencies. This work will, in part, be based on my previous instructable (here).
My initial hopes for an XOR type mechanism that works in a non-bitwise fashion panned out pretty nicely. The only problem is that “substitution ciphers” are far too simple to crack. I read a recently published paper that incorporated many of my ideas that they called “ALPHA-qwerty Cipher” but they did not move beyond the “substitution cipher” stage of my previous work. They did not credit me in their work so I’m going to assume they come up with the idea on their own.
Add a Teacher Note to share how you incorporated it into your lesson.
Step 1: Why Text-based?
Modern encryption algorithms create binary data that is often unprintable and can be difficult to move through certain channels unless it is encoded using a method that produces only text (mime, base64, uuencode, etc.). That encoding often adds 30% volume and requires an extra step unless the particular application used handles that for you. The other issue is that often an app or software must be installed to perform the algorithm.
My goals are as follows:
- Completely text-based operation (printable characters 101-key US keyboard)
- Strong - identify weaknesses and incorporate best practices where possible
- Portable - no installation required
- Free & opensource
I have included several interactive demonstration webpages where the code is highly formatted with comments (simply "view source" in your browser). I've tried to keep everything as readable as possible. The final webpage is the opposite as I have crammed the code to fit on one page. I also removed any mention of encryption as to make it's purpose unclear. My hope is that it will more easily pass through the Internet undetected especially where governments restrict access.
Step 2: The Avalanche Effect
Strong ciphers are designed to have something called the avalanche effect. What this means is that a minor change, like removing a period from the message text, will cause a substantial change in the output. Cryptographers want this feature in their algorithms in order to frustrate any attempt to statistically analyze the ciphertext for clues about the original message and/or the key used to encrypt it. This among other qualities are what make a cipher either strong or weak.
A wonderful analogy for what I came up with in terms of how to propagate small changes into big results and then reverse them is shown in the picture above. I had started with shifting the elements and repetitively using the one-time-pad method but this did not produce the result I was after.
input plaintext: This is a test. This is only a test.
input plaintext: This is a test, this is only a test.
As you can see a small change in input makes only a small change in output. This is bad because substitution ciphers like this can be broken as statistical information leaks out into the ciphertext. If the password was perfectly random and used only once and kept perfectly secret and.... Well, that's not what I wanted at all. If you want to understand more about one-time-pad check out the wikipedia page. Apparently the KGB used this method during the cold war but I wouldn't take that to mean it is a secure method at present.
I've included a webpage that you can use to demonstrate this XOR padding yourself here.
Step 3: Stirring and Mixing
The above diagrams show the concept behind two processes that do create a noticeable avalanche effect in the output. Much like the corn syrup analogy from the previous picture, these two methods appear to hopelessly cover over the data but can be reversed to produce the initial state.
I have already decided that I will only work in 80 digit blocks of data therefore these diagrams are only representative. I wrote a program to do this for me as 80 rounds of 80 calculations will be required. The results were outstanding and even changing "," to "." resulted in completely different outputs. You can use the attached webpage here to try it out yourself. Copy one of the results into the top and rerun the program to see how it reverses out.
Please don’t confuse any of this for encryption because anyone who knows about this algorithm can easily undo what has been done. That’s not the point at the moment. All I wanted to do was provide substantial change in the output corresponding to small changes in the input. Let’s just call the output plaintext for now. The information is all there still, it's just that it has become diffused throughout.
Step 4: Let's Build a Key Hash Algorithm
A hashing function is one that accepts data and produces an unpredictable, unique, fixed-length identifier (the hash).
A strong hashing algorithm is one where:
- cannot be reversed to indicate anything about the original input
- produces a unique code for every specific document (and no others)
- has strong avalanche effect
- cannot be intentional tailored by manipulating the input data
So I'm about to make my first (possibly disastrous) assumption. I'm going to assume that the output from all four operations joined together (XOR'd) will result in the aforementioned qualities. In other words I assume the output to be sufficiently noisy that it is indistinguishable from random and that very little if any statistical information can be gleaned from it and that it is collision resistant due to the nature of the diffusion(s).
I'm not planning to use this as a publicly view-able hash so I'm not overly concerned but I see this as the first possible weakness. In order to hedge my bets against this I'm planning to add an additional round grabbing two extra keys along the way. You can see the process as illustrated above.
If I were to look for weakness in this hash I would try to determine if there was a relationship that could be solved by comparing the four methods of mixing. If such a relationship could be found it might be possible to exclude certain results if something else is true. I have no idea what that might be but that is likely where I would start. I've included a straight Hash and the Key Hash so that you can examine it for yourself using different inputs.
Step 5: Demonstration of One Line Encryption, Parsing and Full Operating Mode
In these two diagrams you can see the overall scheme for encryption or decryption of the message. I've incorporated a mode of operation called "Propagating Cipher Block Chaining". This method requires decryption of the first line before the next line and so-on.
This leaves the first line the most likely target of attack so I've reserved the first three characters for a random set of characters. This will be discarded upon decryption but in the mean time it will add further noise to the first line of the message. What that means is that there will be 857375 ways to send the same message where the ciphertext will look completely different than any of the others. Cryptographers sometimes refer to this additional random input as "salt" but it is usually used to prevent hash functions from being broken by dictionary attacks or pre-calculated tables. In this case I'm using salt to further confound any attempts to analyze the ciphertext for clues about the message.
I started by programming the one-liner mode and then graduated to a full message. Parse and salt was a hard one for me to figure out. You can see the demonstration elements or the final form attached here.
Step 6: Is There a Crypto-analyst in the House? Disclaimer
Calling all crypto-types out there. I'm very interested to know what you think of this cipher. If you have a conspiracy theory about the computer systems at the NSA I would prefer that you keep that to yourself. I'm most interested in programs that you have written to glean statistical information from the ciphertext. Mathematical formulas are also acceptable. If your comment includes the word key-space I will read it with much interest. I have as much interest in crypto-analysis as I do in cryptography.
DISCLAIMER: Use this algorithm at your own risk. I am not responsible for lost or intercepted data/messages.
I have tried to err on the side of caution. That being said please realize that I am not a trained cryptographer nor a mathematician, just a hobbyist with unusual fascinations and study habits. I suggest that before you trust this method wait to see if people effectively discredit it or not. My intent here is to start a conversation about what is secure and what is insecure. There are plenty of justifiable reasons to secure one's own communication with someone else. I'm not trying to encourage or discourage "activity" of any kind in presenting these ideas for discussion. I would like to encourage budding crypto-types and otherwise promote human rights.
Step 7: Specification and Porting the Code
I've included the specification and a fully formatted, heavily commented version of the complete code. I might consider porting the code to another programming language but I'm not sure which one. Perhaps you want to port the code to something else and I'm happy to help if I can. I tried this on several browsers including the one on my tablet and it seems to work but I have not exhaustively tested to see if it is universal to all browsers.