Encoding/decoding is the process of transforming a file of one format into another.
Cryptography is the art of concealing information so only the intended recepient can process the information.
All cryptography involves some form of encoding and/or decoding. But not all forms of encoding and decoding
are cryptography. In cryptography, the message is called the plain text and the encrypted message is called
the ciphertext. We employ an encryption key to encode the plain text into the ciphertext and decode the ciphertext
into the plain text with a decryption key.
There are many forms of encoding and decoding that do not have encryption or decryption keys. For example,
base64 and uuencode
were ways of transforming 8-bit objects like pictures and applications so they could be sent via the 7-bit email format
over the Internet. There is no password or hidden rule involved in the encoding/decoding process.
There is also a difference between "kid sister" encryption and "government" encryption.
Bruce Schneier, a very famous cryptologist defined kid sister encryption as being about preventing
your kid sister from reading your diary. Government encryption is preventing the government from reading your diary.
Examples of kid sister encryption are ROT13, and atbash. Examples of government encryption are AES, and bcrypt.
The big difference between kid sister encryption and government encryption is that with kid sister encryption, there
exists a known guaranteed technique to recover the plaintext from the ciphertext without apriori knowledge of the key. With
government encryption, there is no guaranteed technique to recover the plaintext from the ciphertext without apriori knowledge of the key. In other words, with government encryption, there is no guaranteed strategy to hack the encryption/decryption
algorithm.
In symmetric key cryptography, the encryption and decryption keys are the same. This is the kind of cryptography most
students think about.
Most students are introduced to symmetric key cryptography through a kind of kid sister cipher known as the substitution
cipher. In the substitution cipher, one character in a message is substituted for another. ROT13, and atbash are substitution ciphers. The difference
between ROT13 and atbash is in the key. The key of ROT13 is that A-M is mapped to N-Z and vice-versa. In contrast, with atbash,
A is mapped to Z, B to Y, etc.
However, government symmetric key ciphers exist. AES
and twofish are examples of government symmetric key ciphers.
Symmetric key cryptography is principally used to encrypt larger messages. This is because one is able to retrieve the
plaintext from ciphertext in symmetric key cryptosystems (as opposed to hash-based cryptosystems) and because symmetric key
cryptosystems are typically faster in execution than asymmetric key cryptosystems.
In asymmetric key cryptography, the encryption and decryption keys are different. To illustrate how this would work,
imagine a door lock as pictured below. In this door lock, one rotates the lock 180 degrees to lock it. When one does
this, one uses a specific key. To unlock the door, one rotates the lock 180 degrees in the same direction. To do this,
one uses a different key that is independent of the locking key.
The special advantage of asymmetric key cryptography is the person who decodes the message does not know how to encode
the message.
The most common use of asymmetric key cryptography is in authentication/digital signature mechanisms. The person signing
generates both the encryption and decryption keys. The decryption key is then released to the public. The person signing
proves to the world they are the person signing by encrypting the message. Because the message is decryptable with the
decryption key, the person signing proves they are who they say they are.
In signature systems, the encryption key is called the private key and the decryption key, the public key. For signature
systems to work, a trusted third party must be introduced to hold the public keys and authenticate the public keys are
from the people claimed- otherwise, an imposter can introduce a public key and claim to be someone else. This trusted
third party is often the weak link in public key infrastructure and is often the target of attacks.
Many real-world cryptosystems employ both asymmetric and symmetric key cryptography. The asymmetric keys are used to establish
the connection between the communicating parties, and then a symmetric key is used to encrypt the bulk of the traffic.
In hash-based cryptography, only the encryption key exists. There is no decryption key.
The special advantage of hash-based cryptography is people cannot recover the plaintext from the ciphertext. Hash-based
cryptography is often used in password systems. The passwords are stored in encrypted form in a repository accessible
by administrators and potentially the public. For example, when one logs in, one must have permission to access
the password database- otherwise one cannot authenticate one's own password.
Even though administrators and the public can see the encrypted passwords, they cannot reverse-engineer the encryption
to reveal the passwords.
In hash-based cryptography, passwords are authenticated by the user re-encrypting their password. If the two encrypted
passwords match, they are the same password.
One (formerly) common attack on hash-based cryptographic systems was called a rainbow table. In a rainbow table, common
passwords were encrypted prior to the attack (the rainbow table) and the encrypted passwords in the target system were compared to the rainbow table.
Rainbow tables are defeated using salts. A salt is a random element added to the password prior to encryption. The salted
password is then hashed and the salt is stored alongside it. When authenticating the password, the server encrypts the
password and the salt and compares it to the encrypted salted password stored in the system.
Salts, however, create a vulnerability in that the password cannot be encrypted by the client, who does not possess the salt.
Only the server possesses the salt and hence it is possible for a third-party to intercept the transmitted password. This problem can be
overcome by encrypting the communication link between the client and server prior to the transmission of the password.
Account information must always be public information, because people other than the owner of an account need
access to the account information. For example, payroll needs to know your bank account information to pay you.
However, this also creates a vulnerability, because malicious actors can use that public account information to
harm you. For example, if hackers have your encrypted password, they can keep trying to hack it until they succeed.
One way of handling this problem is to clearly separate the information everyone needs to know from the information
only a few people need to know. When the restricted information is needed, it is accessed by someone with temporarily
elevated permissions. For example, everyone needs to know which accounts exist. But the password only needs to be known
when authentication is actually taking place.
This separation of public information from information necessary on a need-to-know basis is called a shadow file.
Most operating systems keep an account file available that is public access, and a shadow file containing
the salted, encrypted passwords that can only be obtained with administrator access.
The standard attacks on symmetric key cryptosystems are the brute force and dictionary attacks.
Brute force: In a brute force attack, the attacker begins with all possible
1 character passwords, then tries all possible 2 character passwords, ... until finally getting the password right.
The brute force attack is defeated if (1) it is costly to perform the encryption and (2) the password is long and/or
complex. The main problem with brute force attacks is the time it takes to perform them. For reasonably complex passwords, it can take until the end of the universe for the password to be guessed.
Dictionary: In a dictionary attack, the attacker throws a list of password
guesses at the ciphertext. If one of the passwords successfully decrypts the ciphertext, the attack is successful.
The dictionary attack leverages on psychology to make the issue of password guessing tractable. Large password dictionaries
exist on the Internet, such as rockyou, a list of passwords
actually obtained from a company.
Kid sister attacks: Kid sister symmetric cryptosystems are defeatable
through various attack forms. One example is the letter frequency attack. In English, certain letters are more commonly
used than others. Thus, the most frequent letter appearing in a ciphertext is likely to be the letter "e". A
letter appearing by itself is likely to be "a" or "i". Another attack is the "known plaintext
attack" where one encrypts a known plaintext and then uses the functional mapping from the plaintext to ciphertext
to reverse engineer the cryptosystem. These kinds of attacks do not work on government cryptosystems.
Tools exist to automate brute force and dictionary attacks. Examples of such tools are John the Ripper and hashcat.
Asymmetric Key Cryptography
There are two standard ways of compromising asymmetric key cryptosystems. The first is the man-in-the-middle attack
where the attacker intercepts the messages before encryption has taken place. The attacker then pretends to be the
other party and exchanges their own public key for the parties attempting to establish the link.
The second way of compromising asymmetric key cryptosystems is for the attacker to impersonate the
trusted third party. In cryptographic theory, the trusted third party is supposed to be an agent beyond
reproach. In real life, there is no such thing, and attackers able to subvert the trusted third party
can substitute their own public keys with the one in the trusted third party's repository.
Hash-Based Cryptography
Hash-based systems are attacked in a manner similar to symmetric key cryptosystems, i.e., via brute force and
dictionary attacks. One complication with many hash-based attacks is the attacker often does not have access
to shadow files. The attacker thus needs to find an exploit to obtain the shadow file, which the attacker then
merges with the account file before attacking it with tools like John the Ripper or hashcat.
The one time pad is a theoretically unbreakable form of symmetric key cryptography. In the one time pad, the
key is longer than the the plaintext.
Each bit of the plaintext is xored against the corresponding bit of the key.
In the ideal one time pad, the key itself is randomly generated and both parties have access to duplicates
of the machine that generates the key. In practice, pseudorandom one time pads are often used. For example,
two spies might agree to use the book Gulliver's Travels as the one time pad, and do encrypting and decrypting
starting from the first letter of the text, until the last letter.
Steganography involves hiding a message inside something else. For example, you can create a hidden
message in a picture on a website. Your spies just go to your website, and download the picture
to obtain instructions. Everyone else visiting your website thinks it is an ordinary website.
In the images below, the one on the left is the original picture. The one on the right contains
a hidden message. If you use the application openstego,
with 128-bit AES, and the password "drseuss," a hidden message will be revealed.
The video shows me decrypting the image to reveal the hidden message.