There are two basic kinds of encryption algorithms in use today:
Private key cryptography, which uses the same key to encrypt and decrypt the message. This type is also known as symmetric key cryptography.
Public key cryptography, which uses a public key to encrypt the message and a private key to decrypt it. The name public key comes from the fact that you can make the encryption key public without compromising the secrecy of the message or the decryption key. Public key systems are also known as asymmetric key cryptography.
Private key cryptography is most often used for protecting information stored on a computer's hard disk, or for encrypting information carried by a communications link between two different machines. Public key cryptography is most often used for creating digital signatures on data, such as electronic mail, to certify the data's origin and integrity.
This analysis gives rise to a third kind of system:
Hybrid public/private cryptosystems. In these systems, slower public key cryptography is used to exchange a random session key, which is then used as the basis of a private key algorithm. (A session key is used only for a single encryption session and is then discarded.) Nearly all practical public key cryptography implementations are actually hybrid systems.
The following list summarizes the private key systems in common use today.
A simple cryptography algorithm which is used, among other things, to obscure the content of risqué jokes on various Usenet groups. The ROT13 encryption algorithm has no key, and it is not secure.
The original UNIX encryption program which is modeled on the German Enigma encryption machine. crypt uses a variable-length key. Some programs can automatically decrypt crypt-encrypted files without prior knowledge of the key or the plaintext. crypt is not secure. (This program should not be confused with the secure one-way crypt program that UNIX uses for encrypting passwords.)
The Data Encryption Standard (DES), an encryption algorithm developed in the 1970s by the National Bureau of Standards and Technology (since renamed the National Institute of Standards and Technology, or NIST) and IBM. DES uses a 56-bit key.[8]
[8] Technically, we should refer to it as the DEA: Data Encryption Algorithm. Standard-conforming implementations are certified by NIST, and usually require a hardware implementation. However, nearly everyone refers to it as the DES, so we will too.
A block cipher originally developed by Ronald Rivest and kept as a trade secret by RSA Data Security. This algorithm was revealed by an anonymous Usenet posting in 1996 and appears to be reasonably strong (although there are some particular keys that are weak). RC2 is sold with an implementation that allows keys between 1 and 2048 bits. The RC2mail key length is often limited to 40 bits in software that is sold for export.[9]
[9] Unfortunately, a 40-bit key is vulnerable to a brute force attack.
A stream cipher originally developed by Ronald Rivest and kept as a trade secret by RSA Data Security. This algorithm was revealed by an anonymous Usenet posting in 1994 and appears to be reasonably strong (although there are some particular keys that are weak). RC4 is sold with an implementation that allows keys between 1 and 2048 bits. The RC4 key length is often limited to 40 bits in software that is sold for export.[10]
[10] Unfortunately, a 40-bit key is vulnerable to a brute force attack.
A block cipher developed by Ronald Rivest and published in 1994. RC5 allows a user-defined key length, data block size, and number of encryption rounds.
The International Data Encryption Algorithm (IDEA), developed in Zurich, Switzerland by James L. Massey and Xuejia Lai and published in 1990. IDEA uses a 128-bit key, and is believed to be quite strong. IDEA is used by the popular program PGP (described later in this chapter) to encrypt files and electronic mail. Unfortunately,[11] wider use of IDEA may be hampered by a series of software patents on the algorithm which is currently held by Ascom-Tech AG, in Solothurn, Switzerland. Ascom-Tech supposedly will allow IDEA to be used royalty free in implementations of PGP outside the U.S., but concerned users should verify the terms with Ascom-Tech or their licensees directly.
[11] Although we are generally in favor of intellectual property protection, we are opposed to the concept of software patents, in part because they hinder the development and use of innovative software by individuals and small companies.
A classified (SECRET) algorithm developed by the National Security Agency (NSA). Reportedly, a Top Secret security clearance is required to see the algorithm's source code and design specifications. Skipjack is the algorithm used by the Clipper encryption chip. It uses an 80-bit key.
The following list summarizes the public key systems in common use today:
A system for exchanging cryptographic keys between active parties. Diffie-Hellman is not actually a method of encryption and decryption, but a method of developing and exchanging a shared private key over a public communications channel. In effect, the two parties agree to some common numerical values, and then each party creates a key. Mathematical transformations of the keys are exchanged. Each party can then calculate a third session key that cannot easily be derived by an attacker who knows both exchanged values.
Several versions of this protocol exist, involving a differing number of parties and different transformations. Particular care must be exercised in the choice of some of the numbers and calculations used or the exchange can be easily compromised. If you are interested, consult the references for all the gory details.
The Diffie-Hellman algorithm is frequently used as the basis for exchanging cryptographic keys for encrypting a communications link. The key may be any length, depending on the particular implementation used. Longer keys are generally more secure.
The well-known public key cryptography system developed by (then) MIT professors Ronald Rivest and Adi Shamir, and by USC professor Leonard Adleman. RSA can be used both for encrypting information and as the basis of a digital signature system. Digital signatures can be used to prove the authorship and authenticity of digital information. The key may be any length, depending on the particular implementation used. Longer keys are generally considered to be more secure.
Another algorithm based on exponentiation and modular arithmetic. ElGamal may be used for encryption and digital signatures in a manner similar to the RSA algorithm. Longer keys are generally considered to be more secure.
The Digital Signature Algorithm, developed by NSA and adopted as a Federal Information Processing Standard (FIPS) by NIST. Although the DSA key may be any length, only keys between 512 and 1024 bits are permitted under the FIPS. As specified, DSA can only be used for digital signatures, although it is possible to use DSA implementations for encryption as well. The DSA is sometimes referred to as the DSS, in the same manner as the DEA is usually referred to as the DES.
Table 6.1 lists all of the private and public key algorithms we've discussed
Algorithm | Description |
---|---|
Private Key Algorithms: | |
ROT13 | Keyless text scrambler; very weak. |
crypt | Variable key length stream cipher; very weak.[12] |
DES | 56-bit block cipher; patented, but freely usable (but not exportable). |
RC2 | Variable key length block cipher; proprietary. |
RC4 | Variable key length stream cipher; proprietary. |
RC5 | Variable key length block cipher; proprietary. |
IDEA | 128-bit block cipher; patented. |
Skipjack | 80-bit stream cipher; classified. |
Public Key Algorithms: | |
Diffie-Hellman | Key exchange protocol; patented. |
RSA | Public key encryption and digital signatures; patented |
ElGamal | Public key encryption and digital signatures; patented. |
DSA | Digital signatures only; patented. |
[12] Actually, crypt is a fair cipher for files of length less than 1024 bytes. Its recurrence properties only surface when used on longer inputs, thus providing more information for decrypting.
.
The following sections provide some technical information about a few of the algorithms mentioned above. If you are only interested in using encryption, you can skip ahead to the section called "Encryption Programs Available for UNIX" later in this chapter.
ROT13 is a simple substitution cipher[13] that is traditionally used for distributing potentially objectionable material on the Usenet, a worldwide bulletin board system. It is a variation on the Caesar Cipher - an encryption method used by Caesar's troops thousands of years ago. In the ROT13 cipher, each letter of the alphabet is replaced with a letter that is 13 letters further along in the alphabet (with A following Z). Letters encrypt as follows:
[13] Technically, it is an encoding scheme - the "rotation" is fixed, and it does a constant encoding from a fixed alphabet.
Plaintext | Ciphertext |
---|---|
A | N |
B | O |
. . . | |
M | Z |
N | A |
. . . | |
Z | M |
ROT13 used to be the most widely used encryption system in the UNIX world. However, it is not secure at all. Many news and mail-reading programs automatically decrypt ROT13-encoded text with a single keystroke. Some people are known to be able to read ROT13 text without any machine assistance whatsoever.
For example, here is a ROT13 message:
Jung tbrf nebhaq, pbzrf nebhaq.
And here is how the message decrypts:
What goes around, comes around.
If you are not blessed with the ability to read ROT13 files without computer assistance, you can use the following command to either encrypt or decrypt files with the ROT13 algorithm:[14]
[14] On some versions of UNIX, you will need to remove the "[ ]" symbols.
% tr "[a-z][A-Z]" "[n-z][a-m][N-Z][A-M]" < filename
Needless to say, do not use ROT13 as a means of protecting your files! The only real use for this "encryption" method is the one to which it is put on the Usenet: to keep someone who does not want to be exposed to material (such as the answer to a riddle, a movie spoiler in a review, or an offensive joke) from reading it inadvertently.
One of the most widely used encryption systems today is the Data Encryption Standard (DES), developed in the 1970s and patented by researchers at IBM. The DES was an outgrowth of another IBM cipher known as Lucifer. IBM made the DES available for public use, and the federal government issued Federal Information Processing Standard Publication (FIPS PUB) Number 46 in 1977 describing the system. Since that time, the DES has been periodically reviewed and reaffirmed (most recently in December 30, 1993), until 1998 as FIPS PUB 46-2. It has also been adopted as an American National Standard (X3.92-1981/R1987).
The DES performs a series of bit permutation, substitution, and recombination operations on blocks containing 64 bits of data and 56 bits of key (eight 7-bit characters). The 64 bits of input are permuted initially, and are then input to a function using static tables of permutations and substitutions (called S-boxes). The bits are permuted in combination with 48 bits of the key in each round. This process is iterated 16 times (rounds), each time with a different set of tables and different bits from the key. The algorithm then performs a final permutation, and 64 bits of output are provided. The algorithm is structured in such a way that changing any bit in the input has a major effect on almost all of the output bits. Indeed, the output of the DES function appears so unrelated to its input that the function is sometimes used as a random number generator.
Although there is no standard UNIX program that performs encryption using the DES, some vendors' versions of UNIX include a program called des which performs DES encryption. (This command may not be present in international versions of the operating system, as described in the next section.)
The DES was mandated as the encryption method to be used by all federal agencies in protecting sensitive but not classified information.[15] The DES is heavily used in many financial and communication exchanges. Many vendors make DES chips that can encode or decode information fast enough to be used in data-encrypting modems or network interfaces. Note that the DES is not (and has never been) certified as an encryption method that can be used with U.S. Department of Defense classified material.
[15] Other algorithms developed by the NSA are designed for use with classified information.
Export control rules restrict the export of hardware or software implementations of the DES, even though the algorithm has been widely published and implemented many times outside the United States. If you have the international version of UNIX, you may find that your system lacks a des command. If you find yourself in this position, don't worry; good implementations of the DES can be obtained via anonymous FTP from almost any archive service, including the Usenet comp.sources archives.
For more information about export of cryptography, see "Encryption and U.S. Law," later in this chapter.
FIPS PUB 81 explains how the DES algorithm can be used in four modes:
Electronic Code Book (ECB)
Cipher Block Chaining (CBC)
Cipher Feedback (CFB)
Output Feedback (OFB)
Each mode has particular advantages in some circumstances, such as when transmitting text over a noisy channel, or when it is necessary to decrypt only a portion of a file. The following provides a brief discussion of these four methods; consult FIPS PUB 81 or a good textbook on cryptography for details.
ECB Mode. In electronic code book (ECB) mode, each block of the input is encrypted using the same key, and the output is written as a block. This method performs simple encryption of a message, a block at a time. This method may not indicate when portions of a message have been inserted or removed. It works well with noisy transmission channels - alteration of a few bits will corrupt only a single 64-bit block.
CBC Mode. In cipher block chaining (CBC) mode, the plaintext is first XOR'ed with the encrypted value of the previous block. Some known value (usually referred to as the initialization vector, or IV) is used for the first block. The result is then encrypted using the key. Unlike ECB mode, long runs of repeated characters in the plaintext will be masked in the output. CBC mode is the default mode for Sun Microsystems' des program.
CFB Mode. In cipher feedback (CFB) mode, the output is fed back into the mechanism. After each block is encrypted, part of it is shifted into a shift register. The contents of this shift register are encrypted with the user's key value using (effectively) ECB mode, and this output is XOR'd with the data stream to produce the encrypted result. This method is self synchronizing, and enables the user to decrypt only a portion of a large database by starting a fixed distance before the start of the desired data.
OFB Mode. In output feedback (OFB) mode, the output is also fed back into the mechanism. A register is initialized with some known value (again, the IV). This register is then encrypted with (effectively) ECB mode using the user's key. The result of this is used as the key to encrypt the data block (using an XOR operation), and it is also stored back into the register for use on the next block. The algorithm effectively generates a long stream of key bits that can be used to encrypt/decrypt communication streams, with good tolerance for small bit errors in the transmission. This mode is almost never used in UNIX-based systems.
All of these modes require that byte and block boundaries remain synchronized between the sender and recipient. If information is inserted or removed from the encrypted data stream, it is likely that all of the following data from the point of modification can be rendered unintelligible.
Ever since DES was first proposed as a national standard, some people have been suspicious of the algorithm. DES was based on a proprietary encryption algorithm developed by IBM called Lucifer, which IBM had submitted to the National Bureau of Standards (NBS)[16] for consideration as a national cryptographic standard. But whereas Lucifer had a key that was 112 bits long, the DES key was shortened to 56 bits at the request of the National Security Agency. The NSA also requested that certain changes be made in the algorithm's S-boxes. Many people suspected that NSA had intentionally weakened the Lucifer algorithm, so the final standard adopted by NBS would not pose a threat to the NSA's ongoing intelligence collection activities. But nobody had any proof.
[16] NBS later became the National Institute of Standards and Technology.
Today the DES is more than 20 years old, and the algorithm is definitely showing its age. Recently Michael Weiner, a researcher at Bell Northern Research, published a paper detailing how to build a machine capable of decrypting messages encrypted with the DES by conducting an exhaustive key search. Such a machine could be built for a few million dollars, and could break any DES-encrypted message in about a day. We can reasonably assume that such machines have been built by both governments and private industry.
In June 1994, IBM published a paper describing the design criteria of the DES. The paper claims that the choices of the DES key size, S-boxes, and number of rounds were a direct result of the conflicting goals of making the DES simple enough to fit onto a single chip with 1972 chip-making technology, and the desire to make it resistant to differential cryptanalysis.
These two papers, coupled with many previously published analyses, appear to have finally settled a long-running controversy as to whether or not NSA had intentionally built in weaknesses to the DES. The NSA didn't build a back door into DES that would have allowed it to forcibly decrypt any DES-encrypted transmission: it didn't need to. Messages encrypted with DES can be for