For Your Eyes Only: Terms from Crypto

When you visit your bank's website or enter a credit-card number, you've probably noticed that in the browser's address box, the URL begins with https:

Those first letters in a web address specify the protocol, or the transmission method that the browser is using to send and receive information. In https, the "S" stands for "secure," and the security technology your browser uses for that "S" represents one of the great inventions in the history of secrets. In this piece I'll walk you through some of the terms of that rich field.

Cryptography ("hidden+writing"), or the science of transmitting secrets, goes back to the ancients, although the word itself doesn't appear in English till about 1646. This discipline concerns itself with taking clear text, or plaintext, and converting it into ciphertext ("number+ text," since many secrets were coded as numbers), also called a cryptogram ("hidden+word"). The conversion into ciphertext is encryption; converting back to plaintext is decryption. Fun fact: according to the OED, decrypt is older (1936) than encrypt (1950).

One type of cryptography you might know is a substitution cipher, where each letter in the plaintext is swapped for a different letter. A Caesar cipher, supposedly used by Julius Caesar himself, does this using an offset, such as substituting each letter in the plaintext for a letter three places away in the alphabet. For example, if you start with this alphabet:

ABCDEFGHIJKLMNOPQRSTUVWXYZ

A three-letter offset produces this new alphabet:

DEFGHIJKLMNOPQRSTUVWXYZABC

Using this encryption scheme, THE LAZY DOG becomes WKH ODBA GRJ.

Cryptanalysis is the complement to cryptography: it's the science of trying to break codes. If you've ever done one of those secret-message puzzles in the newspaper, you've practiced cryptanalysis. You'll also know that a simple cipher like the one above is vulnerable to frequency analysis, in which a code-breaker (1932) uses information about how often letters appear in a language to guess at the letter substitutions. (In English, the letter e appears most often.) Needless to say, over time a variety of stratagems were invented for making the encryption process more complex.

An alternative to encrypting a message is to hide it in plain sight—a technique known as steganography ("covered+writing"). In this approach, the secret message is embedded in something else that otherwise looks innocuous, like a letter, or these days, a digital image. Here's an example: an "hourglass cipher" used by the British general Sir Henry Clinton during the American Revolutionary War, where the cutout reveals the secret message embedded in a letter:

Hourglass cipher as an example of steganography [source]

In theory, only someone who knows how the message was encrypted (or hidden)—that is, who has a key—can decrypt it. This introduces something that's referred to as the key-distribution problem. In cryptographic terms, the key is a shared secret. If I want you to decrypt my messages, I have to somehow share the key with you in a way that others can't intercept.

As computers started to become more prevalent in the 1970s, people began thinking about the need for crypto and privacy in computer communication. Traditional means of cryptography would not work, since there was no practical way for people to distribute keys securely over an open channel. In 1976, the mathematicians Whitfield Diffie and Martin Hellman published a paper in which they introduced a radical new idea and a new term: a public key cryptosystem. In this scheme, the key is split into two parts, a public key and a private key. Anyone can use the public key to encrypt a message, but only someone who has the private key can decrypt it. A popular analogy is that anyone can put a letter in a mailbox, but only someone who has a key can retrieve the contents.

After Diffie and Hellman proposed their idea, three M.I.T. scientists, Ron Rivest, Adi Shamir, and Leonard Adleman, published an algorithm that successfully implemented public-key encryption, and that has since come to be known as RSA encryption, after the inventors' initials. (The Rivest/Shamir/Adleman paper also introduced the conventional characters Alice and Bob as the people who are trying to exchange secrets.)

This invention introduced a classic terminological conundrum: how to refer to the old technology. The terms symmetric key and symmetric encryption emerged as retronyms to describe traditional cryptography, where a single shared key is used both to encrypt and decrypt messages. In contrast, the terms asymmetric key and asymmetric encryption came to be used for using a different key to encrypt and to decrypt messages. You can vividly see how these terms took off by looking at this Google Ngram graph:

Google Ngram graph showing how "symmetric" and "asymmetric"
took off after public-key encryption was invented in 1976.

Let's return to https. When you see this in your browser's address box, it means that the traffic with the website is encrypted. By now, you'd probably guess that this is done using public-key (asymmetric) encryption.

Yes, but there's some more to this story. Public-key cryptography has also helped to solve another security problem, namely that of authentication, or being sure about who you're communicating with. When you do online business with your bank, how do you know you're not talking with an imposter? This can also be done using public and private keys.

Here's a simplified explanation that highlights some other terms you might have heard. In order to offer secure communications, your bank gets a digital certificate from an established certificate authority (CA). A digital certificate is often compared to a passport—a document that can be trusted to identify you. To get a certificate, the bank generates a public/private key pair and sends its public key to the CA. After validating the bank's identity using traditional physical means, the CA issues a digital certificate that includes the bank's public key and the CA's own digital signature—a message that's encrypted ("signed") using the CA's private key. In the passport analogy, your government acts as the CA; when the government has validated you, it issues you a document that identifies you, and that has the government's own signature (stamp) on it.

On your computer, there's a collection of public keys for recognized certificate authorities. When you connect to your bank, your browser and the bank's website go through a negotiation referred to using the evocative term handshaking. During this process, the bank sends its certificate to your browser as proof of the bank's identity. Your browser extracts the CA's digital signature from the certificate (signed with the CA's private key) and verifies it by using the CA's public key that the browser already has on file. If that goes well, your browser then uses the bank's public key—which is also in the bank's certificate—to start an encrypted session with the bank, and you and the bank can then communicate securely.

Cryptography is a complex field, of course. Not only was cryptography transformed in the 1970s, but as you've seen, so was the vocabulary of the field. Until 1976, the term "public key" was an oxymoron, and no one had ever talked private keys or digital signatures or digital certificates. But these days, you might see some significant the history of the field right there in your browser.