Public Key Infrastructure (PKI) and Public Key encryption are essential components of Transport Layer Security (TLS) as well as digital signing. These technologies are also used for any application in which encryption and authenticity are key components, such as WS-Security, a standard for encrypting XML messages.
1. Symmetric Key Encryption
Most ciphers (encryption schemes) use the same key to encrypt data, and later, to decrypt the same data. This is known as symmetric key encryption, and the key itself must be kept secret, because compromising the key means that the encrypted data could be compromised.
1.1. Symmetric Encryption Process
- The encryption key is input in to the cipher algorithm
- The cipher algorithm encrypts the (cleartext) data
- Resulting in encrypted data (ciphertext)
1.2. Symmetric Decryption Process
- The original key is fed back in to the cipher algorithm (symmetric key)
- The ciphertext (encrypted data) is decrypted by the cipher algorithm
- Resulting in the original cleartext data
Only the original key can be used to decrypt the data… if a different key is used, the result will be unusable garbage:
1.3. Using a Symmetric Cipher to Protect Communication
Let’s say that Alice and Bob wish to communicate privately, using a symmetric cipher.
- Alice and Bob must agree on a cipher. Since the cipher itself can’t be used to compromise the data, this can be done over a public channel – for example, phone or email.
- Alice encrypts her message (data) using Alice’s key
- Alice uses a public channel to send the encrypted data to Bob
- Bob receives the encrypted message.
- Bob must know Alice’s key in order to decrypt her message.
Let’s say that Tim wants to “attack” the encryption process. Here are the rules for Tim:
- He can listen to or watch anything sent between Alice and Bob (anything in a Public channel)
- He does not have access to Alice’s workshop, nor Bob’s workshop, and can’t see anything processed or stored there.
We stated that agreeing on the cipher is something that can be done in public – even if Tim knows the cipher, he can’t do anything with it. Tim must have all three pieces: cipher, ciphertext (encrypted data), and key, in order to be able to read Alice’s message.
When Alice sends her ciphertext message to Bob, we must assume that Tim now has a copy. He also probably knows the cipher, but he doesn’t know the key.
And here is the problem. Alice must somehow send the key to Bob without Tim being able to copy it.
We’ve already stated that Tim can see or overhear anything Alice sends or says to Bob. So Alice can’t “tell” Bob the key. She can’t e-mail it to him. It can’t be based on any public information, nor anything Tim can independently look up – it can’t be “Alice’s old phone number”, nor “the name of a large island-state off the eastern coast of Africa”, because Tim can look these up.
The key must be pre-shared. Meaning, Alice must go to Bob’s workshop (or vice-versa) before she sends the message, and she must tell Bob the key in person: “If I ever have to send you an encrypted message, the key will be ‘Madagascar'”.
Today, using a variety of products, it’s possible to create an encrypted ZIP (archive) file, that follows these same rules, and has the exact same issue. You create a password when you encrypt the file containing the message. You can send the zip file securely, because it’s encrypted, but then you must somehow securely share the password! I can’t tell you how many times someone sends me a zip file, and says “I’ll send you the password in another e-mail”. We have to assume that Tim, the attacker, has unlimited resources and sees everything. If Tim can find one specific e-mail with the zip file attached to it, he can certainly find the second specific e-mail with the password in it!
Assuming Bob and Alice get their key management issues worked out… If Bob wishes to respond to Alice, the process is the same:
- Bob uses the shared key to encrypt his message
- Bob transmits the encrypted message to Alice
- Alice decrypts the message using the shared key
2. Public (Assymetric) Key Encryption
With Public Key ciphers, a “public” key can be well known, and is used by anyone to encrypt data, while the corresponding “secret” key, only known to the owner of the key pair, is the only key that can decrypt the data. Encrypting with the public key is a one-way process that can’t be reversed, and ONLY the secret key can be used to decrypt the data. This is known as asymmetric key encryption, since two different keys are used. Not only can the PUBLIC key be compromised without compromising the encrypted data, but because the encryption process can’t be reversed, the whole scheme is designed so that the public key can be freely given away!
2.1. Encryption and Decryption Using a Public Key Cipher
To encrypt data:
- The public key is input in to the cipher algorithm.
- The cipher algorithm processes the data.
- The resulting ciphertext is encrypted.
To decrypt data:
- The secret key is input in to the cipher algorithm
- The cipher algorithm processes the ciphertext (encrypted data)
- The resulting cleartext is the original data.
If you attempt to decrypt the ciphertext using the public key, the result is garbage. If you attempt to decrypt the ciphertext using the wrong secret key, the result is garbage.
2.2. Public Key uses “One Way” Encryption
How does this “one way” encryption process work?
It’s a lot more complicated than this, but here is a simplified explanation.
P (the Public Key) and S (the Secret Key) are related to each other, through a big math problem. Knowing one implies the other, and vice-versa, but the math problem is so big, it would take a computer hundreds of years to find the answer.
When you encrypt a message using P, you’re removing information pertaining to S. If you decrypt the message using P, all you get is what you already know: Information pertaining to P.
ONLY the person holding S can decrypt the message by restoring the missing information.
As this implies, it’s also possible to ENcrypt using S, and subsequently DEcrypt using P – what appears at first to be a curiosity, is actually a very important part of public-key encryption. Digital signing uses the Secret key to authenticate that a message was in fact sent by the person from whom it appears to have been sent. We’ll get in to this in more detail later!
2.3. Using a Public Key Cipher to Protect Communication
Public Key encryption solves the problem of requiring a pre-shared key.
With a Public Key cipher, Alice and Bob can securely exchange keys and communicate, without the attacker (Tim) being able to compromise the communication scheme.
Alice wants to send Bob a message:
- Bob sends Alice his public key.
- Alice uses Bob’s public key to encrypt her message.
- Alice sends the ciphertext (encrypted message) to Bob.
- Bob uses his secret key to decrypt the message.
If Bob responds to Alice, the process works in reverse – Bob uses Alice’s public key to encrypt his message.
The only things that Tim sees are the encrypted messages, and Alice’s and Bob’s respective Public keys.
The Secret keys used to decrypt each message are never disclosed to Tim, and the public keys are of no use to Tim, since he can’t use them to decrypt any messages.
3. Digital Signatures
Suppose that Tim tries to trick Bob by pretending to be Alice. Since anyone could have Bob’s public key (including Tim), he could attempt to trick Bob in to disclosing information about himself or Alice, by pretending to be Alice. Tim could also construct a malicious message that crashes Bob’s computer, or allows Tim to remotely control Bob’s computer when he decrypts and then attempts to open the message.
How do we know that a particular message, “M”, that appears to be sent by Alice, was ACTUALLY sent by Alice?
Bob can request that Alice digitally sign all of her messages.
- Alice performs a cryptographic checksum (“C”) of the cleartext message (“M”), or C(M).
- Alice uses Bob’s PUBLIC key (“PB”) to encrypt the message E(M,PB).
- Alice uses her own SECRET key (“SA”) to encrypt the checksum E(C(M),SA)
- Alice sends the signed, encrypted message E(M,PB) + E(C(M),SA) to Bob
- Bob decrypts the message E(M,PB) using his own SECRET key (“SB”), or E(E(M,PB),SB) = M (original message)
- Bob performs his own cryptographic checksum of the message C1(M)
- Bob uses Alice’s PUBLIC key (PA) to decrypt the checksum sent by Alice, or E(E(C(M),SA),PA) = C(M) (Alice’s original checksum)
- If C(M) = C1(M) (Alice’s and Bob’s checksums match), then everything is OK…. ONLY Alice (or someone with Alice’s Secret key) could have sent the message.
- If C(M) <> C1(M) (Alice’s and Bob’s checksums DON’T match), then either the message was altered, or the message was NOT sent by Alice.
The checksum encrypted using Alice’s Secret key, E(C(M),SA), acts as a guarantee that a) Alice sent the message, because ONLY her PUBLIC key can decrypt the checksum, and b) that the message hasn’t been altered, because Bob can perform his own cryptographic checksum (C1) of M, to see if it matches C.
When Bob responds to Alice, the process is reversed:
- Bob uses Alice’s Public key to encrypt the message
- Bob uses his own Secret key to sign the message
- Bob transmits the message to Alice
- Alice uses her own Secret key to decrypt the message
- Alice uses Bob’s Public key to verify that Bob is the sender, and that the message is unaltered.
Because each key has two purposes based on who is sending and who is receiving, this can get quite confusing. Refer to the following chart to help keep things straight:
If Alice is sending to Bob, follow the blue path:
- Alice uses Bob’s PUBLIC key to ENCRYPT
- Alice uses Alice’s SECRET key to SIGN
- Bob uses Bob’s SECRET key to DECRYPT
- Bob uses Alice’s PUBLIC key to VERIFY
If Bob is sending to Alice, follow the green path:
- Bob uses Alice’s PUBLIC key to ENCRYPT
- Bob uses Bob’s SECRET key to SIGN
- Alice uses Alice’s SECRET key to DECRYPT
- Alice uses Bob’s PUBLIC key to VERIFY
4. Public Key Infrastructure
Everything we’ve seen about Alice and Bob indicate that they trust each other, or at least know each other, such that Bob can authenticate Alice (he can ascertain that she is who she claims to be), and Alice can authenticate Bob.
Let’s say that Bob and Alice don’t know each other.
Tim could create a key pair, pretend to be Alice, and Bob would never know that he’s not communicating with “the REAL” Alice.
Public Key Infrastructure (PKI) provides a framework for mutual authentication and trust.
4.1. Third Party Trust
The role of third-party trust, is to allow people who don’t know each other, to trust each other.
In the case of Alice and Bob, let’s say that STEVE is a Third-Party Trust provider…
- Alice and Bob don’t trust each other.
- Alice trusts Steve (a third-party), and Steve trusts Alice. Likewise, Bob and Steve also trust each other.
- Steve can “certify” to Alice that Bob can be trusted. In essence, Steve is lending some of Alice’s trust of himself, to Bob. Because Alice trusts Steve, and Steve trusts Bob, Steve tells Alice that SHE can trust Bob on Steve’s authority.
- Steve can also “certify” to Bob that he can trust Alice. Bob trusts Steve, who trusts Alice, and therefore Bob can trust Alice on Steve’s authority.
- Bob and Alice can now trust each other, on Steve’s authority.
Steve can also authenticate Alice and Bob to each other – that is, Steve can prove either Alice’s identity to Bob, or Bob’s identity to Alice.
Let’s say Tim (“Alice”) is at it again, pretending to be Alice (the REAL Alice)
“Alice” sends Bob a message. Because Steve can verify the REAL Alice, Bob asks Steve if he recognizes the signature on the message, to which Steve replies, “That’s not the REAL Alice’s signature, therefore, the message wasn’t sent by the REAL Alice.”
Because Steve can’t verify “Alice”, Bob knows NOT to trust “Alice”.
4.2. Multi-Level Trust
Let’s say that Steve gets promoted.
Alice and Bob only trust each other because of Steve, so with Steve in a new role, how do we keep everything from being thrown in to chaos??
Steve’s replacement is Victor. Steve tells both Alice and Bob that they can trust Victor.
Alice trusts Victor because Steve trusts Victor. Victor trusts Alice because Steve trusts Alice.
Likewise, Bob and Victor can trust each other on Steve’s authority.
Now, Alice and Bob can continue to trust each other on Victor’s authority, because Steve has delegated that trust to Victor.
We have established a multi-level trust. Alice trusts Bob because she trusts Steve who trusts Victor who trusts Bob. Likewise, Bob trusts Steve who trusts Victor who trusts Alice.
This entire process can be extended out to any number of levels.
For example, if Steve gets another promotion, and Victor gets Steve’s old job, he might hire Mark to be the new Victor (who is himself, the new Steve). On Steve’s authority, Bob trusts Victor, who trusts Mark, who trusts Alice, and vice-versa.
4.3. Certification and Digital Certificates
A digital certificate is a small data file, usually only a few hundred bytes, that states the “subject” (the certificate holder’s identity) and information about the subject, including the subject’s Public key, and “fingerprints” (encrypted hashes) that can be used to validate that the stated issuer (trust provider) REALLY DID issue the certificate.
So, for example, Bob can send his digital certificate to Alice, which is a convenient way for Alice to validate that Bob is who he claims to be (through Victor’s “fingerprint”), and also contains Bob’s Public key, that Alice can then use to encrypt and send some information to Bob.
Referring to our previous example, it’s convenient to use Victor as a proxy for Steve. On the other hand, Steve must be very careful. If Victor ends up trusting Tim, then both Alice and Bob will trust Tim on STEVE’S authority, via his proxy, Victor!
If Tim applies for a digital certificate, signed by Victor, and Tim turns out not to be who he claims to be, then this dilutes everyone’s trust for Steve, because Steve signed Victor’s certificate.
For this reason, all third-party trust providers have a verification or certification process, where they verify your name, company, address, and online presence before issuing a certificate.
“Extended Validation” certificates go even farther, beyond identity, to ensure that the applicant has proper authority, direct control over the domain name and web servers in question, and other factors:
When Tim applies for a certificate, claiming to be Bob, Victor quickly notices the deception because Tim is unable to verify any of Bob’s information.
When the trust provider issues a certificate named “Bob”, it’s their reputation at stake, and their responsibility to take reasonable measures to ensure that Bob is who he claims to be.
Alice, knowing this, can base her level of trust for Bob and his certificate on the integrity of the trust provider who issued it.
4.4. Certificate Authorities and Certificate Chains
A certificate is issued by a Certificate Authority (CA), which is basically a certificate that can issue and sign other certificates.
When a certificate is created, the purposes for which the certificate may be used are enumerated within the certificate itself. A “normal” certificate, called an identity certificate, is typically valid for the following purposes:
- Identity (Authenticate the identity of the subject)
- Encryption (Use the certificate’s public key to encrypt data that the subject can subsequently decrypt)
- Digital signing (Use the certificate’s public key to verify the authenticity of a message “signed” with the subject’s secret key)
A CA certificate must be valid for all of these purposes, PLUS the explicit stated purpose of being a CA:
- Certificate Authority (Issue and sign other certificates)
A CA can be self-signed (issued by itself), or issued by another CA.
In the multi-level trust example, Steve starts out directly issuing identity certificates (presumably using his CA). When he delegates this authority to Victor, he issues Victor a CA certificate. Victor’s cert, issued by Steve, is capable of issuing Bob’s and Alice’s identity certificates.
- Steve → Victor → Bob
- Steve → Victor → Alice
A multi-level trust like this, is known as a certificate chain, where the identity cert is issued by one or more intermediate CAs, and eventually the highest level CA is issued by itself (self-signed), also known as the root certificate.
In our example above, Victor is the intermediate CA, and Steve is the root CA:
- Alice (Identity, Issued by Victor) → Victor (Intermediate, Issued by Steve) → Steve (Root, Issued by Steve)
- Bob (Identity, Issued by Victor) → Victor (Intermediate, Issued by Steve) → Steve (Root, Issued by Steve)
Where identity certificates are normally only issued for 1 to 2 years, and have to be continuously renewed, root certificates are typically good for 10 or 20 years.
It’s often convenient to use an intermediate CA, even within a single Trust Provider, because a new CA certificate can be created to support newer cipher and hash algorithms, or longer key lengths, without revoking the root or issuing a new root. This flexibility allows your browser, or an application server, to trust a single root cert, under a variety of conditions, supporting a variety of technologies.
Intermediate certificates might be valid for 5 to 10 years.
The certificate chain consists of the identity certificate, plus all intermediate CA certificates, and the matching root CA certificate.
All application servers, web browsers, and Java come with a certificate “trust” store, that contains the list of known, trusted root CAs. Some of the larger root CAs:
- Microsoft (Microsoft issues its own certificates)
- Verisign (Now, Symantec)
- Wells Fargo (They are their own trust provider, and they are so large, Microsoft and other vendors have included the Wells Fargo CA in its list of trusted CAs)
There are several hundred trust providers, and each one uses at least one CA.
When you connect to a secure website, your browser will read the identity certificate as well as the certificate chain (if provided), and follow the chain until it hits a root (self-signed) certificate that matches one of the certificates in its trusted store. If it can’t find a matching, trusted certificate, your browser will report an error!
4.5. Certificate Signing Request
How does Bob get a certificate?
- Bob creates a key pair (Public, Secret)
- Bob takes his brand new Public key, and creates a Certificate Signing Request (CSR) with the following information:
- Public Key
- Identity (Bob’s Name)
- Location (Geographical location)
- Optional Contact Info (Phone #, E-mail, Etc…)
- Bob signs the CSR with the Secret key he just created.
- Bob sends the CSR to a Certificate Authority (CA)
- The CA verifies Bob’s information
- The CA issues and signs Bob’s Certificate
- The CA sends the final certificate to Bob, along with instructions, and a copy of the certificate trust chain.
- Bob installs the final certificate, his new Secret key, any intermediate certificates, and the root CA certificate into a keystore
- Bob configures the web server to use his keystore, and configures the web server’s identity as “Bob” (the cert subject).
A keystore is simply a small database containing one or more certificates, and any corresponding secret keys.
A trust database is a keystore that contains a list of trusted ROOT CAs, so that the web server / browser can validate a given trust chain, to match a specified root certificate to one of its trusted CAs. A typical trust database contains 100+ root CA certificates from various companies.
For most web / application servers, a separate keystore is created, for storing the secret key, the identity certificate, and ideally, the entire trust chain.
Most server platforms create an empty keystore at the time the key pair is generated, and also generate the CSR at the same time. Once the signed certificate is received, it’s imported in to the new keystore. Importing the trust chain is usually a separate but necessary step.
Now that Bob’s web server is properly configured, when Alice connects, she’s able to verify Bob’s identity, and establish secure communications with Bob.
Alice’s browser downloads Bob’s certificate and certificate chain. Alice parses Bob’s certificate, looks at the Issuer, and then checks to see if she has the Issuer’s certificate in the chain or in her Trust Store. She repeats this process for each Intermediate certificate, until she reaches a Trusted Root certificate, or until she can’t match the specified Issuer against a known, trusted source.
If Alice trusts the CA certificate, then she must trust Bob. If Alice does NOT trust the specified CA certificate, then Alice must assume Bob’s certificate is forged. Perhaps Tim created a fake “Steve” certificate…. the fingerprint would not match, and Alice would be able to determine the deception!
4.6. Self-Signed Identity Certificate
Sometimes, it’s expedient or convenient to create a “self-signed” identity certificate.
This type of certificate is like a normal identity certificate, except that it’s issued by itself!
The “Subject” of the certificate is identical to the “Issuer”.
This type of certificate can be created by various utilities, and most web servers and app server platforms come with one installed by default.
The obvious issue with this type of certificate is that it bypasses the entire third-party trust mechanism.
This is the digital equivalent of Bob standing up in a crowded room, and shouting, “You should trust me, because I proclaim that you can trust me”. Only Bob can ascertain whether he can be trusted or not, along with only those who explicitly choose to trust Bob.
Unless Alice is VERY sure of Bob’s identity, the result is that she could be easily misled by Tim pretending to be Bob, because Steve and Victor are no longer part of the process that authenticates Bob, verifying that he is who he claims to be.
Self-signed certificates have some very valid uses:
You need to encrypt data without the need for authentication – for example, between two servers on your internal network.
Most certificates expire in 1 to 2 years, but a self-signed certificate can be set for literally any validity period – a cert could be good for 10 years, or 100 years. This is convenient for devices that are deployed to the field, that may not be remotely manageable / serviceable, but you still need the benefits of authentication and encryption. Automated Teller Machines (ATMs) and eVoting machines are obvious examples – these devices are intentionally deployed in a “locked down” state, and remote management is limited or difficult. These types of machines are designed to be serviced by a field tech, and deploying a field tech to 100,000 machines once per year would be cost-prohibitive.
- Testing. You have a test lab, and you want to explore / verify functionality without incurring any actual expense for a “real” certificate.
- Providing company intranet resources. Self-signed certificates are an inexpensive way to provide authentication and encryption to internal (to your company) employees, where you have control over laptops and other mobile devices, to install the self-signed certificate within the mobile device’s browser trust store.
- B2B Peering. For Business to Business (B2B) connections, where one company’s servers connect to the other (or vice-versa, or both), the two companies can establish a mutual trust using self-signed certificates. In addition to saving money, this also precludes the potentially costly and risky process of having to update B2B certificates every 1 to 2 years.
All customer, user, and third-party web and application servers should use REGISTERED certificates from a well-known trust provider, in order to provide a guaranty that they are connected to the right place (they can authenticate your servers) and that their data is protected.
4.7. Subject Alternate Naming
Sounding more complex than it really is, Subject Alternate Naming (SAN) is a way of creating nicknames for a certificate, that are as valid as the primary name.
Don’t confuse this with “Storage Area Network”, the more common use of “SAN”. In retrospect the person who created this TLA (Three Letter Acronym) conflict should be shot.
A simple rule of thumb: If you’re talking about storage, disk drives, latency, or IO, SAN probably means “Storage Area Network”. If you’re talking about identity, certificates, or authentication, SAN probably means “Subject Alternate Naming”.
Obviously, when we mention SAN in this document, we mean “Subject Alternate Naming”.
So what is “SAN” and how do we get it?
Let’s say that Bob’s full name is Robert Robertson. SAN allows Bob to get a single certificate with multiple names, and each one acts like a nickname:
- Bob (Primary name)
- Robert (SAN)
- Bob Robertson (SAN)
- Robert Robertson (SAN)
This is convenient, as now he can introduce himself in different contexts, and still use the same certificate for authentication and encryption.
Subject Alternate Naming is specified during the CSR process. When the Certificate Signing Request is created, any SANs are specified as part of the request, and must conform to all of the same rules as the primary name – e.g. you couldn’t request an “Alternate Name” of “www.microsoft.com”, because you don’t own that domain. This keeps the Tims of the world from claiming that their nickname (SAN) is “Bob” or “Alice”.
4.8. Star / Wildcard Certificate
Often called a “star” certificate, a wildcard certificate has a “*” (star) as the first part of the name, thus matching any name that fits the pattern.
For example, let’s say Bob has 3 servers:
Bob can get a “star” (wildcard) certificate: *.bobsdomain.com
This star certificate will match all three of Bob’s existing servers, and he can bring up new ones on the fly as the need arises.
4.9. Certificate Revocation List (CRL)
Scenario: Victor issues a certificate to Tim, pretending to be a legitimate business. Victor quickly finds out that Tim is up to no good, and he’s using Victor’s endorsement to further his own criminal enterprise!
Victor has the option to “revoke” Tim’s certificate. He does this by publishing the NOW-untrusted certificate on Victor’s Certificate Revocation List (CRL).
Every Certificate Authority maintains a CRL – a list of certificates that WERE issued, but are NOW considered untrustworthy.
Bob or Alice would simply check Victor’s CRL during the secure handshake process with Tim. Finding Tim’s certificate on the CRL, Alice and Bob both deny Tim’s connection.
4.10. PKI in a Nutshell
What is PKI?
Public Key Infrastructure is a way for a trust provider to “certify” someone’s identity and trust, allowing third parties (employees, customers, users) to be able to read a certificate, and use the “infrastructure” part of PKI, to validate the identity and authenticity of various services provided on an untrusted network (e.g., the Internet).
PKI allows any two people to validate each other, using publicly-available information, without sacrificing security.
PKI uses a hierarchical system of Certificate Authorities (CAs) to generate Identity certificates, that can all be validated back to the root CA certificate.
Browsers, app / web servers, and Java all use “trust stores”, which are keystores that are pre-loaded with various well-known PKI trust providers’ root CA certificates. Each browser / web server / app server / Java vendor maintains their own trust databases, and the user or administrator can modify them by adding new certificates that SHOULD be trusted, but were not included by the vendor.
If an identity certificate can be “chained” back to a trusted root, then the identity certificate is assumed to be trusted.
PKI also includes mechanisms for revoking trust, via the CRL (Certificate Revocation List).
4.11. Private PKI vs. Public PKI
At first, this seems redundant!
Many companies implement a “Private” Key Infrastructure, that is NOT internet-facing, for the purposes of authenticating and encrypting data within various parts of the enterprise, or within a many-to-one B2B environment (as in, a Service Provider relationship).
In many ways, a “Private” PKI is more secure than a “Public” PKI:
- Tighter controls over internal keys
- Explicit trust (Public PKI is implicitly trusted)
- Tight controls over “Federation”, requires explicit (rather than implicit) agreement from all “Federated” entities. Federation is a system of identity management that is shared among multiple entities.
- Dynamic user / device → certificate mapping. Identity management products, such as Microsoft Active Directory, automatically incorporate the means and mechanisms to automatically generate and manage user certificates dynamically. This allows users and devices to automatically leverage PKI for authentication, mapping each certificate back to a specific user or device for role-based access control (RBAC).
- Tight controls over Multi-Factor Authentication (MFA), where certificates are used as one of the factors.
“Private” PKI is a valid option where tight security controls are required, and can provide valuable flexibility in situations where a regular “Public” PKI would not allow it.
5. Advanced Concepts
Hopefully, you’ve got the basics, let’s move on to some advanced concepts that combine or expand on these elements.
5.1. Man In The Middle (MITM), or Proxy Attack
We’ve demonstrated that Tim, who we assume is listening to all communications between Bob and Alice, can’t decrypt either’s messages, since Tim does NOT have access to either Secret key.
What if Tim takes it a step further. Tim decides to cut Alice’s communication lines, and inserts his equipment in front of them. Anything Alice sends, goes to Tim, who then retransmits the information to Bob. Conversely, Bob, thinking he’s talking to Alice, transmits everything to Tim, who then retransmits the message to Alice.
This is known as a “Man In the Middle” (MITM) or “Proxy” attack.
When Bob sends his Public Key, Tim gets it. Tim then transmits HIS OWN public key to to Alice. Alice uses Tim’s public key to encrypt data for “Bob”, but Tim simply decrypts the traffic using his own Secret key, and then uses the “real” Bob’s Public key to re-encrypt the message. Likewise, Tim sends HIS OWN Public key to Bob, pretending to be Alice. Bob encrypts using Tim’s public key, thinking it’s Alice.
Tim can effectively see both sides of the supposedly secret communication.
In this situation, it’s critical that both Bob and Alice verify each others’ entire certificate chain, which would quickly prove the deception!
5.2. Replay Attack / Secure Key Exchange
Tim could conceivably record the communication between Bob and Alice. He doesn’t know the exact messages being transmitted, but by recording the entire conversation, he can later pretend to be Alice, in order to trick Bob. All Tim has to do, is replay every one of Alice’s messages, while ignoring Bob’s responses.
If Tim timed it right, he could ask Bob to do something that compromises the infrastructure, and Bob would comply, thinking Alice is asking. For example, if Tim knew that the context of the original communication was Alice asking Bob to unlock a secure door temporarily, Tim could replay Alice’s entire conversation with Bob, and Bob would probably comply by unlocking the secure door for Tim, pretending to be Alice.
Secure Key Exchange prevents this!
- Alice creates a random number “x”, and encrypts it using Bob’s Public key “PB”. The result is E(x,PB), which she sends to Bob along with her Public key.
- Bob then decrypts E(x,PB) using his secret key, “SB”, or E(E(x,PB),SB) = x (the original number)
- Bob encrypts x, only known to Bob and Alice at this point, using Alice’s Public key, and sends it back to her. E(x,PA)
- Alice decrypts Bob’s message, E(E(x,PA),SA) = x1. If x = x1, then Bob is authentic.
Meanwhile, let’s say Bob, in the same message, sent his OWN arbitrary number, “y”, to validate Alice. Here is the full secure key exchange process:
- Alice creates a random number x
- Alice encrypts x using PB = E(x,PB)
- Alice transmits E(x,PB) + PA to Bob
- Bob decrypts E(x,PB) using E(E(x,PB),SB)=x
- Bob encrypts x using PA = E(x,PA)
- Bob creates a random number y
- Bob encrypts y using PA = E(y,PA)
- Bob transmits the message to Alice: E(x,PA) + E(y,PA)
- Alice decrypts E(x,PA) using SA: E(E(x,PA),SA) = x1
- If x1 = x, then Alice has authenticated Bob
- Alice decrypts E(y,PA) using SA: E(E(y,PA),SA) = y
- Alice encrypts y using PB: E(y,PB)
- Alice transmits E(y,PB) back to Bob
- Bob decrypts E(y,PB) using SB: E(E(y,PB),SB) = y1
- If y1 = y, then Bob has authenticated Alice.
Actual secure key exchange is a bit more complicated, but you get the idea.
Tim, not knowing x nor y, can’t perform a “replay” attack against Bob nor Alice.
5.3. Statistical Attack / Ephemeral Session Keys
The Germans had an unbeatable encryption system in World War II, called “Enigma”. The shear internal complexity defied computational analysis. A great man, the father of computing, Alan Turing, devised a mechanical computer that could defeat Enigma using brute force. Thus began the cat-and-mouse game of better encryption, vs. computational (brute force) cryptanalytics. As computing power increases, older ciphers are no longer secure, because they can be “brute forced” within a matter of minutes or seconds. What took Alan Turing’s “Colossus” hours or days to crack, could take your cell phone tenths of a second.
What gave the Allies an advantage, was to combine Colossus’s brute force with top-notch cryptanalytics. The Germans began each daily transmission with “Heil H****” and a weather forecast, that could be easily predicted, and then leveraged in order to reduce the computations required to “brute force” the exact initial state (part of the daily key) to the Enigma cipher.
Repeating the same, predictable message, over and over allowed the Allies to perform a few basic computations in order to come up with a narrow range of initial positions for the key, allowing Colossus to quickly brute-force the remaining possibilities.
If Alice and Bob start every conversation with a predictable sequence, such as:
Alice: “Good morning, Bob!”
Bob: “Good morning, Alice!”
Over a long enough period of time, Tim can monitor these predictable sequences and perform a statistical attack, in an attempt to reveal Bob and Alice’s secret keys.
The defense against this (other than common sense), is to use ephemeral session keys.
Rather than use Bob’s public key to encrypt each message and vice-versa, Alice and Bob can agree upon some random number to use as the ephemeral (temporary) session key, a key that lasts only as long as the session, and is periodically renegotiated.
Now, instead of ten years, Tim will require THOUSANDS of years to decipher Bob’s and Alice’s communications.
An ephemeral session key is basically a random number, agreed by Alice and Bob, rather than using a predictable fixed session key.
6. Encryption Types and Their Uses
6.1. Virtual Private Network (VPN)
Virtual Private Network (VPN) creates an authenticated and encrypted “tunnel” connection:
- Between two networks
- Between two systems
- Between a system and a network
All individual data connections within the encrypted tunnel are explicitly encrypted.
VPN is used in situations where multiple systems need access to one or more systems on the other end of the tunnel (or vice-versa).
- Working from home, you can connect to work using VPN, and then access multiple systems, applications, and services across the single VPN connection.
- Your company is located in Dallas, TX, and purchases a small company in Houston. You can connect the Dallas and Houston offices together, over the internet, using VPN. Anyone at either office can securely access each others’ systems, servers, applications, and services.
- You have a vendor who processes some of your data. You could transmit the data directly over the internet using a secure FTP or Web connection, but you want to provide extra security, to protect potentially-sensitive data. A VPN is a low-cost way for you to transmit data to your vendor, over the internet, completely protected. Likewise, you can use the same VPN for them to deliver reports and other data back to you.
VPN tunnels are typically created between two firewalls, but in some special cases, a server can act as a VPN endpoint, but that’s rare.
VPN can use pre-shared keys (symmetric encryption), or private / public PKI. Each company has their own standards and requirements.
6.2. Secure Sockets Layer (SSL) / Transport Layer Security (TLS)
Transport Layer Security (TLS) is the evolved version of Secure Sockets Layer (SSL). All versions of SSL have been officially deprecated, so the correct term moving forward is TLS.
Unlike VPN, which authenticates and encrypts an entire network connection, TLS protects a single data connection.
What that means is that if you connect to three servers, you must have three TLS connections – one for each server – where, VPN allows you to make multiple data connections with one VPN connection.
Likewise, most browsers and applications use multiple data connections that operate in parallel, in order to provide greater performance. Each unique data connection requires a separate TLS authentication / negotiation / encryption process.
TLS explicitly uses PKI for authentication, and is by far the largest use of PKI.
The biggest use of TLS is when you connect to a secure website, such as Amazon, your bank, or e-mail.
A recent trend is for traditionally “cleartext” websites such as Google, to force TLS where feasible, in order to protect you from spying and eavesdropping.
TLS is fast and simple, but not very flexible.
In addition to web traffic, TLS can also protect server-to-server e-mail (SMTP), file transfer (FTP), and other services.
6.3. SSL VPN
SSL VPN (Really, TLS VPN) takes a traditional full VPN, and encodes the encrypted traffic within a TLS data connection, so that it looks like encrypted web traffic.
TLS VPN is convenient for situations where you’re working on a client’s network, but you need VPN access to your company’s network. Most companies block the ports and protocols required for “full” VPN, but TLS-based VPN passes through, as if you were browsing a secure website.
With TLS VPN, just like traditional VPN, you can be sitting on a client’s restrictive network, and still have full access to your corporate e-mail, servers, and applications.
The other useful aspect of TLS VPN is that it can act as a secure, reverse-proxy for web-based applications that may not be designed to be internet-facing.
In this mode, you use a browser to connect to the TLS VPN portal. You log in. You are then presented with “secure links” for each of your normally-internally-only web-based applications. This is excellent for exposing web email, collaboration (such as eRoom / Sharepoint), CRM, and time & expense for home users without the need to provide them a laptop.
6.4. Secure Shell (SSH)
Secure SHell (SSH) is a Unix-oriented suite of tools that leverage an encrypted data connection similar to SSL:
- Shell – provides access to a remote machine’s command prompt
- SFTP – Secure File Transfer Protocol (SFTP is SSH-based, while FTPS or FTP+S is TLS-based) allows files to be transferred using a secure connection (provides an FTP-like interface).
- SCP – Securely copy a file (no interface)
- Port forwarding – Allows the host to accept an SSH data connection, and forward its data to another host. This behaves just like a VPN, except that each data connection must be specified.
SSH typically uses DSA-based keys, but can also leverage an RSA-based PKI.
Like TLS, SSH can use a variety of ciphers and hash algorithms.
SSH is useful in the following scenarios:
- Hosting provider access. Most web hosting services use Linux and support SSH and SFTP for managing the content.
- X11 / VNC (remote desktop) access to Unix systems. These mechanisms are typically protected by SSH, where the user creates an SSH connection, and uses a forwarded port to remotely and securely display an X11 or VNC desktop.
- File Transfer
FTP (“File Transfer Protocol”) is a 40-year-old protocol, that should really be shot dead, so the rest of us don’t have to deal with it any longer. FTP was developed in the days when “every system was trusted”, and has some very serious flaws. Namely, FTP uses TWO data connections: A control channel, and a data channel. The control channel connects from the client to the server. When the client downloads a file, the server tries to connect to the client to establish the data channel. I’m sure this all worked great in the 70’s and 80’s, before people starting using firewalls, but now we use firewalls whose job is to prevent this type of access. There is “passive” mode, and a whole tome of band-aids to keep the elderly and seriously-flawed FTP protocol alive, namely for banks and mainframers, who can’t seem to live without it. FTP with SSL (FTPS or FTP+S) is yet another band-aid, to attempt to remediate a complete mess of a protocol that should GO AWAY.
ON THE OTHER HAND…. SFTP, or SSH-based file transfer, uses a single data connection, with an FTP-like interface. SFTP is far superior to…. well…. everything else used for file transfer.
Banks and other companies could save millions of dollars per year by eliminating FTP and FTPS in favor of SFTP. Prove it? OK! I recently had an issue where a bank discontinued the use of TLS 1.0 (due to the “POODLE” scare) in favor of TLS 1.1 and 1.2. My client’s file transfer servers were still running Windows 2003, that ONLY supported SSLv3 and TLS 1.0. The only answer was to quickly stage a Win2008 server that supports TLS 1.1 and 1.2. Many deadlines were missed. Much drama. How many of my clients using SSH were impacted by all of the SSL / TLS issues? NONE.
That’s nice, but how do you REALLY feel? I FEEL LIKE FTP AND FTPS ARE SLOWLY STEALING MY SOUL EVERY TIME I HAVE TO DEAL WITH AN FTP-RELATED PROBLEM. In contrast, I never deal with SFTP issues, becuase SFTP isn’t based on FTP. Instead, it’s based on that other thing that I like to call “modern standards”.
6.5. Message / File Encryption
Let’s say that we have a secure network connection between Bob’s workshop and Alice’s workshop.
Any message Alice sends to Bob is explicitly encrypted.
Let’s presume that Bob hire’s an assistant – Fred.
Bob still needs to be able to securely communicate with Alice, but he doesn’t completely trust Fred.
With network or data encryption, Fred would be able to see Alice’s message, since it’s decrypted as soon as it reaches Bob’s workshop.
With message encryption, the message itself is encryption with “Bob’s super-secret public key”, and when the SSL or VPN connection decrypts the data stream, Fred still can’t read it because the message (file) itself is still encrypted using a 2nd key pair.
Likewise, Fred knows Alice – he can use his own “Fred” public key to allow Alice to send him messages that Bob can’t read, even though the transport (data connection) is encrypted using “bobsdomain.com” SSL encryption.
6.6. Disk / Storage
Disk or Storage encryption, is typically symmetric encryption designed to prevent secret data being read in the event of physical theft.
Storage encryption is designed to be extremely low-latency, supporting high IO rates, while keeping data physically secure.
If someone breaks in to your datacenter and steals your servers, disk encryption keeps your clients’ data safe.
Likewise, all mobile devices should use disk (or device) encryption, to protect sensitive data in the event that the device is stolen.
7. More About SSL (TLS)
Now that SSLv3 has been deprecated, the term “SSL” is really obsolete, but still in common use, synonymous with TLS, even though TLS replaces SSL.
SSL = Secured Sockets Layer
TLS = Transport Layer Security
They basically do the same thing, but TLS is more flexible and supports stronger standards.
7.1. SSL vs. TLS, and State-of-the-Art
Secure Sockets Layer (SSL) was originally designed as an implementation-specific way to encrypt and authenticate web traffic. SSLv1 was never released, due to technical problems. The first online transactions were secured by SSLv2, using the RSA RC2 cipher and the MD5 hash algorithm. SSLv3 allowed flexibility for multiple ciphers and hashes, extending the life of SSL.
Meanwhile, Transport Layer Security (TLS) allows almost any network connection to be secured, using a variety of ciphers and hash algorithms.
As of this writing, SSLv3, the last version of SSL, was officially deprecated in Dec-2014 due to known security weaknesses.
TLS 1.0 is officially considered “weak”, while TLS 1.1 and 1.2 are considered secure, with TLS 1.2 being the optimal selection. “TLS 1.3” or the successor to TLS 1.2, has not yet been finalized.
All “RC” ciphers, including RC2, RC3, and RC4, have been officially cracked and / or deprecated, and designated as “weak”.
DES is crackable in minutes. 3DES, or “triple DES”, has never been extensively explored due to its deprecation in 2005 and replacement with AES.
AES, the “Advanced Encryption Standard”, uses a variable key length (usually multiples of 2), and is still considered secure. Typical key lengths are 128 bits and 256 bits, but 384, 512, and 1024 are all possible.
MD4 and MD5 (“Message Digest”) hash algorithms are publicly crackable using rainbow tables.
SHA-1 (“Secure Hash Algorithm”) is considered weak, and is officially deprecated with SHA-2, which allows a variable bit length (e.g. SHA2-256 is 256 bits wide)
All RSA encryption keys are suspect. RSA keys have evolved from 512, to 1024, to 2048 bits in length. RSA keys use large prime numbers as key factors for public key encryption.
DSA keys have not been publicly cracked. DSA keys use large logarithmic factors as key factors for public key encryption, and are commonly used by Secure Shell, a suite of Unix-oriented protocols that bear similarity to TLS.
ECDSA, or “Elliptical Curve” DSA, uses information about specific elliptical curves (math formulas) to generate key data, and are considered state of the art (as of this writing)
Work progresses in the following areas:
- One-time-pad using chaotic systems, such as Lorentz attractors and double n-dimensional pendulums.
- Fractal encryption (no scheme has yet been defined), using self-similarity to define keys and / or encryption schemes.
7.2. Pluggable Ciphers / Hashes
In the early days, only asymmetric ciphers were available, but that was OK, because 1990’s computing tech would require thousands of years to break them!
Now, the need exists for “pluggable” encryption – the ability to substitute cipher and hash algorithm “plugin modules” as older ones are deprecated.
Symmetric ciphers such as AES, that are PERFECTLY secure, suffer from the weakness that there is no secure mechanism for key exchange.
Pluggable encryption allows for secure key exchange, to negotiate a symmetric, ephemeral key, cipher, and hash specification, and subsequent communication leverages these.
The ability to support a symmetric cipher within an asymmetric framework, one of the hallmarks of TLS, allows almost any symmetric cipher to be used, and to renegotiate ciphers, hashes, and session keys on-the fly. As new technologies are developed, and older technologies are deprecated, those standards fit within the TLS framework without having to make major changes to the TLS framework itself. Only the new cipher / hash plugins need to be inserted in to the framework.
7.3. SSL (TLS) Mutual Authentication
SSL (TLS) mutual authentication allows the server and client to identify and authenticate each other.
The client, normally a web browser, always performs minimal authentication of the server. As described above, the server’s certificate must be signed by, or chained to a trusted root CA within the client’s trust store.
The server can be configured to require the client to present a specific identity certificate in order to connect.
Additionally, clients other than a web browser, such as one server that sends a message to another server, would itself be a client to the other server.
In either case, in addition to basic trust chain validation, either server (or both) can have a set of rules defining specific properties for the other’s certificate:
- Specify by fingerprint. This is the most secure, but also the least flexible, since the fingerprint may change when the certificate is renewed – for example, if the new certificate uses a newer hash algorithm. In theory, specifying the fingerprint uniquely identifies a single certificate that’s allowed to connect.
- Subject. Requires a specific identity name, such as “Bob”. This is the best way to set up a server-to-server connection, as the server’s identity certificate can be explicitly specified. This type of rule allows changes in underlying ciphers, hashes, fingerprints, etc… as long as the subject name stays the same.
- Subject Wildcard. A rule such as “*.bobsdomain.com” might allow any of Bob’s servers, server1.bobsdomain.com (etc…) to connect. This is a convenient way to allow many-to-many access between two organizations or server farms.
- Issuer. Requires that all certificates be issued by a specific CA (issuer) certificate. The issuer can typically be a specific CA, or one of the CAs in the trust chain. This type of rule is an excellent way to create a “Private” PKI trust, where users and devices might be dynamic, and can’t be explicitly enumerated in the rulebase. Example, “Issuer=bobCA.bobsdomain.com”, would allow “Bob” (issuer bobCA) to connect, as well as “Bob’s laptop” and “Bob’s phone”.
On the server side, matching rules typically also assign a user or profile, allowing the web / app server / appliance to apply Role-Based Access Control (RBAC) based on the user’s identity. For example, a client certificate called “Alice” could be mapped to a trusted user named “Alice”, allowing access to intranet resources, while all other certificates get mapped to a “Guest” user, allowing access ONLY to public resources.
In the “Private” PKI scenario, the mapping rule uses the subject name, or some portion of the subject name to map the user ID.
7.4. SSL Offload / Acceleration and Application Delivery Controllers (ADC)
SSL Offload, also called SSL Acceleration (really should now be called TLS Offload), means using a firewall or Application Delivery Controller (ADC) to perform the TLS encryption / decryption.
An ADC sits between the server and the client, and can use special rules to direct traffic to the correct server pool. Additionally, ADCs typically own the TLS session between the client and server.
The client (user) connects to the ADC, and then the ADC establishes a separate connection to the server. Some applications require SSL, meaning that the ADC establishes a second, separate SSL session to the server, similar to a Man-In-The-Middle attack!
In this case, there are two SSL connections:
- The “outer” SSL connection is established between the client and the ADC.
- The ADC evaluates the data within the cleartext stream, and routes it accordingly
- The ADC establishes a second, “inner” SSL connection between the ADC and the web server.
ADCs can create high-availability by pooling servers together, acting as a single, large “virtual server”, or can be used to route traffic based on complex rules which evaluate portions of the URL, in order to route traffic for different services or applications to the appropriate server pool.
8. Troubleshooting SSL Issues
8.1. Browser SSL Issues
- Invalid date: The identity certificate or one of the certificates within the trust chain has expired. Check each certificate’s “Valid until” field, using a utility such as the Firefox browser.
- Host name / certificate mismatch: When you type a web address in to your browser, an internet service known as DNS resolves the name to an IP address. The certificate’s subject must match the DNS name. So, if you connect to “bobsdomain.com” and the certificate’s subject is “www.bobsdomain.com”, you will get a cert / hostname mismatch. Instead, use a wildcard cert, or SAN to specify multiple “nicknames” for the certificate.
- Can’t validate trust / Invalid trust chain / untrusted host: Either the intermediate certificate is incorrectly configured on the web server, or the trust chain is broken. Evaluate each certificate in the chain, in order, to validate that each one points to the next, terminating in the CA’s self-signed root cert.
Use the Qualys SSL tool to help figure out server configuration issues:
One of the most common browser SSL issues occurs when you connect by IP address rather than host name. Either include the IP address as a SAN for the certificate, or use a hosts file on the client, to override the name.
8.2. Server SSL Issues
When you fire up your app server, let’s say it makes a connection to another server, “app1.bobsdomain.com”.
If you’re experiencing errors when you start your server…
- Can’t validate server identity / can’t validate trust: This means that YOUR server doesn’t trust the remote server. Look at your trust keystore, to ensure that the remote server’s root CA certificate exists in there, or load the remote server’s root CA certificate in to your server’s trust keystore.If you STILL experience errors, look around for another keystore! You would be amazed how often there is a “default” keystore that overrides the application’s trust keystore!
- Can’t validate signature. Your application server doesn’t support the HASH ALGORITHM used to sign the server’s certificate.There’s a lot of this going on right now, as trust providers shift from SHA1 to SHA2, as the hash algorithm for signing their certificates. Java (JRE) 1.4 DOES NOT support SHA2 (It doesn’t. REALLY. Ignore what you just Googled. Even JRE 1.4.2_12 DOES NOT support SHA2. I’ve PERSONALLY TESTED THIS. DUDE? WHY are you still trying?) Instead, upgrade to Java (JRE) 1.5, which DOES support SHA2 certificates. Pointing JAVA_HOME from 1.4 to 1.5 is pretty well a UNIVERSAL FIX for this issue.
- ASN.1 parsing error. This is the old way of saying: “Your trust chain is corrupt”. This is an error you might see in your own web server log, if your trust chain isn’t configured properly. Alternately, maybe you’re pointing to the wrong trust database?
- Invalid protocol. SSLv3, TLS 1.0, TLS 1.1, and TLS 1.2 are the protocols. The server and your client don’t support the same set…. at least ONE of these must match. Try ENABLING TLS 1.2 on your CLIENT.
- Invalid cipher (or) Can’t establish connection. The server probably disabled 3DES and RC4 encryption ciphers. Try enabling AES on your client.
PKI, Public Key Encryption, Digital Signing, Certificate Authorities, and the like SEEM like scary concepts, but hopefully, this article was able to provide practical examples and simplified explanations for these concepts.
If I missed something, or if a concept is unclear, please reach out to me via the comments.