Cryptography in Networking
Cryptography is fundamental to modern networking. You will find it in nearly every system.
Cryptography provides three critical capabilities:
-
Encryption: How to prevent third parties from snooping on transmitted data
-
Authentication: How to verify the identity of the sender and/or recipient
-
Integrity: How to prevent transmitted data from tampering
Foundational Concepts
Before talking about the specific network cryptography protocols, it's critical to understand a couple foundational concepts.
Public Key Cryptography
Also called asymmetric cryptography, public key cryptography differs from symmetric cryptography 1 in that there are two mathematically related, but not identical keys used in cryptographic operations (as opposed to one). We use this type of cryptography for two purposes:
-
Encryption: One key encrypts the data and only the other key can decrypt it. 2
-
Digital signatures: One key signs arbitrary data to produce a signature, and another entity can use the other key to verify the signature was produced by the first key. 3
The benefit over symmetric encryption is that one of the two keys can be shared publicly without introducing any vulnerability. This creates an easy way to securely share data between parties without having to coordinate in advance (i.e., connecting to a remote server). 4
There are many algorithms used to construct the keys, but all are in service of the above premise.
For more details, we recommend taking a moment to review these learning resources:
X.509 Certificates
X.509 is a global standard that builds on top of public key cryptography to introduce the concept of certificates.
Quite simply, a certificate is a public key with some additional associated metadata. As cryptographic keys are random data, adding additional metadata can be critical for accomplishing goals like providing identifying information about the keyholder.
X.509 also introduces the concept of certificate authorities (CAs) which can be used to verify the content of the certificate metadata using digital signatures.
This is a complex standard, so we recommend taking a moment to review these brief learning resources:
TLS
TLS is by far the most prevalent use of cryptography in computer networking. It is the protocol that underlies other protocols like HTTPS 5 which you probably use every time you connect to a site over the internet.
We recommend the following learning resources for a brief overview of how the protocol itself works:
Motivation for TLS
When you use the internet, your traffic is routed across many untrusted computers (routers) before finally reaching your destination. Due to the way the internet works, you have no control what route is taken nor any ability to influence the behavior of these intermediate computers.
Consider the scenario when one of these intermediate computers is nefarious. This creates a man-in-the-middle (MITM) attack:
This attacker can do any of the following:
-
Read the data you transmit (e.g., your passwords and bank credentials)
-
Impersonate the computer that you are trying to reach
-
Inject bad data into the messages sent between you and your destination
Due to the severity of this attack, the industry standard for organizations is a zero trust security model where you assume any intermediate computer could be an attacker and implement the appropriate safeguards.
TLS (and similar protocols) are one such safeguard as it prevents MITM attacks by:
-
Verifying the identity of the server sending you data
-
Encrypting the transmitted data
-
Preventing any mutation of the transmitted data
Certificate Authorities
One critical component of TLS is the certificate metadata. A TLS certificate will provide a set of IP addresses and / or domain names which clients use to verify the identity of the server (i.e., the connection is not being intercepted by a MITM attack).
However, how does one know they can trust the metadata in any given certificate? We resolve that problem by using a certificate authority (CA) which digitally signs the certificate (including the metadata).
But how do we know which certificate authorities to trust?
For public sites, there are a handful of CAs that come pre-trusted by all major operating systems. This allows users to immediately open a browser and visit a website without having to make a trust decision. The most popular public CA for developers is Let's Encrypt as it offers certificate signing services for free. Let's Encrypt follows the ACME standard for automating the certificate management process by using one of the domain verification challenge types. 6
However, you should also use TLS on "private" networks. 7 Unfortunately, you cannot use a public CA to sign certificates in this scenario, so you must run your own private CA. 8 You are then responsible for ensuring that every computer in your network trusts your CA by adding the CA's public key to the computer's trust list. 9
Finally, there is the concept of a self-signed certificate which serves as its own CA. Obviously, this provides no additional trust value and is used either for testing purposes or as the root certificate used by a trusted CA.
Mutual TLS (mTLS)
In standard TLS, only the server presents a certificate for identification. This works well when you want any computer to be able to connect to your server. However, in many situations you only want particular clients to be able to connect.
Using mutual TLS, both the client and server present their own certificates for identity verification. This allows the server to decide if it wants to allow the client to connect based on its identity.
This additional layer of security that should be added whenever possible to limit your attack surface area. In Kubernetes, this is typically accomplished via a service mesh.
TLS Versions
TLS has a few versions in popular use: 1.0
, 1.1
, 1.2
, and 1.3
.
The major difference between versions is which cipher suites they use for their underlying cryptography. As time has gone on, vulnerabilities were discovered in ciphers supported by earlier versions. As a result, later versions of TLS support fewer ciphers and only ones with the maximal security.
TLS 1.0 and 1.1 were deprecated in 2021 due to security issues but are still used by websites that haven't received security updates. These should be considered insecure.
TLS 1.2 is by far the most popular version (with 99.9% website support), but can still be configured insecurely.
The latest TLS protocol 1.3 is recommended for most use cases. TLS 1.3 also comes with forward secrecy which greatly mitigates the blast radius should one of its ciphers be proven insecure in the future.
More details about the TLS versions can be found here.
SSH
The next most popular use for asymmetric cryptography that you are likely familiar with is the Secure Shell Protocol (SSH). This is used to connect with servers and can be used by tools like git.
Largely the motivations are identical to that of TLS. However, SSH was historically not built around the X.509 specification, so it has a separate ecosystem of tooling for historical reasons.
In the Panfactum stack, we use the newer X.509 capabilities provided by SSH servers, most of what has been said about TLS applies for SSH in the Panfactum stack.
VPNs
Virtual Private Networks (VPNs) are another popular use of network cryptography that you may be familiar with, especially in a professional setting.
VPNs were invented to solve the problem of securely connecting to an entire local area networks (LAN) over an untrusted intermediary network (such as the internet). This can be used to connect a remote computer to a private network or even to connect multiple private networks to one another.
Problems and Overuse of VPNs
VPNs are often the default choice when IT teams want to connect a worker's computer to an organization's private network. We believe this is an inappropriate use case for the following reasons:
-
A typical VPN setup will allow access to all networked resources in the private network by default. This is an allow-by-default security posture that leads to security vulnerabilities and runs counter to a zero trust architecture.
-
The most popular VPN solutions have massive surface areas due to their number of supported deployment scenarios. This significantly increases the attack surface. For example, here is the vulnerability list for OpenVPN. 10
-
A typical VPN setup will route all network traffic from the remote machine to the private network. This introduces problems:
-
Organizations incur significant costs for non-work traffic routed though metered networks (e.g., AWS). 11
-
Users experience degraded network performance as internet bandwidth is shared across all VPN users.
-
Users sacrifice privacy unnecessarily by allowing their organization to view their internet traffic.
-
While a VPN can be an attractive solution for site-to-site connections, we recommend using more purpose-built software for exposing private network resources to internal stakeholders. Generally, there are three options:
-
Gate private resources individually with a publicly available authenticating proxies such as Oauth2-proxy, ideally using role-based access control and SSO.
-
Provision an overlay networking solution such as ZeroTier which provides more granular control of access to individual resources and traffic routing.
-
Utilize a point-to-point proxy solution such as ssh tunneling which enables users to expose individual private network resources on local computer ports.
Any of these solutions would offer increased security, improved user experience, and reduced costs compared to a VPN.
WireGuard
If you do need to use a VPN for site-to-site connections, we recommend WireGuard. See this presentation for a good overview of its many benefits over the alternatives.
Footnotes
-
Such as encrypting a file with a password ↩
-
A good example would be using
ssh
to connect to a server. ↩ -
A good example would be signing your git commits ↩
-
Hence the name "public key cryptography" ↩
-
HTTP over TLS ↩
-
In the Panfactum stack, we use cert-manager to complete the CA signing process using these standards. ↩
-
"Private" is in quotes here because it is private in the technical sense (i.e., is not accessibly via a public IP) but not in the semantic sense (i.e., guaranteed that no one is intercepting the traffic). Consider infrastructure hosted by a cloud provider. Not only does the organization and its employees have access to your traffic, but so do all the devices facilitating the provider's network, each device having its own manufacturer and exploitable software stack. ↩
-
How to do this varies heavily with your particular operating system and setup. In the Panfactum stack, we accomplish this using the Linkerd2 service mesh and trust-manager. ↩
-
Compare that to the vulnerability list for something like ZeroTier. ↩
-
For some clients, we have seen VPNs and related charges account for over 25% of their entire AWS bill! ↩