AWS Virtual Private Cloud (VPC)
This module configures the following infrastructure resources for a Virtual Private Cloud:
Establishes a VPC
Deploys subnets with associated CIDR reservations and Route tables
NAT instances with static Elastic IP addresses associated and mapped correctly.
An internet gateway to allow resources that get public IPs in the VPC to be accessible from the internet.
VPC peering as required with resources outside the VPC.
Full VPC Flow Logs with appropriate retention and tiering for compliance and cost management.
An S3 Gateway endpoint for free network traffic to/from AWS S3
Usage
CIDR Blocks and Subnets
A critical decision for this module is deciding your CIDR blocks for subnet setup. This is very difficult to change later and will usually require redeploying your entire VPC and all the resources it contains.
We strongly recommend choosing the largest possible CIDR block for your VPC: 10.0.0/16 (the default for this module). You want to ensure that you have at least 100 IPs available in each public subnet and 1,000 IPs available in each private / isolated subnet. Choosing a large VPC CIDR gives you the most flexibility. 1
Iff you use the default CIDR of 10.0.0/16, your subnets will be automatically configured as follows: 2 3
SLA Level 1 4
| Name | Type | Availability Zone | CIDR | Available IPs |
|---|---|---|---|---|
PUBLIC_A | Public | A | 10.0.0.0/24 | 254 |
PUBLIC_B | Public | B | 10.0.1.0/24 | 254 |
| N/A | Reserved | N/A | 10.0.2.0/24 | 254 |
PRIVATE_A | Private | A | 10.0.64.0/18 | 16,382 |
| N/A | Reserved | N/A | 10.0.128.0/18 | 16,382 |
| N/A | Reserved | N/A | 10.0.192.0/18 | 16,382 |
ISOLATED_A | Isolated | A | 10.0.16.0/20 | 4,094 |
| N/A | Reserved | N/A | 10.0.32.0/20 | 4,094 |
| N/A | Reserved | N/A | 10.0.48.0/20 | 4,094 |
| N/A | Reserved | N/A | 10.0.3.0/24 | 254 |
| N/A | Reserved | N/A | 10.0.4.0/22 | 1022 |
| N/A | Reserved | N/A | 10.0.8.0/21 | 2046 |
We recommend the above Reserved CIDR blocks in case you want upgrade your SLA target in the future.
SLA Level 2+
| Name | Type | Availability Zone | CIDR | Available IPs |
|---|---|---|---|---|
PUBLIC_A | Public | A | 10.0.0.0/24 | 254 |
PUBLIC_B | Public | B | 10.0.1.0/24 | 254 |
PUBLIC_C | Public | C | 10.0.2.0/24 | 254 |
PRIVATE_A | Private | A | 10.0.64.0/18 | 16,382 |
PRIVATE_B | Private | B | 10.0.128.0/18 | 16,382 |
PRIVATE_C | Private | C | 10.0.192.0/18 | 16,382 |
ISOLATED_A | Isolated | A | 10.0.16.0/20 | 4,094 |
ISOLATED_B | Isolated | B | 10.0.32.0/20 | 4,094 |
ISOLATED_C | Isolated | C | 10.0.48.0/20 | 4,094 |
| N/A | Reserved | N/A | 10.0.3.0/24 | 254 |
| N/A | Reserved | N/A | 10.0.4.0/22 | 1022 |
| N/A | Reserved | N/A | 10.0.8.0/21 | 2046 |
Custom Network Layout
If you are choosing a different network layout, we recommend this site for helping to divide your network.
To configure the network, you will need to manually specify both the subnets and nat_associations inputs.
You need at least one of each subnet type in at least three availability zones for a highly available deployment (SLA target >= 2).
You need at least two public subnets regardless of the SLA target in order to deploy Panfactum (EKS limitation).
Network Address Translation (NAT)
If you are unfamiliar with NAT, you should review the NAT concept documentation.
NAT is the one component of the VPC configuration that we have enhanced beyond the typical AWS-recommended setup.
Specifically, we do NOT use AWS NAT Gateways by default. They are far too expensive for behavior that should ultimately be available for free (and is in other cloud providers). For many organizations, NAT Gateway costs alone can produce 10-50% of total AWS spend.
Instead, we deploy a Panfactum-enhanced version of the fck-nat project. Using this pattern, our module launches self-hosted NAT nodes in EC2 autoscaling groups and reduces the costs of NAT by over 90%.
This setup does come with some limitations:
- Outbound network bandwidth is limited to 5 Gbit/s per AZ (vs 25 Gbit/s for AWS NAT Gateways)
- Outbound network connectivity in each AZ is impacted by the health of a single EC2 node
In practice, these limitations rarely impact an organization, especially as they only impact outbound connections (not inbound traffic):
- If you need > 5 Gbit/s of outbound public internet traffic, you would usually establish a private network tunnel to the destination to improve throughput beyond even 25 Gbit/s.
- The EC2 nodes are extremely stable as NAT only relies on functionality that is native to the linux kernel (we have never seen a NAT node crash).
- The primary downside is that during NAT node upgrades, outbound network connectivity will be temporarily suspended. This typically manifests as a brief (1-2 min) delay in outbound traffic. Upgrades are typically only necessary every 6 months, so you can still easily achieve high uptime in this configuration.
Footnotes
If you need to choose a smaller block for some reason (e.g., VPC peering), that is completely fine, but you will want to ensure that it isn’t too small. However, a hard lower limit should be a
/19network which would provide about 8,192 () IP addresses. ↩The public subnets are small because we will only deploy a handful of resources that can be directly reached from the public internet (e.g., load balancers). The private subnets are the largest because that is where the vast majority of the Kubernetes workloads will run. ↩
We reserve a few CIDR ranges so that you can provision extra subnets in the future should you need to. This can be an extremely helpful escape hatch that prevents you from needing to mutate existing subnets (causing a service disruption). ↩