Kubernetes StatefulSet
Provides a production-hardened instance of a Kubernetes StatefulSet with the following enhancements:
- Automatic headless service creation
- Standardized resource labels
- Pod and container security hardening
- Persistent volume creation and mounting with automatic integrations with the
- PVC Autoresizer and Velero
- Temporary directory mounting
- ConfigMap and Secret mounting
- Downward-API integrations
- Service account configuration with default permissions
- Integration with the Panfactum bin-packing scheduler
- High-availability scheduling constraints
- Readiness and liveness probe configurations
- Automatic reloading via the Reloader
- Vertical pod autoscaling
- Pod disruption budget
- Toleration switches for the various Panfactum node classes
Usage
Basics
This module follows the basic workload deployment patterns describe in this guide.
Horizontal Autoscaling
By default, this module does not have horizontal autoscaling built-in. If you wish
to add horizontal autoscaling via the HPA
(or similar controller), you should set ignore_replica_count
to true
to prevent
this module from overriding the replica count set via horizontal autoscaling.
Persistence
One of the core use cases for a StatefulSet is the ability to persistent data across pod restarts through the use of Persistent Volume Claims (PVCs).
You can configure the StatefulSet's PVCs via the volume_mounts
input. This input
is a map of names (arbitrary) to configuration values for each volume that should mounted to every
pod in the StatefulSet.
The configuration values are as follows:
storage_class
: The Storage Class to use for the volume. To learn more about the available storage class options, see our guide.initial_size_gb
: The size of the volume when it is first created.increase_gb
: How much the volume will grow every time it is autoscaled by the PVC autoresizer.increase_threshold_percent
: When free storage drops below this percent on the volume, the volume will be autoscaled.size_limit_gb
: The maximum size the volume is allowed to grow to.mount_path
: Absolute path inside each container that the volume is mounted to.backups_enabled
: Whether the PVC snapshots will be created when Velero backups are created (by default hourly).
You can configure the retention policy
of the volumes through the volume_retention_policy
input.
Providers
The following providers are needed by this module:
-
kubectl (2.1.3)
-
kubernetes (2.34.0)
-
pf (0.0.7)
-
random (3.6.3)
-
time (0.10.0)
Required Inputs
The following input variables are required:
containers
Description: A list of container configurations for the pod
Type:
list(object({
name = string # A unique name for the container within the pod
init = optional(bool, false) # Iff true, the container will be an init container
image_registry = string # The URL for a container image registry (e.g., docker.io)
image_repository = string # The path to the image repository within the registry (e.g., library/nginx)
image_tag = string # The tag for a specific image within the repository (e.g., 1.27.1)
image_prepull_enabled = optional(bool, true) # Whether the image will be prepulled to nodes when the nodes are first created (speeds up startup times)
image_pin_enabled = optional(bool, true) # Whether the image should be pinned to every node regardless of whether the container is running or not (speeds up startup times)
command = list(string) # The command to be run as the root process inside the container
working_dir = optional(string, null) # The directory the command will be run in. If left null, will default to the working directory set by the image
image_pull_policy = optional(string, "IfNotPresent") # Sets the container's ImagePullPolicy
minimum_memory = optional(number, 100) #The minimum amount of memory in megabytes
maximum_memory = optional(number, null) #The maximum amount of memory in megabytes
memory_limit_multiplier = optional(number, 1.3) # memory limits = memory request x this value
minimum_cpu = optional(number, 10) # The minimum amount of cpu millicores
maximum_cpu = optional(number, null) # The maximum amount of cpu to allow (in millicores)
privileged = optional(bool, false) # Whether to allow the container to run in privileged mode
run_as_root = optional(bool, false) # Whether to run the container as root
uid = optional(number, 1000) # user to use when running the container if not root
linux_capabilities = optional(list(string), []) # Default is drop ALL
read_only = optional(bool, true) # Whether to use a readonly file system
env = optional(map(string), {}) # Environment variables specific to the container
liveness_probe_command = optional(list(string), null) # Will run the specified command as the liveness probe if type is exec
liveness_probe_port = optional(number, null) # The number of the port for the liveness_probe
liveness_probe_type = optional(string, null) # Either exec, HTTP, or TCP
liveness_probe_route = optional(string, null) # The route if using HTTP liveness_probes
liveness_probe_scheme = optional(string, "HTTP") # HTTP or HTTPS
readiness_probe_command = optional(list(string), null) # Will run the specified command as the ready check probe if type is exec (default to liveness_probe_command)
readiness_probe_port = optional(number, null) # The number of the port for the ready check (default to liveness_probe_port)
readiness_probe_type = optional(string, null) # Either exec, HTTP, or TCP (default to liveness_probe_type)
readiness_probe_route = optional(string, null) # The route if using HTTP ready checks (default to liveness_probe_route)
readiness_probe_scheme = optional(string, null) # Whether to use HTTP or HTTPS (default to liveness_probe_scheme)
ports = optional(map(object({ # Keys are the port names, and the values are the port configuration.
port = number # Port on the backing pods that traffic should be routed to
service_port = optional(number, null) # Port to expose on the service. defaults to port
protocol = optional(string, "TCP") # One of TCP, UDP, or SCTP
expose_on_service = optional(bool, true) # Whether this port should be listed on the StatefulSet's service
})), {})
}))
name
Description: The name of this workload
Type: string
namespace
Description: The namespace the cluster is in
Type: string
volume_mounts
Description: A mapping of names to configuration for PersistentVolumeClaims used by the StatefulSet
Type:
map(object({
storage_class = optional(string, "ebs-standard-retained")
access_modes = optional(list(string), ["ReadWriteOnce"])
initial_size_gb = optional(number, 1) # The initial size of the volume when first created
size_limit_gb = optional(number, null) # The maximum number of GB that this volume will scale to
increase_threshold_percent = optional(number, 20) # Dropping below this percent of free storage will trigger an automatic increase in storage size
increase_gb = optional(number, 1) # The number of GB to increase the volume by when it needs to scale up
mount_path = string # Where in the containers to mount the volume
backups_enabled = optional(bool, true) # True iff velero should make snapshot backups of the volumes
}))
Optional Inputs
The following input variables are optional (have default values):
arm_nodes_enabled
Description: Whether to allow pods to schedule on arm64 nodes
Type: bool
Default: true
az_anti_affinity_required
Description: Whether to prevent pods from being scheduled on the same availability zone
Type: bool
Default: false
az_spread_preferred
Description: Whether to enable topology spread constraints to spread pods across availability zones (with ScheduleAnyways)
Type: bool
Default: false
az_spread_required
Description: Whether to enable topology spread constraints to spread pods across availability zones (with DoNotSchedule)
Type: bool
Default: true
burstable_nodes_enabled
Description: Whether to allow pods to schedule on burstable nodes
Type: bool
Default: false
cilium_required
Description: True iff the Cilium CNI is required to be installed on a node prior to scheduling on it
Type: bool
Default: true
common_env
Description: Key pair values of the environment variables for each container
Type: map(string)
Default: {}
common_env_from_config_maps
Description: Environment variables that are sourced from existing Kubernetes ConfigMaps. The keys are the environment variables names and the values are the ConfigMap references.
Type:
map(object({
config_map_name = string
key = string
}))
Default: {}
common_env_from_secrets
Description: Environment variables that are sourced from existing Kubernetes Secrets. The keys are the environment variables names and the values are the Secret references.
Type:
map(object({
secret_name = string
key = string
}))
Default: {}
common_secrets
Description: Key pair values of secrets to add to the containers as environment variables
Type: map(string)
Default: {}
config_map_mounts
Description: A mapping of ConfigMap names to their mount configuration in the containers of the Pod
Type:
map(object({
mount_path = string # Where in the containers to mount the ConfigMap
optional = optional(bool, false) # Whether the pod can launch if this ConfigMap does not exist
}))
Default: {}
controller_nodes_enabled
Description: Whether to allow pods to schedule on EKS Node Group nodes (controller nodes)
Type: bool
Default: false
controller_nodes_required
Description: Whether the pods must be scheduled on a controller node
Type: bool
Default: false
dns_policy
Description: The DNS policy for the pods
Type: string
Default: "ClusterFirst"
extra_annotations
Description: A map of extra annotations that will be added to the StatefulSet (not the pods)
Type: map(string)
Default: {}
extra_labels
Description: A map of extra labels that will be added to the StatefulSet (not the pods)
Type: map(string)
Default: {}
extra_pod_annotations
Description: Annotations to add to the pods in the Pod
Type: map(string)
Default: {}
extra_pod_labels
Description: Extra pod labels to use
Type: map(string)
Default: {}
extra_tolerations
Description: Extra tolerations to add to the pods
Type:
list(object({
key = optional(string)
operator = string
value = optional(string)
effect = optional(string)
}))
Default: []
host_anti_affinity_required
Description: Whether to prefer preventing pods from being scheduled on the same host
Type: bool
Default: true
ignore_replica_count
Description: Whether to ignore changes to the replica count. Useful when implementing horizontal autoscaling.
Type: bool
Default: false
instance_type_anti_affinity_required
Description: Whether to enable anti-affinity to prevent pods from being scheduled on the same instance type. Defaults to true iff sla_target == 3.
Type: bool
Default: null
linkerd_enabled
Description: True iff the Linkerd sidecar should be injected into the pods
Type: bool
Default: true
linkerd_required
Description: True iff the Linkerd CNI is required to be installed on a node prior to scheduling on it
Type: bool
Default: true
max_unavailable
Description: Controls how many pods are allowed to be unavailable in the StatefulSet under the Pod Disruption Budget
Type: number
Default: 1
mount_owner
Description: The ID of the group that owns the mounted volumes
Type: number
Default: 1000
node_image_cached_enabled
Description: Whether to add the container images to the node image cache for faster startup times
Type: bool
Default: true
node_preferences
Description: Node label preferences for the pods
Type: map(object({ weight = number, operator = string, values = list(string) }))
Default: {}
node_requirements
Description: Node label requirements for the pods
Type: map(list(string))
Default: {}
panfactum_scheduler_enabled
Description: Whether to use the Panfactum pod scheduler with enhanced bin-packing
Type: bool
Default: true
pod_management_policy
Description: The StatefulSets pod management policy
Type: string
Default: "OrderedReady"
pod_version_labels_enabled
Description: Whether to add version labels to the Pod. Useful for ensuring pods do not get recreated on frequent updates.
Type: bool
Default: true
priority_class_name
Description: The priority class to use for pods in the StatefulSet
Type: string
Default: null
pull_through_cache_enabled
Description: Whether to use the ECR pull through cache for the deployed images
Type: bool
Default: true
replicas
Description: The desired number of pods in the StatefulSet
Type: number
Default: 1
restart_policy
Description: The pod restart policy
Type: string
Default: "Always"
secret_mounts
Description: A mapping of Secret names to their mount configuration in the containers of the Pod
Type:
map(object({
mount_path = string # Where in the containers to mount the Secret
optional = optional(bool, false) # Whether the pod can launch if this Secret does not exist
}))
Default: {}
service_ip
Description: If provided, the StatefulSet's service will be statically bound to this IP address. Must be within the Service IP CIDR range for the cluster.
Type: string
Default: null
service_load_balancer_class
Description: Iff service_type == LoadBalancer, the loadBalancerClass to use.
Type: string
Default: "service.k8s.aws/nlb"
service_name
Description: If provided, the StatefulSet's service will have this name. If not provided, will default to name.
Type: string
Default: null
service_public_domain_names
Description: Iff service_type == LoadBalancer, the public domains names that this service will be accessible from.
Type: list(string)
Default: []
service_type
Description: The type of the StatefulSet's Service.
Type: string
Default: "ClusterIP"
spot_nodes_enabled
Description: Whether to allow pods to schedule on spot nodes
Type: bool
Default: true
termination_grace_period_seconds
Description: The number of seconds to wait for graceful termination before forcing termination
Type: number
Default: 30
tmp_directories
Description: A mapping of temporary directory names (arbitrary) to their configuration
Type:
map(object({
mount_path = string # Where in the containers to mount the temporary directories
size_mb = optional(number, 100) # The number of MB to allocate for the directory
node_local = optional(bool, false) # If true, the temporary storage will come from the node rather than a PVC
}))
Default: {}
unhealthy_pod_eviction_policy
Description: Whether to allow unhealthy pods to be evicted. See https://kubernetes.io/docs/tasks/run-application/configure-pdb/#unhealthy-pod-eviction-policy.
Type: string
Default: "AlwaysAllow"
update_type
Description: The type of update that the StatefulSEt should use
Type: string
Default: "RollingUpdate"
volume_retention_policy
Description: The persistentVolumeClaimRetentionPolicy to use of the StatefulSet
Type:
object({
when_deleted = optional(string, "Retain")
when_scaled = optional(string, "Retain")
})
Default:
{
"when_deleted": "Retain",
"when_scaled": "Retain"
}
voluntary_disruption_window_cron_schedule
Description: The times when disruption windows should start
Type: string
Default: "0 0/4 * * *"
voluntary_disruption_window_enabled
Description: Whether to confine voluntary disruptions of pods in this module to specific time windows
Type: bool
Default: false
voluntary_disruption_window_seconds
Description: The length of the disruption window in seconds
Type: number
Default: 900
voluntary_disruptions_enabled
Description: Whether to enable voluntary disruptions of pods in this module.
Type: bool
Default: true
vpa_enabled
Description: Whether to enable the vertical pod autoscaler
Type: bool
Default: true
Outputs
The following outputs are exported:
headless_service_name
Description: The name of the headless service where StatefulSet pods are registered
labels
Description: The labels assigned to all resources in this deployment
match_labels
Description: The labels unique to this deployment that can be used to select the pods in this deployment
service_account_name
Description: The service account used for the pods
service_name
Description: The name of the service for the deployment
Usage
No notes