Redis with Sentinel
This module deploys a highly-available set of Redis nodes.
This is deployed in a single master, many replica configuration. Failover is handled by Redis Sentinel which is also deployed by this module.
Usage
Credentials
For in-cluster applications, credentials can be sourced from the following Kubernetes Secrets named in the module's outputs:
superuser_creds_secret
: Complete access to the databaseadmin_creds_secret
: Read and write access to the database (does not include the ability to preform sensitive operations like schema or permission manipulation)reader_creds_secret
: Read-only access to the database
Each of the above named Secrets contains the following values:
username
: The username to use for authenticationpassword
: The password to use for authentication
The credentials in each Secret are managed by Vault and rotated automatically before they expire. In the Panfactum Stack, credential rotation will automatically trigger a pod restart for pods that reference the credentials.
The credential lifetime is configured by the vault_credential_lifetime_hours
input (defaults
to 16 hours). Credentials are rotated 50% of the way through their lifetime. Thus, in the worst-case,
credentials that a pod receives are valid for vault_credential_lifetime_hours
/ 2.
Connecting
The below example show how to connect to the Redis master using dynamically rotated admin credentials by setting various environment variables in our kube_deployment module.
module "redis" {
source = "${var.pf_module_source}kube_redis_sentinel${var.pf_module_ref}"
...
}
module "deployment" {
source = "${var.pf_module_source}kube_deployment${var.pf_module_ref}"
...
common_env_from_secrets = {
REDIS_USERNAME = {
secret_name = module.redis.admin_creds_secret
key = "username"
}
REDIS_PASSWORD = {
secret_name = module.redis.admin_creds_secret
key = "password"
}
}
common_env = {
REDIS_HOST = module.redis.redis_master_host
REDIS_PORT = module.redis.redis_port
}
}
Persistence
Redis provides two mechanisms for persistence:
AOF and RDB.
This module uses RDB by default (tuned via redis_save
).
Using AOF (whether independently or concurrently with RDB) negates the ability to do partial resynchronizations after restarts and failovers. Instead, a copy of the database must be transferred from the current master to restarted or new replicas. This greatly increases the time-to-recover as well as incurs a high network cost. In fact, there is arguably no benefit to AOF-based persistence at all with our replicated architecture as new Redis nodes will always pull their data from the running master, not from their local AOF. The only benefit would be if all Redis nodes simultaneously failed with a non-graceful shutdown (an incredibly unlikely scenario).
Persistence is always enabled in this module for similar reasons. Without persistence, an entire copy of the database would have to be transferred from the master to each replica on every Redis node restart. The cost of storing data on disk is far less than the network costs associated with this transfer. Moreover, persistence should never impact performance as writes are completed asynchronously unless configured otherwise.
Once the Redis cluster is running, the PVC autoresizer
(provided by kube_pvc_autoresizer)
will automatically expand the EBS volumes once the free space
drops below persistence_storage_increase_threshold_percent
of the current EBS volume size.
The size of the EBS volume will grow by persistence_storage_increase_gb
on every scaling event until a maximum of persistence_storage_limit_gb
.
Disruptions
By default, failovers of Redis pods in this module can be initiated at any time. This enables the cluster to automatically perform maintenance operations such as instance resizing, AZ re-balancing, version upgrades, etc. However, every time a Redis pod is disrupted, a short period of downtime might occur if the disrupted pod is the master instance.
While this can generally be mitigated when using a Sentinel-aware client, you may want to provide more control over when these failovers can occur, so we provide the following options:
Disruption Windows
Disruption windows provide the ability to confine disruptions to specific time intervals (e.g., periods of low load) if this is needed
to meet your stability goals. You can enable this feature by setting voluntary_disruption_window_enabled
to true
.
The disruption windows are scheduled via voluntary_disruption_window_cron_schedule
and the length of time of each
window via voluntary_disruption_window_seconds
.
If you use this feature, we strongly recommend that you allow disruptions at least once per day, and ideally more frequently.
For more information on how this works, see the kube_disruption_window_controller submodule.
Custom PDBs
Rather than time-based disruption windows, you may want more granular control of when disruptions are allowed and disallowed.
You can do this by managing your own PodDisruptionBudgets. This module provides outputs that will allow you to match certain subsets of Redis pods.
For example:
module "redis" {
source = "${var.pf_module_source}kube_redis_sentinel${var.pf_module_ref}"
...
}
resource "kubectl_manifest" "pdb" {
yaml_body = yamlencode({
apiVersion = "policy/v1"
kind = "PodDisruptionBudget"
metadata = {
name = "custom-pdb"
namespace = module.redis.namespace
}
spec = {
unhealthyPodEvictionPolicy = "AlwaysAllow"
selector = {
matchLabels = module.redis.match_labels_master # Selects only the Redis master (writable) pod
}
maxUnavailable = 0 # Prevents any disruptions
}
})
force_conflicts = true
server_side_apply = true
}
While this example is constructed via IaC, you can also create / destroy these PDBs directly in your application logic via YAML manifests and the Kubernetes API. This would allow you to create a PDB prior to initiating a long-running operation that you do not want disrupted and then delete it upon completion.
Completely Disabling Voluntary Disruptions
Allowing the cluster to periodically initiate failovers of Redis is critical to maintaining system health. However,
there are rare cases where you want to override the safe behavior and disable voluntary disruptions altogether. Setting
the voluntary_disruptions_enabled
to false
will set up PDBs that disallow any voluntary disruption of any Redis
pod in this module.
This is strongly discouraged. If limiting any and all potential disruptions is of primary importance you should instead:
- Create a one-hour weekly disruption window to allow some opportunity for automatic maintenance operations
- Ensure that
spot_instances_enabled
andburstable_instances_enabled
are both set tofalse
Note that the above configuration will significantly increase the costs of running the Redis cluster (2.5-5x) versus more flexible settings. In the vast majority of cases, this is entirely unnecessary, so this should only be used as a last resort.
Extra Redis Configuration
You can add extra Redis configuration flags via the redis_flags
module variable.
These flags are passed as commandline arguments to the redis servers. This ensures they will be of the highest precedence.
For more information about passing flags through the commandline and available options, see this documentation.
Providers
The following providers are needed by this module:
Required Inputs
The following input variables are required:
namespace
Description: The namespace to deploy to the redis instances into
Type: string
Optional Inputs
The following input variables are optional (have default values):
arm_nodes_enabled
Description: Whether the database pods can be scheduled on arm64 nodes
Type: bool
Default: true
burstable_nodes_enabled
Description: Whether the database pods can be scheduled on burstable nodes
Type: bool
Default: false
controller_nodes_enabled
Description: Whether to allow pods to schedule on EKS Node Group nodes (controller nodes)
Type: bool
Default: false
creds_syncer_logging_enabled
Description: Whether to enable logging for the creds-syncer pods
Type: bool
Default: false
disabled_commands
Description: Commands that are disabled in Redis. This can be used to provide global protection against unsafe commands.
Type: list(string)
Default:
[
"FLUSHDB",
"FLUSHALL"
]
helm_version
Description: The version of the bitnami/redis helm chart to use
Type: string
Default: "20.5.0"
instance_type_anti_affinity_required
Description: Whether to enable anti-affinity to prevent pods from being scheduled on the same instance type. Defaults to true iff sla_target == 3.
Type: bool
Default: null
lfu_cache_enabled
Description: Whether redis will be deployed as an LFU cache
Type: bool
Default: false
minimum_memory_mb
Description: The minimum memory in Mb to use for the redis nodes
Type: number
Default: 25
monitoring_enabled
Description: Whether to allow monitoring CRs to be deployed in the namespace
Type: bool
Default: false
node_image_cached_enabled
Description: Whether to add the container images to the node image cache for faster startup times
Type: bool
Default: true
panfactum_scheduler_enabled
Description: Whether to use the Panfactum pod scheduler with enhanced bin-packing
Type: bool
Default: true
persistence_backups_enabled
Description: Whether to enable backups of the Redis durable storage.
Type: bool
Default: true
persistence_size_gb
Description: How many GB to initially allocate for persistent storage (will grow automatically as needed). Can not be changed after cluster creation.
Type: number
Default: 1
persistence_storage_increase_gb
Description: The amount of GB to increase storage by if free space drops below the threshold
Type: number
Default: 1
persistence_storage_increase_threshold_percent
Description: Dropping below this percent of free storage will trigger an automatic increase in storage size
Type: number
Default: 20
persistence_storage_limit_gb
Description: The maximum number of gigabytes of storage to provision for each redis node
Type: number
Default: null
pull_through_cache_enabled
Description: Whether to use the ECR pull through cache for the deployed images
Type: bool
Default: false
redis_flags
Description: Extra configuration flags to pass to each redis node
Type: list(string)
Default: []
redis_save
Description: Sets the save option for periodic snapshotting
Type: string
Default: "300 100"
replica_count
Description: The number of redis replicas to deploy
Type: number
Default: 3
spot_nodes_enabled
Description: Whether the database pods can be scheduled on spot nodes
Type: bool
Default: true
vault_credential_lifetime_hours
Description: The lifetime of database credentials generated by Vault
Type: number
Default: 16
voluntary_disruption_window_cron_schedule
Description: The times when disruption windows should start
Type: string
Default: "0 4 * * *"
voluntary_disruption_window_enabled
Description: Whether to confine voluntary disruptions of pods in this module to specific time windows
Type: bool
Default: false
voluntary_disruption_window_seconds
Description: The length of the disruption window in seconds
Type: number
Default: 3600
voluntary_disruptions_enabled
Description: Whether to enable voluntary disruptions of pods in this module.
Type: bool
Default: true
vpa_enabled
Description: Whether the VPA resources should be enabled
Type: bool
Default: true
Outputs
The following outputs are exported:
admin_creds_secret
Description: The name of the Kubernetes Secret holding credentials for the admin role in the Redis database
admin_role
Description: The Vault role used to get admin credentials for the created Redis cluster
master_set
Description: The value for the master set to use when configuring Sentinel-aware Redis clients
match_labels
Description: A label selector that matches all Redis pods in the cluster
match_labels_master
Description: A label selector that matches only the Redis master pod in the cluster
namespace
Description: Kubernetes namespace where module resources are created
reader_creds_secret
Description: The name of the Kubernetes Secret holding credentials for the reader role in the Redis database
reader_role
Description: The Vault role used to get read-only credentials for the created Redis cluster
redis_host
Description: A service address that points to all Redis nodes
redis_host_list
Description: A list of domain names for every Redis pod in the cluster
redis_master_host
Description: A service address that points to only the writable redis master
redis_port
Description: The port that the Redis servers listen on
redis_sentinel_host
Description: A service address that points to the Redis Sentinels
redis_sentinel_port
Description: The port that the Sentinel servers listen on
root_name
Description: The name of the root user of the database
root_password
Description: The password for root user of the database
superuser_creds_secret
Description: The name of the Kubernetes Secret holding credentials for the superuser role in the Redis database
superuser_role
Description: The Vault role used to get superuser credentials for the created Redis cluster
Usage
No notes