Panfactum LogoPanfactum
Infrastructure ModulesSubmodulesKuberneteskube_nats
kube_nats
Beta
Submodule
Source Code Link

NATS Jetstream

This module deploys a highly-available NATS cluster running in Jetstream mode.

Usage

Credentials

For in-cluster applications, credentials can be sourced from the following Kubernetes Secrets named in the module's outputs:

  • superuser_creds_secret: Full access to the system account (does not have access to normal data, so should generally be avoided)
  • admin_creds_secret: Read and write access to the data streams
  • reader_creds_secret: Read-only access to the data streams

Authenticating with NATS is done via TLS authentication. Each of the above named Secrets contains the following values:

  • ca.crt: The CA certificate used to verify the server-provided certificate.
  • tls.crt: The certificate that the NATS client should provide to the server for authentication.
  • tls.key: The TLS key that the NATS client will use for securing communications with the server.

The credentials in each Secret are managed by Vault and rotated automatically before they expire. In the Panfactum Stack, credential rotation will automatically trigger a pod restart for pods that reference the credentials.

The credential lifetime is configured by the vault_credential_lifetime_hours input (defaults to 16 hours). Credentials are rotated 50% of the way through their lifetime. Thus, in the worst-case, credentials that a pod receives are valid for vault_credential_lifetime_hours / 2.

Connecting

The below example show how to connect to the NATS cluster using dynamically rotated admin credentials by mounting the client certificates in our kube_deployment module.

module "nats" {
  source = "${var.pf_module_source}kube_nats${var.pf_module_ref}"
  ...
}

module "deployment" {
  source = "${var.pf_module_source}kube_deployment${var.pf_module_ref}"
  ...
  secret_mounts = {
    "${module.nats.admin_creds_secret}" = {
      mount_path = "/etc/nats-certs"
    }
  }
  common_env = {
    NATS_HOST = module.nats.host
    NATS_PORT = module.nats.client_port
    
    # It is not strictly necessary to set these, but these are used by the NATS CLI,
    # so it can be helpful to have these set in case you need to debug.
    NATS_URL = "tls://${module.nats.host}:${module.nats.client_port}"
    NATS_KEY = "/etc/nats-certs/tls.key"
    NATS_CERT = "/etc/nats-certs/tls.crt"
    NATS_CA = "/etc/nats-certs/ca.crt"
  }
}

Note that you also must configure the client to use the certificates. For example, if using the nats NPM package:

import { connect } from "nats";

const nc = await connect({
  servers: process.env.NATS_HOST,
  port: process.env.NATS_PORT,
  tls: {
    keyFile: process.env.NATS_KEY,
    certFile: process.env.NATS_CERT,
    caFile: process.env.NATS_CA,
  }
});

Persistence

With NATS Jetstream, persistence is configured via Streams which allow you to control how messages are stored and what the limits of retention are. You can have many different streams on the same NATS instance.

This module only creates the NATS server, not the internal streams. Your services should perform any necessary stream setup before launching (similar to database migrations for other data stores).

That said, there are a few global storage settings to be aware of when first creating the cluster:

  • persistence_initial_storage_gb (can not be changed after NATS cluster creation)
  • persistence_storage_limit_gb
  • persistence_storage_increase_threshold_percent
  • persistence_storage_increase_gb
  • persistence_storage_class_name (can not be changed after NATS cluster creation)

Once the NATS cluster is running, the PVC autoresizer (provided by kube_pvc_autoresizer) will automatically expand the EBS volumes once the free space drops below persistence_storage_increase_threshold_percent of the current EBS volume size. The size of the EBS volume will grow by persistence_storage_increase_gb on every scaling event until a maximum of persistence_storage_limit_gb.

Disruptions

By default, shutdown of NATS pods in this module can be initiated at any time. This enables the cluster to automatically perform maintenance operations such as instance resizing, AZ re-balancing, version upgrades, etc. However, every time a NATS pod is disrupted, clients connected to that instance will need to re-establish a connection with the NATS cluster.

While this generally does not cause issues, you may want to provide more control over when these failovers can occur, so we provide the following options:

Disruption Windows

Disruption windows provide the ability to confine disruptions to specific time intervals (e.g., periods of low load) if this is needed to meet your stability goals. You can enable this feature by setting voluntary_disruption_window_enabled to true.

The disruption windows are scheduled via voluntary_disruption_window_cron_schedule and the length of time of each window via voluntary_disruption_window_seconds.

If you use this feature, we strongly recommend that you allow disruptions at least once per day, and ideally more frequently.

For more information on how this works, see the kube_disruption_window_controller submodule.

Custom PDBs

Rather than time-based disruption windows, you may want more granular control of when disruptions are allowed and disallowed.

You can do this by managing your own PodDisruptionBudgets. This module provides outputs that will allow you to match certain subsets of Redis pods.

For example:

module "redis" {
  source = "${var.pf_module_source}kube_nats${var.pf_module_ref}"
  ...
}

resource "kubectl_manifest" "pdb" {
  yaml_body = yamlencode({
    apiVersion = "policy/v1"
    kind       = "PodDisruptionBudget"
    metadata = {
      name      = "custom-pdb"
      namespace = module.redis.namespace
    }
    spec = {
      unhealthyPodEvictionPolicy = "AlwaysAllow"
      selector = {
        matchLabels = module.redis.match_labels_master # Selects only the Redis master (writable) pod
      }
      maxUnavailable = 0 # Prevents any disruptions
    }
  })
  force_conflicts   = true
  server_side_apply = true
}

While this example is constructed via IaC, you can also create / destroy these PDBs directly in your application logic via YAML manifests and the Kubernetes API. This would allow you to create a PDB prior to initiating a long-running operation that you do not want disrupted and then delete it upon completion.

Completely Disabling Voluntary Disruptions

Allowing the cluster to periodically initiate replacement of NATS pods is critical to maintaining system health. However, there are rare cases where you want to override the safe behavior and disable voluntary disruptions altogether. Setting the voluntary_disruptions_enabled to false will set up PDBs that disallow any voluntary disruption of any NATS pod in this module.

This is strongly discouraged. If limiting any and all potential disruptions is of primary importance you should instead:

  • Create a one-hour weekly disruption window to allow some opportunity for automatic maintenance operations
  • Ensure that spot_instances_enabled and burstable_instances_enabled are both set to false

Note that the above configuration will significantly increase the costs of running the NATS cluster (2.5-5x) versus more flexible settings. In the vast majority of cases, this is entirely unnecessary, so this should only be used as a last resort.

Providers

The following providers are needed by this module:

  • helm (2.12.1)

  • kubectl (2.1.3)

  • kubernetes (2.34.0)

  • pf (0.0.7)

  • random (3.6.3)

  • vault (4.5.0)

Required Inputs

The following input variables are required:

namespace

Description: The namespace to deploy to the NATS instances into

Type: string

Optional Inputs

The following input variables are optional (have default values):

arm_nodes_enabled

Description: Whether the database pods can be scheduled on arm64 nodes

Type: bool

Default: true

burstable_nodes_enabled

Description: Whether the database pods can be scheduled on burstable nodes

Type: bool

Default: false

cert_manager_namespace

Description: The namespace where cert-manager is deployed.

Type: string

Default: "cert-manager"

controller_nodes_enabled

Description: Whether to allow pods to schedule on EKS Node Group nodes (controller nodes)

Type: bool

Default: false

fsync_interval_seconds

Description: Interval in seconds at which data will be synced to disk on each node. Setting this to 0 will force an fsync after each message (which will lower overall throughput dramatically).

Type: number

Default: 10

helm_version

Description: The version of the bitnami/nats helm chart to use

Type: string

Default: "8.5.1"

instance_type_anti_affinity_required

Description: Whether to enable anti-affinity to prevent pods from being scheduled on the same instance type. Defaults to true iff sla_target >= 2.

Type: bool

Default: null

log_level

Description: The log level for the NATS pods. Must be one of: info, debug, trace

Type: string

Default: "info"

max_connections

Description: The maximum number of client connections to the NATS cluster

Type: number

Default: 64000

max_control_line_kb

Description: The maximum length of a protocol line including combined length of subject and queue group (in KB).

Type: number

Default: 4

max_outstanding_catchup_mb

Description: The maximum in-flight bytes for stream catch-up.

Type: number

Default: 128

max_payload_mb

Description: The maximum size of a message payload (in MB).

Type: number

Default: 8

minimum_memory_mb

Description: The minimum memory in Mb to use for the NATS nodes

Type: number

Default: 50

monitoring_enabled

Description: Whether to allow monitoring CRs to be deployed in the namespace

Type: bool

Default: false

node_image_cached_enabled

Description: Whether to add the container images to the node image cache for faster startup times

Type: bool

Default: true

panfactum_scheduler_enabled

Description: Whether to use the Panfactum pod scheduler with enhanced bin-packing

Type: bool

Default: true

persistence_backups_enabled

Description: Whether to enable backups of the NATS durable storage.

Type: bool

Default: true

persistence_initial_storage_gb

Description: How many GB to initially allocate for persistent storage (will grow automatically as needed). Can only be set on cluster creation.

Type: number

Default: 1

persistence_storage_class_name

Description: The StorageClass to use for the PVs used to store filesystem data. Can only be set on cluster creation.

Type: string

Default: "ebs-standard-retained"

persistence_storage_increase_gb

Description: The amount of GB to increase storage by if free space drops below the threshold

Type: number

Default: 1

persistence_storage_increase_threshold_percent

Description: Dropping below this percent of free storage will trigger an automatic increase in storage size

Type: number

Default: 20

persistence_storage_limit_gb

Description: The maximum number of gigabytes of storage to provision for each NATS node

Type: number

Default: null

ping_interval_seconds

Description: Interval in seconds at which pings are sent to clients, leaf nodes, and routes.

Type: number

Default: 20

pull_through_cache_enabled

Description: Whether to use the ECR pull through cache for the deployed images

Type: bool

Default: false

spot_nodes_enabled

Description: Whether the database pods can be scheduled on spot nodes

Type: bool

Default: true

vault_credential_lifetime_hours

Description: The lifetime of database credentials generated by Vault

Type: number

Default: 16

vault_internal_pki_backend_mount_path

Description: The mount path of the PKI backend for internal certificates.

Type: string

Default: "pki/internal"

vault_internal_url

Description: The internal URL of the Vault cluster.

Type: string

Default: "http://vault-active.vault.svc.cluster.local:8200"

voluntary_disruption_window_cron_schedule

Description: The times when disruption windows should start

Type: string

Default: "0 4 * * *"

voluntary_disruption_window_enabled

Description: Whether to confine voluntary disruptions of pods in this module to specific time windows

Type: bool

Default: false

voluntary_disruption_window_seconds

Description: The length of the disruption window in seconds

Type: number

Default: 3600

voluntary_disruptions_enabled

Description: Whether to enable voluntary disruptions of pods in this module.

Type: bool

Default: true

vpa_enabled

Description: Whether the VPA resources should be enabled

Type: bool

Default: true

write_deadline_seconds

Description: The maximum number of seconds the server will block when writing messages to consumers.

Type: number

Default: 55

Outputs

The following outputs are exported:

admin_creds_secret

Description: The name of the Kubernetes Secret holding certificate credentials for the admin role in the NATS cluster

client_port

Description: The port that NATS clients should connect to.

cluster_port

Description: The port that NATS uses for internal cluster communication.

host

Description: The NATS cluster hostname to connect to,

metrics_port

Description: The port that Prometheus metrics is served on.

reader_creds_secret

Description: The name of the Kubernetes Secret holding certificate credentials for the reader role in the NATS cluster

superuser_creds_secret

Description: The name of the Kubernetes Secret holding certificate credentials for the superuser role in the NATS cluster

Usage

No notes