kube_job

Stable

Submodule

Kubernetes Job

Provides a production-hardened instance of a Kubernetes Job with the following enhancements:

Standardized resource labels
Pod and container security hardening
Temporary directory mounting
ConfigMap and Secret mounting
Downward-API integrations
Service account configuration with default permissions
Integration with the Panfactum bin-packing scheduler
High-availability scheduling constraints
Readiness and liveness probe configurations
Automatic reloading via the Reloader
Vertical pod autoscaling
Pod disruption budget
Toleration switches for the various Panfactum node classes

Usage

Basics

This module follows the basic workload deployment patterns describe in this guide.

Scheduling

This module has inputs that map one-for-one to the standard Kubernetes Job API.

Providers

The following providers are needed by this module:

kubectl (2.1.3)
kubernetes (2.34.0)
pf (0.0.7)
random (3.6.3)

Required Inputs

The following input variables are required:

containers

Description: A list of container configurations for the Pod

Type:

list(object({
    name                    = string                           # A unique name for the container within the pod
    init                    = optional(bool, false)            # Iff true, the container will be an init container
    image_registry          = string                           # The URL for a container image registry (e.g., docker.io)
    image_repository        = string                           # The path to the image repository within the registry (e.g., library/nginx)
    image_tag               = string                           # The tag for a specific image within the repository (e.g., 1.27.1)
    image_pin_enabled       = optional(bool, true)             # Whether the image should be pinned to every node regardless of whether the container is running or not (speeds up startup times)
    image_prepull_enabled   = optional(bool, true)             # Whether the image will be prepulled to nodes when the nodes are first created (speeds up startup times)
    command                 = list(string)                     # The command to be run as the root process inside the container
    working_dir             = optional(string, null)           # The directory the command will be run in. If left null, will default to the working directory set by the image
    image_pull_policy       = optional(string, "IfNotPresent") # Sets the container's ImagePullPolicy
    minimum_memory          = optional(number, 100)            #The minimum amount of memory in megabytes
    maximum_memory          = optional(number, null)           #The maximum amount of memory in megabytes
    memory_limit_multiplier = optional(number, 1.3)            # memory limits = memory request x this value
    minimum_cpu             = optional(number, 10)             # The minimum amount of cpu millicores
    maximum_cpu             = optional(number, null)           # The maximum amount of cpu to allow (in millicores)
    privileged              = optional(bool, false)            # Whether to allow the container to run in privileged mode
    run_as_root             = optional(bool, false)            # Whether to run the container as root
    uid                     = optional(number, 1000)           # user to use when running the container if not root
    linux_capabilities      = optional(list(string), [])       # Default is drop ALL
    readonly                = optional(bool, true)             # Whether to use a readonly file system
    env                     = optional(map(string), {})        # Environment variables specific to the container
  }))

name

Description: The name of this Job

Type: string

namespace

Description: The namespace the Job should be deployed to

Type: string

Optional Inputs

The following input variables are optional (have default values):

active_deadline_seconds

Description: Specifies the duration in seconds relative to the startTime that the job may be continuously active before the system tries to terminate it; value must be positive integer.

Type: number

Default: 86400

arm_nodes_enabled

Description: Whether to allow Pods to schedule on arm64 nodes

Type: bool

Default: true

backoff_limit

Description: Specifies the number of retries before marking the Job failed.

Type: number

Default: 1

burstable_nodes_enabled

Description: Whether to allow Pods to schedule on burstable nodes

Type: bool

Default: true

cilium_required

Description: True iff the Cilium CNI is required to be installed on a node prior to scheduling on it

Type: bool

Default: true

common_env

Description: Key pair values of the environment variables for each container

Type: map(string)

Default: {}

common_env_from_config_maps

Description: Environment variables that are sourced from existing Kubernetes ConfigMaps. The keys are the environment variables names and the values are the ConfigMap references.

Type:

map(object({
    config_map_name = string
    key             = string
  }))

Default: {}

common_env_from_secrets

Description: Environment variables that are sourced from existing Kubernetes Secrets. The keys are the environment variables names and the values are the Secret references.

Type:

map(object({
    secret_name = string
    key         = string
  }))

Default: {}

common_secrets

Description: Key pair values of secrets to add to the containers as environment variables

Type: map(string)

Default: {}

concurrency_policy

Description: Specifies how to treat concurrent executions of a Job.

Type: string

Default: "Forbid"

config_map_mounts

Description: A mapping of ConfigMap names to their mount configuration in the containers of the Job

Type:

map(object({
    mount_path = string                     # Where in the containers to mount the ConfigMap
    optional   = optional(bool, false)      # Whether the Pod can launch if this ConfigMap does not exist
    sub_paths  = optional(list(string), []) # Only mount these keys of the ConfigMap (will mount at `${mount_path}/${sub_path}`)
  }))

Default: {}

controller_nodes_enabled

Description: Whether to allow pods to schedule on EKS Node Group nodes (controller nodes)

Type: bool

Default: false

disruptions_enabled

Description: Whether to enable disrupting the Pods in the middle of execution.

Type: bool

Default: false

dns_policy

Description: The DNS policy for the Pod

Type: string

Default: "ClusterFirst"

extra_annotations

Description: A map of extra annotations that will be added to the Job (not the pods)

Type: map(string)

Default: {}

extra_labels

Description: A map of extra labels that will be added to the Job (not the pods)

Type: map(string)

Default: {}

extra_pod_annotations

Description: Annotations to add to the pods in the Job

Type: map(string)

Default: {}

extra_pod_failure_policy_rules

Description: Specifies when a pod is marked as failed. See failure policy docs

Type:

list(object({
    action            = optional(string, "Ignore")
    on_pod_conditions = optional(list(object({ type = string })), [])
    on_exit_codes = optional(object({
      container_name = optional(string)
      operator       = optional(string, "In")
      values         = list(number)
    }), null)
  }))

Default: []

extra_pod_labels

Description: Extra pod labels to use

Type: map(string)

Default: {}

extra_tolerations

Description: Extra tolerations to add to the Pods

Type:

list(object({
    key      = optional(string)
    operator = string
    value    = optional(string)
    effect   = optional(string)
  }))

Default: []

linkerd_enabled

Description: True iff the Linkerd sidecar should be injected into the pods

Type: bool

Default: true

linkerd_required

Description: True iff the Linkerd CNI is required to be installed on a node prior to scheduling on it

Type: bool

Default: true

mount_owner

Description: The ID of the group that owns the mounted volumes

Type: number

Default: 1000

node_image_cached_enabled

Description: Whether to add the container images to the node image cache for faster startup times

Type: bool

Default: true

node_preferences

Description: Node label preferences for the Pods

Type: map(object({ weight = number, operator = string, values = list(string) }))

Default: {}

node_requirements

Description: Node label requirements for the Pods

Type: map(list(string))

Default: {}

panfactum_scheduler_enabled

Description: Whether to use the Panfactum Pod scheduler with enhanced bin-packing

Type: bool

Default: true

pod_completions

Description: Specifies the desired number of successfully finished Pods the Job should be run with.

Type: number

Default: 1

pod_parallelism

Description: Specifies the maximum desired number of Pods the Job should run at any given time.

Type: number

Default: 1

pod_version_labels_enabled

Description: Whether to add version labels to the Pod. Useful for ensuring pods do not get recreated on frequent updates.

Type: bool

Default: true

priority_class_name

Description: The priority class to use for Pods in the Job

Type: string

Default: null

pull_through_cache_enabled

Description: Whether to use the ECR pull through cache for the deployed images

Type: bool

Default: true

secret_mounts

Description: A mapping of Secret names to their mount configuration in the containers of the Job

Type:

map(object({
    mount_path = string                     # Where in the containers to mount the Secret
    optional   = optional(bool, false)      # Whether the Pod can launch if this Secret does not exist
    sub_paths  = optional(list(string), []) # Only mount these keys of the secret (will mount at `${mount_path}/${sub_path}`)
  }))

Default: {}

spot_nodes_enabled

Description: Whether to allow Pods to schedule on spot nodes

Type: bool

Default: true

starting_deadline_seconds

Description: Optional deadline in seconds for starting the job if it misses scheduled time for any reason. Missed jobs executions will be counted as failed ones.

Type: number

Default: 900

termination_grace_period_seconds

Description: The number of seconds to wait for graceful termination before forcing termination

Type: number

Default: 30

tmp_directories

Description: A mapping of temporary directory names (arbitrary) to their configuration

Type:

map(object({
    mount_path = string                # Where in the containers to mount the temporary directories
    size_mb    = optional(number, 100) # The number of MB to allocate for the directory
    node_local = optional(bool, false) # If true, the temporary storage will come from the host node rather than a PVC
  }))

Default: {}

ttl_seconds_after_finished

Description: limits the lifetime of a Job that has finished execution (either Complete or Failed). After this time, it is eligible to be automatically deleted.

Type: number

Default: 600

vpa_enabled

Description: Whether to enable the Vertical Pod Autoscaler

Type: bool

Default: true

wait_for_success

Description: True iff you want to wait for Job success before completing the apply

Type: bool

Default: true

Outputs

The following outputs are exported:

labels

Description: The default labels assigned to all resources in this Workflow

match_labels

Description: The labels unique to this Workflow that can be used to select any pods in this Workflow

service_account_name

Description: The service account used for the pods

Maintainer Notes

No notes