Kubernetes Job
Provides a production-hardened instance of a Kubernetes Job with the following enhancements:
- Standardized resource labels
- Pod and container security hardening
- Temporary directory mounting
- ConfigMap and Secret mounting
- Downward-API integrations
- Service account configuration with default permissions
- Integration with the Panfactum bin-packing scheduler
- High-availability scheduling constraints
- Readiness and liveness probe configurations
- Automatic reloading via the Reloader
- Vertical pod autoscaling
- Pod disruption budget
- Toleration switches for the various Panfactum node classes
Usage
Basics
This module follows the basic workload deployment patterns describe in this guide.
Scheduling
This module has inputs that map one-for-one to the standard Kubernetes Job API.
Providers
The following providers are needed by this module:
kubectl (2.1.3)
kubernetes (2.34.0)
pf (0.0.7)
random (3.6.3)
Required Inputs
The following input variables are required:
containers
Description: A list of container configurations for the Pod
Type:
list(object({
name = string # A unique name for the container within the pod
init = optional(bool, false) # Iff true, the container will be an init container
image_registry = string # The URL for a container image registry (e.g., docker.io)
image_repository = string # The path to the image repository within the registry (e.g., library/nginx)
image_tag = string # The tag for a specific image within the repository (e.g., 1.27.1)
image_pin_enabled = optional(bool, true) # Whether the image should be pinned to every node regardless of whether the container is running or not (speeds up startup times)
image_prepull_enabled = optional(bool, true) # Whether the image will be prepulled to nodes when the nodes are first created (speeds up startup times)
command = list(string) # The command to be run as the root process inside the container
working_dir = optional(string, null) # The directory the command will be run in. If left null, will default to the working directory set by the image
image_pull_policy = optional(string, "IfNotPresent") # Sets the container's ImagePullPolicy
minimum_memory = optional(number, 100) #The minimum amount of memory in megabytes
maximum_memory = optional(number, null) #The maximum amount of memory in megabytes
memory_limit_multiplier = optional(number, 1.3) # memory limits = memory request x this value
minimum_cpu = optional(number, 10) # The minimum amount of cpu millicores
maximum_cpu = optional(number, null) # The maximum amount of cpu to allow (in millicores)
privileged = optional(bool, false) # Whether to allow the container to run in privileged mode
run_as_root = optional(bool, false) # Whether to run the container as root
uid = optional(number, 1000) # user to use when running the container if not root
linux_capabilities = optional(list(string), []) # Default is drop ALL
readonly = optional(bool, true) # Whether to use a readonly file system
env = optional(map(string), {}) # Environment variables specific to the container
}))
name
Description: The name of this Job
Type: string
namespace
Description: The namespace the Job should be deployed to
Type: string
Optional Inputs
The following input variables are optional (have default values):
active_deadline_seconds
Description: Specifies the duration in seconds relative to the startTime that the job may be continuously active before the system tries to terminate it; value must be positive integer.
Type: number
Default: 86400
arm_nodes_enabled
Description: Whether to allow Pods to schedule on arm64 nodes
Type: bool
Default: true
backoff_limit
Description: Specifies the number of retries before marking the Job failed.
Type: number
Default: 1
burstable_nodes_enabled
Description: Whether to allow Pods to schedule on burstable nodes
Type: bool
Default: true
cilium_required
Description: True iff the Cilium CNI is required to be installed on a node prior to scheduling on it
Type: bool
Default: true
common_env
Description: Key pair values of the environment variables for each container
Type: map(string)
Default: {}
common_env_from_config_maps
Description: Environment variables that are sourced from existing Kubernetes ConfigMaps. The keys are the environment variables names and the values are the ConfigMap references.
Type:
map(object({
config_map_name = string
key = string
}))
Default: {}
common_env_from_secrets
Description: Environment variables that are sourced from existing Kubernetes Secrets. The keys are the environment variables names and the values are the Secret references.
Type:
map(object({
secret_name = string
key = string
}))
Default: {}
common_secrets
Description: Key pair values of secrets to add to the containers as environment variables
Type: map(string)
Default: {}
concurrency_policy
Description: Specifies how to treat concurrent executions of a Job.
Type: string
Default: "Forbid"
config_map_mounts
Description: A mapping of ConfigMap names to their mount configuration in the containers of the Job
Type:
map(object({
mount_path = string # Where in the containers to mount the ConfigMap
optional = optional(bool, false) # Whether the Pod can launch if this ConfigMap does not exist
sub_paths = optional(list(string), []) # Only mount these keys of the ConfigMap (will mount at `${mount_path}/${sub_path}`)
}))
Default: {}
controller_nodes_enabled
Description: Whether to allow pods to schedule on EKS Node Group nodes (controller nodes)
Type: bool
Default: false
disruptions_enabled
Description: Whether to enable disrupting the Pods in the middle of execution.
Type: bool
Default: false
dns_policy
Description: The DNS policy for the Pod
Type: string
Default: "ClusterFirst"
extra_annotations
Description: A map of extra annotations that will be added to the Job (not the pods)
Type: map(string)
Default: {}
extra_labels
Description: A map of extra labels that will be added to the Job (not the pods)
Type: map(string)
Default: {}
extra_pod_annotations
Description: Annotations to add to the pods in the Job
Type: map(string)
Default: {}
extra_pod_failure_policy_rules
Description: Specifies when a pod is marked as failed. See failure policy docs
Type:
list(object({
action = optional(string, "Ignore")
on_pod_conditions = optional(list(object({ type = string })), [])
on_exit_codes = optional(object({
container_name = optional(string)
operator = optional(string, "In")
values = list(number)
}), null)
}))
Default: []
extra_pod_labels
Description: Extra pod labels to use
Type: map(string)
Default: {}
extra_tolerations
Description: Extra tolerations to add to the Pods
Type:
list(object({
key = optional(string)
operator = string
value = optional(string)
effect = optional(string)
}))
Default: []
linkerd_enabled
Description: True iff the Linkerd sidecar should be injected into the pods
Type: bool
Default: true
linkerd_required
Description: True iff the Linkerd CNI is required to be installed on a node prior to scheduling on it
Type: bool
Default: true
mount_owner
Description: The ID of the group that owns the mounted volumes
Type: number
Default: 1000
node_image_cached_enabled
Description: Whether to add the container images to the node image cache for faster startup times
Type: bool
Default: true
node_preferences
Description: Node label preferences for the Pods
Type: map(object({ weight = number, operator = string, values = list(string) }))
Default: {}
node_requirements
Description: Node label requirements for the Pods
Type: map(list(string))
Default: {}
panfactum_scheduler_enabled
Description: Whether to use the Panfactum Pod scheduler with enhanced bin-packing
Type: bool
Default: true
pod_completions
Description: Specifies the desired number of successfully finished Pods the Job should be run with.
Type: number
Default: 1
pod_parallelism
Description: Specifies the maximum desired number of Pods the Job should run at any given time.
Type: number
Default: 1
pod_version_labels_enabled
Description: Whether to add version labels to the Pod. Useful for ensuring pods do not get recreated on frequent updates.
Type: bool
Default: true
priority_class_name
Description: The priority class to use for Pods in the Job
Type: string
Default: null
pull_through_cache_enabled
Description: Whether to use the ECR pull through cache for the deployed images
Type: bool
Default: true
secret_mounts
Description: A mapping of Secret names to their mount configuration in the containers of the Job
Type:
map(object({
mount_path = string # Where in the containers to mount the Secret
optional = optional(bool, false) # Whether the Pod can launch if this Secret does not exist
sub_paths = optional(list(string), []) # Only mount these keys of the secret (will mount at `${mount_path}/${sub_path}`)
}))
Default: {}
spot_nodes_enabled
Description: Whether to allow Pods to schedule on spot nodes
Type: bool
Default: true
starting_deadline_seconds
Description: Optional deadline in seconds for starting the job if it misses scheduled time for any reason. Missed jobs executions will be counted as failed ones.
Type: number
Default: 900
termination_grace_period_seconds
Description: The number of seconds to wait for graceful termination before forcing termination
Type: number
Default: 30
tmp_directories
Description: A mapping of temporary directory names (arbitrary) to their configuration
Type:
map(object({
mount_path = string # Where in the containers to mount the temporary directories
size_mb = optional(number, 100) # The number of MB to allocate for the directory
node_local = optional(bool, false) # If true, the temporary storage will come from the host node rather than a PVC
}))
Default: {}
ttl_seconds_after_finished
Description: limits the lifetime of a Job that has finished execution (either Complete or Failed). After this time, it is eligible to be automatically deleted.
Type: number
Default: 600
vpa_enabled
Description: Whether to enable the Vertical Pod Autoscaler
Type: bool
Default: true
wait_for_success
Description: True iff you want to wait for Job success before completing the apply
Type: bool
Default: true
Outputs
The following outputs are exported:
labels
Description: The default labels assigned to all resources in this Workflow
match_labels
Description: The labels unique to this Workflow that can be used to select any pods in this Workflow
service_account_name
Description: The service account used for the pods
Maintainer Notes
No notes