Panfactum LogoPanfactum
Infrastructure ModulesDirect ModulesKuberneteskube_monitoring
kube_monitoring
Alpha
Direct
Source Code Link

Panfactum Monitoring Stack

This module provides a deployment of the kube-prometheus-stack. It includes:

  • The Prometheus Operator

  • Grafana

  • Alert Manager

Providers

The following providers are needed by this module:

  • aws (5.70.0)

  • helm (2.12.1)

  • kubectl (2.0.4)

  • kubernetes (2.27.0)

  • pf (0.0.3)

  • random (3.6.0)

  • time (0.10.0)

  • vault (3.25.0)

Required Inputs

The following input variables are required:

eks_cluster_name

Description: The name of the EKS cluster.

Type: string

grafana_domain

Description: The domain on which to expose Grafana.

Type: string

vault_domain

Description: The domain of the Vault instance running in the cluster.

Type: string

Optional Inputs

The following input variables are optional (have default values):

additional_tracked_resource_labels

Description: Kubernetes resource labels to include in metric labels

Type: list(string)

Default: []

additional_tracked_resources

Description: Additional Kubernetes resources to track in kube-state-metrics

Type: list(string)

Default: []

alertmanager_local_storage_initial_size_gb

Description: Number of GB to use for the local alertmanager storage (before autoscaled)

Type: number

Default: 2

alertmanager_log_level

Description: The log level for the alertmanager pods

Type: string

Default: "warn"

alertmanager_storage_class_name

Description: The storage class to use for local alertmanager storage

Type: string

Default: "ebs-standard"

aws_iam_ip_allow_list

Description: A list of IPs that can use the service account token to authenticate with AWS API

Type: list(string)

Default: []

enhanced_ha_enabled

Description: Whether to add extra high-availability scheduling constraints at the trade-off of increased cost

Type: bool

Default: true

grafana_basic_auth_enabled

Description: Whether to enable username and password authentication. Should only be enabled during debugging.

Type: bool

Default: true

grafana_db_recovery_directory

Description: The name of the directory in the backup bucket that contains the Grafana PostgreSQL backups and WAL archives

Type: string

Default: null

grafana_db_recovery_mode_enabled

Description: Whether to enable recovery mode for the Grafana PostgreSQL database

Type: bool

Default: false

grafana_db_recovery_target_time

Description: If provided, will recover the Grafana PostgreSQL database to the indicated target time in RFC 3339 format rather than to the latest data.

Type: string

Default: null

grafana_log_level

Description: The log level for the grafana pods

Type: string

Default: "error"

ingress_enabled

Description: Whether or not to enable the ingress for routing public traffic to prometheus stack components

Type: bool

Default: false

kube_api_server_monitoring_enabled

Description: Whether to enable monitoring of the API server

Type: bool

Default: false

kube_prometheus_stack_version

Description: The version of the kube-prometheus-stack to deploy

Type: string

Default: "58.5.3"

loki_chart_version

Description: The version of the grafana/loki helm chart to deploy

Type: string

Default: "6.6.2"

loki_storage_class_name

Description: The storage class to use for local loki storage

Type: string

Default: "ebs-standard"

metrics_retention_resolution_1h

Description: Number of days 1h metrics resolution should be kept

Type: number

Default: 1825

metrics_retention_resolution_5m

Description: Number of days 5m metrics resolution should be kept

Type: number

Default: 90

metrics_retention_resolution_raw

Description: Number of days the raw metrics resolution should be kept

Type: number

Default: 15

monitoring_enabled

Description: Whether to add active monitoring to the deployed systems

Type: bool

Default: false

monitoring_etcd_enabled

Description: Whether to monitor the Kubernetes API server's etcd instances. Only enable for debugging purposes as it contains a huge amount of metrics.

Type: bool

Default: false

panfactum_scheduler_enabled

Description: Whether to use the Panfactum pod scheduler with enhanced bin-packing

Type: bool

Default: true

prometheus_default_scrape_interval_seconds

Description: The default interval between prometheus scrapes (in seconds)

Type: number

Default: 60

prometheus_local_storage_initial_size_gb

Description: Number of GB to use for the local prometheus storage (before autoscaled)

Type: number

Default: 2

prometheus_log_level

Description: The log level for the prometheus pods

Type: string

Default: "warn"

prometheus_operator_log_level

Description: The log level for the prometheus operator pods

Type: string

Default: "warn"

prometheus_storage_class_name

Description: The storage class to use for local prometheus storage

Type: string

Default: "ebs-standard"

pull_through_cache_enabled

Description: Whether to use the ECR pull through cache for the deployed images

Type: bool

Default: true

thanos_bucket_web_domain

Description: Domain to host the Thanos bucket web UI on. If not provided, will be on the same subdomain as grafana but with the thanos-bucket identifier (thanos-bucket.<grafana-subdomain>)

Type: string

Default: null

thanos_bucket_web_enable

Description: Whether to enable the web dashboard for the Thanos bucket analyzer which can show debugging information about your metrics data

Type: bool

Default: true

thanos_chart_version

Description: The version of the bitnami/thanos helm chart to deploy

Type: string

Default: "15.4.7"

thanos_compactor_disk_storage_gb

Description: Number of GB for the ephemeral thanos compactor disk. See https://thanos.io/tip/components/compact.md/#disk

Type: number

Default: 100

thanos_compactor_storage_class_name

Description: The storage class to use for the thanos compactor local storage

Type: string

Default: "ebs-standard"

thanos_image_version

Description: The version of thanos images to use

Type: string

Default: "v0.35.0"

thanos_log_level

Description: The log level for the thanos pods

Type: string

Default: "warn"

thanos_ruler_storage_class_name

Description: The storage class to use for the thanos ruler local storage

Type: string

Default: "ebs-standard"

thanos_store_gateway_storage_class_name

Description: The storage class to use for the thanos store gateway local storage

Type: string

Default: "ebs-standard"

vpa_enabled

Description: Whether the VPA resources should be enabled

Type: bool

Default: true

Outputs

The following outputs are exported:

bucket_web_url

Description: n/a

grafana_admin_password

Description: n/a

grafana_db_recovery_directory

Description: The name of the directory in the backup bucket that contains the Grafana PostgreSQL backups and WAL archives

namespace

Description: n/a

thanos_query_frontend_url

Description: n/a

Usage

No notes