Airbyte
This module deploys Airbyte onto a Kubernetes cluster with a focus on AWS infrastructure, though it can be adapted for other cloud providers.
Scope and Connectors
This module only deploys the core Airbyte engine components required for the platform to function. It does not include or configure any source or destination connectors, which must be installed separately after deployment. The Airbyte platform provides a connector catalog within its user interface where administrators can install the specific connectors needed for their data integration workflows.
To install connectors:
- After deployment, log in to the Airbyte UI using the credentials provided
- Navigate to the “Sources” or “Destinations” section
- Search for and install the required connectors from the catalog
For custom connector development, this module includes the Connector Builder Server component, which provides a development environment for creating and testing custom connectors to meet specialized integration needs.
If you need to pre-install specific connectors or automate connector configuration, consider implementing additional Terraform modules that interact with the Airbyte API after core deployment is complete.
Usage
Create a new directory adjacent to your
aws_eks
module calledkube_airbyte
.Add a
terragrunt.hcl
file to the directory that looks like this:include "panfactum" {path = find_in_parent_folders("panfactum.hcl")expose = true}terraform {source = include.panfactum.locals.pf_stack_source}dependency "vault" {config_path = "../kube_vault"}inputs = {vault_domain = dependency.vault.outputs.vault_domain# Must be domain available to the cluster# Example: airbyte.prod.panfactum.comdomain = "REPLACE_ME"# Must be an email address that you have access to# Example: james@panfactum.comadmin_email = "REPLACE_ME"}Run
pf-tf-init
to enable the required providersRun
terragrunt apply
.
Authentication
The module uses Vault for authentication when ingress is enabled, providing secure access to the Airbyte UI.
Providers
The following providers are needed by this module:
Required Inputs
The following input variables are required:
admin_email
Description: Email for the admin user when auth is enabled
Type: string
domain
Description: The domain to access Airbyte (e.g., airbyte.example.com)
Type: string
vault_domain
Description: The domain where Vault is accessible
Type: string
Optional Inputs
The following input variables are optional (have default values):
airbyte_edition
Description: The edition of Airbyte to deploy (community or enterprise)
Type: string
Default: "community"
airbyte_helm_version
Description: The version of the Airbyte Helm chart to deploy
Type: string
Default: "1.5.1"
airbyte_version
Description: The version of Airbyte to deploy (for image caching)
Type: string
Default: "1.5.1"
arm_nodes_enabled
Description: Whether to allow scheduling on arm nodes
Type: bool
Default: true
aws_iam_ip_allow_list
Description: List of IPs to allow for AWS IAM access
Type: list(string)
Default: []
burstable_nodes_enabled
Description: Whether to allow scheduling on burstable nodes
Type: bool
Default: true
connected_s3_bucket_arns
Description: List of S3 bucket ARNs that airbyte will use as connector destinations
Type: list(string)
Default: []
connector_builder_min_cpu_millicores
Description: The minimum amount of cpu millicores for connector builder containers
Type: number
Default: 25
connector_min_builder_memory_mb
Description: Memory request for connector builder containers
Type: number
Default: 300
controller_nodes_enabled
Description: Whether to allow scheduling on controller nodes
Type: bool
Default: false
cron_min_cpu_millicores
Description: The minimum amount of cpu millicores for cron containers
Type: number
Default: 25
cron_min_memory_mb
Description: Memory request for cron containers
Type: number
Default: 368
db_backup_directory
Description: Directory to store database backups (if enabled)
Type: string
Default: "initial"
db_recovery_directory
Description: Directory to restore database from (if recovery mode enabled)
Type: string
Default: null
db_recovery_mode_enabled
Description: Whether to enable recovery mode for the database
Type: bool
Default: false
db_recovery_target_time
Description: Target recovery time for the database (if recovery mode enabled)
Type: string
Default: null
helm_timeout_seconds
Description: The timeout in seconds for Helm operations
Type: number
Default: 600
ingress_enabled
Description: Whether to enable the ingress for Airbyte
Type: bool
Default: true
jobs_cpu_min_millicores
Description: The minimum amount of cpu millicores for jobs containers
Type: number
Default: 100
jobs_env_env
Description: Additional environment variables for Airbyte jobs configuration (e.g. SYNC_JOB_MAX_ATTEMPTS, JOB_MAIN_CONTAINER_MEMORY_LIMIT, etc.) https://docs.airbyte.com/operator-guides/configuring-airbyte#jobs
Type: map(string)
Default: {}
jobs_min_memory_mb
Description: Memory request for jobs containers
Type: number
Default: 1024
jobs_sync_job_retries_complete_failures_backoff_base
Description: Defines the exponential base of the backoff interval between failed attempts in which no data was synchronized.
Type: number
Default: 2
jobs_sync_job_retries_complete_failures_backoff_max_interval_s
Description: Defines the maximum backoff interval in seconds between failed attempts in which no data was synchronized.
Type: number
Default: 3600
jobs_sync_job_retries_complete_failures_backoff_min_interval_s
Description: Defines the minimum backoff interval in seconds between failed attempts in which no data was synchronized.
Type: number
Default: 60
jobs_sync_job_retries_complete_failures_max_successive
Description: Defines the max number of successive attempts in which no data was synchronized before failing the job.
Type: number
Default: 3
jobs_sync_job_retries_complete_failures_max_total
Description: Defines the max number of attempts in which no data was synchronized before failing the job.
Type: number
Default: 30
jobs_sync_job_retries_partial_failures_max_successive
Description: Defines the max number of attempts in which some data was synchronized before failing the job.
Type: number
Default: 3
jobs_sync_job_retries_partial_failures_max_total
Description: Defines the max number of attempts in which some data was synchronized before failing the job.
Type: number
Default: 30
jobs_sync_max_timeout_days
Description: Defines the number of days a sync job will execute for before timing out.
Type: number
Default: 1
license_key
Description: License key for Airbyte Enterprise
Type: string
Default: ""
log_level
Description: The log level for Airbyte components
Type: string
Default: "WARN"
monitoring_enabled
Description: Whether to enable monitoring for Airbyte
Type: bool
Default: false
namespace
Description: The namespace to deploy Airbyte into
Type: string
Default: "airbyte"
node_image_cached_enabled
Description: Whether to enable node image caching
Type: bool
Default: true
panfactum_scheduler_enabled
Description: Whether to enable the Panfactum scheduler
Type: bool
Default: true
pg_initial_storage_gb
Description: The initial storage for PostgreSQL in GB
Type: number
Default: 20
pg_max_cpu_millicores
Description: The maximum amount of cpu to allocate to the postgres pods (in millicores)
Type: number
Default: 10000
pg_max_memory_mb
Description: The maximum amount of memory to allocate to the postgres pods (in Mi)
Type: number
Default: 128000
pg_min_cpu_millicores
Description: The minimum amount of cpu to allocate to the postgres pods (in millicores)
Type: number
Default: 50
pg_min_cpu_update_millicores
Description: The CPU settings for the Postgres won’t be updated until the recommendations from the VPA (if enabled) differ from the current settings by at least this many millicores. This prevents autoscaling thrash.
Type: number
Default: 250
pg_min_memory_mb
Description: The minimum amount of memory to allocate to the postgres pods (in Mi)
Type: number
Default: 500
pgbouncer_max_cpu_millicores
Description: The maximum amount of cpu to allocate to the pgbouncer pods (in millicores)
Type: number
Default: 10000
pgbouncer_max_memory_mb
Description: The maximum amount of memory to allocate to the pgbouncer pods (in Mi)
Type: number
Default: 32000
pgbouncer_min_cpu_millicores
Description: The minimum amount of cpu to allocate to the pgbouncer pods (in millicores)
Type: number
Default: 15
pgbouncer_min_memory_mb
Description: The minimum amount of memory to allocate to the pgbouncer pods (in Mi)
Type: number
Default: 25
pod_annotations
Description: Additional pod annotations to add to all pods
Type: map(string)
Default: {}
pod_min_sweeper_memory_mb
Description: Memory request for pod sweeper containers
Type: number
Default: 32
pod_sweeper_min_cpu_millicores
Description: The minimum amount of cpu millicores for pod sweeper containers
Type: number
Default: 10
pull_through_cache_enabled
Description: Whether to enable pull-through cache for container images
Type: bool
Default: true
server_min_cpu_millicores
Description: The minimum amount of cpu millicores for server containers
Type: number
Default: 50
server_min_memory_mb
Description: Memory request for server containers
Type: number
Default: 512
sla_target
Description: SLA target level (1-3) affecting high availability settings
Type: number
Default: 1
spot_nodes_enabled
Description: Whether to allow scheduling on spot nodes
Type: bool
Default: true
temporal_db_max_conns
Description: Maximum number of connections for Temporal database (SQL_MAX_CONNS)
Type: number
Default: 100
temporal_db_max_idle_conns
Description: Maximum number of idle connections for Temporal database (SQL_MAX_IDLE_CONNS)
Type: number
Default: 20
temporal_min_cpu_millicores
Description: The minimum amount of cpu millicores for temporal containers
Type: number
Default: 150
temporal_min_memory_mb
Description: Memory request for temporal containers
Type: number
Default: 512
vpa_enabled
Description: Whether to enable Vertical Pod Autoscaler
Type: bool
Default: true
wait
Description: Whether to wait for resources to be created before completing
Type: bool
Default: true
webapp_min_cpu_millicores
Description: The minimum amount of cpu millicores webapp containers
Type: number
Default: 50
webapp_min_memory_mb
Description: Memory request for webapp containers
Type: number
Default: 128
worker_min_cpu_millicores
Description: The minimum amount of cpu millicores for worker containers
Type: number
Default: 100
worker_min_memory_mb
Description: Memory request for worker containers
Type: number
Default: 512
worker_replicas
Description: Number of worker replicas
Type: number
Default: 1
workload_api_min_server_memory_mb
Description: Memory request for workload API server containers
Type: number
Default: 325
workload_api_server_min_cpu_millicores
Description: The minimum amount of cpu millicores for workload API server containers
Type: number
Default: 25
workload_launcher_min_cpu_millicores
Description: The minimum amount of cpu millicores for workload launcher containers
Type: number
Default: 25
workload_min_launcher_memory_mb
Description: Memory request for workload launcher containers
Type: number
Default: 350
Outputs
The following outputs are exported:
airbyte_config_secret
Description: The name of the Airbyte configuration secret
airbyte_url
Description: The URL to access Airbyte
database_credentials_secret
Description: The name of the secret containing database credentials
ingress_domain
Description: The domain configured for Airbyte ingress
jobs_labels
Description: Labels applied to the jobs pods
namespace
Description: The namespace where Airbyte is deployed
server_labels
Description: Labels applied to the server pods
server_service_name
Description: The name of the Airbyte server service
server_service_port
Description: The port of the Airbyte server service
service_account_name
Description: The name of the Kubernetes service account used by Airbyte pods
temporal_labels
Description: Labels applied to the temporal pods
temporal_service_name
Description: The name of the Airbyte temporal service
temporal_service_port
Description: The port of the Airbyte temporal service
webapp_labels
Description: Labels applied to the webapp pods
webapp_service_name
Description: The name of the Airbyte webapp service
webapp_service_port
Description: The port of the Airbyte webapp service
worker_labels
Description: Labels applied to the worker pods
Maintainer Notes
None.