A production-ready k3s Terraform module for the OCI Always Free tier.
- HA control plane: 3 control-plane nodes with embedded etcd; survives 1 node failure
- Full stack always deployed: cert-manager, Longhorn, ArgoCD + Image Updater, and kured are always installed; they keep the cluster active and prevent idle reclamation
- Separate public/private subnets: k3s nodes have no public IP; only LBs and the optional bastion are internet-facing
- Envoy Gateway ingress (Gateway API): DaemonSet with
system-cluster-criticalpriority andPodDisruptionBudget maxUnavailable: 1; standardHTTPRoute/Gatewayresources; real client IP preservation via NLB transparent mode - Automatic security updates:
unattended-upgrades+ kured drain-reboot-uncordon cycle; zero manual intervention (Ubuntu) orzypper patchsystemd timers (openSUSE) - Configurable OS (
os_family): Ubuntu 24.04 LTS (default, OCI-native image auto-resolved) or openSUSE Leap 16.0 (custom-imported UEFI image viascripts/import-opensuse-aarch64.sh) - k3s version pinned at plan time: resolved from the GitHub API during
terraform plan, not at boot time - Cluster-scoped IAM: dynamic group and policy scoped to nodes tagged with the cluster name, not every instance in the compartment
- Idempotent cloud-init: all
kubectloperations useapply; re-provisioning is safe - Monitoring (
grafana_hostname): kube-prometheus-stack (Prometheus + Grafana + Alertmanager) always deployed; optional public Grafana UI viagrafana_hostname; PrometheusRules for node disk pressure and Longhorn volume health - Direct SSH via NLB (
expose_ssh = true): expose port 22 on the public NLB restricted tomy_public_ip_cidr; eliminates the need for OCI Bastion sessions for day-to-day access - OCI Vault (
enable_vault = true): cluster secrets in a free software-protected OCI Vault; fetched at boot via instance_principal, not embedded in user-data - Boot volume backups (
enable_backup = true): weekly full backups, 1-week retention, within the 5-backup Always Free limit - Object Storage state bucket (
enable_object_storage_state = true): versioned OCI Object Storage for Terraform state; S3-compatible endpoint interraform_state_backendoutput - OCI Notifications + Alertmanager (
enable_notifications = false): opt-in OCI Notifications topic wired to Alertmanager as a webhook receiver - MySQL HeatWave (
enable_mysql = false): opt-in Always Free MySQL DB in the private subnet; credentials pre-created as a Kubernetes Secret - External DNS (
enable_external_dns = false): automatic Cloudflare DNS record management from HTTPRoute hostnames - External Secrets (
enable_external_secrets = false): sync OCI Vault secrets into Kubernetes Secrets via instance_principal; no credentials to rotate
graph TD
Internet(["🌐 Internet"])
subgraph public["Public Subnet · 10.0.0.0/24"]
NLB["🔀 Public NLB (Always Free)
HTTP :80 · HTTPS :443
optional: kubeapi :6443 · SSH :22"]
end
subgraph private["Private Subnet · 10.0.1.0/24 · no public IPs"]
ILB["⚖️ Internal Flex LB (Always Free)
kubeapi VIP :6443"]
subgraph cp["Control Plane × 3 · A1.Flex (1 OCPU / 6 GB each)
k3s-server · etcd · Envoy Gateway · Longhorn · user workloads"]
CP0["control-plane-0"]
CP1["control-plane-1"]
CP2["control-plane-2"]
end
W["worker-0 · A1.Flex (1 OCPU / 6 GB)
k3s-agent · Envoy Gateway · Longhorn · user workloads"]
end
NAT["🌍 NAT Gateway (Always Free)"]
Bastion["🔐 OCI Bastion Service
optional · Always Free"]
Internet -->|HTTP / HTTPS| NLB
NLB -->|"Envoy Gateway NodePorts :30080 / :30443"| CP0 & CP1 & CP2 & W
NLB -. "kubeapi :6443
expose_kubeapi=true" .-> ILB
NLB -. "SSH :22
expose_ssh=true" .-> CP0 & CP1 & CP2 & W
ILB --> CP0 & CP1 & CP2
W -->|joins via kubeapi| ILB
private -->|outbound| NAT --> Internet
Bastion -. "SSH tunnel
enable_bastion=true" .-> private
All four A1.Flex instances live in a private subnet with no public IPs. Internet traffic enters exclusively through two Always Free load balancers.
k3s naming note: k3s calls control-plane nodes "servers" (
k3s server) and workers "agents" (k3s agent). Terraform resources follow k3s conventions (server/worker); in standard Kubernetes terminology these map to control-plane and worker nodes.
Public NLB forwards HTTP/HTTPS directly to Envoy Gateway NodePorts on all four nodes. is_preserve_source = true preserves real client IPs at the hypervisor level. The NLB optionally exposes the Kubernetes API on port 6443, restricted to your IP.
Internal Flex LB provides a stable private VIP across all three control-plane nodes. Workers join via this VIP so the cluster survives any single control-plane loss.
Longhorn runs on all four nodes with defaultReplicaCount=2; each PVC is replicated across two nodes. For critical PVCs that must survive two simultaneous node losses, use the longhorn-replicated-3 StorageClass (gitops/longhorn/storageclasses/). Control-plane NoSchedule taints are removed after cluster init so user workloads schedule across all four identically-sized nodes.
HA ceiling: etcd runs on the 3 control-plane nodes (quorum = 2). The cluster tolerates 1 control-plane failure, the hard limit of a 4-node Always Free topology.
# 1. Clone the repo
git clone https://github.com/mbologna/k3s-oci.git
cd k3s-oci
# 2. Copy and edit the variables file
cp example/terraform.tfvars.example example/terraform.tfvars
$EDITOR example/terraform.tfvars
# 3. Init and apply (terraform or tofu both work)
cd example && tofu init && tofu applyA Justfile is included for common operations (requires just):
just init # tofu init in example/
just plan # tofu plan in example/
just apply # tofu apply in example/
just kubeconfig # fetch kubeconfig via OCI Bastion
just ssh worker # SSH into a node (server1/server2/server3/worker)
just fmt # tofu fmt -recursiveAfter terraform apply, run:
terraform output kubeconfig_hintThis prints the exact steps for your configuration. If enable_bastion = true (recommended), the fastest path is the included helper script:
cd example && ./get-kubeconfig.sh
export KUBECONFIG=~/.kube/k3s-oci.yaml
kubectl get nodes
enable_bastiondefaults totrue. It uses OCI Bastion Service, a managed SSH proxy with no VM, no boot volume, and no cost. Without it, nodes are only reachable via OCI serial console (terraform output kubeconfig_hintexplains all options).
Direct SSH (no Bastion): set
expose_ssh = trueto expose port 22 on the public NLB, restricted tomy_public_ip_cidr. After apply:$(terraform output -raw ssh_command)This is faster than Bastion sessions and avoids session TTLs. When using
expose_ssh = trueyou can setenable_bastion = falseto skip the Bastion Service resource entirely.
OCI provides two load balancer products with very different capabilities:
| OCI Network Load Balancer (NLB) | OCI Flexible Load Balancer | |
|---|---|---|
| OSI layer | L4 (TCP passthrough) | L7 (HTTP/HTTPS aware) |
| TLS termination | ❌ Not possible | ✅ Yes |
| Always Free | 1 NLB | 2 × 10 Mbps |
| Used here | nlb.tf: public internet traffic |
lb.tf: internal kubeapi HA VIP |
The public-facing load balancer is the NLB. It forwards raw TCP streams with protocol = "TCP", so it has no knowledge of TLS, HTTP headers, or certificates. TLS must be terminated by something behind it.
The Flexible LB could terminate TLS, but the one free allocation is already consumed by the kubeapi HA load balancer. Even if it were available, using OCI to manage certificates would break the automatic cert-manager + Let's Encrypt renewal cycle.
The current flow is: Internet → NLB (TCP passthrough, preserves client IPs) → Envoy Gateway NodePort → TLS terminate → route to app pod.
No domain needed. Requests to the NLB IP are served directly.
# hello-web.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-web
namespace: hello-web
spec:
replicas: 2
selector:
matchLabels:
app: hello-web
template:
metadata:
labels:
app: hello-web
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: hello-web
containers:
- name: hello-web
image: httpd:alpine
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: hello-web
namespace: hello-web
spec:
selector:
app: hello-web
ports:
- port: 80
targetPort: 80
---
# HTTPRoute — no hostname filter = matches all requests on the http listener
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: hello-web
namespace: hello-web
spec:
parentRefs:
- name: eg
namespace: envoy-gateway-system
sectionName: http
rules:
- backendRefs:
- name: hello-web
port: 80kubectl create namespace hello-web
kubectl apply -f hello-web.yaml
NLB_IP=$(cd example && tofu output -raw nlb_ip)
curl http://$NLB_IP/sslip.io is a public DNS service that resolves <anything>.<ip>.sslip.io directly to <ip>. Combined with cert-manager + Let's Encrypt HTTP-01, this gives a trusted TLS certificate with zero infrastructure cost.
Replace <NLB_IP> with the value of tofu output -raw nlb_ip.
# hello-web-tls.yaml
---
# 1. Certificate — cert-manager issues this via HTTP-01 challenge through Envoy Gateway
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: hello-web-tls
namespace: envoy-gateway-system # must be in the same namespace as the Gateway
spec:
secretName: hello-web-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- hello-web.<NLB_IP>.sslip.io
---
# 2. HTTPS listener on the Gateway (add this to gitops/gateway/gateway.yaml for GitOps management)
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: eg
namespace: envoy-gateway-system
spec:
gatewayClassName: eg
listeners:
- name: http
port: 80
protocol: HTTP
allowedRoutes:
namespaces:
from: All
- name: https-hello-web
port: 443
protocol: HTTPS
hostname: hello-web.<NLB_IP>.sslip.io
tls:
mode: Terminate
certificateRefs:
- name: hello-web-tls
allowedRoutes:
namespaces:
from: All
---
# 3. HTTP→HTTPS redirect (add hostname to gitops/gateway/redirect.yaml)
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: http-to-https-redirect
namespace: envoy-gateway-system
spec:
parentRefs:
- name: eg
sectionName: http
hostnames:
- hello-web.<NLB_IP>.sslip.io
rules:
- filters:
- type: RequestRedirect
requestRedirect:
scheme: https
statusCode: 301
---
# 4. HTTPRoute for the app — attaches to both listeners
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: hello-web
namespace: hello-web
spec:
parentRefs:
- name: eg
namespace: envoy-gateway-system
sectionName: https-hello-web
hostnames:
- hello-web.<NLB_IP>.sslip.io
rules:
- backendRefs:
- name: hello-web
port: 80# Wait for certificate issuance (typically 1–2 minutes)
kubectl wait --for=condition=Ready certificate/hello-web-tls -n envoy-gateway-system --timeout=5m
curl https://hello-web.<NLB_IP>.sslip.io/With a real domain: set
enable_external_dns = trueand annotate the HTTPRoute withexternal-dns.alpha.kubernetes.io/hostname: myapp.example.com. External DNS will create the A record automatically, then cert-manager issues the certificate. Alternatively, setenable_dns01_challenge = trueto use DNS-01 (supports wildcard certs and does not require inbound port 80).
Use topologySpreadConstraints to ensure pod replicas land on different nodes:
spec:
template:
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: <your-app>With 4 identically-sized nodes, 2 replicas survive any single node failure. Envoy Gateway runs as a DaemonSet with maxUnavailable: 1, so ingress remains up on the other 3 nodes throughout any single-node drain or failure.
kube-prometheus-stack (Prometheus, Grafana, Alertmanager) is always deployed as part of the full stack.
Set grafana_hostname in terraform.tfvars to expose the Grafana UI with HTTPS and a Let's Encrypt certificate:
grafana_hostname = "grafana.example.com" # or leave null for auto sslip.io hostnameWhen grafana_hostname is null, Grafana is reachable at grafana.<nlb-ip>.sslip.io (no domain purchase required).
Retrieve the admin credentials after terraform apply:
terraform output -raw grafana_admin_credentialsThe password is generated by Terraform and stored in OCI Vault when enable_vault = true; it is never embedded in cloud-init user-data.
The following PrometheusRules are included out of the box (gitops/monitoring/prometheus-rules.yaml):
| Alert | Condition |
|---|---|
NodeDiskPressure |
Node has disk pressure condition |
NodeDiskSpaceLow |
< 15% free disk on any node |
NodeDiskSpaceCritical |
< 5% free disk on any node |
LonghornVolumeDegraded |
Longhorn volume in degraded state |
LonghornVolumeFaulted |
Longhorn volume in faulted state |
LonghornNodeStorageWarning |
Longhorn node storage > 80% used |
Create a ConfigMap in the monitoring namespace with label grafana_dashboard: "1" — the Grafana sidecar auto-discovers and loads it:
apiVersion: v1
kind: ConfigMap
metadata:
name: my-dashboard
namespace: monitoring
labels:
grafana_dashboard: "1"
data:
my-dashboard.json: |
{ ... } # Grafana dashboard JSONThe gitops/ directory contains ArgoCD Application manifests managed with the App of Apps pattern.
After the cluster is running, bootstrap it:
kubectl apply -n argocd -f gitops/apps/app-of-apps.yamlArgoCD will then continuously reconcile every manifest under gitops/apps/.
This repo is designed to be forked. To add your own apps on top of the built-in stack:
-
Fork this repo on GitHub.
-
Update all
repoURLreferences to point to your fork:bash gitops/update-repo-url.sh https://github.com/your-org/your-fork.git git add gitops/apps/ && git commit -m "chore: update gitops repoURL" git push
-
Add your ArgoCD
Applicationmanifests togitops/apps/— ArgoCD syncs them automatically. Each app can point at any Helm chart registry or any Git repository.
Deploying for the first time? Also set
gitops_repo_urlinterraform.tfvarsbefore runningtofu apply, so cloud-init writes the correct fork URL at bootstrap:gitops_repo_url = "https://github.com/your-org/your-fork.git"Already have a running cluster? Patch the App of Apps directly:
argocd app set app-of-apps --repo https://github.com/your-org/your-fork.git
Private repos: set
gitops_ssh_private_keyinterraform.tfvarswith your SSH private key — Terraform stores it in OCI Vault automatically and cloud-init creates theargocd-repo-gitopsSecret before ArgoCD starts. No manualargocd repo addstep needed. For repos with a non-standard directory layout, setgitops_path(default:gitops/apps).
unattended-upgrades applies Ubuntu security patches daily and sets /var/run/reboot-required when a kernel update needs a reboot.
kured watches every node for /var/run/reboot-required and, when found:
- Acquires a cluster-wide lock (only one node reboots at a time)
- Cordons + drains the node
- Reboots
- Waits for the node to return and uncordons it
This keeps the cluster fully patched with zero manual intervention and no concurrent downtime.
Renovate tracks Terraform providers, k3s, all stack component versions (via # renovate: inline comments in vars.tf and gitops/apps/*.yaml), and GitHub Actions. Enable with the Renovate GitHub App or the self-hosted workflow at .github/workflows/renovate.yml (requires a RENOVATE_TOKEN secret with repo scope).
With enable_object_storage_state = true (the default), a versioned OCI Object Storage bucket is created automatically. After terraform apply, get the ready-to-use backend config:
terraform output -json terraform_state_backendUse it in your terraform { backend "s3" {} } block (requires an OCI Customer Secret Key for S3 credentials):
terraform {
backend "s3" {
bucket = "<cluster_name>-terraform-state"
key = "terraform.tfstate"
region = "<your-region>" # e.g. eu-frankfurt-1
endpoint = "https://<namespace>.compat.objectstorage.<region>.oraclecloud.com"
skip_region_validation = true
skip_credentials_validation = true
skip_metadata_api_check = true
force_path_style = true
}
}Generate OCI Customer Secret Keys under Identity → Users → your user → Customer Secret Keys. The bucket name and namespace endpoint are in
terraform output terraform_state_backend.
| Resource | Free allowance | This module |
|---|---|---|
| A1.Flex compute | 4 OCPUs / 24 GB / 4 instances | 3 servers + 1 worker = 4 OCPUs / 24 GB |
| Block storage | 200 GB | 4 × 50 GB = 200 GB |
| Network Load Balancer | 1 NLB | 1 (public, HTTP/HTTPS) |
| Flexible Load Balancer | 2 × 10 Mbps | 1 (private, kubeapi) |
| E2.1.Micro instances | 2 | 0 (bastion uses OCI Bastion Service, managed, no VM) |
| NAT Gateway | 1 per VCN | 1 (outbound-only for private nodes) |
| Object Storage | 20 GB | 2 versioned buckets: Terraform state + Longhorn PVC backups (enable_object_storage_state, enable_longhorn_backup) |
| Vault (shared) | Software keys + 150 secrets | 3 secrets: k3s_token, longhorn_ui_password, grafana_admin_password (enable_vault = true) |
| Volume backups | 5 total | 4 (one per node, weekly, 1-week retention) (enable_backup = true) |
| Notifications | 1M HTTPS + 3K email/month | 1 topic wired to Alertmanager (enable_notifications = false, opt-in) |
| MySQL HeatWave | 1 standalone DB, 50 GB | 1 DB system in private subnet (enable_mysql = false, opt-in) |
⚠️ Idle reclamation : OCI reclaims Always Free instances where CPU, network, and memory stay below 20% for 7 consecutive days. The full stack (Longhorn, ArgoCD, cert-manager, kured) generates enough background activity to keep the cluster alive.
| Component | Tolerance | What happens on failure |
|---|---|---|
| Any single node (any role) | ✅ 1 node | Workloads reschedule to remaining 3 nodes; Longhorn (2 replicas) keeps storage up; Envoy Gateway DaemonSet keeps ingress up on remaining nodes |
| 2 nodes simultaneously | Workloads and ingress continue on 2 surviving nodes; if both failed nodes are control-planes, etcd quorum is lost and the API server stops accepting writes (running pods keep running, no new scheduling) | |
| etcd / control-plane quorum | ❌ 2 control-planes | Cluster becomes read-only; recovery requires etcd snapshot restore; see Split-Brain Recovery |
| Worker node | ✅ Full | With taints removed, workloads reschedule to control-planes; no SPOF |
| HTTP/HTTPS ingress | ✅ 3 node losses | Envoy Gateway DaemonSet; NLB health-checks remove unhealthy backends automatically |
| Kubernetes API | ✅ 1 control-plane | ILB routes to remaining 2 control-planes |
| PVC data (Longhorn) | ✅ 1 node | 2 replicas across 4 nodes; 1 replica lost, 1 remains serving. Use longhorn-replicated-3 StorageClass for critical PVCs to survive 2 simultaneous losses |
| cert-manager | Pod reschedules within minutes; TLS serving unaffected (certs live in Secrets); only new issuance/renewal is paused | |
| ArgoCD | GitOps sync pauses until rescheduled; running workloads unaffected | |
| MySQL (if enabled) | ❌ None | Always Free tier = single OCI-managed instance; no HA failover |
Each A1.Flex instance has identical resources (1 OCPU / 6 GB RAM). The k3s role (server vs agent) affects which system processes run, not how much resource is available for workloads.
| What | control-plane-0/1/2 | worker-0 | Scheduling mechanism |
|---|---|---|---|
| etcd | ✅ | ❌ | k3s built-in; servers only |
| Kubernetes API server | ✅ | ❌ | k3s built-in; servers only |
| Envoy Gateway (ingress) | ✅ | ✅ | DaemonSet (1 pod per node) |
| Longhorn (storage daemon) | ✅ | ✅ | DaemonSet (1 pod per node) |
| cert-manager | ✅ | ✅ | Deployment: schedules on any node |
| ArgoCD | ✅ | ✅ | Deployment: schedules on any node |
| kube-prometheus-stack | ✅ | ✅ | Deployment/StatefulSet: any node |
| kured | ✅ | ✅ | DaemonSet (1 pod per node) |
| User workloads | ✅ | ✅ | No restrictions — schedules on all 4 nodes |
Why control-planes run user workloads: k3s ≥ 1.24 automatically taints control-plane nodes with
NoSchedule. This setup removes those taints at cluster init so all 4 identically-sized nodes are available. With only one worker, keeping the taint would make it a single point of failure for all user workloads.Recommendation: use
replicas ≥ 2withtopologySpreadConstraints(see gitops/README.md) to spread pods across nodes and survive any single-node failure.
With a hard cap of 4 A1.Flex instances, the binding constraint is etcd quorum: HA etcd needs at minimum 3 nodes (quorum = ⌊n/2⌋+1 = 2). The result is a 3-server HA cluster plus 1 standalone worker that saturates every Always Free resource class with nothing left unused and nothing that costs money.
| Topology | etcd HA | Nodes for workloads | Effective RAM for workloads† | Assessment |
|---|---|---|---|---|
| 3 CP + 1 worker (this module) | ✅ 1-node fault | 4 (taints removed) | ~15 GB | Optimal: HA etcd, all 4 nodes contribute to workloads |
| 1 CP + 3 workers | ❌ CP is total SPOF | 4 | ~18 GB | More capacity but control-plane loss = complete cluster death |
| 2 CP + 2 workers | ❌ Invalid | - | - | 2-node etcd cannot form quorum; worse than 1 node |
| 4 CP + 0 workers | ✅ 1-node fault | 4 (taints removed) | ~12 GB | Fewer resources for workloads; more etcd overhead |
†etcd + kubeapi consume ~300–500 MB RAM and ~100–200m CPU per control-plane node.
4 × 1 OCPU even split prevents any single etcd node from becoming a hot-spot, creates 4 equal fault domains, and allows workloads to spread evenly.
Always Free also includes 2 AMD E2.1.Micro instances. They are not worth adding:
- Storage budget exhausted: 4 × 50 GB boot volumes already consume the full 200 GB Always Free block storage allowance; two additional instances would require at least 100 GB more
- 1 GB RAM: k3s agent + Longhorn DaemonSet alone consume ~700–800 MB, leaving ~200 MB for user workloads
- 1/8 OCPU: negligible compute; adds operational complexity for near-zero workload benefit
| Alternative | Why it was rejected |
|---|---|
| nginx stream proxy in front of Envoy Gateway | Extra latency and complexity; NLB already preserves source IPs directly |
| OCI Bastion VM (E2.1.Micro) | OCI Bastion Service provides managed SSH proxying for free with no VM, no OS to patch, and no boot volume consuming storage budget |
| Boot volumes < 50 GB | OCI hard minimum is 50 GB per shape; 4 × 50 GB = 200 GB exactly exhausts the free block storage allowance |
| Additional NLB for kubeapi | Only 1 NLB is Always Free; the existing NLB conditionally exposes port 6443 via expose_kubeapi = true |
| openSUSE (or other non-Ubuntu Linux) as the base OS | OCI provides no native openSUSE ARM platform image. openSUSE Leap 16.0 is now supported via os_family = "opensuse" + a custom-imported UEFI image. See Choosing an OS below. Other distros remain unsupported. |
The module supports two OS families, selected via os_family:
os_family |
Image | Auto-resolved | SSH user | Auto-updates |
|---|---|---|---|---|
"ubuntu" (default) |
Ubuntu 24.04 LTS (Noble) aarch64 | ✅ Yes (latest OCI-native image) | ubuntu |
unattended-upgrades + needrestart |
"opensuse" |
openSUSE Leap 16.0 Minimal VM aarch64 | ❌ No (must import and set os_image_id) |
sles |
zypper patch systemd timers |
No extra steps needed. The latest Ubuntu 24.04 LTS image for VM.Standard.A1.Flex is resolved automatically at plan time from the tenancy.
OCI has no native openSUSE image. Use the included script to import one before running tofu apply:
./scripts/import-opensuse-aarch64.shThe script:
- Resolves the latest openSUSE Leap 16.0 Minimal VM Cloud aarch64 QCOW2 from
download.opensuse.org - Streams the image (~271 MiB) directly into a temporary OCI Object Storage bucket — no local disk required
- Imports via the OCI REST API with
firmware: UEFI_64andlaunchMode: CUSTOM(the OCI CLI'soci compute image importalways defaults to BIOS;UEFI_64is required forVM.Standard.A1.Flex) - Adds
VM.Standard.A1.Flexshape compatibility - Cleans up the temp Object Storage object
- Prints the image OCID
Then set in terraform.tfvars:
os_family = "opensuse"
os_image_id = "ocid1.image.oc1..." # OCID printed by the script aboveScript options:
--compartment-id OCID Compartment OCID (default: tenancy root)
--region REGION OCI region (default: from ~/.oci/config)
--leap-version VERSION openSUSE Leap version (default: 16.0)
--bucket-name NAME Temp bucket name (default: opensuse-image-import-tmp)
--keep-bucket Do not delete the QCOW2 object after import
--image-name NAME Custom display name for the imported image
Prerequisites: OCI CLI configured (~/.oci/config), curl, python3.
Known caveats (verified with Leap 16.0 + VM.Standard.A1.Flex):
| Caveat | Detail |
|---|---|
| Image must be re-imported on new Leap releases | No auto-update path for the base OS image; re-run the script and update os_image_id when a new build is published |
| UEFI_64 required at import time | OCI's oci compute image import CLI hard-codes firmware: BIOS. The script works around this via a direct REST API call |
| Shape compatibility not auto-detected | OCI does not auto-detect the architecture of imported QCOW2 images; the script adds VM.Standard.A1.Flex explicitly |
| Oracle Cloud Agent (OCA) unavailable | No OCI-native monitoring agent on custom images |
Set os_image_id to the OCID of any OCI image. Only Ubuntu and openSUSE are tested. Any other OS will need its own bootstrap logic — fork the repo and adapt files/lib/bootstrap-ubuntu.sh as a starting point.
A split-brain occurs when multiple k3s server nodes each bootstrap an independent etcd cluster (--cluster-init) instead of joining a single shared one. Symptoms: kubectl get nodes shows only 1 node (not 3), or etcd member IDs differ across servers, or the cluster survives a reboot but each server has different state.
# On each server node (via SSH):
sudo k3s kubectl get nodes # should show all 3 servers
sudo k3s etcd-snapshot ls # should show same snapshots on all servers
/usr/local/bin/k3s etcd-snapshot ls 2>&1 | grep -E "^etcd"
# Check etcd member list (run on each server):
sudo ETCDCTL_API=3 \
ETCDCTL_CACERT=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
ETCDCTL_CERT=/var/lib/rancher/k3s/server/tls/etcd/server-client.crt \
ETCDCTL_KEY=/var/lib/rancher/k3s/server/tls/etcd/server-client.key \
etcdctl member list
# If IDs differ between servers → split-brain confirmed# 1. Identify the best snapshot. List snapshots in OCI Object Storage:
# (if enable_etcd_snapshots = true, snapshots are uploaded every 6h)
oci os object list \
--namespace <your-namespace> \
--bucket-name <cluster-name>-terraform-state \
--prefix "etcd-snapshots/<cluster-name>/" \
--query 'sort_by(data, &"time-created")[-1]."name"' --raw-output
# 2. Download the best snapshot to the elected first server:
oci os object get \
--namespace <your-namespace> \
--bucket-name <cluster-name>-terraform-state \
--name etcd-snapshots/<cluster-name>/<snapshot-file> \
--file /tmp/etcd-restore.db
# 3. Stop k3s on ALL server nodes:
sudo systemctl stop k3s
# 4. On the first server: reset etcd and restore from snapshot.
# WARNING: this wipes all current etcd state on this node.
sudo k3s server --cluster-reset \
--cluster-reset-restore-path=/tmp/etcd-restore.db &
# Wait for the reset to complete (watch journalctl -u k3s), then stop it:
sudo pkill -f "k3s server --cluster-reset"
# 5. On the REMAINING server nodes: wipe local etcd data and re-join.
# WARNING: this wipes all etcd state on these nodes (they will re-sync from step 4).
sudo rm -rf /var/lib/rancher/k3s/server/db/
sudo rm -f /var/lib/rancher/k3s/server/token
# 6. Start k3s on the first server first:
sudo systemctl start k3s
sleep 30 # wait for it to become the etcd leader
# 7. Start k3s on the remaining servers (they will join the restored cluster):
sudo systemctl start k3s # (on each remaining server)
# 8. Verify all members rejoined:
sudo k3s kubectl get nodes# 1. Stop k3s on ALL server nodes.
# 2. On the intended first server ONLY, reset with no snapshot:
sudo k3s server --cluster-reset &
# Wait, then stop it.
# 3. Wipe db/ and token on remaining servers (same as step 5 above).
# 4. Start the first server, wait 30s, then start the others.
# Note: without a snapshot you lose all etcd state from the previous cluster.# If cloud-init aborts with "leader lock held by running instance" after a
# tofu destroy + tofu apply, the old lock is still in Object Storage.
# Delete it before re-applying, or it will be cleared automatically if the
# holder instance is no longer RUNNING.
oci os object delete \
--namespace <your-namespace> \
--bucket-name <cluster-name>-terraform-state \
--name cluster-init-lock \
--forceThe public NLB has prevent_destroy = true so its IP is stable across tofu apply runs.
However, if the NLB is ever recreated (e.g. after tofu state rm + re-apply):
- All
sslip.iohostnames change (e.g.grafana.<old-ip>.sslip.io→grafana.<new-ip>.sslip.io) - Let's Encrypt certificates are invalid for the new hostnames and must be reissued
- With a custom domain +
enable_external_dns = true, ExternalDNS updates DNS automatically and cert-manager auto-renews
If using sslip.io defaults, run tofu apply again after NLB recreation: local.grafana_hostname and local.argocd_hostname recompute automatically from the new IP, cloud-init re-creates the Gateway listeners and certificates, and cert-manager reissues via Let's Encrypt.
The first-server TIMECREATED election is stable in practice but not contractually guaranteed when pool instances share the same creation timestamp. In the rare case of a timestamp tie,
jq | firstreturns a stable (but undefined) ordering based on API response. The atomic leader lock (cluster-init-lockin the state bucket) provides the final safety guarantee independent of election ordering.
MIT. See LICENSE.
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| alertmanager_email | Optional email address to subscribe to the OCI Notifications topic. The subscriber must confirm via an OCI confirmation email. | string |
null |
no |
| argocd_chart_version | ArgoCD Helm chart version used for the bootstrap install. Must match gitops/apps/argocd.yaml targetRevision. Managed by Renovate. | string |
"9.7.0" |
no |
| argocd_hostname | Fully-qualified hostname for the ArgoCD UI (e.g. argocd.example.com). When set, a Gateway API HTTPRoute with a cert-manager TLS certificate is created by cloud-init. If null, an sslip.io hostname is derived from the NLB IP. | string |
null |
no |
| availability_domain | Availability domain name, e.g. 'Uocm:EU-FRANKFURT-1-AD-1' | string |
n/a | yes |
| boot_volume_size_in_gbs | Boot volume size in GB for k3s nodes (servers + workers). OCI minimum is 50 GB for all shapes. With 4 k3s nodes at 50 GB each the total is 200 GB (exactly at the Always Free limit). The bastion uses OCI Bastion Service — no VM, no boot volume. | number |
50 |
no |
| certmanager_chart_version | cert-manager Helm chart version used for the bootstrap install. Must match gitops/apps/cert-manager.yaml targetRevision. Managed by Renovate. | string |
"v1.20.2" |
no |
| certmanager_email_address | Email address for Let's Encrypt ACME registration. Must be a real address. | string |
n/a | yes |
| cloudflare_api_token | Cloudflare API token. Required when enable_external_dns = true or enable_dns01_challenge = true. Create a scoped token at https://dash.cloudflare.com/profile/api-tokens with Zone:DNS:Edit permissions. | string |
null |
no |
| cloudflare_zone_id | Cloudflare Zone ID for the managed domain. Required when enable_external_dns = true. | string |
null |
no |
| cluster_name | Logical name for the cluster. Used in display names and freeform tags. | string |
n/a | yes |
| compartment_ocid | OCID of the compartment where all resources are created | string |
n/a | yes |
| compute_shape | OCI compute shape for k3s nodes | string |
"VM.Standard.A1.Flex" |
no |
| dockerhub_password | Docker Hub access token (PAT) for ArgoCD OCI Helm chart pulls. Paired with dockerhub_username. | string |
"" |
no |
| dockerhub_username | Docker Hub username for ArgoCD to authenticate when pulling OCI Helm charts (e.g. Envoy Gateway from registry-1.docker.io). If empty, anonymous pulls are attempted and may be rate-limited. Create a PAT at https://app.docker.com/settings/personal-access-tokens | string |
"" |
no |
| enable_backup | Enable weekly boot volume backups for all k3s nodes (Always Free: 5 total backups). With 4 nodes at weekly-1-week-retention there are at most 4 active backups. | bool |
true |
no |
| enable_bastion | Provision an OCI Bastion Service resource (managed SSH proxy, Always Free, no storage). When enabled, a STANDARD bastion is created and associated with the private subnet. Use example/get-kubeconfig.sh to retrieve kubeconfig via a Bastion session. Strongly recommended; without it, nodes are reachable only via serial console. |
bool |
true |
no |
| enable_dns01_challenge | Configure cert-manager ClusterIssuers to use DNS-01 ACME challenge via Cloudflare instead of HTTP-01. Enables wildcard certificates (*.example.com) and works even without inbound port 80. Requires cloudflare_api_token. | bool |
false |
no |
| enable_etcd_snapshots | Upload etcd snapshots to the OCI Object Storage state bucket every 6 hours using OCI CLI instance_principal auth (no Customer Secret Keys required). Requires enable_object_storage_state = true. Provides off-node etcd backup for split-brain recovery. | bool |
true |
no |
| enable_external_dns | Deploy external-dns (kubernetes-sigs) configured for Cloudflare. Automatically creates/updates DNS A records when Services or Ingresses are annotated. Requires cloudflare_api_token and cloudflare_zone_id. | bool |
false |
no |
| enable_external_secrets | Deploy the External Secrets Operator and create a ClusterSecretStore backed by OCI Vault (instance_principal auth). Requires enable_vault = true. Workloads can then create ExternalSecret resources to sync any OCI Vault secret into a Kubernetes Secret without hard-coding values. | bool |
false |
no |
| enable_longhorn_backup | Provision a dedicated Always Free OCI Object Storage bucket for Longhorn PVC backups. Cloud-init automatically creates the backup credentials secret and wires the Longhorn BackupTarget when enable_longhorn_backup = true AND user_ocid is set. Shares the 20 GB free allowance with the Terraform state bucket. | bool |
true |
no |
| enable_mysql | Provision an Always Free MySQL HeatWave DB system (single node, 50 GB). Creates a Kubernetes Secret 'mysql-credentials' in the default namespace. | bool |
false |
no |
| enable_notifications | Create an OCI Notifications topic and wire the endpoint to Alertmanager as a webhook receiver (Always Free: 1M HTTPS + 3K email/month). requires OCI IAM request signing. Alertmanager sends unsigned HTTP POSTs, which OCI rejects with HTTP 401. Enabling this variable creates the ONS topic and records its endpoint in the 'notification_topic_endpoint' output, but alerts will NOT be delivered to ONS without a signing proxy. Workarounds (choose one): (a) Use Alertmanager's native 'email_configs' receiver with an SMTP relay — no proxy needed. (b) Deploy a small signing proxy (e.g. an OCI Function with instance-principal auth) between Alertmanager and the ONS endpoint. (c) Use a third-party webhook receiver (PagerDuty, Slack, etc.) that does not require signing. The 'alertmanager_email' variable provides a direct ONS email subscription — this works correctly and is independent of the signing limitation (OCI delivers email subscriptions internally). |
bool |
false |
no |
| enable_object_storage_state | Provision an Always Free OCI Object Storage bucket for storing Terraform/OpenTofu state (S3-compatible API). See the terraform_state_backend output for the backend configuration snippet. | bool |
true |
no |
| enable_oci_logging | Enable OCI Logging for cloud-init logs. Ships /var/log/k3s-cloud-init.log to OCI Logging Service via the Unified Monitoring Agent (Always Free: 10 GB/month). | bool |
true |
no |
| enable_tailscale | Store Tailscale Kubernetes operator OAuth credentials in OCI Vault so the tailscale-operator ExternalSecret can sync them into the cluster without committing secrets to git. Requires enable_vault = true. Pre-requisite: create an OAuth client at https://login.tailscale.com/admin/settings/oauth with scope Devices → Write (devices:core:write) and allowed tag tag:k8s-operator. |
bool |
false |
no |
| enable_vault | Store cluster secrets (k3s_token, longhorn_ui_password, grafana_admin_password) in OCI Vault (Always Free: software keys + 150 secrets). Nodes fetch secrets via OCI CLI instance_principal at boot — plaintext values are removed from cloud-init user-data. | bool |
true |
no |
| environment | Deployment environment label (e.g. staging, production) | string |
"staging" |
no |
| etcd_snapshot_retention | Number of etcd snapshots to retain in OCI Object Storage per node. Older snapshots are pruned automatically by the cron job. Must be >= 1 (0 would disable pruning and grow the bucket unbounded). | number |
5 |
no |
| expose_kubeapi | Expose the Kubernetes API server via the public NLB (restricted to my_public_ip_cidr) | bool |
false |
no |
| expose_ssh | Expose SSH (port 22) via the public NLB to all cluster nodes (restricted to my_public_ip_cidr). Eliminates the need for OCI Bastion sessions for day-to-day access. | bool |
false |
no |
| external_dns_domain_filter | Domain filter for external-dns — only DNS records under this domain are managed (e.g. 'k3s.example.com'). Required when enable_external_dns = true. | string |
null |
no |
| external_secrets_chart_version | External Secrets Operator Helm chart version used for the bootstrap install. Must match gitops/apps/external-secrets.yaml targetRevision. Managed by Renovate. | string |
"2.6.0" |
no |
| fault_domains | Fault domains to spread the instance pool across | list(string) |
[ |
no |
| gateway_api_version | Kubernetes Gateway API CRDs version (experimental channel) installed at bootstrap. Experimental channel is a superset of standard and includes GRPCRoute, TCPRoute, TLSRoute, etc. required by Envoy Gateway. Must exist before ArgoCD syncs gateway-config. | string |
"v1.5.1" |
no |
| github_ssh_keys_username | GitHub username whose published SSH keys (https://github.com/.keys) are added to every instance's authorized_keys at plan time, in addition to the primary public_key / public_key_path. Leave empty to skip. |
string |
"" |
no |
| gitops_path | Path within gitops_repo_url that ArgoCD uses as the App of Apps source. Default is 'gitops/apps' (k3s-oci native layout). Override when your GitOps repo uses a different directory structure. | string |
"gitops/apps" |
no |
| gitops_repo_url | Git repository URL for the ArgoCD App of Apps (e.g. https://github.com/your-org/k3s-oci.git). Set this to your fork so ArgoCD pulls from the right repo. | string |
"https://github.com/mbologna/k3s-oci.git" |
no |
| gitops_ssh_private_key | SSH private key (PEM/OpenSSH format) for ArgoCD to clone the gitops repo. Terraform stores it in OCI Vault; cloud-init fetches it and creates the argocd-repo-gitops Secret before ArgoCD starts. Leave empty only when gitops_repo_url is a public HTTPS repo. | string |
"" |
no |
| grafana_hostname | Fully-qualified hostname for the Grafana UI (e.g. grafana.example.com). When set, a Gateway API HTTPRoute with a cert-manager TLS certificate is created in gitops/monitoring/. | string |
null |
no |
| http_lb_port | Public HTTP port on the NLB frontend (default 80). | number |
80 |
no |
| https_lb_port | Public HTTPS port on the NLB frontend (default 443). | number |
443 |
no |
| ingress_controller_http_nodeport | NodePort on workers that the ingress controller binds for HTTP traffic | number |
30080 |
no |
| ingress_controller_https_nodeport | NodePort on workers that the ingress controller binds for HTTPS traffic | number |
30443 |
no |
| k3s_server_pool_size | Number of k3s control-plane nodes in the instance pool. Use 3 for HA (etcd quorum). Must be an odd number >= 1. | number |
3 |
no |
| k3s_standalone_worker | When true (default), provisions one worker node as a plain oci_core_instance resource. This is the recommended approach for OCI Always Free tenancies: instance pools route requests through OCI Capacity Management which can fail for A1.Flex shapes, whereas a direct oci_core_instance reliably claims the free allocation. Default topology: 3 control-plane nodes (pool) + 1 standalone worker = 4 OCPUs / 24 GB. |
bool |
true |
no |
| k3s_subnet | Subnet name used to derive the flannel interface. Leave 'default_route_table' to let k3s auto-detect. | string |
"default_route_table" |
no |
| k3s_version | k3s version to install. Use 'stable' or 'latest' to resolve from the k3s channel API at plan-time, or pin to a specific release (e.g. 'v1.35.5+k3s1'). | string |
"stable" |
no |
| k3s_worker_pool_size | Number of k3s worker nodes managed by the OCI Instance Pool. Set to 0 (default) when using k3s_standalone_worker = true, which is the recommended Always Free topology. The pool is kept to allow future scaling beyond the free tier. |
number |
0 |
no |
| kube_api_port | Port the k3s API server listens on | number |
6443 |
no |
| longhorn_hostname | Fully-qualified hostname for the Longhorn UI (e.g. longhorn.example.com). When set, a Gateway API HTTPRoute with BasicAuth (Envoy Gateway SecurityPolicy) and a cert-manager TLS certificate is created. | string |
null |
no |
| longhorn_ui_username | Username for Longhorn UI BasicAuth (only used when longhorn_hostname is set). | string |
"admin" |
no |
| my_public_ip_cidr | Your workstation public IP in CIDR notation (e.g. 1.2.3.4/32). Restricts OCI Bastion Service session creation (enable_bastion = true) and kubeapi access via the public NLB (expose_kubeapi = true). k3s nodes are in a private subnet and are only reachable via OCI Bastion sessions. |
string |
n/a | yes |
| mysql_admin_username | Admin username for the MySQL HeatWave DB system. | string |
"admin" |
no |
| mysql_shape | MySQL HeatWave shape. 'MySQL.Free' is the Always Free shape. | string |
"MySQL.Free" |
no |
| oci_core_vcn_cidr | CIDR block for the VCN | string |
"10.0.0.0/16" |
no |
| oci_core_vcn_dns_label | DNS label for the VCN (≤15 alphanumeric chars, no hyphens — OCI DNS constraint). | string |
"k3svcn" |
no |
| oci_identity_dynamic_group_name | Name for the OCI dynamic group granting instances access to the OCI API. Must be unique per tenancy — the default 'k3s-cluster-dynamic-group' collides if you deploy multiple clusters in the same tenancy. Recommended: set to "<cluster_name>-dynamic-group" in your tfvars. |
string |
"k3s-cluster-dynamic-group" |
no |
| oci_identity_policy_name | Name for the OCI IAM policy attached to the dynamic group. Must be unique per tenancy — the default 'k3s-cluster-policy' collides if you deploy multiple clusters in the same tenancy. Recommended: set to "<cluster_name>-policy" in your tfvars. |
string |
"k3s-cluster-policy" |
no |
| os_family | OS distribution for cluster nodes. "ubuntu" (default) uses OCI-native Ubuntu 24.04 and auto-resolves the image. "opensuse" uses openSUSE Leap 16.0 — requires os_image_id (use scripts/import-opensuse-aarch64.sh to import the image and obtain its OCID). | string |
"ubuntu" |
no |
| os_image_id | OCID of the OS image for A1.Flex nodes. If null and os_family = "ubuntu", the latest Ubuntu 24.04 LTS (Noble) aarch64 image is resolved automatically. Required when os_family = "opensuse" — use scripts/import-opensuse-aarch64.sh to import and capture the OCID. | string |
null |
no |
| private_subnet_cidr | CIDR for the private subnet (k3s nodes) | string |
"10.0.1.0/24" |
no |
| private_subnet_dns_label | DNS label for the private subnet (≤15 alphanumeric chars, no hyphens — OCI DNS constraint). | string |
"k3sprivate" |
no |
| public_key | SSH public key content placed on every instance. Preferred over public_key_path — pass the key string directly for CI pipelines where ~/.ssh does not exist. When null, the key is read from public_key_path at plan time. |
string |
null |
no |
| public_key_path | Path to SSH public key file. Used as fallback when public_key is null. | string |
"~/.ssh/id_ed25519.pub" |
no |
| public_subnet_cidr | CIDR for the public subnet (load balancers and optional bastion) | string |
"10.0.0.0/24" |
no |
| public_subnet_dns_label | DNS label for the public subnet (≤15 alphanumeric chars, no hyphens — OCI DNS constraint). | string |
"k3spublic" |
no |
| region | OCI region identifier (e.g. 'eu-frankfurt-1'). Required when enable_external_secrets = true for the ClusterSecretStore to locate the OCI Vault endpoint. | string |
null |
no |
| server_memory_in_gbs | RAM in GB per control-plane node. Total RAM must not exceed 24 GB (Always Free). | number |
6 |
no |
| server_ocpus | OCPUs per control-plane node. Total OCPUs across all nodes must not exceed 4 (Always Free). | number |
1 |
no |
| tailscale_oauth_client_id | Tailscale OAuth client ID. Required when enable_tailscale = true. | string |
null |
no |
| tailscale_oauth_client_secret | Tailscale OAuth client secret. Required when enable_tailscale = true. | string |
null |
no |
| tenancy_ocid | OCID of the tenancy | string |
n/a | yes |
| trace_enabled | Enable bash trace mode (set -x) in cloud-init scripts. Produces verbose output in /var/log/k3s-cloud-init.log. Useful for debugging bootstrap failures. Do NOT enable in production. | bool |
false |
no |
| unique_tag_key | Freeform tag key applied to every resource for identification | string |
"k3s-provisioner" |
no |
| unique_tag_value | Freeform tag value applied to every resource for identification | string |
"https://github.com/mbologna/k3s-oci" |
no |
| user_ocid | OCID of the OCI user running Terraform (format: ocid1.user.oc1..xxx). Required when enable_longhorn_backup = true to automatically create a Customer Secret Key for S3-compatible access, wire the Longhorn backup credentials Kubernetes Secret, and apply the Longhorn BackupTarget in cloud-init. When null, the Longhorn backup bucket is still created but wiring is manual (follow the longhorn_backup_setup output instructions). |
string |
null |
no |
| worker_memory_in_gbs | RAM in GB per worker node. | number |
6 |
no |
| worker_ocpus | OCPUs per worker node. | number |
1 |
no |
| Name | Description |
|---|---|
| argocd_initial_password_hint | Command to retrieve the ArgoCD initial admin password (run after cluster is up) |
| bastion_ocid | OCID of the OCI Bastion Service resource (null if enable_bastion = false). Use with example/get-kubeconfig.sh or oci bastion session create-managed-ssh. |
| grafana_admin_credentials | Grafana admin credentials (only available after cluster bootstrap) |
| internal_lb_ip | Private IP of the internal load balancer (used by agents to join the cluster) |
| k3s_servers_private_ips | Private IPs of k3s control-plane nodes |
| k3s_standalone_worker_private_ip | Private IP of the standalone worker node (oci_core_instance, not pool-managed) |
| k3s_token | k3s cluster join token (sensitive) |
| k3s_workers_private_ips | Private IPs of k3s worker nodes (instance pool) |
| kubeconfig_hint | How to retrieve kubeconfig after cluster is up |
| longhorn_backup_setup | Longhorn backup bucket info and wiring status. Null if enable_longhorn_backup = false. |
| longhorn_ui_credentials | Longhorn UI credentials (only set when longhorn_hostname is configured) |
| mysql_admin_credentials | MySQL HeatWave admin credentials (sensitive). Null if enable_mysql = false. |
| mysql_endpoint | MySQL HeatWave connection endpoint (hostname:port). Null if enable_mysql = false. |
| notification_topic_endpoint | OCI Notifications HTTPS endpoint for the Alertmanager webhook receiver (null if enable_notifications = false). |
| oci_log_group_id | OCI Log Group OCID for k3s cloud-init logs (null if enable_oci_logging = false) |
| public_nlb_ip | Public IP address of the NLB (point your DNS here) |
| ssh_command | SSH command to connect to a cluster node via the public NLB (null if expose_ssh = false). Routes to any available server. |
| ssh_host_public_key | Shared SSH host public key deployed to all nodes. Add to known_hosts with: ssh-keygen -R && terraform output -raw ssh_host_public_key | ssh-keyscan -f - >> ~/.ssh/known_hosts (or simply ssh-keyscan >> ~/.ssh/known_hosts after apply). |
| tailscale_vault_secret_names | OCI Vault secret names for the Tailscale operator OAuth credentials (null if enable_tailscale = false). Reference these names in the ExternalSecret (platform//tailscale-operator/oauth-secret.yaml). |
| terraform_state_backend | S3-compatible backend config snippet for storing Terraform state in the provisioned OCI Object Storage bucket. Replace and add S3 credentials (OCI Customer Secret Key). |
| vault_id | OCI Vault OCID (null if enable_vault = false) |