Internals — Reconciliation & Runtime¶

This document describes how the operator behaves at runtime: the reconciliation lifecycle, child controller pattern, status reporting, and resource cleanup. For the structural overview (CRD hierarchy, configuration model, config rendering), see Architecture. For the full lifecycle task reference (pod state machine, retry semantics, upgrade modes), see Lifecycle.

Reconciliation Lifecycle¶

When a Superset CR is created or updated, the parent controller runs through five sequential phases:

Preflight — Fetch the Superset CR, check the suspend flag
Shared Resources — ServiceAccount
Lifecycle Tasks — Create/update SupersetLifecycleTask child CRs (gates everything below)
Component Reconciliation — Resolve shared spec (top-level + per-component) into flat child specs, create/update/delete child CRs, reconcile networking/monitoring/network policies
Status Aggregation — Read child CR statuses, set conditions and phase

Phase 1: Preflight¶

The controller fetches the Superset CR. If it no longer exists, the reconciler returns gracefully — Kubernetes garbage collection handles cleanup via owner references.

If spec.suspend is true, the controller sets the Suspended condition to True, updates status, and returns immediately. No task pods run, no child CRs are created or updated, and no resources are deleted. This allows users to pause reconciliation without removing the CR.

Phase 2: Shared Resources¶

ServiceAccount — Created if spec.serviceAccount.create is true (the default). Uses the name from spec.serviceAccount.name or falls back to the parent CR name. Owned by the parent CR and garbage-collected on parent deletion.

Phase 3: Lifecycle Tasks¶

The parent controller creates SupersetLifecycleTask child CRs: {parentName}-clone, {parentName}-migrate, {parentName}-rotate, and {parentName}-init. The parent uses a Get+Create/Delete pattern (never CreateOrUpdate) to avoid races with the task controller's status writes. When a task needs to re-run (checksum mismatch), the parent deletes the old CR and creates a fresh one on the next reconcile.

Tasks run sequentially: clone → migrate → rotate → init. Each task can be independently disabled via disabled: true. Clone also supports periodic re-execution via cronSchedule. Checksums cascade downstream: a re-clone forces re-migrate, which forces re-rotate, which forces re-init.

When a task requires drain (requiresDrain: true, the default for clone, migrate, and rotate), the operator deletes all component child CRs before running that task. The parent verifies all component pods have terminated (not just Deployments deleted) before proceeding to task execution. This ensures no application pods access the metastore during schema changes. If maintenancePage is configured, the parent brings up a maintenance Deployment and switches the web-server Service selector before draining. After tasks complete, Phase 4 recreates all components fresh.

Components do not deploy until all enabled lifecycle tasks complete (or lifecycle is explicitly disabled via spec.lifecycle.disabled: true). If a task is in progress or has failed, Reconcile() returns early with a requeue, skipping Phase 4.

For the full lifecycle reference including pod state machine, retry/backoff semantics, upgrade modes, and drain verification, see Lifecycle.

Phase 4: Component Reconciliation¶

For each of the six deployment components, the parent controller:

Checks if the component is enabled (field present in spec)
If disabled, deletes the child CR (cascade-deletes all owned resources)
If enabled:
- Renders component-appropriate superset_config.py from the parent's secretKey/secretKeyFrom, metastore, config, and per-component config fields via RenderConfig()
- Collects secret env vars: when secretKeyFrom, metastore.uriFrom, or metastore.passwordFrom are set, the operator produces env vars with valueFrom.secretKeyRef pointing at the referenced Secret. In dev mode, inline values produce plain value env vars instead.
- Resolves the shared spec (top-level + per-component) into a flat FlatComponentSpec via ResolveChildSpec()
- Computes a config checksum from shared inputs and rendered config
- Creates or updates the child CR with the fully-flattened spec

After components, the controller reconciles cluster-scoped resources: networking (Ingress or HTTPRoute), monitoring (ServiceMonitor), and network policies (one NetworkPolicy per enabled component).

Phase 5: Status Aggregation¶

The controller reads each child CR's status via unstructured GET (using the correct GVK per component type), extracts the ready field (format: "readyReplicas/desiredReplicas"), and aggregates into the parent status.

All components ready	Phase	Available condition
Yes	`Running`	`True`
No	`Degraded`	`False`

Child Controller Pattern¶

Each child CRD (SupersetLifecycleTask, SupersetWebServer, SupersetCeleryWorker, etc.) has its own controller that reconciles the Kubernetes resources for that component.

Scalable components (WebServer, CeleryWorker, CeleryFlower, WebsocketServer, McpServer) manage a Deployment and support replicas, HPA, and PDB. Their specs embed ScalableComponentSpec, which has DeploymentTemplate, PodTemplate, and scaling fields.

Singleton components (SupersetLifecycleTask, CeleryBeat) run exactly one instance. SupersetLifecycleTask manages bare Pods with retry logic (uses PodTemplate only). CeleryBeat manages a Deployment but forces replicas: 1 (has both DeploymentTemplate and PodTemplate but no scaling fields).

All deployment controllers follow the same pattern: reconcile ConfigMap (if applicable), reconcile Deployment, reconcile Service (if the component exposes a port), reconcile scaling (HPA + PDB for scalable components), and update status. The task controller reconciles a ConfigMap and manages bare Pods.

Why ConfigMaps¶

Superset imports superset_config as a standard Python module, which means the config must exist as a .py file on the filesystem. A ConfigMap volume mount is the standard Kubernetes mechanism for projecting files into containers:

Python import requirement — superset_config.py must be a real file on disk; environment variables and downward API projections cannot serve as importable Python modules
Operability — kubectl get cm shows exactly what config each component is running, making debugging straightforward
Clean pod manifests — Without ConfigMaps, the rendered Python config would need to be inlined on the pod spec (as annotations or env vars), making Deployment manifests difficult to read. ConfigMaps keep pod specs focused on container configuration

Ownership and Checksum Flow¶

ConfigMaps are created and owned by the parent Superset controller (not by child CRs). This means:

ConfigMaps survive child CR deletion (e.g., during drain)
The parent is the single writer of config content
Child controllers mount ConfigMaps by conventional name without managing them

The parent computes a ConfigChecksum and passes it to child CRs via spec.configChecksum. Child controllers stamp this as a pod template annotation to trigger rolling restarts when config changes. This design follows the principle that the checksum should be computed by whoever writes the data — since the parent renders and writes the ConfigMap, it is the authority on when content changed. Passing the checksum to child CRs avoids requiring child controllers to watch or read ConfigMaps they don't own.

What Each Component Creates¶

Component	ConfigMap	Workload	Service	HPA	PDB
Migrate (task)	superset_config.py	bare Pod	—	—	—
Init (task)	superset_config.py	bare Pod	—	—	—
WebServer	superset_config.py	Deployment (gunicorn)	port 8088	if set	if set
CeleryWorker	superset_config.py	Deployment (celery worker)	—	if set	if set
CeleryBeat	superset_config.py	Deployment (celery beat)	—	—	—
CeleryFlower	superset_config.py	Deployment (celery flower)	port 5555	if set	if set
WebsocketServer	—	Deployment (node.js)	port 8080	if set	if set
McpServer	superset_config.py	Deployment (fastmcp)	port 8088	if set	if set

CeleryBeat is a singleton — the controller forces replicas: 1 regardless of the spec, and does not create an HPA or PDB.

WebsocketServer is Node.js-based and does not get a superset_config.py ConfigMap.

Deployment Builder¶

All child controllers delegate to buildDeploymentSpec(), which constructs a complete Deployment spec from the flat FlatComponentSpec and a component-specific DeploymentConfig:

type DeploymentConfig struct {
    ContainerName  string                 // e.g., "superset-web-server"
    DefaultCommand []string               // e.g., ["/usr/bin/run-server.sh"]
    DefaultArgs    []string               // optional
    DefaultPorts   []corev1.ContainerPort // e.g., [{Name: "http", Port: 8088}]
    ForceReplicas  *int32                 // non-nil only for beat (=1)
}

Replicas resolution order:

ForceReplicas (beat singleton) — always wins
nil if HPA is configured — HPA manages scaling
spec.Replicas otherwise

Idempotent Reconciliation¶

All resource creation uses controllerutil.CreateOrUpdate(): creates the resource if it doesn't exist, updates it if the spec has drifted. This makes every reconciliation cycle safe to re-run.

Labels and Annotations¶

The operator sets reserved labels on child CRs (SupersetLifecycleTask, SupersetWebServer, etc.) and NetworkPolicies for resource discovery and orphan cleanup.

Operator-Managed Labels¶

Label	Value	Purpose
`app.kubernetes.io/name`	`superset`	Application identity
`app.kubernetes.io/component`	Component type (e.g., `web-server`)	Component type filtering
`superset.apache.org/parent`	Parent Superset CR name	Parent-scoped discovery

These labels are set by the operator on every reconciliation and cannot be overridden — operator-managed labels are applied last, taking precedence over any existing values.

Sub-resources (Deployments, Services, ConfigMaps) created by child controllers use the standard app.kubernetes.io/* labels with app.kubernetes.io/instance set to the child CR name for selector matching.

Orphan Cleanup¶

When a component is disabled, the operator uses label-based discovery to find and delete orphaned child CRs. On each reconcile, it lists all child CRs matching the parent and component type labels, then deletes any whose name does not match the currently desired name. Deleting a child CR cascades to all its owned sub-resources via owner references.

Checksum-Driven Rollouts¶

Config changes must trigger pod restarts for the new config to take effect. The operator achieves this through checksum annotations on the pod template.

How It Works¶

Parent controller computes checksums when building child CRs
Checksums are stored on the child CR spec
Child controller stamps them as pod template annotations
When a checksum changes, the pod template changes, and Kubernetes triggers a rolling restart

Checksum Types¶

Annotation	Source	Scope
`superset.apache.org/config-checksum`	Rendered superset_config.py	Per-component

Per-component isolation: Changing a component's config only changes that component's config checksum -- only its pods restart. Other components are unaffected.

Secret safety: In prod mode, operator-managed secret values (secretKeyFrom, metastore.uriFrom, metastore.passwordFrom, valkey.passwordFrom) are never read by the operator and therefore never appear in checksums, annotations, or ConfigMaps. In dev mode, inline secret values (secretKey, metastore.password, valkey.password) influence the shared config checksum (as a hash, not plaintext) because changes to these values must trigger a rollout.

Garbage Collection¶

The operator uses Kubernetes owner references for automatic cleanup. The parent Superset CR owns child CRDs (SupersetLifecycleTask, SupersetWebServer, etc.), the web-server Service, networking resources, ServiceMonitor, and NetworkPolicies. Each child CR owns its managed resources — deployment CRDs own their Deployment, ConfigMap, Service (except web-server, which is parent-owned), HPA, and PDB; the SupersetLifecycleTask CRDs own their ConfigMap and Pods. Deleting the parent cascades to all child CRs, which cascade to all their owned resources. Removing a component from the parent spec (e.g. deleting spec.celeryWorker) deletes its child CR, cascading to all owned resources.

Maintenance Page (Parent-Owned Service Selector Switch)¶

When spec.lifecycle.maintenancePage is set, the operator serves a maintenance page during drain and lifecycle tasks. This section documents the design decision behind the traffic switchover mechanism.

Problem¶

During drain, component child CRs are deleted. GC cascades this to Deployments and Pods. Without intervention, users experience connection errors instead of a friendly maintenance message.

Solution: Parent-Owned Web-Server Service¶

The parent controller owns the web-server Service directly (not the child CR). During lifecycle drain, the parent:

Creates a maintenance Deployment (parent-owned) running a lightweight HTTP server (nginx:alpine by default or a user-provided image).
Switches the web-server Service's selector to match the maintenance-page pod labels, instantly routing traffic to maintenance pods.
Drains all component child CRs (GC cascades to Deployments and Pods, but the Service is unaffected because it belongs to the parent).
Runs lifecycle tasks (clone → migrate → rotate → init).
After tasks complete and the web-server child CR is recreated, waits for the web-server Deployment to become ready.
Switches the Service selector back to the web-server pod labels.
Deletes the maintenance Deployment and its ConfigMap.

Why Parent-Owned Service¶

Service selector changes propagate in ~1 second via the endpoints controller, giving instant traffic switchover regardless of ingress implementation
Works for all access patterns: Ingress, Gateway API, direct Service
No orphan deletion complexity — the Service is always owned by the parent, so GC of child CRs never affects it
The child SupersetWebServer reconciler skips Service management (the parent handles it), keeping the child controller simple

Note for developers using kubectl port-forward: port-forward establishes a tunnel to a specific pod, not through the Service selector. When that pod is deleted during drain, the tunnel breaks with a "lost connection to pod" error. This does not affect Ingress/Gateway users — they route through EndpointSlices and see seamless transitions. Restart port-forward to reconnect to the maintenance pod.

Alternatives Considered¶

Orphan deletion + selector patch (previous design): Used propagationPolicy: Orphan when deleting the SupersetWebServer child CR to preserve the Service, then patched the selector. Rejected because orphan lifecycle was fragile — race conditions between GC finalization and reconciliation, plus the child had to detect and re-adopt the orphaned Service on recreation.

Separate maintenance Service + Ingress/HTTPRoute backend swap: Architecturally pure (clean separation, no interaction with web-server resources), but rejected because Ingress/HTTPRoute propagation latency varies significantly by controller implementation — from ~1s (Envoy-based) to 1-3 minutes (cloud load balancers like GCP/AWS). This creates an unacceptable error window where users hit the draining backend. Also doesn't work for users without networking configured.

Status and Conditions¶

Parent Status¶

The parent Superset CR reports aggregate status:

status:
  phase: Running
  observedGeneration: 3
  version: "latest"
  components:
    webServer:
      ready: "2/2"
    celeryWorker:
      ready: "4/4"
    celeryBeat:
      ready: "1/1"
  conditions:
    - type: Available
      status: "True"
      reason: AllComponentsReady
    - type: InitComplete
      status: "True"
      reason: InitComplete
    - type: Suspended
      status: "False"

Parent Phase¶

The top-level status.phase reflects the overall instance state:

Phase	Meaning
`Initializing`	First deployment — lifecycle tasks running for the first time
`Upgrading`	Image change detected — lifecycle tasks running against new version
`Draining`	Drain strategy active — components being removed before running tasks
`Running`	All enabled components are ready and lifecycle is complete
`Degraded`	One or more components are not fully ready
`Suspended`	`spec.suspend: true` — all reconciliation paused
`Blocked`	Downgrade detected — lifecycle tasks will not run (manual intervention required)
`AwaitingApproval`	Supervised upgrade mode — waiting for approval annotation before proceeding

Child Status¶

Each child CR reports its own status:

status:
  ready: "2/3"
  observedGeneration: 5
  conditions:
    - type: Ready
      status: "False"
      reason: PartiallyReady
      message: "2 of 3 replicas ready"
    - type: Progressing
      status: "True"
      reason: RolloutInProgress

Ready condition states:

State	Meaning
`True` / `AllReplicasReady`	readyReplicas >= desiredReplicas and > 0
`False` / `PartiallyReady`	Some replicas ready, not all
`False` / `NotReady`	Zero replicas ready

Progressing condition states:

State	Meaning
`True` / `RolloutInProgress`	Deployment is rolling out new pods
`False` / `RolloutComplete`	New ReplicaSet is fully available

Error Handling Summary¶

Scenario	Behavior
Superset CR deleted during reconcile	Graceful return (not found)
Init pod fails	Retry with backoff up to maxRetries, then permanent failure
Init pod times out	Counts as failed attempt, same retry logic
Child CR creation fails	Error propagated, reconcile retried by controller-runtime
Optional CRD missing (Gateway API, ServiceMonitor)	Log and continue — feature disabled gracefully
Referenced Secret values change	Pods see new values only after restart; update `forceReload` to trigger rollout
Component removed from spec	Child CR deleted, cascade cleans up all resources
Suspend enabled	All reconciliation paused, no resources created or deleted