Deployment Model

xDP Deployment Models

This page is the architectural reference for how Acceldata xDP can be deployed across regions, tenants, environments, and trust boundaries. It captures the four canonical topologies the platform supports — Models A through D — explains where each one is the right choice, and shows what is actually configured in the xDP codebase to realize them.

Two independent design questions drive every xDP deployment decision:

  1. Control Plane topology — is the platform operated as a federated Control Plane (one global CP orchestrating many dataplanes for many tenants), or as an isolated Control Plane (one or more independently operated CPs, including sovereign installs inside the customer's own trust boundary)? Within the federated Control Plane, the choice is between Model A (multi-dataplane federation) and Model B (multi-tenant single dataplane). Within the isolated Control Plane, the choice is between Model C (one CP fronted by many domain URLs) and Model D (one CP per environment).
  2. Dataplane topology under each Control Plane — how are tenants and dataplanes carved up underneath that CP? Models A and B answer this question for the federated CP; Models A and B can also be applied inside any isolated CP that Models C and D set up.

Most enterprise deployments compose the answers: e.g., Model D for environment isolation (separate isolated CPs for Dev/Stage/Prod), with Model A inside each CP for region-level dataplane separation, and Model B inside each dataplane for tenant pooling.

Overview

xDP runs as a set of Kubernetes-native workloads. The Control Plane (orchestration, identity, policy) runs as long-lived deployments on a dedicated node pool. The Dataplane (Spark, Trino, Flink, Kafka, Airflow, JupyterHub) runs as workloads scheduled by Yunikorn. Authorization is enforced by Apache Ranger inside xCentral. Metadata flows through xStore.

Three principles inform every model that follows:

  • Control and data are separated — the Control Plane never holds customer data, and the Dataplane never owns identity.
  • Compute is decoupled from storage — clusters are spun up and down freely while data and metadata persist on shared, governed substrates.
  • Policy is enforced at the catalog protocol layer rather than in each engine, which prevents engine-specific bypass.

The Federated Control Plane

A single global Control Plane sits above all managed clusters and presents a unified administrative surface. It runs in HA on a dedicated node pool and consists of the following services:

  • Federation registry — the system of record for every cluster, dataplane, region, and tenant. Every managed entity has a globally unique identity, ownership metadata, and lifecycle state. All other CP services key off this registry.
  • Domain manager — owns the tenancy taxonomy. A domain is a logical scope that groups namespaces, quotas, catalogs, and identity bindings. Domains are hierarchical: a parent organization can hold child domains for business units.
  • Cluster orchestrator — drives cluster lifecycle (create, scale, upgrade, decommission) by issuing reconciled changes to local control agents on each dataplane.
  • Policy and RBAC engine — enforces both global rules (data residency, compliance class) and tenant-scoped rules (which roles a user holds where). Federates with enterprise identity over LDAP, SAML, or OIDC.
  • Cross-cluster scheduler — decides which dataplane should host a workload when multiple are eligible, based on capacity, residency, and cost signals.
  • Observability and audit — aggregates metrics, logs, and the activity trail across all dataplanes. Every administrative action is recorded with actor, timestamp, and outcome for compliance review.

Model A — Multi-Dataplane Federation

In Model A, each dataplane is a physically separate xDP deployment with its own Kubernetes cluster, local control agent, scheduler, authorization layer, and storage backend. The global Control Plane treats each dataplane as a managed entity and orchestrates them in parallel.

When to use Model A

This model is the right choice when isolation requirements cannot be satisfied by Kubernetes namespaces alone. Common drivers:

  • Data residency — GDPR, India DPDP, China PIPL, and similar regulations require that data subjects' records remain within specific geographies. A dataplane per region enforces this at the infrastructure level.
  • Network sovereignty — air-gapped environments, customer VPCs, or partner networks where data cannot transit shared infrastructure. The dataplane is deployed inside the trust boundary and connects outbound to the global CP through a regional gateway.
  • Blast radius — an outage, misconfiguration, or compromise of one dataplane cannot affect others. This matters for production-versus-non-production separation and for multi-tenant SaaS deployments where one tenant's load cannot impact another.
  • Heterogeneous infrastructure — mixing on-prem, AWS, GCP, and Azure under one administrative plane. Each environment runs the same xDP dataplane software but on its native Kubernetes distribution (OpenShift, EKS, GKE, AKS).
  • Compliance class separation — FedRAMP-bound, HIPAA-bound, and PCI-bound workloads benefit from being on entirely separate infrastructure to simplify audit scoping.

Operational characteristics

Multi-dataplane deployments require active reconciliation work that single-dataplane deployments do not. Image versions, configuration templates, and policy bundles must be propagated to every dataplane and verified for drift. The global Control Plane handles this through a pull-based reconciliation loop — each local agent periodically requests its desired state from the global plane and reports its observed state back. Cross-dataplane queries (for example, a Trino query that joins a US table with an EU table) are handled through the federated catalog rather than data movement, which preserves residency guarantees.

Model B — Multi-Tenant Single-Dataplane Domains

In Model B, a single dataplane hosts many tenants, each isolated through Kubernetes-native primitives rather than separate infrastructure. Each tenant maps to:

  • A dedicated Kubernetes namespace (e.g., tenant-mktg, tenant-fin) with network policies that prevent pod-to-pod traffic across namespaces.
  • A Yunikorn queue with reserved capacity (vCPU and memory), maximum capacity (burst limits), and a fair-share weight. Pre-emption rules ensure that a tenant exceeding its burst cannot starve another tenant's reserved share.
  • A ResourceQuota and LimitRange at the namespace level — a hard ceiling that Kubernetes enforces regardless of scheduler decisions.
  • A logical catalog domain in xStore — domain-mktg, domain-fin — that segregates table metadata and lineage.
  • An SSO group binding that maps enterprise identity directly to the tenant's domain and Yunikorn queue.

When to use Model B

Model B is the right choice when many tenants share the same compliance zone, and their workloads can safely coexist on shared infrastructure. The benefits are operational efficiency (one dataplane to upgrade, patch, monitor) and cost efficiency (pooled capacity, bin-packed scheduling, shared system services). Onboarding a new tenant takes a namespace, quota, and policy entry — minutes rather than days.

Isolation strength

Namespace-based isolation is logical, not physical. A determined adversary with cluster-admin access could escape it. For that reason, Model B is generally suitable for trusted internal tenants (different teams within the same organization) but not for hostile multi-tenancy (different paying customers in a shared SaaS deployment). Where stronger isolation is needed within a single dataplane, xDP supports gVisor or Kata Containers as a runtime class on selected node pools, which adds kernel-level isolation at the cost of some performance.

The Isolated Control Plane

In contrast to the federated Control Plane — which presents many dataplanes under a single global administrative surface — an isolated Control Plane is one in which each Control Plane instance is operated as a self-contained unit, without any cross-instance federation. There is no global coordinator above it, no shared identity tier alongside it, and no shared database between it and any other Control Plane. Everything the platform needs to function — the xDP UI, the Control Plane service, the identity provider, the API gateway, the database, and the dataplanes it manages — is provisioned and operated as one bounded deployment.

The isolated Control Plane is the foundation of sovereign deployment. In a sovereign install every component of the stack lives inside the customer's own trust boundary: the platform runs on the customer's Kubernetes, pulls images from a private registry inside the customer's network, authenticates against a customer-deployed identity provider, and writes all metadata, audit logs, and policy state to databases the customer controls. There is no outbound dependency on vendor-hosted infrastructure, and the deployment can be disconnected from the public internet without losing operational capability.

Customers choose an isolated Control Plane over a federated one for the following reasons:

  • Data sovereignty and regulatory residency — data, metadata, and identity must remain inside a defined legal or geographic boundary, with no possibility of leakage through a shared global tier.
  • Air-gapped operation — the deployment must function without outbound connectivity to vendor infrastructure. Image updates, telemetry, and licensing are handled offline through customer-controlled transfer paths.
  • Customer-managed Control Plane — the customer's platform team operates the Control Plane itself on their own change-management cadence, rather than consuming a vendor-managed SaaS.
  • Compliance scope reduction — by isolating each Control Plane (per environment, per business unit, or per compliance class), audits can be scoped narrowly to the systems that genuinely process regulated data.

Two topologies share this foundation. Model C runs a single isolated Control Plane that is fronted by multiple public hostnames; the Control Plane stack is shared across the URLs and only DNS, ingress, and identity-provider configuration differ. Model D runs many isolated Control Planes — typically one per environment, such as QE, Dev, Staging, and Production — that are deliberately kept independent of each other, with no data, audit trail, or policy decision crossing an environment boundary. Both models compose on the dataplane side with Models A and B inside each Control Plane.

Model C — One Control Plane Serving Multiple Domain URLs

In Model C, a single Control Plane installation answers to multiple public domain names, all routing to the same backend services, the same database, the same identity provider, and the same tenant set. Two or more fully-qualified domain names — for example, dp-n.acceldata.tech and dp-n.acceldata.dev — resolve through DNS to the same xDP ingress, terminate TLS at the same gateway, and deliver traffic to one set of Control Plane services. From the platform's point of view there is one Control Plane database, one set of tenants, and one upgrade pipeline. From the user's point of view, either domain reaches the same xDP UI and the same data.

When to use Model C

This model is the right choice when DNS variation is the only difference required between two views of the platform. Common drivers:

  • Brand and vanity domains — partner co-branding (for example, data.partner.com placed next to a primary dp-n.acceldata.tech), mergers and acquisitions, or short-lived launch domains during a marketing event that must point at the same backend as the canonical domain.
  • DNS rebrand and cutover — a new apex domain (for example, .dev to .tech) needs to coexist with the old one for the lifetime of bookmarks, saved links, and OAuth client registrations. Both names continue to work until the old one is retired.
  • Internal-versus-external sharing — an internal-facing domain (dp-n.acceldata.dev) is used for smoke tests and pre-production validation, while the customer-facing domain (dp-n.acceldata.tech) carries production traffic. Both domains hit the same backend, so internal tests reproduce exactly what the customer sees.
  • Region-aware geo-DNS — the same logical Control Plane is fronted by region-specific domains (for example, us.dp-n.acceldata.tech, eu.dp-n.acceldata.tech) for latency or marketing reasons, with traffic routed to one stack via DNS or anycast.

Operational characteristics

Model C is realised at the DNS, ingress, and identity layers — the xDP application binaries do not need to be aware of how many domains front them. A small number of touch points must remain aligned, however, and missing one of them is the most common Model C failure:

  • DNS and TLS — every public hostname needs an A or CNAME record pointing at the Control Plane ingress, and a TLS certificate that covers it. A single certificate with multiple Subject Alternative Names is the simplest path; per-host certificates work equally well when the ingress terminates by SNI.
  • Tenant resolution by hostname — xDP derives the identity realm and the identity host directly from the hostname the browser is using. Every public hostname must therefore have a matching realm in the identity provider and a matching accounts.<rest-of-hostname> host that resolves to the same identity service. As long as the realm name is identical across both domains, the same tenant is reached from either.
  • OIDC client registration — every public FQDN must be registered as a valid redirect URI and web origin on the OIDC client for each realm. A missing entry typically surfaces as a blank login screen rather than an explicit error, which makes interactive validation through a real browser the recommended sign-off step after every domain change.
  • Canonical domain for dataplane callbacks — when xDP installs a new dataplane, the dataplane is stamped with the canonical Control Plane hostname it must dial back to. Alternate domains are fine for interactive UI traffic, but the dataplane-to-Control-Plane control channel continues to use the canonical domain it was installed with. Plan DNS so that the canonical name is treated as permanent for the lifetime of the deployment; alternates can come and go around it.

Tenant isolation in Model C

Domains in Model C do not isolate tenants from each other — they only present different URLs. Tenants remain separated by realm, by Kubernetes namespace, by Yunikorn queue, and by catalog domain, exactly as in Model A or Model B. Adding a second domain to a Control Plane does not add a second tenant boundary. Where isolation between two views of the platform is genuinely required — separate data, separate identity, separate audit trail — Model D is the appropriate choice.

Trade-offs

AspectBehavior in Model C
Tenant isolationNone across domains — same Control Plane database, same tenant set, same identity provider. Tenants are separated by realm exactly as they would be on a single-domain Control Plane.
Blast radiusShared. A Control Plane outage takes every fronting domain down at once.
Upgrade strategySingle and atomic. All domains see the new version simultaneously.
Operational footprintLowest of the four models — one stack, one database, one upgrade pipeline, regardless of how many domains front it.
Use case fitBrand consolidation, vanity domains, painless DNS rebrand, partner co-branding inside a single trust zone.

What Model C is not

Model C is not a substitute for environment isolation between Dev, Staging, and Production — every domain shares the same Control Plane stack, the same database, and the same release cadence. Model C is not a way to reduce the regulatory scope of an audit, because a compliance review covers everything behind every domain that fronts the Control Plane. Where multiple independent environments are needed — separate data, separate identity, separate audit trail, separate upgrade timelines — Model D is the appropriate choice.

Model D — Separate xDP Instance per Environment

In Model D, each environment is an entirely independent xDP installation built on the isolated Control Plane foundation described above. There is no shared Control Plane, no shared database, no shared identity provider, and no shared dataplane between environments. The same model is also the canonical shape of a sovereign deployment, where every component lives inside the customer's trust boundary and the platform operates without any outbound dependency on vendor-hosted services.

Each environment — typically qe, dev, staging, and prod — receives a complete xDP installation:

  • Its own Control Plane stack — the xDP UI, Control Plane service, identity provider, API gateway, PostgreSQL, Redis, and authorization services run as a self-contained set of deployments dedicated to that environment.
  • Its own dataplanes — every environment has independent xCentral, xStore, and xCompute clusters. Tenants, catalogs, policies, secrets, and audit logs are scoped to one environment and do not leak across.
  • Its own image registry reference — public or vendor-hosted registries for non-restricted environments, or a private mirror inside the customer trust boundary for sovereign installs.
  • Its own DNS, TLS, and OIDC configuration — distinct hostnames per environment, distinct certificates, distinct realms, distinct OIDC clients, and distinct credentials.

No data, metadata, audit trail, or policy decision crosses environment boundaries. Each Control Plane rolls forward on its own schedule — QE can run a release candidate while Production stays on the previous GA build, and a defect found in QE never has a path into Production.

Figure 3. Model D — four self-contained xDP stacks. Each environment has independent identity, governance, catalog, compute, and registry. No control or data path crosses environments.

When to use Model D

This is the right choice when the isolation requirements concern entire environments — not individual tenants and not individual domain URLs. Common drivers:

  • Production-versus-non-production isolation — code, configuration, and image releases promote QE → Dev → Staging → Production through independent stacks, so a build that has only passed QE cannot reach Production by accident. Promotion gates are explicit and auditable, and an incident in one environment cannot cascade into another.
  • Compliance scope reduction — FedRAMP, HIPAA, PCI, SOC 2, and similar audits are dramatically cheaper to scope when only the Production stack is inside the audit boundary. QE, Dev, and Staging stay outside scope because they share no infrastructure, no identity, and no data with Production.
  • Air-gapped and sovereign deployment — the customer requires that no service inside the deployment makes outbound calls to vendor-hosted infrastructure. The Control Plane, the identity provider, the image registry, and all support services run inside the customer's trust boundary, with the platform delivered as installable artifacts rather than as a shared SaaS endpoint.
  • Customer-managed Control Plane for data sovereignty — large enterprises that want operational control of the Control Plane itself, not just the dataplanes. Each environment's Control Plane is administered by the customer's platform team, on the customer's change-management cadence, and inside the customer's network.
  • Per-business-unit autonomy — different business units inside a holding-company structure each operate their own Control Plane with their own identity provider, billing, compliance posture, and lifecycle, independent of the others.
  • Blue-green platform upgrades — a new Control Plane stack can be stood up in parallel, validated end-to-end, and switched in at the DNS layer once parity is proven. The previous stack is retired only after the new one has carried real traffic.

Operational characteristics

Model D scales the operational footprint linearly with the number of environments. The benefits — strong blast-radius containment, clean compliance boundaries, and sovereign deployability — come with corresponding obligations:

  • N upgrade pipelines — every environment is upgraded on its own cadence, with promotion gates between them. A bug fix lands in QE first, is verified, and only then progresses toward Production.
  • N image registries (or N tenants in one registry) — there is no cross-environment image pull path, which prevents a non-Production image from reaching Production but does require image promotion to be a deliberate operation.
  • N identity configurations — realms, clients, SMTP settings, and TLS certificates are templated per environment. The environments share no users, sessions, or tokens.
  • N monitoring stacks, one observability backend — each environment runs its own metrics and logging agents. A single observability backend can collect from all of them via tagged streams, which gives operators a unified view without sharing trust boundaries.
  • No global view across environments — cross-environment workflows, such as promoting a Spark job artifact from Staging to Production, are implemented externally, typically through Git, a CI/CD pipeline, and per-environment API calls. The platform itself does not move artifacts between environments.

Sovereign variant of Model D

When Model D is run as a sovereign deployment, the same per-environment isolation pattern is preserved, but several install-time decisions are made differently to keep the entire stack inside the customer's trust boundary:

  • Image distribution — every xDP image is pulled from a private container registry inside the customer's network. Image updates are pulled from vendor-published artifacts, scanned, signed, and pushed into the private registry through the customer's own pipeline rather than at runtime by the Control Plane.
  • Identity — the identity provider runs inside the customer environment, either as a customer-deployed instance of the platform's identity provider or as a customer-operated SSO that xDP federates with over OIDC. Authentication never traverses an external service.
  • Install path — the operator downloads the rendered Helm values file, vendors it into version control, and applies it through the customer's own deployment pipeline. The Control Plane never reaches out to install or upgrade dataplanes automatically; every change is a deliberate, audited action.
  • Telemetry and licensing — usage reports, audit logs, and license validation function without an outbound connection. Reports are exported on a schedule and uploaded by the customer through whatever transfer path their security policy permits.

The result is an xDP installation that is operationally identical to a managed deployment from the user's point of view, but architecturally isolated to the point that the customer can disconnect it from the public internet and continue to operate every environment independently.

Tenant and environment isolation

Tenant isolation in Model D is the strongest the platform offers. Each environment has its own Control Plane database, so the one-xCentral-per-tenant invariant applies separately per environment — the same tenant name in two environments produces two distinct xCentral installations, with no shared metadata between them. Within an environment, dataplane-to-dataplane traffic stays inside the environment's own Kubernetes cluster on internal links, and is routed through the environment's own ingress on external links. Traffic from one environment never traverses another environment's network, and a credential issued in one environment is not valid in any other.

Trade-offs

AspectBehavior in Model D
Tenant and environment isolationStrongest — separate Control Plane database, separate identity provider, separate dataplanes, separate audit trail.
Blast radiusConfined to one environment. A QE outage cannot affect Production.
Upgrade strategyPer-environment rolling, with promotion gates between environments.
Operational footprintHighest of the four models — N self-contained stacks to operate.
SovereigntyFull, when paired with a private registry and an identity provider inside the customer trust boundary.
Use case fitEnvironment isolation, compliance scope reduction, air-gapped and sovereign customers, customer-managed Control Plane, per-business-unit autonomy.

Choosing Between Models — Comparison

The four models answer two independent questions, so they compose rather than compete. The table below compares them on the axes that drive the decision in practice; the row marked "Combines with" identifies which other models can layer on top.

DimensionModel A — Multi-dataplaneModel B — Multi-tenantModel C — One CP, many domainsModel D — One CP per environment
Control plane topologyFederated CPFederated CPIsolated CP (single instance, many URLs)Isolated CP (one instance per environment)
Sovereign-deployableDataplane-level onlyDataplane-level onlyYes — single sovereign CP behind multiple domainsYes — canonical sovereign deployment shape
What it isolatesDataplanes (regions, zones)Tenants on one dataplaneNothing across domains — domains share one CPWhole CP stack per environment
Isolation strengthStrong — separate infrastructureLogical — namespaces and quotasNone at CP — DNS multiplexing onlyStrongest — separate everything
Onboarding speedDays to weeks (new infra)Minutes (namespace + quota)Hours (DNS + cert + identity provider entry)Days to weeks per new environment
Operational footprintOne control surface per dataplaneOne control surface for all tenantsSingle — same as Models A/B on a single CPN control surfaces (one per environment)
Cost efficiencyLower — capacity not pooledHigher — pooled, bin-packedHighest — fewest stacksLowest — N stacks
Blast radiusConfined to one dataplaneAll tenants share the dataplaneAll domains share the CPConfined to one environment
Cross-tenant federationThrough the federated catalogNative within the dataplaneN/A — single CPExternal only (no shared catalog across CPs)
Data sovereigntyPer-regionPer-tenant policyPer CP — single trust boundary covers all domainsFull — per-environment trust boundary
Compliance scopePer-dataplanePer-tenant policyWhole CP in scopePer-environment scope (audit only Production)
Upgrade strategyPer-dataplane rollingSingle upgrade, broader impactSingle, atomic, all domains at oncePer-environment promotion (QE → Dev → Stage → Prod)
Suitable forCompliance zones, regions, hostile tenantsInternal teams, trusted tenantsBrand domains, vanity URLs, DNS rebrandsProd-vs-non-prod isolation, FedRAMP, air-gapped, sovereign
Combines withB inside each dataplane; C or D on the CPA across dataplanes; C or D on the CPA and B beneath the CP; not with D in the same CPA and B inside each environment's CP

How models compose in real deployments

  • Pattern 1 — Single-tenant SaaS — Model A (per-region dataplanes) under one global CP. Model C optional for vanity domains.
  • Pattern 2 — Internal enterprise platform — Model B (many internal teams as tenants on a shared dataplane), one CP, Model C if there are multiple corporate brands.
  • Pattern 3 — Regulated multi-environment — Model D for QE/Dev/Stage/Prod separation, with Model A inside each environment for region split, and Model B inside each region for tenant pooling.
  • Pattern 4 — Sovereign single-customer — Model D with a single environment, fully air-gapped, private registry, Model A or B inside as needed.

For a typical enterprise data platform spanning multiple regions, environments, and teams, the recommended pattern is:

  • Use Model D to separate environments. Run a distinct CP for QE, Dev, Staging, and Prod. Audit scope, image promotion, and incident isolation all become tractable.
  • Use Model A inside each environment's CP to carve dataplanes per region or compliance zone — us-east, eu-central, on-prem-prod, on-prem-non-prod.
  • Use Model B inside each dataplane to pool capacity for trusted internal tenants — one namespace, one Yunikorn queue, one xStore domain per tenant.
  • Layer Model C on a single CP only when DNS variation is the goal — vanity domains, partner branding, or DNS cutover. Do not use Model C as a substitute for environment separation.
  • Deploy xCentral first per CP, before any xStore or xCompute. xCentral is one per tenant per CP and is mandatory for governance.
  • Deploy xStore alongside xCentral in the same CP to federate regional Glue and Hive catalogs into a single namespace tree.
  • Standardize on object storage (S3, GCS, ADLS, Vast S3, Ceph S3, or Apache Ozone) as the canonical data layer, with Iceberg as the table format. This is what makes Models A and D portable across clouds and air-gapped sites.

Best Practices

Tip — Pick the model from the question, not the other way around. If the question is "how do I keep tenants from stepping on each other?", you want Model A or B. If the question is "how do I keep environments from contaminating each other?", you want Model D. If the question is "how do I expose this CP under a partner's brand?", you want Model C. Mixing answers wastes operational budget.

  • Treat the canonical Control Plane domain as forever — every dataplane installed under Model C remembers the canonical Control Plane hostname it was provisioned with. Plan DNS so that the canonical name does not need to change; alternate domains can come and go around it.
  • Register all hostnames with the identity provider up front — for Model C, every public FQDN must appear as a valid redirect URI and web origin on the OIDC client for each realm. Missing entries fail silently as a blank login screen rather than an explicit error.
  • Pin per-environment image registries — for Model D, each Control Plane install should reference a distinct registry, or a distinct path inside one shared registry. This prevents a non-Production image from being pulled into Production by accident.
  • Back up Control Plane databases per environment — under Model D, each Control Plane has its own database. Restore drills must be run per environment; a Production restore plan does not validate the QE plan.

Tip — Validate identity end-to-end after every Model C domain change. Add the new hostname to DNS, issue or extend the certificate, register the redirect URI and web origin on the OIDC client, then walk a real browser through login from the new domain. The most common Model C failure is a working API alongside a broken interactive login.

  • Use the manual install path for sovereign Model D — air-gapped customers should download the Helm values file, vendor it into version control, and apply it through their own pipeline. The Control Plane never reaches into the dataplane to install or upgrade it automatically.
  • Document the topology in one place — a one-page diagram of which environments exist, which dataplanes live under each, and which domains front each Control Plane, kept current in version control, is worth more than any individual configuration file.
VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches