Prerequisites and Deployment Requirements for ADOC Dataplane
To ensure a seamless and efficient configuration, the ADOC data plane must be deployed carefully and in accordance with required conditions. This document defines the control plane permissions, Kubernetes requirements, and network connectivity standards required to successfully deploy the ADOC dataplane. By following these instructions, users may ensure that their environment is properly prepared for both automatic and manual deployment processes, allowing the dataplane to integrate seamlessly with their existing infrastructure.
Prerequisite
Control Plane Permission
Ensure that the user permissions required to create a data plane in the control plane are added.
Kubernetes Requirement
A dedicated Kubernetes cluster is the recommended approach to deploying the Acceldata ADOC dataplane. However, Acceldata also offers shared clusters, in which the dataplane installation and resources are deployed in a separate namespace within your Kubernetes cluster.
To achieve the highest performance and stability, deploy the ADOC dataplane in a dedicated Kubernetes cluster. However, if a dedicated cluster is not practical, you might use the following configuration:
Minimum node requirements:
Quantity: 4 nodes CPU: 4 cores per node. Memory: Each node has 32 gigabytes of RAM. Disk: Each node has 80 GB of storage.
Note: This suggestion is based on typical use scenarios and may vary depending on the enterprise's demands and data characteristics. Update the configuration as needed to match the unique needs of your deployment environment.
Permission Required in Kubernetes
There are two ways to deploy the dataplane to any Kubernetes cluster:
Automatic Deployment
- Initially, the user must configure the Acceldata deployer in the Kubernetes cluster.
- The Acceldata deployer will request permissions and manage the data plane's deployment and upgrading. Follow the steps provided in the Install Data Plane guide.
Permission Required for Automatic Deployment
The user who will initiate the automatic flow must have the minimum permissions specified below:
Cluster Role
apiVersion rbac.authorization.k8s.io/v1
kind ClusterRole
metadata
name cluster-resource-manager
rules
apiGroups"rbac.authorization.k8s.io"
resources"clusterroles"
verbs"get" "list" "watch" "update" "patch"
apiGroups"rbac.authorization.k8s.io"
resources"clusterrolebindings"
verbs"get" "list" "watch" "update" "patch"
Role
apiVersion rbac.authorization.k8s.io/v1
kind Role
metadata
name resource-manager
namespace <YOUR_NAMESPACE>
rules
apiGroups""
resources"serviceaccounts" "configmaps" "services" "secrets"
verbs"create" "delete" "get" "list" "patch" "update" "watch"
apiGroups"rbac.authorization.k8s.io"
resources"rolebindings" "roles"
verbs"create" "delete" "get" "list" "patch" "update" "watch"
apiGroups"apps"
resources"deployments"
verbs"create" "delete" "get" "list" "patch" "update" "watch"
apiGroups"batch"
resources"jobs" "cronjobs"
verbs"create" "delete" "get" "list" "patch" "update" "watch"
Manual Deployment
- The Acceldata control plane portal gives the user the option of downloading the appropriate Helm chart and values file.
- You can then deploy the Helm charts using the Helm command or any Helm deployer.
Follow the steps provided in the Install Data Plane guide.
Permission Required for Manual Deployment
The user who is responsible for deploying our dataplane using Helm or any Helm installer should have the following Role and Cluster Role permissions.
Role
apiVersion rbac.authorization.k8s.io/v1
kind Role
metadata
name helm-installer-role
namespace your-namespace
rules
apiGroups""
resources"serviceaccounts" "secrets" "configmaps" "services"
verbs"create" "get" "list" "watch" "update" "delete"
apiGroups"rbac.authorization.k8s.io"
resources"roles" "rolebindings"
verbs"create" "get" "list" "watch" "update" "delete"
apiGroups"apps"
resources"deployments"
verbs"create" "get" "list" "watch" "update" "delete"
apiGroups"autoscaling"
resources"horizontalpodautoscalers"
verbs"create" "get" "list" "watch" "update" "delete"
apiGroups"batch"
resources"jobs" "cronjobs"
verbs"create" "get" "list" "watch" "update" "delete"
apiGroups"sparkoperator.k8s.io"
resources"scheduledsparkapplications" "sparkapplications"
verbs"create" "get" "list" "watch" "update" "delete"
Cluster Role
apiVersion rbac.authorization.k8s.io/v1
kind ClusterRole
metadata
name helm-installer-clusterrole
rules
apiGroups"apiextensions.k8s.io"
resources"customresourcedefinitions"
verbs"create" "get" "list" "watch" "delete"
apiGroups"rbac.authorization.k8s.io"
resources"clusterroles" "clusterrolebindings"
verbs"create" "get" "list" "watch" "update" "delete"
Network Connectivity
Make sure to whitelist the egress firewall and egress http proxy from the data plane.
Domain | Port | Schema | Description |
---|---|---|---|
accounts.acceldata.app | 443 | HTTPS | This is for OAuth, account signing, and tenant registration. |
<tenant>.acceldata.app | 443 | HTTPS | This is the endpoint that consumes analytics data (metadata and metrics) from the dataplane |
dataplane.acceldata.app | 443 | HTTPS | This endpoint handles control flow.
|
public.ecr.aws | 443 | HTTPS | Needed to fetch the container for authentication with Acceldata private ECR. |
registry.acceldata.app | 443 | HTTPS | A new registry endpoint to fetch images. |
191579300362.dkr.ecr.us-east-1.amazonaws.com | 443 | HTTPS | It is the legacy registry endpoint. |
Run Preflight Checks Script
Acceldata has created the preflight check tool, which is an open-source tool for verifying a Kubernetes cluster's readiness, network connectivity, and dataplane installation capacity. This tool is intended to reduce the validation time in the dataplane environment.
The goal of the preflight check tool is as follows:
- ADOC Control Plane Connectivity: Ensures the Kubernetes cluster can communicate with the ADOC control plane endpoints.
- Data Source Connectivity: Ensures that the cluster may access specified data sources and network components via the dataplane environment. This includes verifying TCP, UDP, and HTTP connections to ensure data sources are available.
Note: This tool is not meant for functional testing, such as verifying usernames, passwords, or data source authentication procedures.
How to Run a Preflight Check?
Before doing a preflight check, make sure the user has the cluster admin role in Kubernetes. The preflight check tool includes a configuration file that allows users to create various test cases such as TCP egress checks, UDP egress checks, HTTP checks, node count, node sizing, and quota in the namespace.
./dataplane-cli preflight-launcher --config-file=./config.yaml -n <adoc_dp_namespace>
Example - YAML Config File:
egressTCPConnection
host12.128.0.6
port5120
host12.128.0.6
port443
host12.128.0.6
port6166
egressUDPConnection
host12.128.0.6
port5120
host12.128.0.6
port443
host12.128.0.6
port6166
egressHttpConnection
proxy
enabledfalse
host egress.example.com
port3148
user proxyuser
password proxypass
urls
scheme https
host acceldata.acceldata.app
port443
path /
scheme https
host accounts.acceldata.app
port443
path /
scheme https
host dataplane.acceldata.app
port443
path /
quota
memory 20Gi
cpu10
ephemerial 100Gi
nodes
quantity4
spec
cpu4
memory 32Gi
disk 80Gi
Detailed Schema Documentation with Examples
Egress TCP Connection
This section describes the egress TCP connection tests. It lists the hosts and ports that the system should be able to connect to via TCP.
Example:
egressTCPConnection
host12.128.0.6
port5120
host12.128.0.6
port443
host12.128.0.6
port6166
Field | Description | Example |
---|---|---|
host | The IP address or hostname that the system needs to connect to. | 12. 128. 0. 6 - This is the address of the server you are trying to connect to. |
port | The specific port on the host that the system needs to connect to using TCP. | 5120 - This is the port number on the host that is open for TCP connections. |
Each entry in the list represents a specific TCP connection that needs to be validated.
Egress UDP Connection
This section describes the egress UDP connection tests. It lists the hosts and ports that the system should be able to connect to via UDP.
Example:
egressUDPConnection
host12.128.0.6
port5120
host12.128.0.6
port443
host12.128.0.6
port6166
Field | Description | Example |
---|---|---|
host | The hostname or IP address to which the system must connect. | 12.128.0.6 - This is the server address you are attempting to connect to. |
port | The specific port on the host that the system needs to connect to using UDP. | 5120 - This is the port number on the host that is open for UDP connections. |
Each entry in the list represents a specific UDP connection that needs to be validated.
Egress HTTP Connection
This section contains the configuration for HTTP connections, such as proxy settings and URLs to be probed.
Proxy Configuration determines whether or not a proxy is enabled for HTTP connections.
Example:
egressHttpConnection
proxy
enabledfalse
host egress.example.com
port3148
user proxyuser
password proxypass
Field | Description | Example |
---|---|---|
enabled | A boolean value indicating if the proxy is activated. | False : It means proxy is not enabled |
host | The hostname or IP address of the proxy server. | egress.example.com : The proxy server's address is provided here. |
port | The port number on the proxy server. | 3148 : The port number on the proxy server. |
user | The username used to authenticate with the proxy server. | proxyuser : The proxy server's username. |
password | The password used to authenticate with the proxy server. | proxypass: This is the password for the proxy server. |
URLs to Probe
Specifies the URLs that need to be checked for connectivity. Each URL includes a scheme, host, port, and path.
Example:
urls
scheme https
host acceldata.acceldata.app
port443
path /
scheme https
host accounts.acceldata.app
port443
path /
scheme https
host dataplane.acceldata.app
port443
path /
Field | Description | Example |
---|---|---|
scheme | The scheme of the URL, typically http or https. | https - Indicates that the connection should be made using HTTPS. |
host | The hostname of the URL. | acceldata.acceldata.app - This is the domain of the server. |
port | The port number for the URL. | 443 - Standard port number for HTTPS connections. |
path | The path component of the URL. | / - The root path of the URL. |
Each entry in the list represents a specific URL that needs to be probed for connectivity, ensuring that the necessary endpoints are reachable from the cluster.
Quota
This section specifies the minimum resource requirements for the namespace where the dataplane will be installed.
quota
memory 20Gi
cpu10
ephemerial 100Gi
Resource | Description | Example |
---|---|---|
Memory | Specifies the minimum memory required. | 20Gi |
CPU | Specifies the minimum number of CPU cores required. | 10 |
Ephemeral Storage | Specifies the minimum ephemeral storage required. | 100Gi |
Nodes
This section specifies the minimum number of nodes required in the Kubernetes cluster and the specifications for each node.
Example:
Field | Description | Example |
---|---|---|
Quantity | Specifies the minimum number of nodes required. | 4 |
CPU | Specifies the number of CPU cores for each node. | 4 |
Memory | Specifies the amount of memory for each node. | 32Gi |
Disk | Specifies the amount of disk space for each node. | 80Gi |
Ensuring that all prerequisites and deployment requirements are met is crucial for the successful implementation of the ADOC dataplane. By adhering to these guidelines, users can achieve a reliable and efficient deployment, enabling the robust performance and scalability of the ADOC dataplane within their infrastructure.