Global Storage

What Is Global Storage?

Acceldata Data Plane processes a huge amount of data, from profiling and quality checks to advanced analytics and monitoring. All of that data needs to go somewhere, whether it's logs, results, temporary files, or models.

That “somewhere” is what we call Global Storage, a central storage location used by Data Plane services to read and write data.

Depending on your environment, this storage could be:

  • Google Cloud Storage (GCS) – If you're running in GCP
  • Amazon S3 – If you're on AWS
  • Azure Data Lake (ADLS) – If you're on Microsoft Azure
  • HDFS or MAPRFS – For on-prem or Hadoop-based setups
  • Local disk – For quick tests or minimal deployments (not recommended for production)

Where Is This Configured?

All global storage settings live in a JSON configuration file at:

Bash
Copy

This file tells Data Plane:

  • What type of storage you’re using (gcs, s3, adls, etc.)
  • Where to find the storage (bucket name, project ID, etc.)
  • How to securely connect to it (credentials, roles, or keys)

Sample Configuration (GCS Example)

Here’s what this JSON file might look like if you're using Google Cloud Storage:

JSON
Copy

Note If you're using GCS, you’ll also need to securely provide the service account credentials (gcp_cred.json) to the system. This is explained in the Secret Management.

AWS S3
Azure ALDS
HDFS/MAPRFS
Local Storage to Pod

Here’s what this JSON file might look like if you're using AWS S3:

JSON
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard

How to Provide Credentials (If Using Cloud Storage)

For cloud-based storage (like GCS or S3), the system needs credentials to authenticate and access the storage bucket.

Rather than putting raw credentials in the file or environment, Data Plane reads them from Kubernetes Secrets for security.

Here’s how that works:

Step 1: Create or update /opt/acceldata/globalstorage.json

Define your storage type (e.g., gcs, s3) and connection settings.

Step 2: Base64-encode the file

Kubernetes requires secrets to be stored in base64-encoded form.

Step 3: Inject the config into the cluster

Run:

Bash
Copy

In the data: section, add:

YAML
Copy

Step 4 (GCP Only): Provide GCP credentials

If using GCS, you'll also need to base64-encode your gcp_cred.json file (your service account credentials) and add it to the gcp-cred Kubernetes Secret:

Bash
Copy

Inside the data: section, add or update:

YAML
Copy

Note For full instructions on setting up secrets securely — including how to create and inject gcp_cred.json — refer to the Secret Management.

Deploying the Configuration

After updating the JSON file with the encoded secret, restart the data plane services by running the following command:

Bash
Copy

Once completed, navigate to the Data Plane's Application Configuration page in the UI to verify that the Global Storage is properly set up.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard