Set Up Databricks for Compute Monitoring

Acceldata Data Observability Cloud (ADOC) integrates with Databricks to provide compute observability, monitoring cluster health, job performance, and costs. This guide provides step-by-step instructions to connect your Databricks workspace (on AWS or Azure) to ADOC for Compute Observability, enabling you to optimize resources and reduce costs.

Prerequisites

Complete the following prerequisites to enable Compute Observability in ADOC.

Common Prerequisites

  • Access to ADOC Platform: Ensure you have login credentials for ADOC.
  • Databricks Workspace: A running Databricks workspace on AWS or Azure.
  • Databricks Workspace ID: Find in the workspace properties page (e.g., 1234567890).
  • Databricks Warehouse ID: Obtain from the SQL Warehouse URL (e.g., your-warehouse-id from https://<instance>/sql/1.0/warehouses/your-warehouse-id).
  • Personal Access Token: A secure key for Databricks API access.

AWS-Specific Prerequisites

  1. Create an IAM User for Cost Explorer:
    1. Log in to the AWS Console and go to IAM > Users > Add user.
    2. Enable Programmatic access.
    3. Attach this custom policy for cost retrieval:
JSON
Copy
  • Save the Access Key and Secret Keys securely.
  1. Create a Databricks Personal Access Token:

    1. In your Databricks workspace, go to User avatar > User Settings > Access Tokens.
    2. Click Generate New Token, assign a nickname (e.g., “ADOC-Token”), and set an expiration date (e.g., 90 days).
    3. Copy the token (displayed only once) and store it securely.
  2. Set Up Global Init Script for Automated Agent Deployment:

    • Use Acceldata’s script to deploy the Pulse agent:
Copy
  • In Databricks, go to Admin Console > Clusters > Init Scripts, paste or upload the script, and apply to relevant clusters.
  • Restart clusters to activate the agent (automated thereafter).
  • NOTE Replace <your-adoc-api-key>with your ADOC-provided API key (contact Acceldata support)
  1. Provide DBU Pricing:

    • Check your Databricks billing console for DBU rates (e.g., $0.20/DBU for Jobs Compute).
    • Note these for accurate cost reporting.
  2. (Optional) Use AWS Secrets Manager:

    1. Store credentials (e.g., Access Key, Token) in AWS Secrets Manager.
    2. Reference secrets by ARN in ADOC.

Azure-Specific Prerequisites

  1. Create an Azure Service Principal:

    1. In Azure Portal, go to Azure Active Directory > App Registrations > New Registration.
    2. Name the app (e.g., “ADOC-Databricks-SP”) and click Register.
    3. Copy the Application (Client) ID, Directory (Tenant) ID, and create a Client Secret.
    4. Save these values securely.
  2. Create and Assign a Custom Role:

    1. In Azure Portal, go to Subscription or Databricks Resource Group > Access Control (IAM).
    2. Click Add > Add custom role and define permissions:
JSON
Copy
  • Assign the role to the Service Principal.
  1. Add Service Principal to Databricks Workspace:

    • In Databricks, go to Admin Settings > Users or Groups, add the Service Principal using its Client ID, and assign roles (e.g., Contributor).
  2. Grant Workspace Admin Access:

    • In the Databricks user list, locate the Service Principal and enable the Workspace Admin toggle.
  3. Gather Workspace Details:

Prerequisite Comparison Table

RequirementAWSAzure
Identity/AuthIAM User + Personal Access TokenService Principal + Personal Access Token
Key PermissionsCost Explorer, Workspace/API ReadCustom RBAC Role, Admin Workspace Access, Cost Management
Secret ManagementAWS Secrets Manager (optional)Azure Key Vault (optional)
Advanced ObservabilityGlobal Init Script, DBU ConfigurationAdmin Role, Custom Role
Cost RetrievalAPI or System Table methodAPI or System Table method

Add Databricks as a Data Source

Follow these steps to connect your Databricks workspace to ADOC for Compute Observability. Steps are identical for AWS and Azure unless specified.

Step 1: Start Setup

  1. In ADOC, select Register from the left menu.
  2. Click Add Data Source and choose Databricks.
  3. On the Data Source Details page:
  • Enter a name (e.g., “Prod-Databricks-Compute”).
  • (Optional) Add a description (e.g., “Compute monitoring for analytics clusters”).
  • Enable the Compute Observability toggle.
  1. Click Next.

Step 2: Add Connection Details

Provide the following details. Refer to the Databricks Documentation for help.

  • Cloud Provider: Select AWS or Azure.

  • Cloud Region: Enter your region (e.g., us-west-2 for AWS, eastus for Azure).

  • Workspace Name: Enter a descriptive name (e.g., “Analytics-Workspace”).

  • Databricks URL: Provide the full URL (e.g., `https://adb-1234567890.cloud.databrickssmile emoticon).

  • Warehouse ID: Find in the SQL Warehouse URL (e.g., your-warehouse-id).

  • Workspace ID: Obtain from workspace properties (e.g., 1234567890).

  • Token: Enter the Personal Access Token.

  • Auto-Renew Token: Enable to avoid manual updates (if supported).

  • Advanced Options (AWS):

    • Enable AWS Actual Cost and select API method.
    • Enter AWS Access Key ID and Secret Access Key.
  • Advanced Options (Azure):

    • (Optional) Use Service Principal with Client ID, Client Secret, and Tenant ID.
    • Enable Azure Actual Cost for cost retrieval.

Step 3: Validate and Save Connection

  1. Click Test Connection. If successful, you’ll see “Connected.” If it fails, check:

    • Invalid or expired Personal Access Token.
    • Incorrect Workspace URL or ID.
    • Insufficient permissions for token or Service Principal.
  2. Click Next.

Step 4: Configure Compute Observability

Configure settings for Pulse to monitor compute resources:

  • Enable Global Init Script:

    • Toggle ON to apply the Pulse agent script to all clusters for monitoring CPU, memory, and Spark metrics.
    • Paste the script from the prerequisites (replace <your-adoc-api-key>).
  • Compute Cost Parameters:

    • Enter per-DBU costs (e.g.):

      • Jobs Compute: $0.20/DBU
      • Jobs Photon Compute: $0.25/DBU
      • Delta Live Tables: $0.30/DBU
      • All-Purpose Photon Compute: $0.22/DBU
      • All-Purpose Cluster: $0.18/DBU
    • Enter Cloud Provider Cost Discount (e.g., 10%) if applicable.

    • Enable Tag-Based Chargebacks to allocate costs by project or team.

  • Enable Private S3 Bucket (AWS only):

    • Toggle ON and configure for log/data storage.
  • Click Submit to save. A Databricks card will appear on the ADOC Data Sources page, showing connection status.

What’s Next

With Compute Observability enabled, you can:

  • Monitor Clusters with ADOC: View cluster health, resource utilization, and job statuses (e.g., running, failed).
  • Analyze Performance: Use JVM flame graphs to identify bottlenecks.
  • Optimize Costs: Set budget alerts and use tag-based chargebacks for cost allocation.
  • Use Query Studio: View real-time/historical queries, abort long-running queries, and explore heatmaps.
  • Check Guardrails: Visualize terminated clusters and usage limits.

Known Limitations

LimitationDetailsRecommendation
System Time AdjustmentCost data requires UTC system time for Azure Portal alignment.Set system time to UTC.
Job Studio Page MismatchFilter facet counts may differ due to update frequencies.Expect minor discrepancies in job counts.
Cloud Vendor Cost DelayAzure Portal cost calculations may take 24-48 hours.Allow up to 48 hours for accurate costs.
Initial API Data RetrievalFirst-time API cost data takes 24 hours for 30-day history.Plan for delayed historical data.
All-Purpose Cluster Cost DisplayCosts shown daily; ≤24-hour ranges may not display data.Select date ranges >24 hours.

Troubleshooting and FAQs

Common Issues:

  • “Connection Failed”:

    • Verify token validity and permissions.
    • Check Workspace URL/ID accuracy.
    • Ensure network access to Databricks APIs.
  • Global Init Script Not Running:

    • Confirm script is applied and clusters are restarted.
  • Cost Data Missing:

    • Verify DBU pricing and Cost Explorer (AWS) or API access (Azure).

FAQs

  1. Do I need admin access?

Yes, for init script or Service Principal setup. Non-admins should coordinate with their Databricks admin.

  1. How do I find my DBU pricing?

Check your Databricks billing console or consult your account manager.

Glossary

  • DBU (Databricks Unit): A unit measuring compute usage for Databricks billing.
  • Personal Access Token: A secure key for Databricks API access.

Additional References

  1. Databricks Documentation
  2. Azure Custom Roles
  3. Create or update Azure Custom Roles using the Azure Portal
  4. AWS IAM Setup Documentation
  5. Azure Service Principal Guide
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard