Skip to main content

Databricks

This guide walks you through connecting a Databricks workspace to Roundtable so the AI can query your data using the query_databricks tool.

Prerequisites

Before you begin, make sure you have:

  • An active Databricks workspace (on AWS, Azure, or GCP).
  • A Databricks SQL warehouse or all-purpose cluster with SQL access.
  • Permissions to generate a personal access token in Databricks.
  • Admin access to your Roundtable organization (to create connections).

Step 1: Gather Your Databricks Credentials

You'll need four values from your Databricks workspace:

Host

Your Databricks workspace URL without the https:// prefix.

  • Find it in your browser's address bar when logged into Databricks.
  • Example: dbc-a1b2c3d4-e5f6.cloud.databricks.com

HTTP Path

The connection path for your SQL warehouse or cluster.

  1. In Databricks, go to SQL Warehouses (or Compute for all-purpose clusters).
  2. Click your warehouse/cluster.
  3. Open the Connection Details tab.
  4. Copy the HTTP Path value.
    • Example: /sql/1.0/warehouses/abcdef1234567890

Catalog

The Unity Catalog name that contains your data.

  • Example: main or analytics
info

If your Databricks workspace doesn't use Unity Catalog, you can leave this field as hive_metastore (the legacy default) or set it to match your metastore configuration.

Personal Access Token

  1. In Databricks, click your user avatar in the top-right corner.
  2. Select Settings.
  3. Go to Developer → Access Tokens.
  4. Click Generate New Token.
  5. Enter a comment (e.g., Roundtable read access) and set an expiration.
  6. Click Generate and copy the token.
danger

The token is only shown once. Copy it immediately and store it securely. If you lose it, you'll need to generate a new one.


Step 2: Add the Connection in Roundtable

  1. Go to your Roundtable organization's Settings → Connections.
  2. Click Add Connection.
  3. Select Databricks as the connection type.
  4. Fill in the required fields:
FieldDescriptionExample
Connection NameA friendly label for this connectionProduction Databricks
HostDatabricks workspace URL (without https://)dbc-a1b2c3d4-e5f6.cloud.databricks.com
HTTP PathSQL warehouse or cluster connection path/sql/1.0/warehouses/abcdef1234567890
CatalogUnity Catalog nameanalytics
TokenPersonal access tokendapi••••••••••••••••
  1. Click Save.

Step 3: Test the Connection

After saving, click the Test Connection button. Roundtable will:

  1. Connect to your Databricks workspace using the provided host and token.
  2. Verify that the SQL warehouse is reachable via the HTTP path.
  3. Confirm that the catalog exists and the token has query permissions.

Common test failures and their fixes:

ErrorCauseFix
Authentication failedInvalid or expired tokenGenerate a new personal access token
Could not connect to hostIncorrect host URLVerify the workspace URL in your browser
HTTP Path not foundWrong path or warehouse is stoppedCheck the path in Connection Details; start the warehouse
Catalog not foundIncorrect catalog nameVerify the catalog name in Databricks Data Explorer
tip

Make sure your Databricks SQL warehouse is running before testing the connection. Serverless warehouses may auto-start, but classic warehouses need to be started manually.


Step 4: Attach to a Workspace

  1. Open the workspace where you want Databricks access.
  2. Go to Settings → Connections.
  3. Select your Databricks connection from the dropdown.
  4. Click Save.

The query_databricks tool is now active in this workspace.


Example: Querying Your Data

Once the connection is live, workspace members can ask the AI to query Databricks in natural language:

User:

List all tables in the analytics catalog and show me the top 10 rows from the events table.

AI response (using query_databricks):

The AI will:

  1. Run a discovery query:
    SHOW TABLES IN analytics.default;
  2. Then fetch sample data:
    SELECT *
    FROM analytics.default.events
    LIMIT 10;
  3. Return the table list and sample rows in a formatted summary.

Another example:

User:

What's the conversion rate from signup to first purchase by channel this quarter?

AI response:

SELECT
signup_channel,
COUNT(DISTINCT s.user_id) AS signups,
COUNT(DISTINCT p.user_id) AS purchasers,
ROUND(COUNT(DISTINCT p.user_id) * 100.0 / COUNT(DISTINCT s.user_id), 2) AS conversion_rate
FROM analytics.default.signups s
LEFT JOIN analytics.default.purchases p
ON s.user_id = p.user_id
AND p.purchase_date >= s.signup_date
AND p.purchase_date <= DATE_ADD(s.signup_date, 90)
WHERE s.signup_date >= DATE_TRUNC('QUARTER', CURRENT_DATE())
GROUP BY signup_channel
ORDER BY conversion_rate DESC;
tip

Include your catalog structure and key table descriptions in the workspace's system prompt so the AI can write accurate queries without needing to explore the schema first.