Databricks

Overview

Birdie connects to your Databricks workspace and runs SQL queries through a Databricks SQL Warehouse.

Typical queries executed by Birdie look like:

SELECT *
FROM <catalog>.<schema>.<table>
WHERE <partition_column> BETWEEN :start AND :end;

Your team controls:

  • Which tables or views Birdie can read

  • Which datasets are exposed

  • How data is partitioned for incremental ingestion


Integration models

Birdie supports the authentication mechanisms provided by Databricks.

Birdie authenticates using a Databricks service principal and short-lived OAuth tokensarrow-up-right issued by the workspace OIDC endpoint.

This model is recommended when:

  • You run production or automated ingestion pipelines

  • You want OAuth-based, non-interactive access

  • You want to avoid long-lived credentials

Personal Access Token (PAT)

Birdie authenticates using a technical user and a Personal Access Tokenarrow-up-right.

This model is typically used when:

  • OAuth is not enabled in the workspace

  • You are running a proof-of-concept or non-production setup


Schema requirements

All Birdie database connectors follow the same schema model. Each dataset type must be exposed as one table or one view.

Examples of dataset types:

  • nps

  • csat

  • review

  • survey

  • support_ticket

  • conversation_message

  • Operational or reference tables (accounts, users, metadata)

Birdie uses a separate detailed schema definition, similar to the S3-based ingestionarrow-up-right model.


Requirements

Before starting, make sure you have:

  • A Databricks workspace with Databricks SQL enabled

  • A SQL Warehouse available for querying

  • Admin or equivalent privileges to:

    • Create users or service principals

    • Grant SQL permissions

    • Grant Warehouse access

  • The tables or views Birdie will ingest already created


Setup

1. Create the identity used by Birdie

Birdie can authenticate as either a service principal (recommended) or a technical user.

Option A — Service Principal (recommended)

  1. Go to Settings - Workspace admin - Identity and access - Service principals

  2. Click Add service principal

  3. Name it: birdie

  4. Enable:

    • Workspace Access

    • Databricks SQL Access

  5. Generate a client secret

  6. Save:

    • Client ID

    • Client Secret

Birdie will use the following OAuth token endpoint:

Option B — Technical User + PAT

  1. Create a user named: birdie

  2. Enable:

    • Workspace Access

    • Databricks SQL Access

  3. Generate a Personal Access Token (PAT)

  4. Store the PAT securely

2. Grant read-only SQL permissions

Birdie requires SELECT-only access to the tables or views it will ingest.

Unity Catalog environments

Repeat the SELECT grant for each table or view Birdie should ingest.


Hive Metastore (legacy workspaces)

Repeat for each required table or view.

3. Grant SQL Warehouse access

Birdie must be able to execute queries in a Databricks SQL Warehouse.

Grant CAN USE permission on the warehouse to:

  • The service principal, or

  • The technical user

This step is mandatory for SQL execution.


Connection details to share with Birdie

Provide the following information securely to the Birdie team:

  • Databricks workspace URL / host

  • Authentication method (OAuth or PAT)

  • Identity used (service principal or user)

  • OAuth Client ID and Client Secret or PAT

  • SQL Warehouse ID

  • Catalog, schema, and table or view names

  • Partition column used for incremental ingestion

Birdie validates connectivity using queries such as:


Validating the integration

1

Generate an OAuth token (OAuth example)

You should receive an access_token.

2

Validate Databricks REST API access

Expected response:

3

Validate SQL execution

If this returns 1, Birdie can successfully execute SQL queries.Validating the integration


References

Last updated