# Databricks

This article describes how to configure your Databricks workspace to allow Birdie to export data to a dedicated catalog and schema.

> **Prerequisite:** Before configuring the export integration, you should already have completed the setup described in Data Ingestion with Databricks. The steps below assume a Birdie service principal (or user) already exists and has access to a SQL Warehouse.

### Overview

Birdie can load processed data (e.g., enriched feedback, areas, opportunities) into your Databricks workspace. We recommend creating:

* A dedicated write role or grant set for Birdie
* A separate catalog and schema for Birdie exports

This isolates write access from your existing data.

### Step 1 – Create a dedicated catalog and schema for exports

Create a catalog and schema for Birdie to write to:

```sql
CREATE CATALOG IF NOT EXISTS birdie_export;
CREATE SCHEMA IF NOT EXISTS birdie_export.birdie_export_schema;
```

### Step 2 – Grant required privileges to the Birdie principal

Grant the necessary privileges for Birdie to create and manage tables within the export schema.

#### For OAuth (Service Principal)

```sql
-- Catalog and schema usage
GRANT USE CATALOG ON CATALOG birdie_export TO `birdie`;
GRANT USE SCHEMA ON SCHEMA birdie_export.birdie_export_schema TO `birdie`;

-- Table privileges (create, read, write)
GRANT CREATE TABLE ON SCHEMA birdie_export.birdie_export_schema TO `birdie`;
GRANT MODIFY ON SCHEMA birdie_export.birdie_export_schema TO `birdie`;
GRANT SELECT ON SCHEMA birdie_export.birdie_export_schema TO `birdie`;
```

#### For Personal Access Token (PAT)

Apply the same `GRANT` statements above, replacing `` `birdie` `` with the user account that owns the PAT.

### Step 3 – Ensure SQL Warehouse access

The Birdie principal must have **CAN USE** permission on the SQL Warehouse that will execute the export queries. If this was already granted during the data ingestion setup, no additional action is needed.

### Step 4 – Share export configuration with Birdie

Share the following details securely with Birdie:

**For OAuth (Service Principal):**

```json
{
  "connection_details": {
    "host": "https://dbc-xxxx.cloud.databricks.com",
    "warehouse_id": "a1b2c3d4e5f6g7h8",
    "auth": {
      "method": "oauth",
      "client_id": "your-service-principal-client-id",
      "client_secret": "your-service-principal-client-secret"
    }
  },
  "export_destination": {
    "catalog": "birdie_export",
    "schema": "birdie_export_schema"
  }
}
```

**For PAT:**

```json
{
  "connection_details": {
    "host": "https://dbc-xxxx.cloud.databricks.com",
    "warehouse_id": "a1b2c3d4e5f6g7h8",
    "auth": {
      "method": "pat",
      "personal_access_token": "your-personal-access-token"
    }
  },
  "export_destination": {
    "catalog": "birdie_export",
    "schema": "birdie_export_schema"
  }
}
```

| Section                  | Field                   | Required | Description                                                 |
| ------------------------ | ----------------------- | -------- | ----------------------------------------------------------- |
| connection\_details      | host                    | Yes      | Workspace URL, e.g. `https://dbc-xxxx.cloud.databricks.com` |
| connection\_details      | warehouse\_id           | Yes      | Databricks SQL Warehouse ID.                                |
| connection\_details.auth | method                  | Yes      | Authentication mode: `oauth` (recommended) or `pat`.        |
| connection\_details.auth | client\_id              | OAuth    | Service principal client ID.                                |
| connection\_details.auth | client\_secret          | OAuth    | Service principal client secret.                            |
| connection\_details.auth | personal\_access\_token | PAT      | Personal access token.                                      |
| export\_destination      | catalog                 | Yes      | Unity Catalog name for the export target.                   |
| export\_destination      | schema                  | Yes      | Schema name within the catalog.                             |

If not already shared during the data ingestion setup, also include the warehouse ID and account host.

Share credentials securely with the Birdie team. Never send credentials via email or unencrypted channels.

### How the Export Works

Birdie uses Databricks SQL Warehouse to load data through the following pipeline:

1. **Staging:** A temporary table or volume is created to hold the incoming data.
2. **Schema alignment:** Table schema is derived from the CSV export headers and matched to the target table. If the table does not yet exist, it is created automatically.
3. **Data loading:** Data is loaded into the target table using `COPY INTO` or direct `INSERT` operations.
4. **Merge:** For incremental loads, rows with matching primary keys are deleted and re-inserted within a single transaction.
5. **Commit:** Birdie uses MERGE INTO so the upsert is atomic on the target Delta table.

### Exported Data Format

Birdie creates one table per CSV file. Table and column naming follows our standard CSV export conventions.

Data is de-duplicated on each load, so it will be ready to use. For details on joining tables, refer to our data exports documentation.

Some data (e.g., custom fields) may require additional handling due to their object-like nature.

### References

* [Databricks Unity Catalog Privileges](https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/index.html)
* [Databricks SQL Warehouses](https://docs.databricks.com/en/compute/sql-warehouse/index.html)
* [Databricks Service Principals](https://docs.databricks.com/en/admin/users-groups/service-principals.html)
* [Databricks Personal Access Tokens](https://docs.databricks.com/en/dev-tools/auth/pat.html)
* [Data exports](/admin-and-settings/export-import-csv.md#export)
* [Data Ingestion with Databricks](https://ask.birdie.ai/integrations-and-data-ingestion/how-to-integrate-with/databricks)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ask.birdie.ai/integrations-and-data-ingestion/how-to-export-with.../databricks.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
