> For the complete documentation index, see [llms.txt](https://ask.birdie.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ask.birdie.ai/integrations-and-data-ingestion/how-to-integrate-with/bigquery.md).

# BigQuery

### Overview

Birdie can ingest structured datasets directly from [BigQuery](https://cloud.google.com/bigquery/docs/introduction) using read-only SQL queries. Most customers run Birdie ingestion daily, using a partition column (DATE / TIMESTAMP / DATETIME) to import only new rows and avoid full table scans.

#### Create the Service Account used by Birdie

Birdie connects to BigQuery using a [Google Cloud Service Account](https://docs.cloud.google.com/iam/docs/service-account-overview). This service account is a technical identity used only for read-only access to your data.\
Steps in Google Cloud Console:

1. [Open Google Cloud Console](https://console.cloud.google.com/)
2. Go to IAM & Admin - [Service Accounts](https://console.cloud.google.com/iam-admin/serviceaccounts)
3. Click “Create service account”
4. Service account name (suggested): birdie
5. Description (optional)
6. Click “Create and continue” (roles will be added in the next section)
7. Finish creating the service account

Note: After creation, wait up to 1 minute before using the service account. IAM propagation can be slightly delayed.

***

#### Grant Read-Only BigQuery Permissions

**Permission to run query jobs (Project level)**\
Role: BigQuery Job User (roles/bigquery.jobUser)

This permission allows Birdie to execute SQL queries. It does NOT grant data access by itself.

How to grant:

* IAM & Admin - IAM
* Select your Project
* Add the service account
* Assign role: [BigQuery Job User](https://docs.cloud.google.com/bigquery/docs/access-control#bigquery.jobUser)

***

**Permission to read data (Dataset level – recommended)**\
Role: BigQuery Data Viewer (roles/bigquery.dataViewer)

This permission allows Birdie to read tables or views.

How to grant:

* Go to BigQuery
* Select the Dataset that Birdie should ingest
* Click “Share dataset”
* Add the service account
* Assign role: [BigQuery Data Viewer](https://cloud.google.com/bigquery/docs/access-control#bigquery.dataViewer)

Note: Grant Data Viewer at the dataset (or table) level instead of the entire project.

***

#### Incremental Partition Column

Birdie uses an incremental partition column to determine which rows to ingest on each run. Choosing the right column is critical for complete data ingestion.

#### Requirements

The partition column **must reflect when a row was added or last modified in the table**, not when the original event occurred. For example:

* `loaded_at` or `updated_at` (ETL/pipeline timestamp) - **recommended**
* `inserted_at` (row creation timestamp) - acceptable if the table is append-only

#### Why `posted_at` is not sufficient

Using a business event timestamp like `posted_at` can cause **data gaps**:

1. **Historical updates missed**: If a past row is updated (e.g., a ticket status changes), its `posted_at` remains in the past and Birdie will not re-ingest it.
2. **Pipeline timing gaps**: If your ETL runs at specific times (e.g., every 6 hours), rows loaded between Birdie's query window and the ETL run may not appear until the next cycle. An `updated_at` or `loaded_at` column tied to the ETL timestamp ensures these rows are captured.

#### Recommendation

Add a `loaded_at` or `updated_at` column to your table that is set to `CURRENT_TIMESTAMP()` whenever a row is inserted or modified. Use this column as the partition column when configuring the Birdie integration.

If your table is append-only and never updated, `posted_at` is acceptable as the partition column.

***

#### Prepare Your BigQuery Table

Birdie works best when the source table is [partitioned](https://docs.cloud.google.com/bigquery/docs/creating-partitioned-tables) by a column used for incremental ingestion (for example: posted\_at).

Supported partition column types:

* DATE
* TIMESTAMP
* DATETIME

Steps in BigQuery Console:

1. Open [BigQuery](https://console.cloud.google.com/bigquery)
2. Select your Dataset
3. Click “Create table”
4. Source: Empty table
5. Define the schema:
   1. Make sure the schema includes your partition column as DATE, TIMESTAMP, or DATETIME.
6. Partition and cluster settings:
   1. Partition by field
   2. Select the partition column
   3. Recommended: enable “Require partition filter". This reduces cost and prevents accidental full table scans.
7. Create the table

***

#### Create a Service Account Key (JSON)

Birdie authenticates using a JSON [service account key](https://cloud.google.com/iam/docs/keys-create-delete).\
Steps:

1. Go to IAM & Admin - [Service Accounts](https://console.cloud.google.com/iam-admin/serviceaccounts)
2. Click the service account (birdie)
3. Open the “Keys” tab
4. Click “Add key” - “Create new key”
5. Select JSON
6. Click “Create”\
   The JSON file will be downloaded automatically.

Important:

* This JSON file is what Birdie uses to authenticate

***

Validate Access

You can validate access using Cloud Shell or your local machine.

* Activate the service account using the JSON key

```
gcloud auth activate-service-account --key-file /path/to/birdie.json
```

* Run a simple query

```
bq query --use_legacy_sql=false "SELECT 1 AS ok"
```

* Validate reading your table using a partition filter

```
bq query --use_legacy_sql=false
```

```
"SELECT * FROM <PROJECT_ID>.<DATASET>.<TABLE>
WHERE <PARTITION_COLUMN> >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) LIMIT 10"
```

If this works, Birdie can query your data successfully.

***

#### Share Connection Details with Birdie

To configure the integration, securely provide Birdie with:

```
{
  "connection_details": {
    "project_id": "your-project-id-123",
    "service_account_key": {
      "type": "service_account",
      "project_id": "your-project-id-123",
      "private_key_id": "...",
      "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
      "client_email": "birdie-integration@project.iam.gserviceaccount.com",
      "client_id": "...",
      "auth_uri": "https://accounts.google.com/o/oauth2/auth",
      "token_uri": "https://oauth2.googleapis.com/token",
      "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
      "client_x509_cert_url": "..."
    }
  },
  "data_source": {
    "dataset": "marketing_data",
    "table": "customer_feedback_v1",
    "partition_column": "created_at",
    "partition_type": "TIMESTAMP"
  },
  "metadata": {
    "data_kind": "Support Ticket, NPS, Account"
  }
}
```

| Section             | Field                 | Required | Description                                                                                                       |
| ------------------- | --------------------- | -------- | ----------------------------------------------------------------------------------------------------------------- |
| connection\_details | project\_id           | No       | GCP project ID. Defaults to the project the service account belongs to.                                           |
| connection\_details | service\_account\_key | Yes      | The full JSON service account key downloaded in the previous step.                                                |
| data\_source        | dataset               | Yes      | The BigQuery dataset name (without the project ID), e.g. `gold_layer`.                                            |
| data\_source        | table                 | Yes      | The table or view to query, e.g. `customer_feedback_v1`.                                                          |
| data\_source        | partition\_column     | No       | The date/datetime column used for incremental ingestion. Defaults to `posted_at`.                                 |
| data\_source        | partition\_type       | No       | Type of the partition column: `DATE`, `TIMESTAMP`, or `DATETIME`.                                                 |
| metadata            | data\_kind            | Yes      | The type(s) of data in the table, e.g. `Support Ticket`, `NPS`, `CSAT`, `Review`, `Social Media Post`, `Account`. |

[Share it securely](/integrations-and-data-ingestion/securely-sharing-credentials-for-data-integration.md) with the Birdie team

***

Data Types Supported

Birdie can ingest [structured datasets](https://ask.birdie.ai/integrations-and-data-ingestion/how-to-integrate-with/s3-azure-gcs#data-in-scope) exposed as BigQuery tables or views, including:

* Conversations and messages (Support Tickets, Issues, Social Media Posts)
* Feedback datasets (review, nps, csat)
* Operational or reference tables (accounts, users, metadata)

Expose one table or view per dataset type.

***

### References

* [BigQuery Introduction](https://cloud.google.com/bigquery/docs/introduction)
* [Creating Partitioned Tables](https://cloud.google.com/bigquery/docs/creating-partitioned-tables)
* [BigQuery IAM Roles](https://cloud.google.com/bigquery/docs/access-control#bigquery)
* [Service Account Keys](https://cloud.google.com/iam/docs/keys-create-delete)
* [Securely Sharing Credentials](https://ask.birdie.ai/integrations-and-data-ingestion/securely-sharing-credentials-for-data-integration)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ask.birdie.ai/integrations-and-data-ingestion/how-to-integrate-with/bigquery.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
