BigQuery
Overview
Birdie can ingest structured datasets directly from BigQuery using read-only SQL queries. Most customers run Birdie ingestion daily, using a partition column (DATE / TIMESTAMP / DATETIME) to import only new rows and avoid full table scans.
Create the Service Account used by Birdie
Birdie connects to BigQuery using a Google Cloud Service Account. This service account is a technical identity used only for read-only access to your data. Steps in Google Cloud Console:
Go to IAM & Admin - Service Accounts
Click “Create service account”
Service account name (suggested): birdie
Description (optional)
Click “Create and continue” (roles will be added in the next section)
Finish creating the service account
Note: After creation, wait up to 1 minute before using the service account. IAM propagation can be slightly delayed.
Grant Read-Only BigQuery Permissions
Permission to run query jobs (Project level) Role: BigQuery Job User (roles/bigquery.jobUser)
This permission allows Birdie to execute SQL queries. It does NOT grant data access by itself.
How to grant:
IAM & Admin - IAM
Select your Project
Add the service account
Assign role: BigQuery Job User
Permission to read data (Dataset level – recommended) Role: BigQuery Data Viewer (roles/bigquery.dataViewer)
This permission allows Birdie to read tables or views.
How to grant:
Go to BigQuery
Select the Dataset that Birdie should ingest
Click “Share dataset”
Add the service account
Assign role: BigQuery Data Viewer
Note: Grant Data Viewer at the dataset (or table) level instead of the entire project.
Incremental Partition Column
Birdie uses an incremental partition column to determine which rows to ingest on each run. Choosing the right column is critical for complete data ingestion.
Requirements
The partition column must reflect when a row was added or last modified in the table, not when the original event occurred. For example:
loaded_atorupdated_at(ETL/pipeline timestamp) - recommendedinserted_at(row creation timestamp) - acceptable if the table is append-only
Why posted_at is not sufficient
posted_at is not sufficientUsing a business event timestamp like posted_at can cause data gaps:
Historical updates missed: If a past row is updated (e.g., a ticket status changes), its
posted_atremains in the past and Birdie will not re-ingest it.Pipeline timing gaps: If your ETL runs at specific times (e.g., every 6 hours), rows loaded between Birdie's query window and the ETL run may not appear until the next cycle. An
updated_atorloaded_atcolumn tied to the ETL timestamp ensures these rows are captured.
Recommendation
Add a loaded_at or updated_at column to your table that is set to CURRENT_TIMESTAMP() whenever a row is inserted or modified. Use this column as the partition column when configuring the Birdie integration.
If your table is append-only and never updated, posted_at is acceptable as the partition column.
Prepare Your BigQuery Table
Birdie works best when the source table is partitioned by a column used for incremental ingestion (for example: posted_at).
Supported partition column types:
DATE
TIMESTAMP
DATETIME
Steps in BigQuery Console:
Open BigQuery
Select your Dataset
Click “Create table”
Source: Empty table
Define the schema:
Make sure the schema includes your partition column as DATE, TIMESTAMP, or DATETIME.
Partition and cluster settings:
Partition by field
Select the partition column
Recommended: enable “Require partition filter". This reduces cost and prevents accidental full table scans.
Create the table
Create a Service Account Key (JSON)
Birdie authenticates using a JSON service account key. Steps:
Go to IAM & Admin - Service Accounts
Click the service account (birdie)
Open the “Keys” tab
Click “Add key” - “Create new key”
Select JSON
Click “Create” The JSON file will be downloaded automatically.
Important:
This JSON file is what Birdie uses to authenticate
Validate Access
You can validate access using Cloud Shell or your local machine.
Activate the service account using the JSON key
Run a simple query
Validate reading your table using a partition filter
If this works, Birdie can query your data successfully.
Share Connection Details with Birdie
To configure the integration, securely provide Birdie with:
connection_details
project_id
No
GCP project ID. Defaults to the project the service account belongs to.
connection_details
service_account_key
Yes
The full JSON service account key downloaded in the previous step.
data_source
dataset
Yes
The BigQuery dataset name (without the project ID), e.g. gold_layer.
data_source
table
Yes
The table or view to query, e.g. customer_feedback_v1.
data_source
partition_column
No
The date/datetime column used for incremental ingestion. Defaults to posted_at.
data_source
partition_type
No
Type of the partition column: DATE, TIMESTAMP, or DATETIME.
metadata
data_kind
Yes
The type(s) of data in the table, e.g. Support Ticket, NPS, CSAT, Review, Social Media Post, Account.
Share it securely with the Birdie team
Data Types Supported
Birdie can ingest structured datasets exposed as BigQuery tables or views, including:
Conversations and messages (Support Tickets, Issues, Social Media Posts)
Feedback datasets (review, nps, csat)
Operational or reference tables (accounts, users, metadata)
Expose one table or view per dataset type.
References
Last updated