S3/Azure/GCS

Overview

With the S3 connector, Birdie can import data from multiple file formats from AWS S3 or a storage service that implements the S3 API such as Google Cloud Storage. Once a day the connector checks if there are new objects (files) and if so imports the records in those objects.

Requirements

  • Dedicated bucket for Birdie Integration.

  • Birdie integration requires a service account with read-only access. Write access may be granted as optional to support teams during initial/manual dataset uploads.

Setup in S3-compatible storage

AWS

Azure

To enable secure access to an Azure Blob Storage container, you can use either Shared Access Signature (SAS) or Shared Key Authentication.

SAS Tokens are the recommended approach because they provide limited, temporary access without exposing the account key.

  1. Shared Access Signature

    • To generate a SAS token for an Azure Blob Storage container using the Azure Portal, go to your storage account, select the container, and click on “Shared Access Signature.” Configure the token by selecting the permissions and setting the start and expiry time. Once configured, click “Generate SAS” and copy the SAS URL or token.

    • For more information about Shared Access Signature, see the docsarrow-up-right

  2. Shared Key Authentication

    • To access an Azure Blob Storage container using Shared Key Authentication, you use the storage account name and account key, which provide permanent, full access to the storage account. By supplying the account name, account key, and container name to your application or client, you can perform any operation on the blobs and containers.

    • For more information about Shared Key Authentication, see the docsarrow-up-right.

circle-exclamation

GCP

To enable HMAC Access for a Google Cloud Storage (GCS) bucket, please follow the instructions provided in the Google Cloud documentationarrow-up-right.

Recommended Steps:

  1. Create a New Service Account

    • Create a new Service Account with the necessary permissions to list buckets and read objects. Assign the "Storage Object Viewer" role to this Service Account.

  2. Create a New HMAC Key

    • Follow the instructionsarrow-up-right to create a new HMAC Key for the newly created Service Account. This key will allow secure access to your GCS bucket.

  3. Set Up IAM Policy

    • Bind an IAM policy that grants permissions to list objects only within the target bucket and, if necessary, a specific folder within that bucket. This ensures that access is restricted to the appropriate resources, enhancing security.

Connect to Birdie

To configure the connector, provide Birdie with the following parameters and credentials.

Parameters

  • Region: The region, e.g "us-west-2" (AWS) or "us-central1" (GCP).

  • Bucket: The bucket name.

  • Prefix: A prefix for the object keys. We suggest organizing it based on the kind of data (e.g birdie/tickets, birdie/nps).

  • Format: The data format to use. Currently only supports parquet and csv.

  • Kind: The kind of data you're trying to import. This defines what schema Birdie expects when reading rows from your file. Supported values are:

    • review

    • nps

    • csat

    • support_ticket

    • social_media_post

    • issue

    • accounts

  • Credentials for S3

    • Access Key ID / HMAC Access ID

    • Secret Key ID / HMAC Secret

    • External ID (optional, AWS specific)

    • Role ARN (optional, AWS specific)

    • The S3 endpoint to use. Only needed if not using AWS S3.

  • Start Date: Date to filter objects by (object modified at).

Data in scope

S3 Schemas

Each row of the file must fit within one of the following schemas. The schema must match the kind selected when configuring the parameters.

See the oficial PARQUET spec for more information on the supported typesarrow-up-right and logical typesarrow-up-right.

Feedbacks // Review

feedback_id

STRING

Required

Unique identifier for each review.

text

STRING

Required

Text posted by user

posted_at

STRING

Required

When the feedback was posted (RFC 3339 timestamp)

author_id

STRING

Optional

Identifier for the author of the the record.

account_id

STRING

Optional

Identifier for the account the record belongs to.

language

STRING

Optional

Language of the record as BCP 47 code.

title

STRING

Optional

The title of the feedback given by the author.

rating

FLOAT

Required

A rating or score of the feedback.

category

STRING

Optional

The category the review belongs to.

owner

STRING

Optional

Owner, Competitor

source

STRING

Optional

A user-customizable label for grouping feedbacks

Feedbacks // NPS and CSAT

Column Name
Type
Optional
Description

feedback_id

STRING

Required

Unique identifier for each answer.

text

STRING

Optional

Text posted by user

posted_at

STRING

Required

When the feedback was posted (RFC 3339 timestamp)

author_id

STRING

Optional

Identifier for the author of the the record.

account_id

STRING

Optional

Identifier for the account the record belongs to.

language

STRING

Optional

Language of the record as BCP 47 code.

title

STRING

Optional

The title of the survey.

rating

FLOAT

Required

A rating or score of the feedback.

source

STRING

Optional

A user-customizable label for grouping feedbacks

Conversations // Support tickets

conversation_id

STRING

Required

Unique identifier for each conversation.

message_id

STRING

Required

Unique identifier for each message (Unique at the account level)

author_id

STRING

Optional

Identifier for the author of the the message. For author_type = agent, use a user-friendly string (email, login)

account_id

STRING

Optional

Identifier for the account the message belongs to.

text

STRING

Required

Text of the message

posted_at

STRING

Required

When the message was posted (RFC 3339 timestamp)

language

STRING

Optional

Language of the message as BCP 47 code.

subject

STRING

Optional

Subject of the ticket.

status

STRING

Optional

Status of the ticket, e.g open.

priority

STRING

Optional

Priority assigned to the ticket.

channel

STRING

Optional

Source channel of the ticket, e.g web.

tags

REPEATED STRING

Optional

Array of tags applied to the ticket.

author_type

STRING

Optional

Bot, Agent, User

survey_title

STRING

Optional

Title of the survey that closes the ticket.

survey_type

STRING

Optional

Type of the survey that closes the ticket. One of: csat or nps

rating

FLOAT

Optional

Rating that the client gave to the support ticket experience.

solved

STRING

Optional

Flag that indicates if the ticket was solved. One of: true or false

source

STRING

Optional

A user-customizable label for grouping feedbacks

agent_team

STRING

Optional

Support agent's team name

agent_company

STRING

Optional

Support agent's company name

agent_supervisor_id

STRING

Optional

Support agent's supervisor identifier

agent_experience

STRING

Optional

Support agent's maturity level (Enum)

circle-info

To ensure consistency, please upload only one row per conversation containing the survey response fields (such as survey_type, survey_title, rating, etc.). This message should be the final one for that ticket, reflecting the client’s closing thoughts on the service provided.

circle-info

To upload multiple messages per ticket, make sure the "ticket" fields (such as subject, status, priority, channel and tags) are consistent across all messages.

Conversations // Complaints

conversation_id

STRING

Required

Unique identifier for each conversation.

message_id

STRING

Required

Unique identifier for each message (Unique at the account level)

author_id

STRING

Optional

Identifier for the author of the the message.

account_id

STRING

Optional

Identifier for the account the message belongs to.

text

STRING

Required

Text of the message

posted_at

STRING

Required

When the message was posted (RFC 3339 timestamp)

language

STRING

Optional

Language of the message as BCP 47 code.

category

STRING

Optional

A classification for segmenting complaints. e.g Support, Shipping

status

STRING

Optional

Status of the complaint negotiations, e.g pending, initiated, ongoing and solved.

url

STRING

Optional

URL for the complaint if from a public source.

rating

FLOAT

Optional

Rating that the client gave to the complaint negotiation experience.

author_type

STRING

Optional

Internal Person, User, Bot

source

STRING

Optional

A user-customizable label for grouping feedbacks

Conversation // Social Media Post

conversation_id

STRING

Required

Unique identifier for each conversation.

message_id

STRING

Required

Unique identifier for each message (Unique at the account level)

author_id

STRING

Optional

Identifier for the author of the the message.

account_id

STRING

Optional

Identifier for the account the message belongs to.

text

STRING

Required

Text of the message

posted_at

STRING

Required

When the message was posted (RFC 3339 timestamp)

language

STRING

Optional

Language of the message as BCP 47 code.

title

STRING

Optional

Title of the post.

owner

STRING

Optional

Owner, Competitor

category

STRING

Optional

The category the post was under, e.g a subreddit name.

url

STRING

Optional

URL of the post.

channel

STRING

Optional

Source channel of the post, e.g facebook.

tags

REPEATED STRING

Optional

Array of tags applied to the post.

author_type

STRING

Optional

Internal Person, User, Bot

upvotes

INTEGER

Optional

The number of upvotes the message has.

source

STRING

Optional

A user-customizable label for grouping feedbacks

Conversation // Issue

conversation_id

STRING

Required

Unique identifier for each conversation.

message_id

STRING

Required

Unique identifier for each message (Unique at the account level)

author_id

STRING

Optional

Identifier for the author of the the message.

account_id

STRING

Optional

Identifier for the account the message belongs to.

text

STRING

Required

Text of the message

posted_at

STRING

Required

When the message was posted (RFC 3339 timestamp)

language

STRING

Optional

Language of the message as BCP 47 code.

project_id

STRING

Optional

Project identifier

project_name

STRING

Optional

Project Name

title

STRING

Optional

Issue title

status

STRING

Optional

Issue status

source

STRING

Optional

A user-customizable label for grouping feedbacks.

Accounts

account_id

STRING

Required

Unique identifier for the account.

Custom Fields

Any columns that don't fit under the previously listed schemas may become custom fields.

The name of the column in the Parquet Schema must be configured as the key/source of the custom field inside the Birdie App.

Last updated