Google Cloud Bucket Message Data Export
For more complex analyses and for ingesting data into 3rd party SaaS tools we provide daily Message data exports hosted within Aampe's Cloud infrastructure.
Customer Setup Requirements
To export data directly to a SaaS tool, contact us to set up and provide a service account JSON via a secure communication channel. Detailed instructions on how to use the service account JSON can be found in the vendor export guide later in this document.
For direct bucket access we setup service account or use Google Workload Identity Federation. Workload identity federation allows you to use your existing user identity system to access data.
In order to use workload identity federation to access Google cloud resources from non-Google cloud platforms, the following steps are needed to configure workload identity pools, providers, service account impersonation and generate the JSON configuration file to be used by the auth libraries.
Configure Workload Identity Federation from AWS
Configure Workload Identity Federation from Microsoft Azure
Configure Workload Identity Federation from an OIDC identity provider
Configure Workload Identity Federation from a SAML identity provider
Export Schema
Data is exported daily to a GCS bucket with the following structure:
All data is located under the path: messages_send/{year}/{month}/{day}/
For example: messages_send/2024/09/25/000000000000.csv.gz
The data is exported as gzipped CSV files with the following example columns:
| send_date | contact_id | event_name | time | event_properties|
send_date | contact_id | event_name | time | event_properties |
---|---|---|---|---|
"2024-09-29" | "11171" | "Aampe Message Sent" | "1727614021217" | {event_properties} |
Detailed event_properties object
event_properties |
---|
{"aampe_message_id":"f4fc8697-1c47-440f-b0cf-33faf5884e9c","message_id":3107,"message_name":"Referral | DE","channel":"Push","message_tags":["DE","Referral"],"messaging_type":"Delegate to Agents","timestamp":"2024-09-29T12:47:01.217423Z"} |
Data Considerations
Data is matched to messages sent by your delivery platforms, such as Braze or Leanplum. To ensure data completeness, we export the data with a 3-day delay.
Loading Message Data to Amplitude
To add a new GCS data source for Amplitude to draw data from, follow these steps:
In Amplitude Data, click Catalog and select the Sources tab.
In the Warehouse Sources section, click GCS.
Upload your Service Account Key file. This gives Amplitude the permissions to pull data from your GCS bucket.
After you've uploaded the Service Account Key file, enter the bucket name and folder where the data resides.
Click Next to test the credentials. If all your information checks out, Amplitude displays a success message. Click Next > to continue the process.
In the Enable Data Source panel, name your data source and give it a description. (You can edit this information later, via Settings.) Then click Save Source. Amplitude confirms that you've created and enabled your source.
Click Finish to go back to the list of data sources. If you've already configured the converter, the data import starts in a few moments. Otherwise, it's time to create your data converter.
The final step in setting up Amplitude's GCS ingestion source is creating the converter file. Your converter configuration gives the integration this information:
A pattern that tells Amplitude what a valid data file looks like. For example:“\w+_\d{4}-\d{2}-\d{2}.json.gz”
Whether the file is compressed, and if so, how.
The file’s format. For example: CSV (with a particular delimiter), or lines of JSON objects.
The converter file tells Amplitude how to process the ingested files. Create it in two steps: first, configure the compression type, file name, and escape characters for your files. Then use JSON to describe the rules your converter follows.
Updated about 2 months ago