Google Spanner

This guide helps you configure your Google Cloud Spanner instance to allow Aampe to access your data.

Option 1: Direct Read Access

Grant Aampe direct read access to your Spanner instance for real-time data synchronization.

Prerequisites

  • Google Cloud project with Spanner instance
  • Admin access to your GCP project
  • Spanner instance ID and database name

Setup Steps

1. Create a Service Account for Aampe

# Create service account
gcloud iam service-accounts create aampe-spanner-reader \
  --display-name="Aampe Spanner Reader" \
  --description="Service account for Aampe to read Spanner data"

# Get the service account email
gcloud iam service-accounts list --filter="displayName:Aampe Spanner Reader"

2. Grant Spanner Access

# Grant database reader role
gcloud spanner databases add-iam-policy-binding YOUR_DATABASE_NAME \
  --instance=YOUR_INSTANCE_ID \
  --member="serviceAccount:aampe-spanner-reader@YOUR_PROJECT.iam.gserviceaccount.com" \
  --role="roles/spanner.databaseReader"

3. Create and Download Service Account Key

# Create key
gcloud iam service-accounts keys create aampe-spanner-key.json \
  --iam-account=aampe-spanner-reader@YOUR_PROJECT.iam.gserviceaccount.com

# The key file will be downloaded to your current directory

4. Provide Information to Aampe

Share the following with your Aampe representative:

  • Service account key file (aampe-spanner-key.json)
  • Project ID
  • Spanner instance ID
  • Database name
  • List of tables to sync

Option 2: BigQuery Federation (Recommended)

If you prefer not to grant direct Spanner access, you can set up BigQuery federation and grant access to BigQuery instead.

Prerequisites

  • BigQuery API enabled in your project
  • Existing Spanner instance with data

Setup Steps

1. Create External Connection in BigQuery

-- In BigQuery Console
-- Navigate to "Add Data" > "External data source"
-- Select "Cloud Spanner" as source type
-- Configure:
--   Connection ID: aampe-spanner-connection
--   Location: [Match your Spanner region]
--   Spanner instance: YOUR_INSTANCE_ID
--   Spanner database: YOUR_DATABASE_NAME

2. Create Views for Aampe Access

-- Create a dataset for Aampe views
CREATE SCHEMA IF NOT EXISTS `your_project.aampe_views`
OPTIONS(
  description="Views for Aampe data access",
  location="us-central1"
);

-- Create view for customer data
CREATE OR REPLACE VIEW `your_project.aampe_views.customers` AS
SELECT * FROM EXTERNAL_QUERY(
  'projects/YOUR_PROJECT/locations/YOUR_LOCATION/connections/aampe-spanner-connection',
  '''
  SELECT 
    customer_id,
    email,
    created_at,
    -- Include only fields needed by Aampe
  FROM Customers
  '''
);

3. Grant BigQuery Access to Aampe

# Grant access to the views dataset
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
  --member="serviceAccount:AAMPE_SERVICE_ACCOUNT" \
  --role="roles/bigquery.dataViewer" \
  --condition="expression=resource.name.startsWith('projects/YOUR_PROJECT/datasets/aampe_views'),title=Aampe Views Access"

4. Share Connection Details

Provide Aampe with:

  • Project ID
  • Dataset name (aampe_views)
  • List of available views

Option 3: Cross-Project Access for Dataflow

If Aampe will be running Dataflow jobs to extract your data, grant the necessary permissions for cross-project access.

Prerequisites

  • Aampe will provide you with a service account email that will run the Dataflow jobs

Setup Steps

1. Enable Required APIs

# Enable APIs that Aampe's Dataflow jobs will need
gcloud services enable spanner.googleapis.com --project=YOUR_PROJECT

2. Grant Spanner Access

Grant Aampe's service account permission to read your Spanner data:

# Grant database reader role to Aampe's service account
gcloud spanner databases add-iam-policy-binding YOUR_DATABASE_NAME \
  --instance=YOUR_INSTANCE_ID \
  --member="serviceAccount:AAMPE_PROVIDED_SERVICE_ACCOUNT" \
  --role="roles/spanner.databaseReader"

# If you want to limit access to specific tables, create a custom role
gcloud iam roles create aampeSpannerReader \
  --project=YOUR_PROJECT \
  --title="Aampe Spanner Reader" \
  --description="Read access to specific Spanner tables for Aampe" \
  --permissions="spanner.databases.select,spanner.sessions.create,spanner.sessions.get,spanner.sessions.delete"

3. Network Configuration (if using VPC)

If your Spanner instance is in a VPC-enabled configuration:

# Option A: Add Aampe's project to your VPC Service Controls perimeter
gcloud access-context-manager perimeters update YOUR_PERIMETER \
  --add-resources="projects/AAMPE_PROJECT_NUMBER"

# Option B: Create ingress rule for Aampe
gcloud access-context-manager perimeters update YOUR_PERIMETER \
  --add-ingress-policies=ingress.yaml

Example ingress.yaml:

- ingressFrom:
    sources:
      - resource: projects/AAMPE_PROJECT_NUMBER
    identityType: ANY_IDENTITY
  ingressTo:
    operations:
      - serviceName: spanner.googleapis.com
        methodSelectors:
          - method: "*"
    resources:
      - projects/YOUR_PROJECT_NUMBER

Share the following details:

  • Project ID
  • Spanner instance ID
  • Database name
  • List of tables to sync
  • Any network restrictions or VPC configurations

4. Real-time streams

For real-time event streaming, set up Spanner change streams to publish to a Pub/Sub topic that Aampe can subscribe to/ you can push to Aampe endpoints

Prerequisites

  • Pub/Sub API enabled
  • Spanner database with change streams support

Setup Steps

1. Create Change Streams

-- In Spanner
CREATE CHANGE STREAM aampe_customer_changes
  FOR Customers(customer_id, email, status, updated_at)
  OPTIONS (
    retention_period = '7d',
    value_capture_type = 'NEW_AND_OLD_VALUES'
  );

2. Create Pub/Sub Topic

# Create topic
gcloud pubsub topics create aampe-spanner-changes

# Grant Aampe subscription permissions
gcloud pubsub topics add-iam-policy-binding aampe-spanner-changes \
  --member="serviceAccount:AAMPE_SERVICE_ACCOUNT" \
  --role="roles/pubsub.subscriber"

3. Set Up Change Stream Export

Use the Dataflow template to export changes:

gcloud dataflow jobs run aampe-change-stream-export \
  --gcs-location gs://dataflow-templates/latest/Spanner_Change_Streams_to_PubSub \
  --region YOUR_REGION \
  --parameters \
spannerInstanceId=YOUR_INSTANCE,\
spannerDatabase=YOUR_DATABASE,\
spannerChangeStreamName=aampe_customer_changes,\
pubsubTopic=projects/YOUR_PROJECT/topics/aampe-spanner-changes

4. Configure Dead Letter Queue (Optional)

# Create dead letter topic
gcloud pubsub topics create aampe-spanner-changes-dlq

# Create subscription with dead letter policy
gcloud pubsub subscriptions create aampe-changes-sub \
  --topic=aampe-spanner-changes \
  --dead-letter-topic=aampe-spanner-changes-dlq \
  --max-delivery-attempts=5

Performance Considerations

  • Ensure indexes exist on frequently queried columns (customer_id, created_at)
  • Consider partitioning large tables by date
  • Monitor Spanner CPU utilization during sync operations


Next Steps

Once configuration is complete:

  1. Notify your Aampe representative
  2. We'll verify the connection and begin initial data sync
  3. Monitor the first sync for any performance impacts
  4. Set up alerts for failed syncs (if using change streams)