At a glance: Set up one or more data warehouses (BigQuery, Snowflake) and/or cloud storage buckets (Amazon S3, GCS) to share data with the Data Clean Room and receive reports.
Overview
Preparing to use the Data Clean Room (DCR) involves setting up:
- The cloud services/locations from which the DCR reads first-party data from your systems (custom sources). These locations are used to create inbound connections.
- The cloud services/locations to which the DCR delivers reports after processing. These locations are used to create outbound connections.
Creating an inbound or outbound connection is a 2-step process:
- Step #1 – Use the interfaces of your selected cloud services to prepare them for use with the DCR (this article).
- Step #2 – Use the AppsFlyer platform to connect them to the DCR. (See Data Clean Room—Working with connections).
Note
See Data Clean Room—Working with sources for complete information about source data requirements:
- Data format (for all sources)
- Table columns (for sources in data warehouses)
- File name and format (for sources in cloud storage buckets)
Supported cloud services
Two types of cloud services are supported for inbound and outbound connections to the DCR:
- Data warehouses: BigQuery and Snowflake
- Cloud storage buckets: Amazon S3 (AWS) and GCS
You can use one or any combination of these services for inbound and outbound connections.
Important!
- If you will be using multiple custom sources for a single report, they must be located in cloud storage buckets.
- It's very common to use the same cloud storage bucket on Amazon S3 or GCS for both inbound and outbound connections. Be sure to follow the special instructions for that setup.
Setting up cloud services for inbound connections
Prepare your selected cloud services for use with DCR inbound connections according to the instructions in the following tabs.
Data warehouses – BigQuery and Snowflake
BigQuery
Note: The following procedure must be performed by your Google Cloud admin.
To create a dataset and grant the DCR permissions:
- Log in to your Google Cloud console.
- Go to the BigQuery page.
- In a new or existing Google Cloud project, create a dataset for the exclusive use of the DCR:
- In the left-side panel, click the View actions button
to the right of the project ID.
- Select Create dataset.
- In the right-side panel that opens, enter the name of the dataset and select other options as you require.
- You can use any name that suits you – using letters, numbers, and underscores (_) only.
- Recommended: Use a name that indicates the dataset is being used for an inbound connection.
- It is strongly recommended NOT to use the Enable table expiration option since the DCR would be unable to read the sources after the tables expire.
- You can use any name that suits you – using letters, numbers, and underscores (_) only.
-
Click the
button.
- In the left-side panel, click the View actions button
- Grant the DCR permissions to the dataset:
- In the left-side panel, click the View actions button
to the right of the dataset you created.
- Select Share.
- In the right-side panel that opens, click the
button.
- In the Add principals section, enter the following account in the New principals field:
appsflyer-dcr@dcr-report.iam.gserviceaccount.com
- In the Assign roles section, select BigQuery > BigQuery Data Viewer.
- Click Save.
- Click CLOSE to close the right-side panel.
- In the left-side panel, click the View actions button
Snowflake
Note: The following procedure must be performed by a Snowflake Accountadmin.
To create a private share for use by the DCR:
- Log in to the Snowflake account that contains the data you want to share with the DCR.
- Switch your role to Accountadmin.
- From the left-side panel, select Private Sharing.
-
In the page that opens, select the Shared By Your Account tab.
-
Click the Share button. From the list that opens, select Create a Direct Share.
- Select the tables and/or views that you want to share with the DCR, then click Done.
- According to your needs, change the Secure Share Identifier and add an optional description.
-
In the field Add accounts in your region by name, enter one of the following AppsFlyer Snowflake accounts, according to your Snowflake account region:
Region AppsFlyer account EU West (eu-west-1) QL63117 US East - N. Virginia (us-east-1) MWB70410 US East - Ohio (us-east-2) BM15378 - Click the Create Share button.
Cloud storage buckets – Amazon S3 and GCS
You can use one or more buckets for uploading data to the DCR (on Amazon S3, GCS, or both). However, in most cases, the easiest-to-manage structure includes a single bucket on a single cloud service.
- You can set up the same bucket for use with both inbound and outbound connections by following these instructions.
The following requirements are relevant to buckets on both cloud services:
- Use: The bucket must be for the exclusive use of AppsFlyer Data Clean Room. In other words, no other service can write data to the bucket.
- Permissions: AppsFlyer DCR service must be given bucket permissions. See instructions for granting these permissions in the tabs for each cloud service below.
-
Name: The bucket name must begin with
af-dcr-
oraf-datalocker-
- Example:
af-dcr-example-bucket
- Example:
-
DCR naming requirements: The following naming requirements apply to all DCR data entities (buckets, folders, and files):
- Maximum length: 200 characters
- Valid characters:
- letters
(A-Z, a-z)
- numbers
(0-9)
, cannot be the first character of a name - hyphens
(-)
, cannot be the first character of a name - Invalid characters:
- spaces
- all other symbols or special characters
- Characters used for special purposes only:
-
- equal signs
(=)
, only where required in date and version folder names - underscores
(_)
, only where used to identify the parts of multi-part GZIP files or for naming _SUCCESS files - dots
(.)
, only directly before filename extensions(.csv, .gzip)
- equal signs
-
- letters
Amazon S3
Note: The following procedure must be performed by your AWS admin.
To create a bucket and grant AppsFlyer permissions:
- Log in to the AWS console.
- Go to the S3 service.
- Create the bucket:
- Click Create bucket.
- Complete the Bucket name, starting with
af-dcr-
oraf-datalocker-
and followed by your text (according to the DCR naming requirements above). - Click Create bucket.
- Grant AppsFlyer bucket permissions:
- Select the bucket you created.
- Go to the Permissions tab.
- In the Bucket policy section, click Edit.
The Edit bucket policy window opens. - Paste the following code snippet into the window.
{ "Version": "2012-10-17", "Statement": [ { "Sid": "AF-DCR-DL", "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::195229424603:user/product=dcr-reporter__envtype=prod__ns=default", "arn:aws:iam::195229424603:user/product=datalocker__envtype=prod__ns=default" ] }, "Action": [ "s3:GetObject", "s3:ListBucket", "s3:DeleteObject", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::af-dcr-mybucket", "arn:aws:s3:::af-dcr-mybucket/*" ] } ] }
-
In the snippet, replace
af-dcr-mybucket
(in the 2 lines in which it appears) with the name of the bucket you created.
Caution! When replacing the bucket name in the snippet, be sure not to overwrite/*
in the second line in which the bucket name appears. -
Click Save changes.
GCS
Note: The following procedure must be performed by your Google Cloud admin.
To create a bucket and grant AppsFlyer permissions:
- Log in to your GCS console.
- Go to the Cloud Storage Browser page.
- Create the bucket:
- Click Create bucket.
- Enter your bucket information on the Create a bucket page. Include the bucket name, starting with
af-dcr-
oraf-datalocker-
and followed by your text (according to the DCR naming requirements above). - Click Continue.
- Click Create.
- Grant AppsFlyer bucket permissions:
- Select the bucket you created.
- Go to the Permissions tab.
- In the Permissions section, click + Add.
The Add members window opens. - In the New members box, enter the following account:
appsflyer-dcr@dcr-report.iam.gserviceaccount.com
- From the Role list, select Cloud storage > Storage Admin.
- Click Save.
Setting up cloud services for outbound connections
The DCR delivers reports to your selected cloud services using AppsFlyer Data Locker.
- Note: Receiving DCR reports does not require a premium subscription to Data Locker. If you are interested in receiving other AppsFlyer reports via Data Locker, contact your CSM or send an email to hello@appsflyer.com.
Your DCR reports can be delivered to one or more locations on your cloud services (whether or not you use the same services for inbound connections). Prepare them for use with outbound connections according to the instructions in the following tabs.
Data warehouses – BigQuery and Snowflake
BigQuery
Note: The following procedure must be performed by your Google Cloud admin.
To create a dataset and grant Data Locker permissions:
- Log in to your Google Cloud console.
- Go to the BigQuery page.
- In a new or existing Google Cloud project, create a dataset for the exclusive use of Data Locker:
- In the left-side panel, click the View actions button
to the right of the project ID.
- Select Create dataset.
- In the right-side panel that opens, enter the name of the dataset and select other options as you require.
- You can use any name that suits you – using letters, numbers, and underscores (_) only.
- Recommended: Use a name that indicates the dataset is being used for an outbound connection.
- It is strongly recommended NOT to use the Enable table expiration option since Data Locker would be unable to write reports to the dataset after the tables expire.
- You can use any name that suits you – using letters, numbers, and underscores (_) only.
-
Click the
button.
- In the left-side panel, click the View actions button
- Grant Data Locker permissions to the dataset:
- In the left-side panel, click the View actions button
to the right of the dataset you created.
- Select Share.
- In the right-side panel that opens, click the
button.
- In the Add principals section, enter the following account in the New principals field:
datalocker-bq-admin-prod@datalocker-bq-prod.iam.gserviceaccount.com
- In the Assign roles section, select BigQuery > BigQuery Data Editor.
- Click Save.
- Click CLOSE to close the right-side panel.
- In the left-side panel, click the View actions button
Snowflake
Cloud storage buckets – Amazon S3 and GCS
The procedure for preparing cloud storage buckets for outbound connections is very similar to the one preparing them for inbound connections (including the instructions relevant to both cloud storage services).
The instructions in the tabs below apply when you are using a bucket for outbound connections only.
- If you will be using the same bucket for both inbound and outbound connections, follow the special instructions for that setup.
Amazon S3
Follow the instructions for creating an Amazon S3 bucket for inbound connections (with no changes to that procedure).
GCS
Follow the instructions for creating a GCS bucket for inbound connections. In step #4 of that procedure, enter the following account in the New members box:af-data-delivery@af-raw-data.iam.gserviceaccount.com
Setting up the same cloud storage bucket for both inbound and outbound connections
As previously mentioned, it's common to use the same bucket on Amazon S3 or GCS for both inbound and outbound connections.
The instructions for this setup vary only slightly from the instructions for inbound connections. They do differ, however, depending on whether you are:
- creating a new bucket for use with DCR inbound and outbound connections; or
- modifying a bucket previously used only for Data Locker to one now used for both inbound and outbound DCR connections
Instructions for both of these scenarios are included in the tabs below:
Amazon S3
Creating a new bucket for inbound/outbound connections
Follow the instructions for creating an Amazon S3 bucket for inbound connections (with no changes to that procedure).
Modifying an existing bucket previously used only for Data Locker
Modifying an existing bucket that you used previously only for Data Locker requires changing bucket permissions (to allow access by both DCR and Data Locker).
To modify bucket permissions:
- Log in to the AWS console.
- Go to the S3 service.
- Select the bucket used previously only for Data Locker.
- Go to the Permissions tab.
- In the Bucket policy section, click Edit.
The Edit bucket policy window opens. -
Replace the contents of the window with following code snippet:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "AF-DCR-DL", "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::195229424603:user/product=dcr-reporter__envtype=prod__ns=default", "arn:aws:iam::195229424603:user/product=datalocker__envtype=prod__ns=default" ] }, "Action": [ "s3:GetObject", "s3:ListBucket", "s3:DeleteObject", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::af-dcr-mybucket", "arn:aws:s3:::af-dcr-mybucket/*" ] } ] }
- In the snippet, replace
af-dcr-mybucket
(in the 2 lines in which it appears) with the name of the bucket you created. -
Caution! When replacing the bucket name in the snippet, be sure not to overwrite
/*
in the second line in which the bucket name appears.
- In the snippet, replace
- Click Save changes.
GCS
Creating a new bucket for inbound/outbound connections
Follow the instructions for creating a GCS bucket for inbound connections. Modify step #4 of that procedure to enter the following 2 accounts in the New members box:appsflyer-dcr@dcr-report.iam.gserviceaccount.com
af-data-delivery@af-raw-data.iam.gserviceaccount.com
Modifying an existing bucket previously used only for Data Locker
Modifying an existing bucket that you used previously only for Data Locker requires changing bucket permissions (to allow access by both DCR and Data Locker).
To modify bucket permissions:
- Log in to your GCS console.
- Go to the Cloud Storage Browser page.
- Select the bucket used previously only for Data Locker.
- Go to the Permissions tab.
- In the Permissions section, click + Add.
The Add members window opens. - In the New members box, enter the following account:
appsflyer-dcr@dcr-report.iam.gserviceaccount.com
- From the Role list, select Cloud storage > Storage Admin.
- Click Save.