Data Collaboration Platform (DCP)—Create and manage sources

At a glance: Set up and manage sources to securely share your first-party data with other collaborators.

 About DCP

The Data Collaboration Platform (DCP) functions as the central point for data collaboration, including audience creation and activation. DCP relies on the advanced technology of the Data Clean Room (DCR) to ensure data privacy and security for the collaboration and audience management processes.

Overview

This article contains everything you need to know about creating and managing your sources, including how to:

 Before you begin

Before you create your sources, you should first:

  • [Required] Set up the cloud services from which the DCR will retrieve the data. Two types of cloud services are supported:
    • Data warehouses: BigQuery and Snowflake
    • Cloud storage buckets: Amazon S3 (AWS) and GCS
  • [Optional] Create inbound connections in the AppsFlyer platform to connect these cloud services to the DCR. If these connections were not set up previously, you will be prompted to set them up during source creation. 

Source data requirements

Sources must meet these requirements in order to prevent errors in source creation.

Data format (relevant to all sources)

Data within sources must meet these requirements:

  • Date (only): yyyy-mm-dd (for example, 2023-04-18)
  • Date and time:
    • Format: yyyy-MMM-dd hh:mm:ss (for example, 2023-APR-18 15:30:35)
    • Time zone: UTC
  • Numbers: maximum 2 digits following the decimal point
  • String length: maximum of 256 characters
  • Character limitations:
    • For field names (column headers): no spaces or special characters
    • All other data: no limitations (all characters are valid)

Table columns (relevant only to sources in data warehouses)

In addition to data shared for processing, source tables in BigQuery or Snowflake must include 2 additional columns – one for date and one for version:

  • Date:
    • Column header: dt
    • Column type: date
    • Data format: yyyy-mm-dd (for example, 2023-04-18)
    • Additional: BigQuery tables must be partitioned by this column
  • Version:
    • Column header: v
    • Column type: string
    • Data format: number (for example, 1, 2, 3, 10)
    • Important! A new version of a report is triggered each time the DCR detects a new value in this column. To ensure the completeness of your report, be sure to populate the source table with a complete set of data whenever the column value is changed.

File name and format (relevant only to sources in cloud storage buckets)

Source files stored in Amazon S3 or GCS must meet these file name and format requirements:

  • File name must comply with DCR naming requirements
  • CSV or GZIP format
    • The file underlying GZIP compression must be a CSV file.
  • Number of data source files per data folder:
    • CSV: Maximum of 1
    • GZIP: Maximum of 1 single-part file. Multi-part GZIP files are supported when named as follows: filename_part01.gzip, filename_part02.gzip, etc.

Create a source

Creating a source involves a guided walkthrough with a few simple steps. To start the process:

  1. In AppsFlyer, from the side menu, select Collaborate > Data Clean Room.
  2. From the top-right menu, click + New source.
  3. Proceed with the New source walkthrough steps:

Step 1: Set source name

Enter the source name. This can be any unique name that will help you identify the source.

Requirements and guidelines

  • Make sure the source name is unique among all other sources in your account. Otherwise, you won't be able to save the source.
  • For cloud integrations, the name doesn't need to match the file name.
  • Source name requirements:
    • Length: 2-80 characters
    • Valid characters:
      • letters (A-Z, a-z)
      • numbers (0-9), cannot be the first character of a name; 
    • Invalid characters:
      • spaces
      • all other symbols or special characters

Step 2: Set source location

To specify the source location:

  1. Select the connection in which the source will be (or has been) created.
    • If there are no connections defined in your account, the New connection dialog will open, prompting you to create one. Follow these instructions to create it.
    • If you have existing connections but want to use a new one, click + New connection and follow these instructions to create it.
  2. Continue with the relevant instructions below, based on where the data for your source is located.

Source locations in BigQuery

To complete specifying the source location for BigQuery sources:

  1. Select the dataset in which the source table is located.
  2. Select the table in which the source data is located.

The lists from which you make these selections contain the available datasets and tables, respectively, in the BigQuery project you specified when creating the connection.

Source locations in Snowflake

To complete specifying the source location for Snowflake sources:

  1. Select the share containing the source data.
  2. Select the schema in which the source table is located.
  3. Select the table in which the source data is located.

The lists from which you make these selections contain the shares, schemas, and tables, respectively, in the Snowflake account you specified when creating the connection.

Source locations in cloud storage buckets

Source locations in Amazon S3 or GCS consist of the cloud storage bucket specified by the connection and the underlying folder path from which the DCR reads the source file each time it is updated. 

Once you have specified the connection, AppsFlyer can automatically generate the required underlying folder path as part of the source creation process.

  • Allowing AppsFlyer to generate the folders makes the process easy. However, you can choose to manually create them instead, according to the instructions detailed here.

If AppsFlyer generates the folders, the only additional information required is the name you want to give the source folder. (This is the top-level folder in which you update the source each time your want to use it for running a new report version.) You can also indicate whether you want the source folder to be created underneath a parent folder often named input.

To complete specifying a source location in a cloud storage bucket, enter the source folder name.

  • By default, the displayed source folder name:
    • Is based on the name you gave the source. You can change the folder name to meet your needs, so long as it complies with the DCR naming requirements.
    • Indicates that it will be generated within a parent folder named input. This folder serves as the parent folder for all sources you upload to the DCR.
      • The input folder is not required, and you can remove it or name it something different, so long as it complies with the DCR naming requirements.
      • Although this folder is not required, having an input folder (or an equivalent folder of a different name) is considered best practice. It is even more highly recommended when you are using the same cloud storage bucket both for uploading data files (input) and receiving reports (output).

 Important!

If you manually created the folder path, make sure the connection and path you enter in the Source location section match the path you manually created.

Local file system

You can also upload source data using a local file. However, this option is not the recommended way to create your source data. It's primarily used for testing purposes so you can get familiarized with the platform functionalities. 

 Note

Data uploaded from a local file is automatically removed after being stored for a maximum of 7 days.

Step 3: Configure source structure

Prepare and organize the source data, test it, and save the configured data:

  1. Load the source fields
  2. Configure the loaded fields
  3. Verify EU user inclusion
  4. Test the source data
  5. Save the source

1. Load the source fields

Use the instructions below according to the source location:

Data warehouse sources

To load fields from a source located in a data warehouse (BigQuery or Snowflake), click Load fields from source.

 Important!

If the selected source table does not include the required date and version columns, you will receive an error.

Cloud storage bucket sources

To load fields from a source located in a cloud storage bucket (Amazon S3 or GCS), you must upload a prototype source file.

For purposes of defining the source structure: 

  • You can upload a prototype version of the source from a local file.
    • If you select this option, AppsFlyer always creates the source folder path automatically.

                                                                - or -

  • You can upload a prototype version of the source file directly from its connection.
    • If you select this option, there's one additional choice to make:
      • Allow AppsFlyer to automatically create the source folder structure; or
      • Create the source folder structure manually

To upload your prototype source file, follow the instructions in the relevant tab below:

Local file Connection (automatic creation) Connection (manual creation)
  1. In the Source structure section, click Load fields from source.
  2. In the window that opens, select Upload a local file.
  3. Specify the CSV or GZIP file you want to upload, then click OK.

2. Configure the source fields

After loading the source fields, each field (column) is presented with a field type. Review each field and match it with the appropriate data type from the drop-down list beside it. Consider the following:

 Considerations

  • When both parties share their source data, at least one field must be set as an identifier to enable user-level data to match across the corresponding sources. An identifier is a field that uniquely identifies an app user (for example, CUID, AppsFlyer ID, or hashed email).
  • Although configuring each of the uploaded source fields (columns) isn't mandatory, it is important for categorization, effective data interpretation, aiding audience creation, suggesting insights, and effectively facilitating validations.

To remove a field:

  • Hover over the right side of the field you want to remove and click the dustbin icon that appears when hovering.

To add fields manually:

This option allows you to include a field in the audience that isn't currently present in the data source.

  1. Click + New field. An empty field is added.
  2. Enter the name of the field and select its type.

Reload source fields

If the configuration of one of your data files has changed, you can update the source file to reflect the changes.

 Note

Reloading the source resets the column names in the source to match the updated file names. This will overwrite any of the field names in the list and their type. 

To reload updated fields from a file:

  1. Click Reload fields.
  2. Select the file location.
    • For a local file: Upload the file.
    • For a file from your cloud service: Click Load from cloud bucket and follow the instructions. 
  3. Click OK. The updated files are now displayed. 

3. Verify EU user inclusion

  • Select Yes or No to the question: Does your source include European users, to which EU DMA regulations apply?

Learn more about privacy regulations for the EU Digital Markets Act.

4. Test the source data

  • [Optional] Click Test to check for errors in the format or validity of the source fields.

5. Save the source

  • Click Save to save the source.

After confirming, the new source is added under the Sources tab.

 Note

If you uploaded the source from a local file, saving the source triggers the automatic creation of the folder structure, and the displayed confirmation message includes a link to the source folder.

Manage your sources

The sources you've created are displayed in the Sources tab. From here, you can edit the source name and structure, share it with a collaborator, and delete it—provided it's not already being used in an audience.

Edit a source

  1. Go to the Sources tab of the Data Clean Room.
  2. In the list of sources, hover over the source you want to edit, and click the edit icon edit_button.png at the end of the row.
  3. On the Edit source page, edit the relevant fields, as detailed below. 
  4. Click Save.

Edit the source name

When editing the source name, make sure to follow these naming requirements.

 Edit the source location

  1. From the Edit source page > Source location, select a different data connection.
  2. Select the relevant location details.
  3. [Optional] Test the source.
  4. Click Save.

Edit the source structure

  1. Go to the field name and type and make the necessary changes: Change the field name or update its type.
  2. Click Save.

 Important!

Don't forget to make corresponding changes reflecting the new source structure in any reports for which this source is used:

  • Fields that were removed, uncategorized, or changed from their previous categories are automatically removed from any reports in which they are used.
  • Newly added or categorized fields are not automatically included in existing reports until you edit report definitions to include them.

Delete a source

You can delete any source except when it's being used in an audience. If this is the case, a notification will specify the audiences using that source. To enable the deletion of the source, you must first delete the audience linked to it. As for sources shared with you, they can only be deleted by the source owner.

  1. Go to the Sources tab of the Data Clean Room.
  2. In the list of sources, hover over the row of the source you want to delete.
  3. Click the delete icon delete_button.png showing on the right side of the row.
  4. In the dialog, click Delete to confirm.

Share a source

To share your source with the collaborator and provide them permissions:

  1. Go to the Sources tab of the Data Clean Room.
  2. From the list of sources, hover over the source you want to share, and click the sharing iconshare-source.jpgat the end of the row.
  3. Enter the collaborator's email address, and click Next.
  4. Select the relevant sharing permissions, as detailed below.
  5. Click Save & send.

Reference

Manually creating a storage bucket folder structure (relevant only if you choose to do so)

In general, it's easiest to allow AppsFlyer to automatically generate the required folder structure as part of the source creation process. However, if you wish to create these folders manually, you can do so as follows.

Create a DCR key folder

To ensure maximum security, the folder directly beneath the bucket (the "DCR key folder") must be named with the 8-character, alphanumeric DCR key assigned to your account (for example, 01bcc5fb). Note that this is different from any other password or key associated with your AppsFlyer account.

The DCR key folder is generally created manually using the interface of your selected cloud service.

To get your account's DCR key:

  • Click DCR key at the top of the main DCR page.

dcr_key_button.png

After creating the DCR key folder, your bucket/folder structure would look something like this:

dcr_file_structure_dcr_key_folder.png

Top-level input folder

Though it is not required, best practice is to create a top-level input folder directly beneath the DCR key folder. This folder will be dedicated to files you upload to the DCR.

The top-level input folder is generally created manually using the interface of your selected cloud service.

  • This practice is even more highly recommended when you are using the same bucket both for uploading data files (input) and receiving reports (output).
  • You can name this folder anything you want, so long as it complies with the DCR naming requirements. For ease of identification, it is usually named input/.

After creating the top-level input folder, your bucket/folder structure might look something like this:

dcr_file_structure_input_folder.png

Second-level folder for each data source

You can regularly upload different data source files to the DCR for processing. Each of these data sources must be assigned a separate folder ("data source folders").

So, for example, if you plan to upload 2 files to the DCR for processing every day: BI-data.csv and CRM-data.gzip, you would assign each of these data sources a folder. You could choose to call these folders BI-data/ and CRM-data/.

The data source folders are generally created manually using the interface of your selected cloud service.

After creating 2 data source folders, your bucket/folder structure might look something like this:

dcr_file_structure_source_folders.png

Under each data source folder, nested subfolders by date and version must be created each time the source is updated.

Sharing permissions

Grant the collaborator permission to use your source data. View permissions granted to you by a collaborator on the source they shared with you.

Grant permissions for your source

Turn on any of the following permissions to grant the collaborator to access and use your source data:

Permission Description
Query data Query the source data using the DCR Dynamic Query
Build audience

Segment and combine datasets to create an audience you would like to target, using the audience builder tool.

    • Can use only overlapping data: Build audiences using only intersecting data found in both sources (yours and the collaborator’s).
    • Can use all data: Access is granted to all data in the source.
Partner connections

Activate the created audience on any of your media partner platforms.

Note: 

  • If the collaborator doesn’t have permission to publish or export the audience, it will be sent to you for activation.
  • By sharing user identifiers, the media partners gain data autonomy beyond AppsFlyer's control.

File export

Download the created audience as a CSV file or send it to your cloud services.

Add expiration

Set an expiration date for the data sharing:

  1. Click Add expiration.
  2. Select the date of expiration.
  3. Click Save & send.

View permissions on collaborator’s source

To view the permissions granted to you by a collaborator on the source they shared with you:

  1. From the Sources tab of the Data Clean Room, go to the sources that were shared with you. You can use the Shared with me filter at the top of the page.
  2. Under the Permissions column, hover over the row of the source you want to see the detailed permissions.
view-permissions.png

Privacy regulations: EU Digital Markets Act

Understanding Google's EU user consent policy and its imlications

As part of Google’s enforcement of the Digital Markets Act (DMA) Google updated its EU user consent policy as of March 6, 2024. As a Google App Attribution Partner, AppsFlyer made the necessary changes to support their policy requirements, while ensuring that advertisers maximize the value from their Google Ads marketing channels. 

 Note

Adding consent fields

When setting up a source intended for audience activation on Google and confirming Yes to the question Does your source involve European users subject to EU DMA regulations? ensure that the additional consent fields in the table below are included in the source file. This enables AppsFlyer to transfer the necessary information to Google during the activation process.

Additional consent fields and their response values:

Field name Response value Field explained
eea true/false Is the user located in the EEA (European Economic Area), to which the DMA applies?
ad_personalization *true/false Did the user give Google consent to use their data for personalized advertising?
ad_user_data *true/false Did the user give consent to send their user data to Google? 
* When “true”, AppsFlyer includes the user identifiers you've sent for users who gave consent.
   When “false”: AppsFlyer doesn't include them, as they weren't sent to AppsFlyer.

Potential impact on the audience size

The actual and estimated audience size sent to Google may vary based on the number of users granting or denying consent.