Data Locker for partners

At a glance: Data Locker for partners (ad networks and agencies) delivers app data to the partner's storage in AWS, GCS, or BigQuery.

6133DataLockerForPartners.png

Data Locker for partners

Data Locker for partners delivers app data to cloud storage. Advertisers set permissions that allow AppsFlyer to share selected data with a given partner. 

Data Locker features

Feature Description
Storage options (cloud)

Storage (bucket) owned by you on:

  • AWS 
  • GCS
  • Yandex
  • BigQuery

About storage options

Multi app support

Supports data of apps that are integrated with you. The advertiser must give permission per report for you to get the data. 

Data format options
  • For bucket cloud storage:
    • CSV
    • Parquet
    • Adobe
  • Data warehouse
Data freshness

Freshness depends per report type:

  • Hourly: Data generated continuously
  • Daily: Some reports are prepared on a daily basis and are ready on the following day

Reports available for partners

The reports available and the permissions required to get the reports differ per partner type. However, the Data Locker mechanism, storage options, and settings required are the same irrespective of the partner type. See the articles per partner type as follows: 

Data storage architecture

Overview

Data is written to your selected storage option. You can switch from one option to another at any time. The change occurs within hours. 

Data in the cloud bucket storage is organized in a hierarchical folder structure, according to report type, date, and time. The following figure contains an example of this structure:

DLFolderOVerview.png

Data of a given report is contained in the hour (h) folders associated with that report.

  • The number of hour folders depends on if the report streams hourly or daily.
  • Data is provided in Snappy or GZIP compressed files, or uncompressed files, having Parquet or CSV format.
  • Data files consist of columns (fields).
  • The column structure is defined per report type. 

Folder structure

Folder Description 
data-locker-hourly

DLHourly.png

  • The top-level folder in the bucket depends on the storage provider.
  • The data-locker-hourly folder contains the report topics. 

Examples of folder structure based on bucket owner and cloud provider:

  • Your AWS bucket: <af-datalocker-your bucket prefix>/<generated-home-folder><subscription-id>
  • Your GCS bucket: <your bucket name>/<generated-home-folder>/<subscription-id>
t (topic) Report type relates to the subject matter of the report. 
dt (date)

This is the related data date. In most cases, this means the date the event occurred. 

h (hour)

The h folders relate to the time AppsFlyer received the data. For example, install events received between 14:00-15:00 UTC are streamed to the h=14 file. Note! There is a lag, of about 6 hours, between the time the data arrives in AppsFlyer until the h folder is streamed to Data Locker. For example, the h=14 folder is streamed six hours later at 23:00 UTC. 

Folder characteristics:

  • There are 24 h folders numbered 0-23. For example, h=0, h=1, and so on. 
  • In addition, a late folder contains events from the preceding day arriving after midnight (in other words, events that arrive between 00:00–02:00 UTC of the following day). For example, if a user installs an app on Monday at 08:00 and the event arrives on Tuesday at 01:00, the event is recorded in Monday's late folder. 
  • Data arriving after 02:00 is recorded in the folder of the actual arrival date and time. 
  • You must use the data in the late folder. It isn't contained in any other folder. 
  • _temporary folder: In some cases, we generate a temporary folder within an h folder. Disregard temporary folders and subfolders. Example: /t=impressions/dt=2021-04-11/h=18/_temporary.
Unified data

Data for all apps is provided in unified data files. When you load the data, use the row-level app_id field to distinguish between apps.

Example of data files are in the h=2 folder

UnifiedByApp.png

  • In your data loading process ensure that:
    • You begin to consume data only after the _SUCCESS flag is set.
    • You load all files having a .gz extension.
Completion flag

The last file (completion) flag is set when all of the data for a given h folder has been written. 

  • Don't read data in a folder before verifying that the _SUCCESS flag exists.

  • The _SUCCESS flag is set even in cases where no data is written to the folder. Meaning the folder is empty.

Zipping

Files are zipped using gz. After unzipping:

  • The files have no extension.
  • Each file has a header row containing the column (field) names. 
Column sequence

The sequence of fields in reports is always the same. New fields are added to the right of existing fields. 

Column (field) definitions are defined per report. Check the relevant report article for the description. 

Field population considerations

Blank or empty fields: Some fields are populated with null or are empty. This means that in the context of a given report there is no data to report. Typically null means this field is not populated in the context of a given report and app type. Blank "" means the field is relevant in its context but no data was found to populate it with. 

Time zone and currency:

App-specific time zone and currency settings are disregarded for data provided by Data Locker. As such: 

  • Time zone: Date and hour data are in UTC
  • Currency: The field event_revenue_usd is in USD

Values with commas: These comas are contained between double quotes `"`, for example, `"iPhone6,1"`.

Data files

Data files depend on segregation type.

Content Details
Completion flag

The last file (completion) flag is set when all the data for a given h folder has been written. 

  • Don't read data in a folder before verifying that the _SUCCESS flag exists.

  • The _SUCCESS flag is set even in cases where there is no data to write to a given folder and the folder is empty. 

  • Note! In the segregation by app option, the flag is set in the h folder and not the individual app folders. See the figures in the previous section. 
File types
  • Data is provided in Snappy or GZIP compressed files, or uncompressed files, having Parquet or CSV format.
  • After unzipping, the data files are in Parquet or CSV format according to your settings.
Column sequence (CSV files) 

In the case of CSV files, the sequence of fields in reports is always the same. When we add new fields these are added to the right of the existing fields. 

In this regard: 

  • The column structure of user journey reports is identical. This means you can have similar data loading procedures for different report types. You select the fields contained in the reports. The field meaning is detailed in the raw data dictionary
  • Reports having an FF notation in the report availability section don't adhere to the common column structure. 
Field population considerations

Blank or empty fields: Some fields are populated with null or are empty. This means that in the context of a given report there is no data to report. Typically null means this field is not populated in the context of a given report and app type. Blank "" means the field is relevant in its context but no data was found to populate it with. 

In the case of the restricted media source, the content of restricted fields is set to null. 

Overall regard null and blank as one and the same thing; there is no data available. 

Time zone and currency

App-specific time zone and currency settings have no effect on data written to Data Locker. The following apply: 

  • Time zone: Date and hour data are in UTC.
  • Currency: The field event_revenue_usd is in USD.

Values with commas: These commas are contained between double quotes `"`, for example, `"iPhone6,1"`.

Storage options

  • Data is written to a storage owner of your choice as follows: AWS, GCS, and BigQuery.
  • You can change the storage selection at any time.
  • If you change the storage, the following happens:
    • We start writing to the newly selected storage within one hour.
    • We continue writing to the existing storage during a transition period of 7 days. The transition period expiry time displays in the user interface. Use the transition period to update your data loading processes. 
    • Changing buckets: If you change storage, data is sent to both for a transition period of 7 days, allowing you to align your data consumption process. 
  Partner-owned storage (GCS, AWS, BigQuery)
Bucket name
  • GCS: No restriction
  • AWS: Set by you. Must have the prefix af-.

Example: af-datalocker-your-bucket-name

Storage owner Partner 
Storage platform AWS, GCS, Yandex, BigQuery
Credentials to access data by you Not known to AppsFlyer. Use credentials provided by the storage provider.
Data retention Controlled by you
Security

You control the storage. 

  • AWS: AppsFlyer requires GetObject, ListBucket, DeleteObject, PutObject permission to the bucket. The bucket should be dedicated to AppsFlyer use. Don't use it for other purposes.
  • GCS

Notice to security officers

Consider: 

  • The bucket or destination is for the sole use of AppsFlyer. There should be no other entity writing to the bucket.
  • You can delete data in the bucket 25 hours after we write the data.
  • Data we write to the destination is a copy of data already in our servers. The data continues to be in our servers in accordance with our retention policy. 
  • For technical reasons, we sometimes need to delete and rewrite the data. For this reason, we require delete and list permissions. Neither list nor delete are a security risk for you. In the case of list, we are the sole entity writing to the bucket. In the case of delete, we can regenerate the data. 

Multiple-connection principles (more than one destination)

In Data Locker you can send some or all of your data to more than one destination (defined in the connection settings). For example, you can send App A data to AWS, and App B data to GCS.

Each connection consists of a complete set of Data Locker settings, including a destination. Connection settings are independent of one another.

In managing your connections, consider:

  • In Data Locker settings, connections are shown in tabs. Each connection has its own settings tab from which you can manage the connection. The default tab is “Data Locker.”
  • To create a new connection:
    1. Click Add connection.
    2. Provide a name for the connection and choose the storage type.
    3. Click Save.
      Once saved, the connection displays next to the default “Data Locker” tab. The icon of each tab represents the storage type.
  • To see connection details, duplicate a connection, or delete a connection, click ⋮ (options).

Procedures

Set up Data Locker

Use this procedure to set up Data Locker. Changes to settings take effect within 3 hours. 

Prerequisite:

Complete one or more of the following storage procedures:

AppsFlyerAdmin_us-en.png To set up Data Locker:

  1. Log in to your AppsFlyer partner dashboard.
  2. Go to:
    • Advertisers: Report > Data Locker.
    • Marketing partners: Click the account menu > Data Locker.
  3. Follow the Data Locker setup instructions steps 3-16.

Additional information

Traits and Limitations

Trait Remarks 
App-specific time zone Not Applicable
App-specific currency  Not supported
Size limitations Not applicable
Data freshness Data is updated according to the specific report data freshness detailed in this article.
Historical data

Not supported

Team member access Team members cannot set up Data Locker.