Data Locker—data streaming for advertisers

At a glance: Data Locker delivers advertiser data to cloud storage. Advertisers can select between an AppsFlyer-owned bucket (on AWS) or an advertiser-owned bucket (on AWS or GCS).

DataLockerBuckets.png

Related reading: Selecting the right raw data delivery tool

Data Locker

Data Locker is a solution that streams your AppsFlyer data to cloud storage. You then load the data programmatically into your BI systems. 

Data Locker main features 
Feature Description
Cloud storage options

Storage options available: 

  • AWS bucket owned by AppsFlyer (retention 30 days)
  • AWS bucket owned by you (retention controlled by you)
  • GCS bucket owned by you (retention controlled by you)

You can switch from one storage option to a different option at any time. 

Multi app support

Send data of some or all of your app as needed. If you add apps to the account, they can be automatically added to Data Locker. 

Data is available in data files that are either: 

  • Unified: data of all apps combined in a single folder. Identify the app associated with a given data row using the app id field in the data file.
  • Segregated: data per app is in separate folders. Identify the app associated with the data using the folder name.
Data freshness
  • Freshness depends on the report type.
  • Continuous data like installs and in-apps is streamed within several hours of the event occurrence.
  • Daily reports are streamed once a day.
Reports unique to Data Locker
  • Unconverted data: Click and impression data of UA and retargeting campaigns are available only via Data Locker. About clicks and impressions
  • SKAdNetwork raw data are available in advertiser-storage without the need for a Data Locker subscription. 
Example data files

Reports available in Data Locker

Core attribution reports—UA and retargeting
Category Report type (topic) Data freshness* Organic/Non-organic Unique to Data Locker
User acquisition Clicks  6-hour lag N/A
Retargeting Clicks 6-hour lag N/A
User acquisition Impressions 6-hour lag N/A
Retargeting Impressions 6-hour lag N/A
User acquisition Installs 6-hour lag Both  
User acquisition In-app events  6-hour lag Both  
User acquisition Attributed ad revenue Daily+2 Non-organic  
User acquisition Organic ad revenue Daily+2 Organic  
Retargeting Retargeting ad revenue Daily+2 Non-organic  
Retargeting Conversions 6-hour lag Non-organic  
Retargeting In-app events 6-hour lag Non-organic  
Retargeting Sessions 6-hour lag Both
User acquisition Sessions 6-hour lag Both
User acquisition Uninstalls Daily-uninstall Non-organic  
User acquisition Organic uninstalls Daily-uninstall Organic  
Reinstalls Reinstalls 6-hour lag Non-organic  
Reinstalls Organic reinstalls 6-hour lag Organic  
Protect360
Report type (topic) Data freshness*
Blocked installs 6-hour lag
Blocked in-app events 6-hour lag
Blocked clicks 6-hour lag
[AG*] Post-attribution installs Daily
SKAdNetwork [Doesn't require a Data Locker subscription if you send them to your own bucket]
Data freshness: Daily 
Report type (topic)
[FF*] Postbacks
[FF*] Installs
[FF*] Redownloads
[FF*] In-app events
People-Based Attribution
Data freshness: Daily
Report type (topic)
[FF*] Website visits
[FF*] Website events
[FF*] Website-assisted installs
[FF*] Conversion Paths
Key to abbreviations

* Key to abbreviations

[FF] Report fields are fixed by AppsFlyer. They are not related to the fields selected for inclusion in reports.

[AG] Agency transparency not supported.

6-hour lag: Data is separated into arrival hour folders. That is the hour that the event was made avaialble to Data Locker to stream. Some data Locker folders are streamed about six hours after the actual event time for events reported continuously. The lag time is the same irrespective of the app-specific time zone.

Daily: Reports are streamed to the h=23 folder. These reports are typically available by 10:00-12:00 UTC in the h=23 folder of the preceding day. For example, the report for data generated during Monday is in the Monday H=23 folder. The data is available after 10:00 UTC on Tuesday. 

Daily-uninstall: These reports are prepared daily are usually available by 10:00-12:00 UTC and most often streamed to the h=2 folder. Meaning the h=2 folder contains uninstalls reported on the previous day. However, the data can be streamed to a later folder. As such your import process should read the data of all folders h1-h=24 and h=late in the uninstall topic. For example, the report for data generated during Monday is in the Tuesday h=2 folder. The data is available after 10:00 UTC on Tuesday. 

Daily+2: Ad revenue data is available after 2 days, meaning that data generated during Monday, becomes available in the Monday H=23 folder after 06:00 UTC on Wednesday.

Data storage architecture

Overview

Data is streamed to the storage option selected by you. The storage is either owned by AppsFlyer on AWS or owned by you on AWS or GCS. You can switch from one storage option to another at any time. The change occurs within hours. 

Within the storage, data is organized in a hierarchical folder structure, illustrated in the figure that follows, by report type, date, and time.

DLFolderOVerview.png

Data of a given report is contained in the hour (h) folders associated with that report.

  • The number of hour folders depends on if the report streams hourly or daily.
  • Data files consist being GZ compressed files containing CSV files.
  • The CSV files consist of columns.
  • The column structure of UA and retargeting reports is identical. This means you can have similar data loading procedures for different report types. The actual fields in these reports are selected by you. 
  • Reports having an indication of FF in the previous section don't adhere to the common column structure. 

Folder structure

Folder Description 
data-locker-hourly

DLHourly.png

  • The highest level folder in the bucket depends on the storage owner and provider.
  • The data-locker-hourly folder contains the report topics. Folders above this level depend on bucket ownership and cloud service provider.

 Examples of folder structure based on bucket owner and cloud provider

    • AppsFlyer bucket: <af-ext-reports>/<unique_identifier>/<data-locker-hourly>
    • Your AWS bucket: <af-datalocker-your folder name>/<data-locker-hourly>
    • Your GCS bucket: <data-locker-hourly>
t (topic) Report type relates to the subject matter of the report. 
dt (date)

This is the related data date. In most cases, this means the date the event occurred. 

h (hour)

The h folders relate to the time the data was received by AppsFlyer. For example, install events received between 14:00-15:00 UTC are streamed to the h=14 file. Note! There is a lag, of about 6 hours, between the time the data arrives in AppsFlyer until the h folder is streamed to Data Locker. For example, the h=14 folder is streamed six hours later at 23:00 UTC. 

Folder characteristics:

  • There are 24 h folders numbered 0-23. For example, h=0, h=1, and so on. 
  • In addition, a late folder contains events of the preceding day arriving after midnight (in other words, events that arrive between 00:00–02:00 UTC of the following day). For example, if a user installs an app on Monday at 08:00 and the event arrives on Tuesday at 01:00, the event is recorded in Monday's late folder. 
  • Data arriving after 02:00 is recorded in the folder of the actual arrival date and time. 
  • You must consume data in the late folder. It isn't contained in any other folder. 
  • _temporary folder: In some cases, we generate a temporary folder within an h folder. Disregard temporary folders and subfolders. Example: /t=impressions/dt=2021-04-11/h=18/_temporary.

App segregation

Data is provided in unified data files containing the data of all apps selected or segregated into folders by app. The segregation is within the h folder as described in the table that follows.
Segregation type Description 
[Default] Unified

Data for all apps are provided in unified data files. When you consume the data, use the row-level app_id field to distinguish between apps.

Example of data files are in the h=2 folder

UnifiedByApp.png

The data file naming convention is: part-nnnnn.gz where: 

  • nnnnn is a part number in the range 0000-99999. For example, part-00000, part-00001, part-00002, and so on.
  • Part numbers aren't necessarily consecutive.
  • In your data consumption process ensure that:
    • You begin to consume data only after the _SUCCESS flag is set.
    • You consume all files in the folder having a .gz extension.
Segregated by app

The folder contains sub-folders per app. Data files for a given app are contained within the app folder. In the figure that follows, the h=19 folder contains app folders. Each app folder contains the associated data files.

DLSegregateByApp.png

In each app folder the naming convention is: part-nnnnn-string.csv.gz: 

  • nnnnn is a part number in the range 0000-99999. For example, part-00000, part-00001, part-00002, and so on.
  • Part numbers aren't necessarily consecutive.
  • In your data consumption process ensure that:
    • You begin to consume data only after the _SUCCESS flag is set. Note! The flag is set at the h level and not at the app_id level. 
    • You consume all files in the folder having a .gz extension.

Limitation: This option is not available for Peopled-Based Attribution reports.

Data files

Content Unified  Segregated by app 
Completion flag

The last file (completion) flag is set when all the data for a given h folder has been streamed. 

  • Don't read data in a folder before verifying that the _SUCCESS flag exists.

  • The _SUCCESS flag is set even in cases where no data is written to the folder. Meaning the folder is empty.

  • Note! In the case of segregation by app, the flag is set in the h folder and not the individual app folders. See the figures in the previous section. 
 

Part files are zipped using gz. After unzipping:

    • The files have no extension.
    • Each file has a header row containing the column (field) names. 

Part files are zipped using gz. After unzipping:

    • Have a csv extension.
    • Each file has a header row containing the column (field) names.
Column sequence

The sequence of fields in reports is always the same. When we add new fields these are added to the right of the existing fields. 

In this regard: 

  • The column structure of UA and retargeting reports is identical. This means you can have similar data loading procedures for different report types. The actual fields streamed are selected by you. 
  • Reports having an indication of FF in the report availability section don't adhere to the common column structure. 
  • The field meaning is detailed in the raw data dictionary
Field population considerations

Blank or empty fields: Some fields are populated with null or are empty. This means that in the context of a given report there is no data to report. Typically null means this field is not populated in the context of a given report and app type. Blank "" means the field is relevant in its context but no data was found to populate it with. 

Time zone and currency

App-specific time zone and currency settings don't have an effect on data in Data Locker data is always as follows: 

  • Time zone: Date and hour data are in UTC.
  • Currency: The field event_revenue_usd is in USD

Values with commas: These comas are contained between double quotes `"`, for example, `"iPhone6,1"`.

Data files depend on segregation type

Storage options—AppsFlyer vs. customer (AWS or GCS)

 Caution!

If you are using the Customer bucket solution: 

  • Ensure that you comply with data privacy regulations like GDPR and ad network/SRN data retention policies.
  • Don't use the Customer bucket solution to send data to third parties. 
  • Data is written to a bucket owner of your choice as follows:
    • AppsFlyer bucket
    • Customer bucket—AWS or GCS
  • You can change the bucket selection at any time.
  • If you change the bucket the following happens:
    • We start writing to the newly selected bucket within one hour.
    • We continue writing to the existing bucket during a transition period of 7 days. The transition period expiry time displays in the user interface. Use the transition period to update your data loading processes. 
    • Changing buckets: If you change buckets, data is sent to both buckets for a transition period of 7 days allowing you to align your data consumption process. 
  AppsFlyer bucket (AWS)  Customer bucket (AWS) Customer bucket (GCS)
Bucket name Set by AppsFlyer

Set by you. Must have the prefix af-datalocker-.

Example: af-datalocker-your-bucket-name

No restriction

Bucket ownership AppsFlyer Customer Customer
Storage platform supported AWS AWS  GCS
Credentials to access data by you Available in the Data Locker user interface to the Admin Not known to AppsFlyer. Use your AWS credentials. Not known to AppsFlyer. Use your GCS credentials.
Data retention Data is deleted after 30 days Your responsibility Your responsibility
Data deletion requests AppsFlyer responsibility Your responsibility Your responsibility
Security AppsFlyer controls the bucket. The customer has read access. The customer controls the bucket. AppsFlyer requires GetObject, ListBucket, DeleteObject, PutObject permission to the bucket. The bucket should be dedicated to AppsFlyer use. Don't use it for other purposes. The customer controls the bucket. AppsFlyer requires permissions as specified in the GCS configuration article. The bucket should be dedicated to AppsFlyer use. Don't use it for other purposes.
Storage space Managed by AppsFlyer Managed by you Managed by you
Access control using VPC endpoints with bucket policies Not Applicable [Optional] If you implement VPC endpoint security at the bucket level in AWS then you must allowlist AppsFlyer servers.  N/A
SKAdNetwork reports Not available if you don't have a Data Locker subscription Available if you have a raw data subscription. irrespective of the Data Locker subscription Available if you have a raw data subscription. irrespective of the Data Locker subscription

Notice to security officers in the case of customer-controlled buckets

At first glance, it seems that we are asking for a lot of permissions but consider as follows:

  • The bucket is for the sole use of AppsFlyer. There should be no other entity writing to the bucket.
  • You can delete data in the bucket 25 hours after we write the data.
  • Data we write to the bucket is a copy of data already in our servers. The data continues to be in our servers in accordance with our retention policy. 
  • For technical reasons, we sometimes need to delete and rewrite the data. For this reason, we require delete and list permissions. Neither list nor delete are a security risk for you. In the case of list, we are the sole entity writing to the bucket. In the case of delete, we are able to regenerate the data. 

Procedures

Set up Data Locker

Use this procedure to set up Data Locker. Changes to Data Locker settings, and subsequent changes, take up to 3 hours to take effect. 

You can use any of the following buckets:

  • AWS bucket owned and provided by AppsFlyer (complete the procedure that follows). Note: This option is not available if you get SKAdNetwork raw data without a Data Locker subscription. 
  • Your AWS bucket.
  • Your GCS bucket. 

Prerequisite for setting up your bucket:

If you are setting up Data Locker using your bucket, meaning a bucket owned by you, complete the relevant AWS or GCS procedure before continuing.

When you migrate from one bucket to another, we'll keep sending data to the existing bucket for a transition period of 7 days. Note! You must update your data consumption processes before the expiry of the transition period. The transition period can be reset to 7 days (extended) by reverting the bucket type, saving, and then selecting your bucket again. 

 

AppsFlyerAdmin_us-en.png To set up Data Locker:

  1. The admin needs to perform the setup. 
  2. In AppsFlyer, go to Integration > Data Locker. 
  3. Choose the integration method. Select one of the following
    • AppsFlyer AWS bucket. Continue to step 4. 
    • Customer AWS bucket.
      1. Enter your AWS bucket name. Don't enter the prefix af-datalocker-
      2. Click Test.
      3. Verify that no error message displays indicating that the bucket path is invalid.
  4. Select folder structure (data segregation):
    • [Default] Unified 
    • Segregated by app
  5. Select one or more or all apps. Select all to automatically include apps you add in the future.
  6. Click Apply
  7. [optional] Media Sources: Select one or more Media Sources to include in reports.
    • Default=All. This means that media sources added in the future are automatically added.
  8. Select one or more report types.
  9. [optional] In-app events: Select the in-app events to include. If you have more than 100 in-app event types, you can't search for them. Enter their names exactly to select them. 
    • Default=All. This means that in-app events added in the future are automatically added.
  10. Click Apply
  11. [optional] Fields (default=All): Select the fields to include in the reports. Note: We add fields from time to time take this into account in your data import process.
  12. Click Save Configuration. One of the following occurs:
    • If you selected AppsFlyer AWS bucket:
      • A dedicated AWS bucket is created. The bucket credentials display.
      • The bucket is accessible using the credentials. The credentials provide you with read-only access to the bucket.
    • If you selected Customer AWS bucket:
      • Data will start being written to your AWS bucket within 3 hours.

Setup Data Locker—Your AWS S3 bucket

The procedure in this section must be performed by your AWS admin.

You can delete files from Data Locker 25 or more hours after they were written. Please don't delete them before. 

Background information for the AWS admin: 

  • AppsFlyer writes your data to an S3 bucket owned by you. To do so, the following are required:
    • Create a bucket having the name af-datalocker-mybucket. The prefix af-datalocker- is mandatory . The suffix is free text.
  • We suggest af-datalocker-yyyy-mm-dd-hh-mm-free-text. Where yyyy-mm-dd-hh-mm is the current date and time, and you add any other text you want as depicted in the figure that follows.
    User interface ins AWS console

    MyBucket.jpg

  • Having created the bucket, grant AppsFlyer permissions using the procedure that follows. 

To create a bucket and grant AppsFlyer permissions: 

  1. Sign in to the AWS console.
  2. Go to the S3 service.
  3. To create the bucket:
    1. Click Create bucket.
    2. Complete the Bucket name as follows: Start with af-datalocker- and then add any other text as described previously.
    3. Click Create bucket.
  4. To grant AppsFlyer permissions:
    1. Select the bucket. 
    2. Go to the Permissions tab. 
    3. In the Bucket policy section, click Edit. 
      The Bucket policy window opens.
    4. Paste the following snippet into the window.
      {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Sid": "AF_DataLocker_Direct",
            "Effect": "Allow",
            "Principal": {
              "AWS": "arn:aws:iam::195229424603:user/product=datalocker__envtype=prod__ns=default"
            },
            "Action": [
              "s3:GetObject",
              "s3:ListBucket",
              "s3:DeleteObject",
              "s3:PutObject"
            ],
            "Resource": [
              "arn:aws:s3:::af-datalocker-my-bucket",
              "arn:aws:s3:::af-datalocker-my-bucket/*"
            ]
          }
        ]
      }
      
  5. In the snippet, replace af-data-locker-my-bucket with the bucket name you created.

  6. Click Save changes.

  7. Complete the Setup Data Locker procedure.

Reset credentials

The admin can reset the AppsFlyer bucket credentials at any time. Note! If you reset the credentials you must update your data import scripts with the updated credentials.

AppsFlyerAdmin_us-en.png To reset the credentials:

  1. In AppsFlyer, go to Integration > Data Locker. 
  2. In the Credentials section, click Reset credentials.
    A confirmation window displays.
  3. Click Reset.
  4. Wait (about 20 seconds) until the Credentials successfully reset message displays.
    The updated credentials are available.

Additional information

Traits and Limitations

Traits
Trait Remarks 
Ad networks Not for use by ad networks. 
Agencies Not for use by agencies
App-specific time zone Not Applicable. Data Locker folders are divided into hours using UTC. The actual events contain times in UTC. Convert the times to any other time zone as needed. Irrespective of your app time-zone the lag from event occurrence until it is recorded in Data Locker remains the same; that is 6 hours. 
App-specific currency  Not supported
Size limitations Not applicable
Data freshness Files are updated hourly with a lag of six hours from the event time.
Historical data Not supported. Event data is sent after configuring Data Locker. If you need historical data use Pull API. 
Team member access Team members cannot configure Data Locker. 
Single app/multiple app Multi app support. Data Locker is at the account level

Troubleshooting

  • Symptom: Unable to retrieve data using AWS CLI
  • Error message: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
  • Cause: The AWS credentials being used not the correct credentials for the AppsFlyer bucket. This can be caused by having multiple or invalid credentials on your machine. 
  • Solution:
    1. Use a different method, like Cyberduck to access the bucket, meaning not the CLI. Do this to verify that the credentials you are using are working. If you are able to connect using Cyberduck, this indicates an issue with the credentials cache. 
    2. Refresh the AWS credentials cache.
      Screenshot from AWSmceclip0.png 

Data retrieval

Use your preferred AWS data retrieval tool, AWS CLI, or one of the tools described in the sections that follow. Note! The exact instructions are suitable for AppsFlyer owned buckets. Adjust the instructions as needed if you are connecting to your bucket. 

AWS CLI

Before you begin:

  • Install the AWS CLI on your computer.
  • In AppsFlyer, go to Data Locker, retrieve the information contained in the credentials panel as it is needed to perform this procedure. 

To use AWS CLI:

  1. Open the terminal. To do so in Windows, <Windows>+<R>, click OK.
    The command line window opens.
  2. Enter aws configure
  3. Enter the AWS Access Key as it appears in the credentials panel.
  4. Enter your AWS Secret Key as it appears in the credentials panel.
  5. Enter eu-west-1
  6. Press Enter (None)

Use the CLI commands that follow as needed.

In the following commands, the value of {home-folder} can be found

To list folders in your bucket

aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/

Listing files and folders

There are three types of folders in your Data Locker bucket:

  • Report Type t=
  • Date dt=
  • Hour h=

To list all the reports of a specific report type:

aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/

To list all the reports of a specific report type for a specific day:

aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/dt=2019-01-17

To list all the reports of a specific report, in a specific hour of a specific day:

aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/dt=2019-01-17/h=23

To download files for a specific date:

aws s3 cp s3://af-ext-reports/<home-folder>/data-locker-hourly/t=installs/dt=2020-08-01/h=9/part-00000.gz ~/Downloads/

Cyberduck

Before you begin:

  • Install the Cyberduck client.
  • In AppsFlyer, go to Data Locker, retrieve the information contained in the credentials panel. You will need this information when you configure Cyberduck. 

To configure Cyberduck:

  1. In Cyberduck, click Action.
  2. Select New Bookmark. The window opens.
  3. In the first field, (marked [1] in the screenshot that follows,) select Amazon S3.

    DataDuckSmall2.png

  4. Complete the fields as follows:
    • Nickname: free text
    • Server: s3.amazonaws.com
    • Access Key ID: copy the AWS Access Key as it appears in the credentials panel in AppsFlyer
    • Secret Access Key: copy the Bucket Secret key as it appears in the credentials panel in AppsFlyer.
    • Path: {Bucket Name}/{Home Folder} For example: af-ext-reports/1234-abc-ffffffff
  5. Close the window, to do so, use the X in the upper-right corner of the window.
  6. Select the connection.
    The data directories are displayed.

Amazon S3 browser

Before you begin:

  • Install the Amazon S3 Browser.
  • In AppsFlyer, go to Data Locker, retrieve the information contained in the credentials panel as it is needed to perform this procedure. 

To configure the Amazon S3 Browser:

  1. In the S3 browser, Click Accounts > Add New Account.
    The Add New Account window opens.

    mceclip0.png

  2. Complete the fields as follows:
    • Account Name: free text. 
    • Access Key ID: copy the AWS Access Key as it appears in the credentials panel. 
    • Secret Access Key: copy the Bucket Secret key as it appears in the credentials panel.
    • Select Encrypt Access Keys with a password and enter a password. Make a note of this password.
    • Select Use secure transfer. 
  3.  Click Save changes.
  4. Click Buckets > Add External Bucket.
    The Add External Bucket window opens.

    mceclip2.png

  5. Enter the Bucket name. The Bucket name has the following format: {Bucket Name}/{Home Folder}. The values needed for bucket name and home folder appear in the credentials window. 
  6. Click Add External bucket.
    The bucket is created and displays in the left panel of the window.
    You can now access the Data Locker files. 
Was this article helpful?