Data Locker V2.0

At a glance: Data Locker is a solution for storing and retrieving raw data suited to app owners with large amounts of data. Data Locker is an alternative to exporting data using Pull API. You can choose to view and extract raw data by report type, days, and hours. You can use automated scripts to pull and process the data, import it into your BI systems or make it available on demand.

mceclip1.png

Selecting the right data delivery tool

Data Locker

Main features

  • Apps: supports multiple apps
  • Simplicity: Data is deposited into an Amazon S3 bucket and manage the storage requirements
  • Reliability: data is stored in AWS which ensures data persistence
  • Flexibility: choose what data you want to include in the reports
  • Granularity: data is segmented into report types, days and hours
  • Unique data: get more data such as Organic Installs, In-App Events, Sessions, Clicks, and Impressions
  • Accessibility: pull data when required

Data segmentation

Data in Data Locker is segmented into folders as follows:

  • Report types
  • Days
  • Hours

This means that for each report type, on a given day, the data is separated into separate folders by arrival hour and not by the data event time.  For example,  ../t=installs/dt=2019-01-17/ contains 25 folders. There are 24 folders for each hour of the day, from 0 to 23 and an additional folder for data that arrives late.

Data freshness: Data is separated into arrival hour folders. That is the hour that the event was processed in AppsFlyer. The Data Locker folder is written within six hours of processing.

Implementing Data Locker

Configuring Data Locker

Prerequisite: The account admin must configure Data Locker. 

To configure Data Locker:

  1. In Appsflyer, go to Integration > Data Locker. 
  2. Select one or more apps. 
  3. Click Apply
  4. (optional) Media Sources (default=All): Select one or more Media Sources to include in reports. 
  5. Click Apply
  6. Select the report type: Select from;
    • Acquisition: Clicks, Impressions, Installs, In-App Events, Sessions, Uninstalls
    • Retargeting: Retargeting Clicks, Retargeting Impressions. Retargeting Conversions, Retargeting In-App Events
    • Protect 360: Blocked Installs, Blocked In-App Events, Blocked Clicks
    • People Based Attribution: Web Conversions (available if People-Based Attribution is enabled). People Based Attribution data is aggregate. This report is located in h=23 folder. Example t=web_touch_points/dt=2019-07-19/h=23.

      Protect 360 is an AppsFlyer premium solution. 
  7. (optional) In-app events (default=All) Select the in-app events to include .
  8. Click Apply
  9. (optional) Fields (default=All): Select the fields to include in the reports.

     Note:

    AppsFlyer may add new fields without prior notice. If your parsing process does not support field additions, manually select the required fields.
  10. (optional) Recipients Email list of people to notify when reports are ready. To add more than one recipient, sperate the emails using a comma, for example, user1@example.com, user2@example.com.
  11. Click Create Bucket.

Bucket credentials

Once the configuration is saved, a dedicated AWS bucket is created. The bucket details appear at the top right-hand corner of the screen. They include the Bucket Name, Home Folder, and credentials for accessing data.

data-credentials.png

The bucket is only accessible using customer credentials (for security reasons). In addition, all access to the bucket is audited. Note: The above configuration is for Data Locker 2.0. Data Locker 1.0.

Technical details

Data availability

  • Data is updated hourly after a six-hour delay
  • Each file includes the apps selected 
  • Retention: Files and folders are available for 30 days. After 30 days the data is deleted

Folder structure and format

  • Folder structure is: af-ext-reports/<Home Folder>/data-locker-hourly/t=<event-type>/dt=<date YYYY-MM-dd>/h=<Hour h>
  • The Home Folder is the Home Folder that appears in the Credentials window (see the setup instruction in the prior section) 
  • For example, for the date 2016-08-12 the relevant report appears under: s3://af-ext-reports/12345678911-acc-1abc234/data-locker-hourly/t=installs/dt=2016-08-12/
  • The folder dt=yyyy-mm-dd is split into 25 hourly folders. These folders represent the arrival hour of the event, not the event hour of the event itself. The folders are named h=0, h=1, h=2, and so on, up to h=23, and h=late. For example, the folder h=0 contains the events that arrive between 00:00 and 01:00, similarly, the folder h=20 contains the events that arrive between 20:00 and 21:00.
  • In each folder, data may be split into multiple files to avoid large files. Depending on the type of data exported, folders can contain up to 1000 files. This number may change without notice. Files are named part-00000, part-00001, part-00002, and so on.

  • In each folder, the last file to be written is always an empty file named _SUCCESS. This file is a flag to indicates that no further data will be written to the folder. As such, do not read data in a folder before verifying that the _SUCCESS file exists. Note: The _SUCCESS flag is also written in cases where there is no data to be written to the folder. 

Late folder

The Late folder contains events of the preceding day that arrived after 0000 UTC +0 (midnight) midnight and up to 02:00 UTC +0. It also contains the _SUCCESS flag as described in the previous section. Automated processes should look for data in the late folder as is done for all the other folders of the day. 

 Example

An event is received by AppsFlyer on January 21st at 1:15 AM. The event has a timestamp of January 20th at 18:45. Because this event arrived late, it will be placed in the late folder inside the /dt=2019-20-01/ h=late folder.

File structure and format

  • Data Locker files are based on Raw Data Reports V5 (see: Raw Data Reports V5).
  • The actual data file is in CSV format but it has no file extension.
  • The report files are zipped in .gz format (to make the download process efficient).
  • Each file has a header row.
  • Values that have a coma in them are contained between double quotes `"`, for example `"iPhone6,1"`.

Retrieving data from a locker

AppsFlyer creates an AWS principle (ARN in Amazon terms) and generates credentials for that principle. A policy is then set allowing the principle to browse and retrieve files from the bucket.

You can access the bucket using AWS command-line tools and most FTP clients. To use these tools, retrieve the credentials, AWS Access Key and AWS Secret, from the Credentials section.

Data can be accessed using the following tools, amongst others:

AWS CLI

Before you begin:

  • Install the AWS CLI on your computer.
  • In AppsFlyer, go to Data Locker, retrieve the information contained in the credentials panel as it is needed to perform this procedure. 

To use AWS CLI:

  1. Open the terminal. To do so in Windows, <Windows>+<R>, click OK.
    The command line window opens.
  2. Enter aws configure
  3. Enter the AWS Access Key as it appears in the credentials panel.
  4. Enter your AWS Secret Key as it appears in the credentials panel.
  5. Enter eu-west-1
  6. Press Enter (None)

Use the CLI commands that follow as needed.

In the following commands, the value of {home-folder} can be found

To list folders in your bucket

aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/

Listing files and folders

There are three types of folders in your Data Locker bucket:

  • Report Type t=
  • Date dt=
  • Hour h=

To list all the reports of a specific report type:

aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/

To list all the reports of a specific report type for a specific day:

aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/dt=2019-01-17

To list all the reports of a specific report, in a specific hour of a specific day:

aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/dt=2019-01-17/h=23

Cyber Duck

Before you begin:

  • Install the Cyber Duck client.
  • In AppsFlyer, go to Data Locker, retrieve the information contained in the credentials panel. You will need this information when you configure Cyber Duck. 

To configure Cyber Duck:

  1. In Cyber Duck, click Action.
  2. Select New Bookmark. The window opens.
  3. In the first field, (marked [1] in the screenshot that follows,) select Amazon S3.

    DataDuckSmall2.png
  4. Complete the fields as follows:
    • Nickname: free text
    • Server: s3.amazonaws.com
    • Access Key ID: copy the AWS Access Key as it appears in the credentials panel in AppFlyer
    • Secret Access Key: copy the Bucket Secret key as it appears in the credentials panel in AppsFlyer.
    • Path: {Bucket Name}/{Home Folder} For example: af-ext-reports/1234-abc-ffffffff
  5. Close the window, to do so, use the X in the upper-right corner of the window.
  6. Select the connection.
    The data directories are displayed.

Amazon S3 browser

Before you begin:

  • Install the Amazon S3 Browser.
  • In AppsFlyer, go to Data Locker, retrieve the information contained in the credentials panel as it is needed to perform this procedure. 

To configure the Amazon S3 Browser:

  1. In the S3 browser, Click Accounts > Add New Account.
    The Add New Account window opens.
    mceclip0.png
  2. Complete the fields as follows:
    • Account Name: free text. 
    • Access Key ID: copy the AWS Access Key as it appears in the credentials panel. 
    • Secret Access Key: copy the Bucket Secret key as it appears in the credentials panel.
    • Select Encrypt Access Keys with a password and enter a password. Make a note of this password.
    • Select Use secure transfer. 
  3.  Click Save changes.
  4. Click Buckets > Add External Bucket.
    The Add External Bucket window opens.

    mceclip2.png

  5. Enter the Bucket name. The Bucket name has the following format: {Bucket Name}/{Home Folder}. The values needed for bucket name and home folder appear in the credentials window. 
  6. Click Add External bucket.
    The bucket is created and displays in the left panel of the window.
    You can now access the Data Locker files. 

Data available in Data Locker

For each data locker file, the following information is available.

Folder Report description Unique Data Locker report Organic Non-Organic
clicks Clicks  x
clicks_retargeting Clicks coming from retargeting campaigns  x
impressions Impressions x
impressions_retarget Impressions from retargeting campaigns x
installs Installs  x
inapps In-App events  x
conversions_retargeting Retargeting includes re-engagements and re-attributions x x
inapp_retargeting In-App Events from re-attributions and re-engagements x x
sessions App sessions
uninstalls Non-organic uninstalls  x x
Organic uninstalls Organic uninstalls x
blocked_installs Protect360 blocked installs x x
blocked_inapps Protect360 blocked in-app events x x
blocked_clicks Protect360 blocked clicks x x
web_events People based attribution web events
web_touch_points People based attribution web touch points

Using reports as data sources

You can use the data from the reports and add it to your own databases. To extract the data and add it to your databases you need to know the report format. Data Locker reports are based on Raw Data Reports. However, the final report format depends on the fields that you choose to include.

Report format

The fields available in Data Locker are isted in the data field dictionary V5.0

 Tip

The reports contain data that you can use for campaign optimization and retargeting.

Examples

  • Clicks report - the clicks report contains the IDFA or Google Advertising ID. You can use these IDs to retarget users that engage with your ads but fail to install the app.
  • Impression report - Like the clicks report, the impression report also contains the IDFA or Google Advertising ID. You can use the impression report to optimize campaigns according to impressions that don't lead to clicks. You can also retarget these users with different ads and in different campaigns.
  • Retargeting and Re-attribution report - these reports also contain the IDFA or Google Advertising ID. You can use the IDFA or Google Advertising ID to highlight those users that you manage to retarget. Knowing what users you manage to retarget can help you optimize retargeting campaigns.

Note: To benefit from IDFA or Google Advertising ID as explained above, make sure they are included in all your attribution links.

Hourly reports

Data Locker separates data into hourly folders. The hourly folder represents the processing hour and not the hour that the event occurred. The data is written to Data Locker within six hours of processing.  

 Example

AppsFlyer receives data for activity between 14:00 and 15:00 on January 17, 2019. At some time after 15:00 AppsFlyer begins processing the data. Due to processing, the data is not written to Data Locker immediately. Thus the data in folder /t=installs/dt=2019-17-01/h=14 is not available on January 17th, 2019 at 15:00 but rather six hours later.

Creating hourly folders

In cases where there is no data for a specific hour, Data Locker creates a folder for that hour. This is to indicate to you that there was no data in that hour. The folder will contain a `_SUCCESS` file which indicates that AppsFlyer has completed writing to this folder. When designing automated processes this should be taken into account, meaning design your data retrieval processes so that they can handle empty hourly folders.

Clicks and impressions of SRNs vs. non-SRNs

  • Non-SRNs use AppsFlyer attribution links for clicks and impressions. This provides AppsFlyer with the complete data set of the engagement which is then written to Data Locker. 
  • SRNs (Self-reporting networks) don't use AppsFlyer attribution links. As a result, only after an app open does the SRN share click and impression information, which is then written to Data Locker. To clarify, AppsFlyer is not aware of clicks and impressions which do not result in an app open.  Note: Aggregate data reports include all clicks and impressions even if no app open took place. 

Amazon clicks and impressions

Amazon clicks and impressions are not supported. They do not appear in the reports that are stored in Data Locker.

Timezone and currency

Timezone and currency in-app settings don't have an effect on data in Data Locker. Data Locker reports always display the timezone and currency in UTC + 0 and in USD.

Traits and Limitations

Traits
Trait Remarks 
Ad networks Not for use by ad networks. 
Agencies Not for use by agencies
App-specific time zone Supported
App-specific currency  Supported
Size limitations Not applicable
Organic app users Supported
Non-organic app users Supported
Data freshness Files are updated hourly with a lag of six hours from the event time.
Historical data Not supported. Event data is sent after configuring Data Locker. If you need historical data use Pull API. 
Team member access Team members cannot configure Data Locker. 
Single app/multiple app Multi app support. Data locker is at the account level

Troubleshooting

  • Symptom: Unable to retrieve data using AWS CLI
  • Error messageAn error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
  • Cause: The AWS credentials being used not the correct credentials for the AppsFlyer bucket. This can be caused by having multiple or invalid credentials on your machine. 
  • Solution:
    1. Use a different method, like Cyber Duck to access the bucket, meaning not the CLI. Do this to verify that the credentials you are using are working. If you are able to connect using Cyber Duck, this indicates an issue with the credentials cache. 
    2. Refresh the AWS credentials cache.
      Screenshot from AWS`mceclip0.png 
Was this article helpful?