At a glance: Data Locker writes raw-data to an AWS S3 bucket in near-real-time (lag 6 hours.) Data can be written to a bucket provided by AppsFlyer or directly to your bucket.
Data Locker
- Selecting the right data delivery tool
- Data Locker example reports in an Excel file: contains install and clicks reports in different sheets
Data locker main features
- Apps: supports multiple apps which added automatically
- Simplicity: data is written to an Amazon S3 bucket
- Reliability: data is stored in AWS which ensures data persistence
- Flexibility: choose what data you want to include
- Granularity: data is segmented into report types, days and hours
- Accessibility: pull data when required
- Data freshness: 6-hour lag using or daily depending on the report type. The lag-time is the same (6 hours) irrespective of the app-specific time zone.
- Bucket ownership:
- Get the data via an AppsFlyer owned bucket. Data retention: 30 days.
- AppsFlyer writes the data directly to your bucket. Data retention: Controlled by you.
Reports available in Data Locker
Category | Report type (topic) | Data freshness* | Organic/Non-organic | Unique to Data Locker |
---|---|---|---|---|
User acquisition | Clicks | 6-hour lag | N/A | ✓ |
Retargeting | Clicks | 6-hour lag | N/A | ✓ |
User acquisition | Impressions | 6-hour lag | N/A | ✓ |
Retargeting | Impressions | 6-hour lag | N/A | ✓ |
User acquisition | Installs | 6-hour lag | Both | |
User acquisition | In-app events | 6-hour lag | Both | |
User acquisition | Attributed ad revenue | Daily+2 | Non-organic | |
User acquisition | Organic ad revenue | Daily+2 | Organic | |
Retargeting | Retargeting ad revenue | Daily+2 | Non-organic | |
Retargeting | Conversions | 6-hour lag | Non-organic | |
Retargeting | In-app events | 6-hour lag | Non-organic | |
Retargeting | Sessions | 6-hour lag | Both | ✓ |
User acquisition | Sessions | 6-hour lag | Both | ✓ |
User acquisition | Uninstalls | Daily | Non-organic | |
User acquisition | Organic uninstalls | Daily | Organic | |
Reinstalls | Reinstalls | 6-hour lag | Non-organic | |
Reinstalls | Organic reinstalls | 6-hour lag | Organic |
Report type (topic) | Data freshness* |
---|---|
Blocked installs | 6-hour lag |
Blocked in-app events | 6-hour lag |
Blocked clicks | 6-hour lag |
[FF*] [AG*] Post-attribution installs | Daily |
Report type (topic) |
---|
[FF*] Postbacks |
[FF*] Installs |
[FF*] Redownloads |
[FF*] In-app events |
Report type (topic) |
---|
[FF*] Website visits |
[FF*] Website events |
[FF*] Website-assisted installs |
[FF*] Conversion Paths |
* Key to abbreviations [FF] Report fields are fixed by Appsflyer. They are not related to the fields selected for inclusion in reports. [AG] Agency transparency not supported. 6-hour lag: Data is separated into arrival hour folders. That is the hour that the event was deposited to Data Locker. Some data Locker folders are written about six hours after the actual event time for real-time events. There are 24 folders for each hour of the day, from 0 to 23, and an additional folder for data that arriving late. The lag time is the same irrespective of the app-specific timezone. Daily: Reports having a data freshness rate of daily are written to the h=23 folder. These reports are typically available by 10:00-12:00 UTC in the h=23 folder of the preceding day. For example, the report for data generated during Monday is in the Monday H=23 folder. The data is available after 10:00 UTC on Tuesday. Daily+2: Ad revenue data is available after 2 days, meaning that data generated during Monday, becomes available in the Monday H=23 folder after 06:00 UTC on Wednesday. |
Data Locker architecture
Data partition
AppsFlyer creates an AWS principal (ARN in Amazon terms) and generates credentials for that principal. A policy is then set allowing the principal to browse and retrieve files from their bucket.
In the bucket, data is organized by report type. The data for a given report is stored in its folder.
Folder and file structure
- Folder structure is: af-ext-reports/<Home Folder>/data-locker-hourly/t=<event-type>/dt=<date YYYY-MM-dd>/h=<Hour h>
- The Home Folder is the Home Folder that appears in the Credentials window (see the setup instruction in the prior section)
- For example, for the date 2016-08-12 the report appears in:
s3://af-ext-reports/12345678911-acc-1abc234/data-locker-hourly/t=installs/dt=2016-08-12/
- The day folder dt=yyyy-mm-dd is split into 25 hourly folders. These folders represent the arrival hour of the event, not the event hour of the event itself. The folders are named h=0, h=1, h=2, and so on, up to h=23, and h=late. For example, the folder h=0 contains the events that arrive between 00:00 UTC and 01:00 UTC similarly, the folder h=20 contains the events that arrive between 20:00 and 21:00.
-
In each folder:
-
Data is split into multiple files to avoid large files. File names are: part-00000, part-00001, part-00002, and so on. There can be up to 1000 files. We may increase this maximum number in the future without advance notice.
-
The last file to be written is an empty file named _SUCCESS. This file is a flag indicating that no further data will be written to the folder. As such, do not read data in a folder before verifying that the _SUCCESS file exists. Note: The _SUCCESS flag is also written in cases where there is no data to be written to the folder.
-
-
Late folder
-
The late folder contains events of the preceding day that arrived after midnight. Meaning arrives between 00:00–02:00 UTC of the following day. For example, a user installs an app on Monday 08:00, the event arrives on Tuesday at 01:00. The event is recorded in Monday's late folder.
-
-
Data in the late folder are not recorded in any other folder.
-
-
File structure and format
- Data Locker files are based on Raw Data Reports V5 (see: Raw Data Reports V5).
- The actual data file is in CSV format but it has no file extension.
- The report files are zipped in .gz format.
- Each file has a header row.
- Values that have a coma in them are contained between double quotes `"`, for example
`"iPhone6,1"`
.
Using reports as data sources
You can use the data from the reports and add it to your own databases. To extract the data and add it to your databases you need to know the report format. Data Locker reports are based on Raw Data Reports. However, the final report format depends on the fields that you choose to include.
Some fields are populated with null or are empty. This means that in the context of a given report there is no data to report. In general, null means this field is not populated in the context of a given report and app type. Blank "" means the field is relevant in its context but no data was found to populate it with.
Timezone and currencyApp-specific timezone and currency settings don't have an effect on data in Data Locker.
- Timezone: Data Locker reports use the UTC timezone
- Currency: The field event_revenue_usd is in USD.
AppsFlyer AWS bucket vs. Customer AWS Bucket
Caution!
If you are using the Client AWS Bucket solution:
- Ensure that you comply with data privacy regulations like GDPR and ad network/SRN data retention policies.
- Don't use the Customer AWS Bucket solution to send data to third parties.
- Data is written to a bucket owner of your choice as follows:
- AppsFlyer AWS bucket
- Customer AWS bucket
- You can change the bucket owner selection at any time:
- Move from an AppsFlyer AWS bucket to a Customer AWS bucket in the user interface. The change takes effect within 1 hour. This means we stop writing data to one bucket and start to write data to the newly selected bucket.
- If you want to stop using your Customer Bucket, select the AppsFlyer bucket.
AppsFlyer AWS bucket | Customer AWS bucket | |
---|---|---|
Bucket name | Set by AppsFlyer |
Set by you. Must have the prefix af-datalocker-. Example: |
Bucket ownership | AppsFlyer | Customer |
Storage platform supported | AWS | AWS |
Credentials to access data by you | Available in the Data Locker user interface to the Admin | Not known to AppsFlyer. Use your AWS credentials. |
Data retention | Data is deleted after 30 days | Your responsibility |
Data deletion requests | AppsFlyer responsibility | Your responsibility |
Security | AppsFlyer controls the bucket. The customer has read access. | The customer controls the bucket. AppsFlyer requires GetObject, ListBucket, DeleteObject, PutObject permission to the bucket. The bucket should be dedicated to AppsFlyer use. Don't use it for other purposes. |
Storage space | Managed by AppsFlyer | Managed by you |
Procedures
Set up Data Locker
Use this procedure to set up Data Locker.
Prerequisite for setting up a Customer AWS bucket:
If you are setting up Data Locker using your Customer AWS bucket, meaning a bucket owned by you, you must first complete setting up your AWS S3 bucket.
To set up Data Locker:
- The admin needs to perform the setup.
- In Appsflyer, go to Integration > Data Locker.
- Choose the Amazon S3 integration method. Select one of the following
- AppsFlyer AWS bucket. Continue to step 4.
- Customer AWS bucket.
- Enter your AWS bucket name. Don't enter the prefix af-datalocker-
- Click Test.
- Verify that no error message displays indicating that the bucket path is invalid.
- Select one or more or all apps. Select all to automatically include apps you add in the future.
- Click Apply.
- [optional] Media Sources: Select one or more Media Sources to include in reports.
- Default=All. This means that media sources added in the future are automatically added.
- Select one or more report types.
- [optional] In-app events: Select the in-app events to include. If you have more than 100 in-app event types, you can't search for them. Enter their names exactly to select them.
- Default=All. This means that in-app events added in the future are automatically added.
- Click Apply.
- [optional] Fields (default=All): Select the fields to include in the reports. Note: We add fields from time to time take this into account in your data import process.
- Click Save Configuration. One of the following occurs:
- If you selected AppsFlyer AWS bucket:
- A dedicated AWS bucket is created. The bucket credentials display.
- The bucket is accessible using the credentials. The credentials provide you with read-only access to the bucket.
- If you selected Customer AWS bucket:
- Data will start being written to your AWS bucket within 1-2 hours.
- If you selected AppsFlyer AWS bucket:
Setup Data Locker—Your AWS S3 bucket
The procedure in this section must be performed by your AWS admin.
You can delete files from Data Locker 25 or more hours after they were written. Please don't delete them before.
Background information for the AWS admin:
- AppsFlyer writes your data to an S3 bucket owned by you. To do so, the following are required:
- Create a bucket having the name af-datalocker-mybucket. The prefix af-datalocker- is mandatory . The suffix is free text.
- We suggest af-datalocker-yyyy-mm-dd-hh-mm-free-text. Where yyyy-mm-dd-hh-mm is the current date and time, and you add any other text you want as depicted in the figure that follows.
User interface ins AWS console - Having created the bucket, grant AppsFlyer permissions using the procedure that follows.
To create a bucket and grant AppsFlyer permissions:
- Sign in to the AWS console.
- Go to the S3 service.
- To create the bucket:
- Click Create bucket.
- Complete the Bucket name as follows: Start with
af-datalocker-
and then add any other text as described previously. - Click Create bucket.
- To grant AppsFlyer permissions:
- Select the bucket.
- Go to the Permissions tab.
- In the Bucket policy section, click Edit.
The Bucket policy window opens. - Paste the following snippet into the window.
{ "Version": "2012-10-17", "Statement": [ { "Sid": "AF_DataLocker_Direct", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::195229424603:user/product=datalocker__envtype=prod__ns=default" }, "Action": [ "s3:GetObject", "s3:ListBucket", "s3:DeleteObject", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::af-datalocker-my-bucket", "arn:aws:s3:::af-datalocker-my-bucket/*" ] } ] }
-
In the snippet, replace
af-data-locker-my-bucket
with the bucket name you created. -
Click Save changes.
- Complete the Setup Data Locker procedure.
Reset credentials
The admin can reset the AppsFlyer AWS bucket credentials at any time. Note! If you reset the credentials you must update your data import scripts with the updated credentials.
To reset the credentials:
- In Appsflyer, go to Integration > Data Locker.
- In the Credentials section, click Reset credentials.
A confirmation window displays. - Click Reset.
- Wait (about 20 seconds) until the Credentials successfully reset message displays.
The updated credentials are available.
Data retrieval
Use your preferred S3 data retrieval tool, AWS CLI, or one of the tools described in the sections that follow.
AWS CLI
Before you begin:
- Install the AWS CLI on your computer.
- In AppsFlyer, go to Data Locker, retrieve the information contained in the credentials panel as it is needed to perform this procedure.
To use AWS CLI:
- Open the terminal. To do so in Windows, <Windows>+<R>, click OK.
The command line window opens. - Enter aws configure
- Enter the AWS Access Key as it appears in the credentials panel.
- Enter your AWS Secret Key as it appears in the credentials panel.
- Enter eu-west-1
- Press Enter (None)
Use the CLI commands that follow as needed.
In the following commands, the value of {home-folder} can be found
To list folders in your bucket
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/
Listing files and folders
There are three types of folders in your Data Locker bucket:
- Report Type
t=
- Date
dt=
- Hour
h=
To list all the reports of a specific report type:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/
To list all the reports of a specific report type for a specific day:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/dt=2019-01-17
To list all the reports of a specific report, in a specific hour of a specific day:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/dt=2019-01-17/h=23
To download files for a specific date:
aws s3 cp s3://af-ext-reports/<home-folder>/data-locker-hourly/t=installs/dt=2020-08-01/h=9/part-00000.gz ~/Downloads/
Cyber Duck
Before you begin:
- Install the Cyber Duck client.
- In AppsFlyer, go to Data Locker, retrieve the information contained in the credentials panel. You will need this information when you configure Cyber Duck.
To configure Cyber Duck:
- In Cyber Duck, click Action.
- Select New Bookmark. The window opens.
- In the first field, (marked [1] in the screenshot that follows,) select Amazon S3.
- Complete the fields as follows:
- Nickname: free text
- Server: s3.amazonaws.com
- Access Key ID: copy the AWS Access Key as it appears in the credentials panel in AppFlyer
- Secret Access Key: copy the Bucket Secret key as it appears in the credentials panel in AppsFlyer.
- Path: {Bucket Name}/{Home Folder} For example: af-ext-reports/1234-abc-ffffffff
- Close the window, to do so, use the X in the upper-right corner of the window.
- Select the connection.
The data directories are displayed.
Amazon S3 browser
Before you begin:
- Install the Amazon S3 Browser.
- In AppsFlyer, go to Data Locker, retrieve the information contained in the credentials panel as it is needed to perform this procedure.
To configure the Amazon S3 Browser:
- In the S3 browser, Click Accounts > Add New Account.
The Add New Account window opens. - Complete the fields as follows:
- Account Name: free text.
- Access Key ID: copy the AWS Access Key as it appears in the credentials panel.
- Secret Access Key: copy the Bucket Secret key as it appears in the credentials panel.
- Select Encrypt Access Keys with a password and enter a password. Make a note of this password.
- Select Use secure transfer.
- Click Save changes.
- Click Buckets > Add External Bucket.
The Add External Bucket window opens.
- Enter the Bucket name. The Bucket name has the following format: {Bucket Name}/{Home Folder}. The values needed for bucket name and home folder appear in the credentials window.
- Click Add External bucket.
The bucket is created and displays in the left panel of the window.
You can now access the Data Locker files.
Additional information
Traits and Limitations
Trait | Remarks |
---|---|
Ad networks | Not for use by ad networks. |
Agencies | Not for use by agencies |
App-specific time zone | Not Applicable. Data locker folders are divided into hours using UTC. The actual events contain times in UTC. Convert the times to any other time zone as needed. Irrespective of your app time-zone the lag from event occurrence until it is recorded in Data Locker remains the same; that is 6 hours. |
App-specific currency | Not supported |
Size limitations | Not applicable |
Data freshness | Files are updated hourly with a lag of six hours from the event time. |
Historical data | Not supported. Event data is sent after configuring Data Locker. If you need historical data use Pull API. |
Team member access | Team members cannot configure Data Locker. |
Single app/multiple app | Multi app support. Data locker is at the account level |
Developer considerations
In preparing scripts for data loading into your systems consider the following:
- Temporary folder:
- In some cases a temporary folder remains. You should disregard this folder. Example:
/data-locker-hourly/t=inapps/dt=2020-11-13/h=2/_temporarary/0/_temporary/
. - Consume only folders having the
_SUCCESS
flag in them.
- In some cases a temporary folder remains. You should disregard this folder. Example:
- Sequence of columns in reports:
- The sequence of fields in reports is always the same. When we add new fields these are added to the right of the existing fields. The field list in the user interface is sequenced accordingly.
Troubleshooting
- Symptom: Unable to retrieve data using AWS CLI
- Error message: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
- Cause: The AWS credentials being used not the correct credentials for the AppsFlyer bucket. This can be caused by having multiple or invalid credentials on your machine.
- Solution:
- Use a different method, like Cyber Duck to access the bucket, meaning not the CLI. Do this to verify that the credentials you are using are working. If you are able to connect using Cyber Duck, this indicates an issue with the credentials cache.
- Refresh the AWS credentials cache.
Screenshot from AWS`