At a glance: Data Locker sends your report data to cloud storage for loading into your BI systems. You can select between different storage destinations: An AppsFlyer-owned bucket in AWS, or storage owned by you in AWS, GCS, Yandex, BigQuery, and Snowflake. Data Locker supports multiple destinations. That means you can send all data to multiple destinations, segregate data by destination, or a combination of both.
Data Locker
In Data Locker select your apps, media sources, events, and reports to include in the data AppsFlyer delivers to your selected cloud storage options. Then, load data programmatically from the storage into your systems.
Data Locker—features
Feature | Description |
---|---|
Storage options (cloud) |
Data Locker can send your data to any of the following cloud service providers:
You can set more than 1 destination in Data Locker. This means that you can send all or some of your data to multiple destinations. Examples
|
Multi-app |
Send data of 1, more, or all apps in your account. When you add apps to the account, they can be automatically included. |
Availability window |
14 days |
Data segregation |
Available data segregation options (relevant for bucket cloud storage):
|
Data format options |
|
Data freshness |
Freshness depends on the report type
|
Reports available via Data Locker
Data storage architecture
Overview
The structure of your data in storage depends if the data is sent to cloud storage or a data warehouse. The folder structure described here applies to storage (buckets). In the case of data warehouse storage, consider that the reference to folders applies to views.
Data is written to your selected storage option. In the case of cloud storage, the storage is owned by AppsFlyer on AWS or owned by you on AWS, GCS, or Yandex. You can switch storage options at any time or send some or all of your data to multiple storage options.
Data in the cloud bucket storage is organized in a hierarchical folder structure, according to report type, date, and time. The following figure contains an example of this structure:
Data of a given report is contained in the hour (h) folders associated with that report:
- The number of hour folders depends on the report data freshness (hourly, daily or versioned).
- Data is provided in Snappy or GZIP compressed files, or uncompressed files, having Parquet or CSV format.
- Data files consist of columns (fields).
- The schema (field) structure of the user journey reports is identical to each other and depends on the fields selected by you. Other reports each have their own explicit fields, AKA schemaless reports. See Data Locker marketer reports for the reports available and links to the report specifications.
Folder structure
Folder | Description |
---|---|
Subscription ID |
Examples of folder structure based on bucket owner and cloud provider
|
Topic (t) | Report type relates to the subject matter of the report. |
Date (dt) |
This is the related data date. In the case of raw data, it means the date the event occurred. In the case of aggregated data, the reporting date itself. |
Time (h or version) |
Date folders are divided into hourly (h) or version folders depending on the report type. Hourly foldersThe h folders relate to the time the data was received by AppsFlyer. For example, install events received between 14:00-15:00 UTC are written to the h=14 file. Note! There is a delay, of about 1-3 hours, between the time the data arrives in AppsFlyer until the h folder is written to Data Locker. For example, the h=14 folder is written 1 hour later at 15:00 UTC. Hourly folder characteristics:
Version foldersSome reports have a versioned option. This means that the most updated data for a given day is provided multiple times. Because data can continue to update due to late-arriving data or more accurate data the same report has multiple versions where the most recent version is the most accurate. The reports for a given day are contained in the versions folder of that day. Each version is contained in a separate folder whose name is set using an Epoch timestamp that uniquely identifies the report. Your data import processes must consider that data can be written retroactively. For example, on January 14, data can be written to the Jan 1 folder. If the bucket is owned by you, consider using cloud service notification to trigger your import process (AWS | GCS) |
App segregation
Segregation type | Description |
---|---|
[Default] Unified |
Data for all apps are provided in unified data files. When consuming the data, use the row-level app_id field to distinguish between apps. Example of data files are in the h=2 folder The data file naming convention is unique_id.gz.
|
Segregated by app |
The folder contains subfolders per app. Data files for a given app are contained within the app folder. In the figure that follows, the h=19 folder contains app folders. Each app folder contains the associated data files. Note! The data files don't contain the app_id you must determine the app_id using the folder. In each app folder the naming convention is unique_id.gz:
Limitation: This option is not available for Peopled-Based Attribution reports. |
Data files
Data files depend on segregation type.
Content | Details | |
---|---|---|
Completion flag |
The last file (completion) flag is set when all the data for a given h folder has been written.
|
|
File types |
|
|
Column sequence (CSV files) |
In the case of CSV files, the sequence of fields in reports is always the same. When we add new fields these are added to the right of the existing fields. In this regard:
|
|
Field population considerations |
Blank or empty fields: Some fields are populated with null or are empty. This means that in the context of a given report there is no data to report. Typically null means this field is not populated in the context of a given report and app type. Blank "" means the field is relevant in its context but no data was found to populate it with. In the case of the restricted media source, the content of restricted fields is set to null. Overall regard null and blank as one and the same thing; there is no data available. Time zone and currency App-specific time zone and currency settings have no effect on data written to Data Locker. The following apply:
Values with commas: These commas are contained between double quotes `"`, for example, |
Storage options
Caution!
If you are using the marketer-owned storage option:
- Verify that you comply with data privacy regulations like GDPR and ad network/SRN data retention policies.
- Don't use the marketer-owned storage solution to send data to third parties.
- Data is written to a storage owner of your choice as follows:
- AppsFlyer storage
- Customer storage—AWS, GCS, Azure, Yandex, BigQuery, and Snowflake
- You can change the storage selection at any time.
- If you change the storage, the following happens:
- We start writing to the newly selected storage within one hour.
- We continue writing to the existing storage during a transition period of 7 days. The transition period expiry time displays in the user interface. Use the transition period to update your data loading processes. You can restart the transition period or revert to the AppsFlyer bucket if needed.
- Changing storage: You can migrate from one storage option to another by using the multi-storage option and sending data to multiple destinations simultaneously. Once you have completed the migration and testing, delete the storage option you no longer need.
AppsFlyer-owned storage (AWS) | Marketer-owned storage (GCS, AWS, Azure, Yandex, BigQuery, Snowflake) | |
---|---|---|
Bucket name | Set by AppsFlyer |
Example: |
Storage ownership | AppsFlyer | Marketer |
Storage platform | AWS | AWS, GCS, Azure, Yandex, BigQuery, Snowflake |
Credentials to access data by you | Available in the Data Locker user interface to your AppsFlyer account admins | Not known to AppsFlyer. Use credentials provided by the cloud provider. |
Data retention | Data is deleted after 14 days | Marketer responsibility |
Data deletion requests | AppsFlyer responsibility | Marketer responsibility |
Security | AppsFlyer controls the storage. The customer has read access. |
The marketer controls the storage.
|
Storage capacity | Managed by AppsFlyer | Managed by the marketer |
Access control using VPC endpoints with bucket policies | Not Applicable | [Optional] In AWS, if you implement VPC endpoint security at the bucket level, you must allowlist AppsFlyer servers. |
Notice to security officers in the case of customer-controlled storage
Consider:
- The bucket or destination is for the sole use of AppsFlyer. There should be no other entity writing to a given destination.
- You can delete data in the destination 25 hours after we write the data.
- Data written to the destination is a copy of data already in our servers. The data continues to be in our servers in accordance with our retention policy.
- For technical reasons, we sometimes delete and rewrite the data. For this reason, we need delete and list permissions. Neither permissions are a security risk for you. In the case of list, we are the sole entity writing to the bucket. In the case of delete, we are able to regenerate the data.
- For additional information, you can contact our security team via hello@appsflyer.com or your CSM.
Multiple-connections principles (more than one destination)
In Data Locker you can send some or all of your data to 2 destinations (defined in the connection settings). For example, you can send App A data to AWS, and App B data to GCS.
Each connection consists of a complete set of Data Locker settings, including a destination. Connection settings are independent of one another.
In managing your connections, consider:
- In Data Locker settings, connections are shown in tabs. Each connection has its own settings tab from which you can manage the connection. The icon of each tab represents the storage type.
- To see connection details, duplicate a connection, or delete a connection, click ⋮ (options).
Procedures
User permissions
Both admins and team members, with the correct permissions, can access Data Locker.
Admins
Admins can access the Data Locker page, create and manage all connections, add editors, and assign owners to existing connections.
Team members
Team members can access the Data Locker page, edit existing connections that they own, or create new connections.
Providing permissions
- To provide a team member permission to access Data Locker, assign them a role with Data Locker set to 'Manage'.
- To transfer ownership or add a team member as an editor on an existing connection, click on the three-dot options menu within an existing connection, and then Manage ownership to either change the connection owner or add editors.
Set up Data Locker
Use this procedure to set up Data Locker. Any changes to Data Locker settings take up to 3 hours to take effect.
Prerequisites
To set up marketer-owned storage:
If you are setting up Data Locker using a marketer-owned cloud storage service, complete one or more of the following procedures now.
- Set up:
Note! If you don't have a Data Locker subscription and you access Cohorts analytics or SKAN data, you must still complete a marketer-owned cloud storage service procedure.
To set up Data Locker:
- An admin needs to perform the setup.
- In AppsFlyer, from the sidebar, go to Reports > Data Locker.
- [Optional] If you already have an active Data Locker destination and want to add a destination, click Add connection. Name your connection.
-
Select a cloud service data destination. Do one of the following:
- Select AppsFlyer AWS bucket (option available to Data Locker subscribers only). Click Save and continue to step 5.
- Select S3.
- Enter your AWS S3 bucket name.
af-
prefix is mandatory, and should be entered manually. - Click Test connection.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Select whether to Make this connection compatible with Adobe Experience Platform. If selected, click Save and continue to Step 6.
- Click Save.
- Enter your AWS S3 bucket name.
- Select GCS.
- Enter your GCS bucket name.
- Click Test connection.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Select whether to Make this connection compatible with Adobe Experience Platform. If selected, click Save and continue to Step 6.
- Click Save.
- [Beta] Select Azure.
- Enter your Connection name, Storage account name, and Key.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Select whether to Make this connection compatible with Adobe Experience Platform. If selected, click Save and continue to Step 6.
- Click Save.
- [Beta] Select Yandex.
- Enter your Bucket name, Access key, and Secret key.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Select whether to Make this connection compatible with Adobe Experience Platform. If selected, click Save and continue to Step 6.
- Click Save.
- Select BigQuery.
- Enter your BigQuery project ID and dataset name.
- Click Test connection.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Click Save and continue to Step 6.
- Select Snowflake.
- Enter your Snowflake region and account ID.
- Click Test connection.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Click Save and continue to Step 6.
- Complete the Data settings section:
- Select the file format you want:
- [Default] Parquet
- CSV
- Select the file compression type you want:
- Snappy (only available for Parquet files)
- GZIP
- Uncompressed
- Select the max rows you want in your file: Either 10k, 25k, 50, 100k, 200k, or 500k. More rows in the file means fewer files but a larger file size.
- Select folder structure (data segregation):
- [Default] Unified
- Segregated by app
- Select the file format you want:
- Complete the Data Locker content section:
- Select one or more Apps to include in the reports. Select all to automatically include apps added in the future.
- Click Apply.
- [Optional] Select one or more Media Sources to include in reports.
- Default=All. This means that media sources added in the future are automatically added.
- Click Apply.
- [Optional] Select the Fields to include in reports. Note: Sometimes we make additional fields available. Take this into account in your data import process.
- Click Apply.
- Select the report types. You must select at least 1.
- [Optional] For the In-app events report, select the in-app events to include. If you have more than 100 in-app event types, you can't search for them. Enter their names exactly to select them.
- Default=All. This means that in-app events added in the future are automatically added.
- Click Apply.
- Click Save connection. One of the following occurs:
- If you selected AppsFlyer AWS bucket:
- A dedicated AWS bucket is created. The bucket credentials display.
- The bucket is accessible using the credentials. The credentials provide you with read-only access to the bucket.
- If you selected one of your cloud storage services: Data will be written to your service within 3 hours.
- If you selected AppsFlyer AWS bucket:
Reset credentials
An admin can reset the AppsFlyer bucket credentials at any time. Note! If you reset the credentials, you must update your data import scripts with the updated credentials.
To reset the credentials of AppsFlyer owned storage:
- In AppsFlyer, go to Reports > Data Locker.
- Select the AppsFlyer-owned destination.
- In the Credentials section, click Reset credentials.
A confirmation window displays. - Click Reset.
- Wait (about 20 seconds) until the Credentials successfully reset message displays.
The updated credentials are available.
Additional information
Traits and Limitations
Trait | Remarks |
---|---|
Ad networks | Not for use by ad networks |
Agencies | Not for use by agencies |
App-specific time zone | Not Applicable. Data Locker folders are divided into hours using UTC. The actual events contain times in UTC. Convert the times to any other time zone as needed. Irrespective of your app time-zone the delay from event occurrence until it is recorded in Data Locker remains the same. |
App-specific currency | Not supported |
Size limitations | Not applicable |
Data freshness | Data is updated according to the specific report data freshness detailed in this article. |
Historical data | Not supported. If you need historical data, some reports, but not all, are available via Pull API. |
Restricted data | Fields in some reports are restricted due to privacy limitations. Learn more |
User access | Only account users with required permissions can configure Data Locker. |
Single app/multiple app | Multi-app support. Data Locker is at the account level |
Troubleshooting
- Symptom: Unable to retrieve data using AWS CLI
- Error message: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
- Cause: The AWS credentials being used are not the correct credentials for the AppsFlyer bucket. This can be caused by having multiple or invalid credentials on your machine.
-
Solution:
- Use a different method, like Cyberduck to access the bucket, meaning not the CLI. Do this to verify that the credentials you are using are working. If you are able to connect using Cyberduck, this indicates an issue with the credentials cache.
-
Refresh the AWS credentials cache.
Screenshot from AWS
AWS data retrieval
Use your preferred AWS data retrieval tool, AWS CLI, or one of the tools described in the sections that follow. Note! The exact instructions are suitable for AppsFlyer owned buckets. Adjust the instructions as needed if you are connecting to your bucket.
AWS CLI
Before you begin:
- Install the AWS CLI on your computer.
- In AppsFlyer, go to Data Locker, and retrieve the information contained in the credentials panel.
To use AWS CLI:
- Open the terminal. To do so in Windows, <Windows>+<R>, click OK.
The command line window opens. - Enter aws configure.
- Enter the AWS Access Key as it appears in the credentials panel.
- Enter your AWS Secret Key as it appears in the credentials panel.
- Enter eu-west-1.
- Press Enter (None).
Use the CLI commands that follow as needed.
In the following commands, the value of {home-folder} can be found
To list folders in your bucket:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/
Listing files and folders
There are three types of folders in your Data Locker bucket:
- Report Type
t=
- Date
dt=
- Hour
h=
To list all the reports of a specific report type:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/
To list all the reports of a specific report type for a specific day:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/dt=2019-01-17
To list all the reports of a specific report, in a specific hour of a specific day:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/dt=2019-01-17/h=23
To download files for a specific date:
aws s3 cp s3://af-ext-reports/<home-folder>/data-locker-hourly/t=installs/dt=2020-08-01/h=9/part-00000.gz ~/Downloads/
Cyberduck
Before you begin:
- Install the Cyberduck client.
- In AppsFlyer, go to Data Locker and retrieve the information contained in the credentials panel.
To configure Cyberduck:
- In Cyberduck, click Action.
- Select New Bookmark. The window opens.
- In the first field (marked [1] in the screenshot below) select Amazon S3.
- Complete the fields as follows:
- Nickname: Free text
- Server: s3.amazonaws.com
- Access Key ID: Copy the AWS Access Key as it appears in the credentials panel in AppsFlyer
- Secret Access Key: Copy the Bucket Secret key as it appears in the credentials panel in AppsFlyer.
- Path: {Bucket Name}/{Home Folder} For example: af-ext-reports/1234-abc-ffffffff
- Close the window. To do so, click the X in the upper-right corner of the window.
- Select the connection.
The data directories are displayed.
Amazon S3 browser
Before you begin:
- Install the Amazon S3 Browser.
- In AppsFlyer, go to Data Locker and retrieve the information contained in the credentials panel.
To configure the Amazon S3 Browser:
- In the S3 browser, Click Accounts > Add New Account.
The Add New Account window opens. - Complete the fields as follows:
- Account Name: free text.
- Access Key ID: copy the AWS Access Key as it appears in the credentials panel.
- Secret Access Key: copy the Bucket Secret key as it appears in the credentials panel.
- Select Encrypt Access Keys with a password and enter a password. Make a note of this password.
- Select Use secure transfer.
- Click Save changes.
- Click Buckets > Add External Bucket.
The Add External Bucket window opens.
- Enter the Bucket name. The Bucket name has the following format: {Bucket Name}/{Home Folder}. The values needed for bucket name and home folder appear in the credentials window.
- Click Add External bucket.
The bucket is created and displays in the left panel of the window.
You can now access the Data Locker files.