At a glance: Data Locker sends your report data to cloud storage for loading into your BI systems. You can select between different storage destinations: An AppsFlyer-owned bucket in AWS, or storage owned by you in AWS, GCS, Yandex, BigQuery, and Snowflake. Data Locker supports multiple destinations. That means you can send all data to multiple destinations, segregate data by destination, or a combination of both.
Overview
In Data Locker select your apps, media sources, events, and reports to include in the data AppsFlyer delivers to your selected cloud storage options. Then, load data programmatically from the storage into your systems.
Data Locker—features
Feature | Description |
---|---|
Storage options (cloud) |
Data Locker can send your data to any of the following cloud service providers:
You can set more than 1 destination in Data Locker. This means that you can send all or some of your data to multiple destinations. Examples
|
Multi-app | Send data of 1, more, or all apps in your account. When you add apps to the account, they can be automatically included. |
Availability window | 14 days |
Data segregation | Available data segregation options (relevant for bucket cloud storage):
|
Data format options |
|
Data freshness | Freshness depends on the report type
|
Reports available via Data Locker
Set Data Locker report settings
To configure Data Locker, follow these steps to connect your cloud service, define export settings, and customize report content:
1. Set up your cloud service
You can connect your Data Locker to one or more cloud service providers. See the following for instructions on how to configure them to work with Data Locker:
- AWS bucket
- GCS bucket
- [Beta] Azure Blob
- [Beta] Yandex bucket
- BigQuery data warehouse
- Snowflake data warehouse
Note! If you don't have a Data Locker subscription and you access Cohorts analytics or SKAN data, you must still complete a marketer-owned cloud storage service procedure.
2. Add a connection to your cloud service
After configuring your cloud service account to work with Data Locker (see "Set up your cloud service" above), create a connection in Data Locker using the credentials from your account. You can create up to two connections.
To create a connection for your cloud provider perform the following steps:
- In AppsFlyer, from the sidebar, go to Exports > Data Locker.
- On the right-hand side, click New connection.
- In Connection name enter the name for your connection. Use only lowercase letters, digits, and hyphens.
- Click the icon of the cloud service to which you want to connect.
- Depending on the service you selected, enter the following connection information.
AWS cloud bucket connection
Before setting the AWS connection, create an AWS bucket. To learn, how see here.
To set the connection:
- Enter your AWS S3 bucket name.
af-
prefix is mandatory, and should be entered manually. - Click Test connection.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Select whether to Make this connection compatible with Adobe Experience Platform. If selected, click Save and continue to select global-level filters.
- Click Save.
GCS cloud bucket connection
Before setting the GCS connection, create a bucket on GCS. To learn how see here.
To set the connection:
- Enter your GCS bucket name.
- Click Test connection.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Select whether to Make this connection compatible with Adobe Experience Platform. If selected, click Save and continue to select global-level filters.
- Click Save.
Azure cloud bucket connection
Before setting the Azure connection, open a storage account in Azure. To learn how, see here.
To set the connection:
- Enter your Connection name, Storage account name, and Key.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Select whether to Make this connection compatible with Adobe Experience Platform. If selected, click Save and continue to select global-level filters.
- Click Save.
Yandex Cloud bucket connection
Before setting the AWS connection, create a service account in Yandex. To learn how, see here.
To set the connection:
- Enter your Bucket name, Access key, and Secret key.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Select whether to Make this connection compatible with Adobe Experience Platform. If selected, click Save and continue to select global-level filters.
- Click Save.
BigQuery data warehouse connection
Before setting the BigQuery connection, create a dataset in BigQuery. To learn how, see here.
To set the connection:
- Enter your BigQuery project ID and dataset name.
- Click Test connection.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Click Save and continue to select global-level filters.
Snowflake data warehouse connection
Before setting the Snowflake connection, open an account in Snowflake. To learn how, see here.
To set the connection:
- Enter your Snowflake region and account ID.
- Click Test connection.
- Verify that an error message indicating that the bucket path is invalid isn't displayed.
- Click Save and continue to select global-level filters.
- Enter your AWS S3 bucket name.
- Click Save. The Report output settings section is displayed.
3. Set the report output settings
After setting the connection with the cloud service, you can continue to set the general settings of your Data Locker reporting outputs. If your cloud service is BigQuery or Snowflake, you can skip this step.
- Under the Report output settings section, select the folder structure (data segregation):
- Unified (default): The report files include records from all the apps.
- Segregated by app: Each report file is dedicated to one app.
- Select the reports file format: Parquet (default) or CSV.
- Select the report's file compression type:
- Snappy (only available for Parquet files)
- GZIP
- Select the maximum row number you want in your file: Either 10k, 25k, 50, 100k, 200k, or 500k. More rows in the file mean fewer files but a larger file size.
Note
Under Expected path, view the path patterns for your reports. Note: The real path may be different than what is displayed.
4. Select global-level filters
The global-level filters allow you to filter your reports by apps or media sources. These filters apply to most of the reports in your Data Locker account, but you can also set them at the report level (see 7-select-the-reportlevel-filters below). If the same filter is applied on both levels, the report-level filter takes precedence.
To apply a filter, perform the following:
- In the Reports section, click the filter and select the items to include in the report. For example, click
the Apps filter and select the apps to include in the reports.
- Then click the Enter (⏎) button.
5. Select the report
Select the reports that you want to get in your cloud service. The reports are listed in groups. Clicking on the report group name expands or collapses the groups.
- To select a report, click
to expand the report group. For each report in the group, the following information is presented:
-
- Report Name: The title of the report.
- Dataset Name: The name of the dataset that contains the report's records.
- Data Freshness: How often the report is updated with new records (e.g., hourly, daily, or versioned).
- Fields: the number of fields (or columns) that you selected for the report compared to the total number of fields available for selection.
6. Select the report fields
Each report offers a complete set of fields from which you can select only those you want to include. By default, all the report fields are selected.
To select the fields to include in the report:
- Under Select reports, hover over the specific report that you want to customize.
- Click
to open the actions menu, and select
Edit report.
- In the selected report dialog, under the Fields tab, hover over any field to view its description.
- Check the fields you want to include in the report, or uncheck the fields you wish to exclude from the report.
- Click Apply to save your settings.
Copy field selection from another report
You can copy the field selection from another report as a starting point and then continue to select or deselect fields to fine-tune the report.
- In the Fields tab, deselect any random field.
- Click Pull schema from report.
- Select the report you want to copy the field selection from.
- Continue to select or deselect fields.
- To restore the report's original field selection, click
Refresh.
7. Select the report-level filters
The report-level filters enable you to filter a single report by apps, media sources, or other dimensions. You can also set filters that apply to all the reports in your account; see select global-level filters. By default, the report-level filters are set to the global-level filter settings, but you can update them to custom settings that apply only to the selected report.
To select the filters to apply to a specified report:
- Hover on the specific report that you want to customize.
- Click
to open the actions menu, and select
Edit report.
- Open the Filters tab. The filters are set to the global-level filter settings.
- Click the filter and select the items to include in the report. For example, click
the Apps filter and select the apps to include in the reports.
- Click the Enter (⏎) button. Your selection overrides the global-level settings.
- (Optional) For the Inapps report, you can set the In-app event filter. Enter their names exactly to select them.
- Click Apply to save your settings.
8. Remove unused fields
Unused fields are those that were previously included in the report schema but are now excluded. We recommend removing these fields to ensure your report contains only relevant information. Before making any changes, make sure that your workflows and integrations do not depend on them.
To remove specific unused fields
- Open the Unused fields tab.
- Turn on: Include unused fields in the report.
- Deselect the fields you want to exclude.
- Click Apply.
- Save the connection settings.
To remove all unused fields:
- Open the unused fields tab.
- Turn off: Include unused fields in the report.
Note
If you want to include unused fields in the report but can't because the unused fields list is grayed out and locked, contact your Customer Success Manager.
Non-empty unused fields
Most unused fields are empty or null
. However, a few of them contain values but are still considered unused because either:
- They appear in the report under a different name (renamed).
- They were excluded from the report schema (deprecated).
9. Save the connection
Click Save, and the first data dump will be written to your cloud service within 3 hours. Subsequent data update schedules are specific to each report.
Important!
Any changes to Data Locker settings take up to 3 hours to take effect.
Data storage architecture
Overview
The structure of your data in storage depends if the data is sent to cloud storage or a data warehouse. The folder structure described here applies to storage (buckets). In the case of data warehouse storage, consider that the reference to folders applies to views.
Data is written to your selected storage option. In the case of cloud storage, the storage is owned by AppsFlyer on AWS or owned by you on AWS, GCS, or Yandex. You can switch storage options at any time or send some or all of your data to multiple storage options.
Data in the cloud bucket storage is organized in a hierarchical folder structure, according to report type, date, and time. The following figure contains an example of this structure:
Data of a given report is contained in the hour (h) folders associated with that report:
- The number of hour folders depends on the report data freshness (hourly, daily or versioned).
- Data is provided in Snappy or GZIP compressed files, or uncompressed files, having Parquet or CSV format.
- Data files consist of columns (fields).
- The schema (field) structure of the user journey reports is identical to each other and depends on the fields selected by you. Other reports each have their own explicit fields, AKA schemaless reports. See Data Locker marketer reports for the reports available and links to the report specifications.
Folder structure
Folder | Description |
---|---|
Subscription ID |
Examples of folder structure based on bucket owner and cloud provider
|
Topic (t) | Report type relates to the subject matter of the report. |
Date (dt) | This is the related data date. In the case of raw data, it means the date the event occurred. In the case of aggregated data, the reporting date itself. |
Time (h or version) |
Date folders are divided into hourly (h) or version folders depending on the report type. Hourly foldersThe h folders relate to the time the data was received by AppsFlyer. For example, install events received between 14:00-15:00 UTC are written to the h=14 file. Note! There is a delay, of about 1-3 hours, between the time the data arrives in AppsFlyer until the h folder is written to Data Locker. For example, the h=14 folder is written 1 hour later at 15:00 UTC. Hourly folder characteristics:
Version foldersSome reports have a versioned option. This means that the most updated data for a given day is provided multiple times. Because data can continue to update due to late-arriving data or more accurate data the same report has multiple versions where the most recent version is the most accurate. The reports for a given day are contained in the versions folder of that day. Each version is contained in a separate folder whose name is set using an Epoch timestamp that uniquely identifies the report. Your data import processes must consider that data can be written retroactively. For example, on January 14, data can be written to the Jan 1 folder. If the bucket is owned by you, consider using cloud service notification to trigger your import process (AWS | GCS) |
App segregation
For bucket cloud storage, data is provided in unified data files containing the data of all apps selected or segregated into folders by app. The segregation is within the h folder as described in the table that follows.
Segregation type | Description |
---|---|
[Default] Unified |
Data for all apps are provided in unified data files. When consuming the data, use the row-level app_id field to distinguish between apps. Example of data files are in the h=2 folder The data file naming convention is unique_id.gz.
|
Segregated by app |
The folder contains subfolders per app. Data files for a given app are contained within the app folder. In the figure that follows, the h=19 folder contains app folders. Each app folder contains the associated data files. Note! The data files don't contain the app_id you must determine the app_id using the folder. In each app folder the naming convention is unique_id.gz:
Limitation: This option is not available for Peopled-Based Attribution reports. |
Data files
Data files depend on segregation type.
Content | Details | |
---|---|---|
Completion flag | The last file (completion) flag is set when all the data for a given h folder has been written.
|
|
File types |
|
|
Column sequence (CSV files) |
In the case of CSV files, the sequence of fields in reports is always the same. When we add new fields these are added to the right of the existing fields. In this regard:
|
|
Field population considerations |
Blank or empty fields: Some fields are populated with null or are empty. This means that in the context of a given report there is no data to report. Typically null means this field is not populated in the context of a given report and app type. Blank "" means the field is relevant in its context but no data was found to populate it with. In the case of the restricted media source, the content of restricted fields is set to null. Overall regard null and blank as one and the same thing; there is no data available. Time zone and currency App-specific time zone and currency settings have no effect on data written to Data Locker. The following apply:
Values with commas: These commas are contained between double quotes `"`, for example, |
Storage options
Caution!
If you are using the marketer-owned storage option:
- Verify that you comply with data privacy regulations like GDPR and ad network/SRN data retention policies.
- Don't use the marketer-owned storage solution to send data to third parties.
- Data is written to a storage owner of your choice as follows:
- AppsFlyer storage
- Customer storage—AWS, GCS, Azure, Yandex, BigQuery, and Snowflake
- You can change the storage selection at any time.
- If you change the storage, the following happens:
- We start writing to the newly selected storage within one hour.
- We continue writing to the existing storage during a transition period of 7 days. The transition period expiry time displays in the user interface. Use the transition period to update your data loading processes. You can restart the transition period or revert to the AppsFlyer bucket if needed.
- Changing storage: You can migrate from one storage option to another by using the multi-storage option and sending data to multiple destinations simultaneously. Once you have completed the migration and testing, delete the storage option you no longer need.
AppsFlyer-owned storage (AWS) | Marketer-owned storage (GCS, AWS, Azure, Yandex, BigQuery, Snowflake) | |
---|---|---|
Bucket name | Set by AppsFlyer |
Example: af-datalocker-your-bucket-name
|
Storage ownership | AppsFlyer | Marketer |
Storage platform | AWS | AWS, GCS, Azure, Yandex, BigQuery, Snowflake |
Credentials to access data by you | Available in the Data Locker user interface to your AppsFlyer account admins | Not known to AppsFlyer. Use credentials provided by the cloud provider. |
Data retention | Data is deleted after 14 days | Marketer responsibility |
Data deletion requests | AppsFlyer responsibility | Marketer responsibility |
Security | AppsFlyer controls the storage. The customer has read access. | The marketer controls the storage.
|
Storage capacity | Managed by AppsFlyer | Managed by the marketer |
Access control using VPC endpoints with bucket policies | Not Applicable | [Optional] In AWS, if you implement VPC endpoint security at the bucket level, you must allowlist AppsFlyer servers. |
Notice to security officers in the case of customer-controlled storage
Consider:
- The bucket or destination is for the sole use of AppsFlyer. There should be no other entity writing to a given destination.
- You can delete data in the destination 25 hours after we write the data.
- Data written to the destination is a copy of data already in our servers. The data continues to be in our servers in accordance with our retention policy.
- For technical reasons, we sometimes delete and rewrite the data. For this reason, we need delete and list permissions. Neither permissions are a security risk for you. In the case of list, we are the sole entity writing to the bucket. In the case of delete, we are able to regenerate the data.
- For additional information, you can contact our security team via hello@appsflyer.com or your CSM.
Multiple-connections principles (more than one destination)
In Data Locker you can send some or all of your data to 2 destinations (defined in the connection settings). For example, you can send App A data to AWS, and App B data to GCS.
Each connection consists of a complete set of Data Locker settings, including a destination. Connection settings are independent of one another.
In managing your connections, consider:
- In Data Locker settings, connections are shown in tabs. Each connection has its own settings tab from which you can manage the connection. The icon of each tab represents the storage type.
- To see connection details, duplicate a connection, or delete a connection, click ⋮ (options).
Additional information
Traits and Limitations
Trait | Remarks |
---|---|
Ad networks | Not for use by ad networks |
Agencies | Not for use by agencies |
App-specific time zone | Not Applicable. Data Locker folders are divided into hours using UTC. The actual events contain times in UTC. Convert the times to any other time zone as needed. Irrespective of your app time-zone the delay from event occurrence until it is recorded in Data Locker remains the same. |
App-specific currency | Not supported |
Size limitations | Not applicable |
Data freshness | Data is updated according to the specific report data freshness detailed in this article. |
Historical data | Not supported. If you need historical data, some reports, but not all, are available via Pull API. |
Restricted data | Fields in some reports are restricted due to privacy limitations. Learn more |
User access | Only account users with required permissions can configure Data Locker. |
Single app/multiple app | Multi-app support. Data Locker is at the account level |
Troubleshooting
- Symptom: Unable to retrieve data using AWS CLI
- Error message: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
- Cause: The AWS credentials being used are not the correct credentials for the AppsFlyer bucket. This can be caused by having multiple or invalid credentials on your machine.
-
Solution:
- Use a different method, like Cyberduck to access the bucket, meaning not the CLI. Do this to verify that the credentials you are using are working. If you are able to connect using Cyberduck, this indicates an issue with the credentials cache.
-
Refresh the AWS credentials cache.
Screenshot from AWS
AWS data retrieval
Use your preferred AWS data retrieval tool, AWS CLI, or one of the tools described in the sections that follow. Note! The exact instructions are suitable for AppsFlyer owned buckets. Adjust the instructions as needed if you are connecting to your bucket.
AWS CLI
Before you begin:
- Install the AWS CLI on your computer.
- In AppsFlyer, go to Data Locker, and retrieve the information contained in the credentials panel.
To use AWS CLI:
- Open the terminal. To do so in Windows, <Windows>+<R>, click OK.
The command line window opens. - Enter aws configure.
- Enter the AWS Access Key as it appears in the credentials panel.
- Enter your AWS Secret Key as it appears in the credentials panel.
- Enter eu-west-1.
- Press Enter (None).
Use the CLI commands that follow as needed.
In the following commands, the value of {home-folder} can be found
To list folders in your bucket:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/
Listing files and folders
There are three types of folders in your Data Locker bucket:
- Report Type
t=
- Date
dt=
- Hour
h=
To list all the reports of a specific report type:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/
To list all the reports of a specific report type for a specific day:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/dt=2019-01-17
To list all the reports of a specific report, in a specific hour of a specific day:
aws s3 ls s3://af-ext-reports/{home-folder}/data-locker-hourly/t=installs/dt=2019-01-17/h=23
To download files for a specific date:
aws s3 cp s3://af-ext-reports/<home-folder>/data-locker-hourly/t=installs/dt=2020-08-01/h=9/part-00000.gz ~/Downloads/
Cyberduck
Before you begin:
- Install the Cyberduck client.
- In AppsFlyer, go to Data Locker and retrieve the information contained in the credentials panel.
To configure Cyberduck:
- In Cyberduck, click Action.
- Select New Bookmark. The window opens.
- In the first field (marked [1] in the screenshot below) select Amazon S3.
- Complete the fields as follows:
- Nickname: Free text
- Server: s3.amazonaws.com
- Access Key ID: Copy the AWS Access Key as it appears in the credentials panel in AppsFlyer
- Secret Access Key: Copy the Bucket Secret key as it appears in the credentials panel in AppsFlyer.
- Path: {Bucket Name}/{Home Folder} For example: af-ext-reports/1234-abc-ffffffff
- Close the window. To do so, click the X in the upper-right corner of the window.
- Select the connection.
The data directories are displayed.
Amazon S3 browser
Before you begin:
- Install the Amazon S3 Browser.
- In AppsFlyer, go to Data Locker and retrieve the information contained in the credentials panel.
To configure the Amazon S3 Browser:
- In the S3 browser, Click Accounts > Add New Account.
The Add New Account window opens. - Complete the fields as follows:
- Account Name: free text.
- Access Key ID: copy the AWS Access Key as it appears in the credentials panel.
- Secret Access Key: copy the Bucket Secret key as it appears in the credentials panel.
- Select Encrypt Access Keys with a password and enter a password. Make a note of this password.
- Select Use secure transfer.
- Click Save changes.
- Click Buckets > Add External Bucket.
The Add External Bucket window opens. - Enter the Bucket name. The Bucket name has the following format: {Bucket Name}/{Home Folder}. The values needed for bucket name and home folder appear in the credentials window.
- Click Add External bucket.The bucket is created and displays in the left panel of the window.
You can now access the Data Locker files.