Data Locker—raw data delivery

At a glance: Data Locker writes raw-data to an AWS S3 bucket in near-real-time (lag 6 hours.) Data can be written to a bucket provided by AppsFlyer or directly to your bucket.

6133_Data_Locker_-_your_bucket.png

Data Locker

Data Locker主要功能

  • Apps: supports multiple apps which added automatically
  • Simplicity: data is written to an Amazon S3 bucket 
  • 可靠性:数据存储在AWS中,可确保数据持久性
  • Flexibility: choose what data you want to include 
  • 数据颗粒度:将数据按照报告类型,日期和小时拆分
  • 可获取性:需要时拉取数据
  • Data freshness: 6-hour lag using or daily depending on the report type. The lag-time is the same (6 hours) irrespective of the app-specific time zone. 
  • Bucket ownership:
    • Get the data via an AppsFlyer owned bucket. Data retention: 30 days.
    • AppsFlyer writes the data directly to your bucket. Data retention: Controlled by you. 

Reports available in Data Locker

UA and retargeting
类别 报告类型(主题) 数据更新 Organic/Non-organic Unique to Data Locker
用户获取 Clicks 滞后6小时 N/A
再营销 Clicks 滞后6小时 N/A
用户获取 展示 滞后6小时 N/A
再营销 展示 滞后6小时 N/A
用户获取 Installs 滞后6小时 Both  
用户获取 应用内事件 滞后6小时 Both  
用户获取 Attributed ad revenue Daily+2 Non-organic  
用户获取 Organic ad revenue Daily+2 自然  
再营销 Retargeting ad revenue Daily+2 Non-organic  
再营销 Conversions 滞后6小时 Non-organic  
再营销 In-app events 滞后6小时 Non-organic  
再营销 Sessions 滞后6小时 Both
用户获取 Sessions 滞后6小时 Both
用户获取 Uninstalls 每日 Non-organic  
用户获取 Organic uninstalls 每日 自然  
Reinstalls Reinstalls 滞后6小时 Non-organic  
Reinstalls Organic reinstalls 滞后6小时 自然  
Protect360
报告类型(主题) 数据更新
已拦截的激活 滞后6小时
Blocked in-app events 滞后6小时
Blocked clicks 滞后6小时
[FF*] [AG*] Post-attribution installs 每日
SKAdNetwork
Data freshness: Daily 
报告类型(主题)
[FF*] Postbacks
[FF*] Installs
[FF*] Redownloads
[FF*] In-app events
People-Based Attribution
Data freshness: Daily
报告类型(主题)
[FF*] Website visits
[FF*] Website events
[FF*] Website-assisted installs
[FF*] Conversion Paths
Key to abbreviations

* Key to abbreviations

[FF] Report fields are fixed by Appsflyer. They are not related to the fields selected for inclusion in reports.

[AG] Agency transparency not supported.

6-hour lag: Data is separated into arrival hour folders. That is the hour that the event was deposited to Data Locker. Some data Locker folders are written about six hours after the actual event time for real-time events. There are 24 folders for each hour of the day, from 0 to 23, and an additional folder for data that arriving late. The lag time is the same irrespective of the app-specific timezone.

Daily: Reports having a data freshness rate of daily are written to the h=23 folder. These reports are typically available by 10:00-12:00 UTC in the h=23 folder of the preceding day. For example, the report for data generated during Monday is in the Monday H=23 folder. The data is available after 10:00 UTC on Tuesday. 

Daily+2: Ad revenue data is available after 2 days, meaning that data generated during Monday, becomes available in the Monday H=23 folder after 06:00 UTC on Wednesday.

Data Locker architecture

Data partition

AppsFlyer creates an AWS principal (ARN in Amazon terms) and generates credentials for that principal. A policy is then set allowing the principal to browse and retrieve files from their bucket.

In the bucket, data is organized by report type. The data for a given report is stored in its folder. 

Folder and file structure

  • Folder structure is: af-ext-reports/<Home Folder>/data-locker-hourly/t=<event-type>/dt=<date YYYY-MM-dd>/h=<Hour h>
    • The Home Folder is the Home Folder that appears in the Credentials window (see the setup instruction in the prior section) 
    • For example, for the date 2016-08-12 the report appears in: s3://af-ext-reports/12345678911-acc-1abc234/data-locker-hourly/t=installs/dt=2016-08-12/
  • The day folder dt=yyyy-mm-dd is split into 25 hourly folders. These folders represent the arrival hour of the event, not the event hour of the event itself. The folders are named h=0, h=1, h=2, and so on, up to h=23, and h=late. For example, the folder h=0 contains the events that arrive between 00:00 UTC and 01:00 UTC similarly, the folder h=20 contains the events that arrive between 20:00 and 21:00.
  • 在每个文件夹中:

    • Data is split into multiple files to avoid large files. File names are: part-00000, part-00001, part-00002, and so on. There can be up to 1000 files. We may increase this maximum number in the future without advance notice.

    • The last file to be written is an empty file named _SUCCESS. This file is a flag indicating that no further data will be written to the folder. As such, do not read data in a folder before verifying that the _SUCCESS file exists. Note: The _SUCCESS flag is also written in cases where there is no data to be written to the folder.

  • late文件夹

    • The late folder contains events of the preceding day that arrived after midnight. Meaning arrives between 00:00–02:00 UTC of the following day. For example, a user installs an app on Monday 08:00, the event arrives on Tuesday at 01:00. The event is recorded in Monday's late folder. 

    • The folder also contains the _SUCCESS flag as described in the previous section.

    • Data in the late folder are not recorded in any other folder.

    • Automated processes should look for data in the late folder similarly to that of other folder. 

文件结构和格式

  • Data Locker文件是基于Raw Data Reports V5(请参阅: Raw Data Reports V5 )。
  • 实际的数据文件为CSV格式,但没有文件扩展名。
  • The report files are zipped in .gz format.
  • 每个文件都有一个标题行。
  • 带有逗号的值包含在双引号“”之间,例如`"iPhone6,1"`

使用报告作为数据源

您可以使用报告中的数据并将其添加到自己的数据库中。要提取数据并将其添加到数据库中,您需要了解报告格式。Data Locker报告基于原始数据报告。但是,最终报告格式取决于您选择包括的字段。

Some fields are populated with null or are empty. This means that in the context of a given report there is no data to report. In general, null means this field is not populated in the context of a given report and app type. Blank "" means the field is relevant in its context but no data was found to populate it with. 

时区和货币

特定于应用程序的时区和货币设置对Data Locker中的数据没有影响。

  • Timezone: Data Locker reports use the UTC timezone
  • 货币:字段event_revenue_usd以美元为单位。

AppsFlyer AWS bucket vs. Customer AWS Bucket

 Caution!

If you are using the Client AWS Bucket solution: 

  • Ensure that you comply with data privacy regulations like GDPR and ad network/SRN data retention policies.
  • Don't use the Customer AWS Bucket solution to send data to third parties. 
  • Data is written to a bucket owner of your choice as follows:
    • AppsFlyer AWS bucket
    • Customer AWS bucket
  • You can change the bucket owner selection at any time:
    • Move from an AppsFlyer AWS bucket to a Customer AWS bucket in the user interface. The change takes effect within 1 hour. This means we stop writing data to one bucket and start to write data to the newly selected bucket. 
    • If you want to stop using your Customer Bucket, select the AppsFlyer bucket. 
  AppsFlyer AWS bucket Customer AWS bucket
Bucket name Set by AppsFlyer

Set by you. Must have the prefix af-datalocker-.

Example: af-datalocker-your-bucket-name

Bucket ownership AppsFlyer Customer
Storage platform supported AWS AWS 
Credentials to access data by you Available in the Data Locker user interface to the Admin Not known to AppsFlyer. Use your AWS credentials.
Data retention Data is deleted after 30 days Your responsibility
Data deletion requests AppsFlyer responsibility Your responsibility
Security AppsFlyer controls the bucket. The customer has read access. The customer controls the bucket. AppsFlyer requires GetObject, ListBucket, DeleteObject, PutObject permission to the bucket. The bucket should be dedicated to AppsFlyer use. Don't use it for other purposes.
Storage space Managed by AppsFlyer Managed by you

Procedures

Set up Data Locker

Use this procedure to set up Data Locker.

Prerequisite for setting up a Customer AWS bucket:

If you are setting up Data Locker using your Customer AWS bucket, meaning a bucket owned by you, you must first complete setting up your AWS S3 bucket.

AppsFlyerAdmin_us-en.png To set up Data Locker:

  1. The admin needs to perform the setup. 
  2. In Appsflyer, go to Integration > Data Locker. 
  3. Choose the Amazon S3 integration method. Select one of the following
    • AppsFlyer AWS bucket. Continue to step 4. 
    • Customer AWS bucket.
      1. Enter your AWS bucket name. Don't enter the prefix af-datalocker-
      2. Click Test.
      3. Verify that no error message displays indicating that the bucket path is invalid.
  4. Select one or more or all apps.  Select all to automatically include apps you add in the future.
  5. Click Apply
  6. [optional] Media Sources: Select one or more Media Sources to include in reports.
    • Default=All. This means that media sources added in the future are automatically added.
  7. 选择一种或多种报告类型
  8. [optional] In-app events: Select the in-app events to include. If you have more than 100 in-app event types, you can't search for them. Enter their names exactly to select them. 
    • Default=All. This means that in-app events added in the future are automatically added.
  9. Click Apply
  10. [optional] Fields (default=All): Select the fields to include in the reports. Note: We add fields from time to time take this into account in your data import process.
  11. Click Save Configuration. One of the following occurs:
    • If you selected AppsFlyer AWS bucket:
      • A dedicated AWS bucket is created. The bucket credentials display.
      • The bucket is accessible using the credentials. The credentials provide you with read-only access to the bucket.
    • If you selected Customer AWS bucket:
      • Data will start being written to your AWS bucket within 1-2 hours.

Setup Data Locker—Your AWS S3 bucket

The procedure in this section must be performed by your AWS admin.

You can delete files from Data Locker 25 or more hours after they were written. Please don't delete them before. 

Background information for the AWS admin: 

  • AppsFlyer writes your data to an S3 bucket owned by you. To do so, the following are required:
    • Create a bucket having the name af-datalocker-mybucket. The prefix af-datalocker- is mandatory . The suffix is free text.
  • We suggest af-datalocker-yyyy-mm-dd-hh-mm-free-text. Where yyyy-mm-dd-hh-mm is the current date and time, and you add any other text you want as depicted in the figure that follows.
    User interface ins AWS console

    MyBucket.jpg

  • Having created the bucket, grant AppsFlyer permissions using the procedure that follows. 

To create a bucket and grant AppsFlyer permissions: 

  1. Sign in to the AWS console.
  2. Go to the S3 service.
  3. To create the bucket:
    1. Click Create bucket.
    2. Complete the Bucket name as follows: Start with af-datalocker- and then add any other text as described previously.
    3. Click Create bucket.
  4. To grant AppsFlyer permissions:
    1. Select the bucket. 
    2. Go to the Permissions tab. 
    3. In the Bucket policy section, click Edit. 
      The Bucket policy window opens.
    4. Paste the following snippet into the window.
      {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Sid": "AF_DataLocker_Direct",
            "Effect": "Allow",
            "Principal": {
              "AWS": "arn:aws:iam::195229424603:user/product=datalocker__envtype=prod__ns=default"
            },
            "Action": [
              "s3:GetObject",
              "s3:ListBucket",
              "s3:DeleteObject",
              "s3:PutObject"
            ],
            "Resource": [
              "arn:aws:s3:::af-datalocker-my-bucket",
              "arn:aws:s3:::af-datalocker-my-bucket/*"
            ]
          }
        ]
      }
      
  5. In the snippet, replace af-data-locker-my-bucket with the bucket name you created.

  6. Click Save changes.

  7. Complete the Setup Data Locker procedure.

Reset credentials

The admin can reset the AppsFlyer AWS bucket credentials at any time. Note! If you reset the credentials you must update your data import scripts with the updated credentials.

AppsFlyerAdmin_us-en.png To reset the credentials:

  1. In Appsflyer, go to Integration > Data Locker. 
  2. In the Credentials section, click Reset credentials.
    A confirmation window displays.
  3. Click Reset.
  4. Wait (about 20 seconds) until the Credentials successfully reset message displays.
    The updated credentials are available.

Data retrieval

Use your preferred S3 data retrieval tool, AWS CLI, or one of the tools described in the sections that follow.

AWS CLI

开始之前

  • 在您的计算机上安装AWS CLI。
  • 在AppsFlyer面板上,到Data Locker配置页面,到证书面板获取执行此过程需要的信息。

使用AWS CLI

  1. Open the terminal. To do so in Windows, <Windows>+<R>, click OK.
    The command line window opens.
  2. Enter aws configure
  3. 输入证书面板中显示的AWS Access Key。
  4. 输入在证书面板中显示的您的AWS密钥。
  5. Enter eu-west-1
  6. 按Enter键(无)

根据需要使用以下CLI命令。

在以下命令中,可以找到{home-folder}的值

在您的存储器列出文件夹

aws s3 ls s3:// af-ext-reports / {home-folder} / data-locker-hourly /

列出文件和文件夹

Data Locker存储器中有三种类型的文件夹:

  • Report Type t=
  • Date dt=
  • Hour h=

列出特定报告类型的所有报告:

aws s3 ls s3:// af-ext-reports / {home-folder} / data-locker-hourly / t = installs /

列出特定日期的特定报告类型的所有报告:

aws s3 ls s3:// af-ext-reports / {home-folder} / data-locker-hourly / t = installs / dt = 2019-01-17

在特定日期的特定时间列出特定报告的所有报告,请执行以下操作:

aws s3 ls s3:// af-ext-reports / {home-folder} / data-locker-hourly / t = installs / dt = 2019-01-17 / h = 23

下载特定日期的文件:

aws s3 cp s3://af-ext-reports/<home-folder>/data-locker-hourly/t=installs/dt=2020-08-01/h=9/part-00000.gz ~/Downloads/

Cyber Duck

开始之前

  • 安装Cyber Duck客户端。
  • 在AppsFlyer面板中,到Data Locker配置页面,到证书面板获取需要的信息。当您配置Cyber Duck时,您将需要此信息。

要配置Cyber Duck

  1. 在“ Cyber Duck”中,单击“Action”。
  2. Select New Bookmark. The window opens.
  3. In the first field, (marked [1] in the screenshot that follows,) select Amazon S3.

    DataDuckSmall2.png

  4. 填写以下字段:
    • Nickname: free text
    • Server: s3.amazonaws.com
    • Access Key ID: copy the AWS Access Key as it appears in the credentials panel in AppFlyer
    • Secret Access Key: copy the Bucket Secret key as it appears in the credentials panel in AppsFlyer.
    • Path: {Bucket Name}/{Home Folder} For example: af-ext-reports/1234-abc-ffffffff
  5. 关闭窗口,为此,请使用窗口右上角的X。
  6. Select the connection.
    The data directories are displayed.

Amazon S3浏览器

开始之前

  • 安装Amazon S3浏览器。
  • 在AppsFlyer面板上,到Data Locker配置页面,到证书面板获取执行此过程需要的信息。

配置Amazon S3浏览器

  1. In the S3 browser, Click Accounts > Add New Account.
    The Add New Account window opens.

    mceclip0.png

  2. 填写以下字段:
    • 帐户名:自定义文本。
    • Access Key ID: copy the AWS Access Key as it appears in the credentials panel. 
    • Secret Access Key: copy the Bucket Secret key as it appears in the credentials panel.
    • Select Encrypt Access Keys with a password and enter a password. Make a note of this password.
    • Select Use secure transfer. 
  3.  Click Save changes.
  4. Click Buckets > Add External Bucket.
    The Add External Bucket window opens.

    mceclip2.png

  5. Enter the Bucket name. The Bucket name has the following format: {Bucket Name}/{Home Folder}. The values needed for bucket name and home folder appear in the credentials window. 
  6. Click Add External bucket.
    The bucket is created and displays in the left panel of the window.
    You can now access the Data Locker files. 

Additional information

特性与局限性

特征:
特征 备注
广告网络 不适用于广告平台。
代理商 不供代理商使用
应用设定的时区 不适用Data locker文件夹使用UTC时区,按照小时拆分。实际事件时间是UTC时间。根据需要将时间转换为任何其他时区。不论您的应用时区如何,从事件发生到记录在Data Locker中的延迟都保持不变的;数据延迟是6个小时。
应用设定的货币 不支持
大小限制 不适用
数据新鲜度 文件每小时更新一次,距事件发生时间有六个小时的延迟。
历史数据 不支持。配置Data Locker后就会发送事件数据。如果您需要历史数据,请使用Pull API。
团队成员访问 团队成员无法配置Data Locker。
单个应用程序/多个应用程序 支持多应用程序。Data Locker是帐户级别的

Developer considerations

In preparing scripts for data loading into your systems consider the following:

  • Temporary folder:
    • In some cases a temporary folder remains. You should disregard this folder.  Example: /data-locker-hourly/t=inapps/dt=2020-11-13/h=2/_temporarary/0/_temporary/.
    • Consume only folders having the _SUCCESS flag in them. 
  • Sequence of columns in reports: 
    • The sequence of fields in reports is always the same. When we add new fields these are added to the right of the existing fields. The field list in the user interface is sequenced accordingly.

疑难解答

  • 表现:无法使用AWS CLI获取数据
  • Error message: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
  • 原因:使用的AWS证书不是AppsFlyer存储器的正确证书。这可能是由于您的计算机上有多个或无效的证书所致。
  • 解决方案:
    1. 使用其他方法(例如Cyber Duck)访问存储器,而不是CLI。这样做以验证您使用的证书是否正常运行。如果您能够使用Cyber Duck进行连接,则表明证书缓存存在问题。
    2. Refresh the AWS credentials cache.
      Screenshot from AWS`mceclip0.png 
这篇文章有帮助吗?