Cost ETL—granular campaign cost in a bucket

At a glance: Cost ETL, part of AppsFlyer Xpend, provides advertisers with campaign cost data having the best granularity available per media source. Data, updated 4 times daily, is delivered in an AWS S3 bucket ready for loading into advertiser BI systems. 

CostETL__1_.png

Cost ETL is currently in beta. There will be changes made to product setup and behavior during the coming weeks.

Cost ETL principles

  • Campaign cost data is written :
    • to an S3 bucket provided by AppsFlyer: for viewing, transferring, and loading cost data into your systems. Only you have access to the folder. 
    • to the bucket four times per day (data freshness: intraday) and contains 7 days data. 
  • Data is provided with guaranteed primary dimensions:
    • Summary
    • Geo: breakdown by country
    • Channel: media source channel, for example, in the case of Google—YouTube
    • Keywords as used by app users when searching
    • Site ID: publisher ID 
  • Reach out to your CSM or hello@AppsFlyer.com to :
    • Enable Cost ETL
    • Customize the report structure to your needs

Implementation

Report dimensions

  • Reports are at the app level, per day, per batch. 
  • Each time data is written to the bucket, meaning 4 times a day, all available data is written, including the history of the previous 7 days. Take this into consideration in your data loading process.
  • Reports structures are detailed in the file fields table. The structures are: 
    • Summary report is less granular (detailed) to enabling for easier and faster consumption.
    • Dimension reports have a primary dimension which is guaranteed. This means that the dimension is available for all media sources contained in that report. In contrast, if a media source does not provide the primary dimension data, that data is not included in the specific dimension report. Secondary dimensions are included, when available. They are not guaranteed.
    • The primary (guaranteed) dimensions available are:
      • Geo: Data filtered by country
      • Keywords: Word(s) used by the user for online search
      • Site ID: Publisher ID
      • Channel: Media source channel, for example, YouTube in the case of Google and Instagram in the case of Facebook.
    • You should use the dimension that best fits your business needs. This may differ depending on the media source.

Directory and filename structure

  • Data written to the S3 bucket has the directory and file structure described.
  • When Cost ETL completes writing to a directory, a flag is set, by creating a file, success.txt. This is always the most recent timestamp in the directory. 
  • Each time data is written, it includes data for the current day and the previous 6 days (referred to as 7 days in this article)
  • The number of files per folder differs as follows:
    • Summary: One file per folder containing 7 days of data.
    • Dimensions: 7 files per folder, one for each day

Example directory structure

ETLFileStructure.jpg

Directory structure

/acc-<advertiser account>/cost_etl/version/dt=<yyyy-mm-dd>/b=<n>/
<dimension>/<app id>/<file name>

Variable

Content

advertiser_account

You AppsFlyer account ID, for example, abc123

cost_etl Always cost_etl 
version Cost ETL version
date

Cost date

Format: yyyy-mm-dd

batch Number 1-4
dimension

Data dimension:

  • summary
  • geo
  • channel
  • keywords
  • site_id
app id App ID
Directory structure

File name structure 

cost_etl_<dimension>_<app id>_<from date>_<to date>_<file_date>.csv 

Variable

Content

cost_etl Always cost_etl
dimension
  • summary 
  • geo
  • channel
  • keywords
  • site_id
app id App ID
from date yyyy-mm-dd
to date yyyy-mm-dd 
file date yyyy-mm-dd
.csv Always .csv
File name structure 

Example

For the first data pull of June 23, 2020, the directory and file name structure is as follows:

s3://af-spend-export/acc-abc123/cost_etl/v1/dt=2020-06-23/b=1/geo/com.app.name/cost_etl_geo_com.app.name_2020-06-17_2020_06_17_2020-06-23.csv

File fields

Field

Remarks

Always populated

Format

Included in summary file

date

Date cost incurred reported by the media source

Yes

String yyyy-mm-dd

app_id

App id in the AppsFlyer platform

Yes

String

media_source

Media source responsible for display the ad

Yes

String

agency

Agency responsible for placing the ad

FALSE

String

-

campaign

Component of the advertising hierarchy

FALSE

String

-

campaign_id

Component of the advertising hierarchy

Yes

String

-

adset

Component of the advertising hierarchy

No

String

-

adset_id

Component of the advertising hierarchy

No

String

-

ad

Component of the advertising hierarchy

No

String

-

ad_id

Component of the advertising hierarchy

No

String

ad_account

  • Ad account from which cost data was pulled
  • Relevant for networks where connection is done using oAuth (logging in with your credentials), for example in Google or Facebook

No

String

-

impressions

Counted by AppsFlyer

No

Integer

clicks

Counted by AppsFlyer

No

Integer

reported_impressions

Counted by the Media source

No

Integer

reported_clicks

Counted by the Media source

No

Integer

installs

Counted by AppsFlyer

No

Integer

re_engagements

Counted by AppsFlyer

No

Integer

re_attributions

Counted by AppsFlyer

No

Integer

currency

Currency of advertiser spend as defined for the app in AppsFlyer

Yes

3 character string compliant with ISO-4217

cost

Amount of spend

Yes

Value

original_cost

Cost as reported by the network, in the currency reported by the network (before any currency conversion)

No Value 

original_currency

Currency of cost as reported by the network before any conversions

Yes

3 character string compliant with ISO-4217

-

timezone

  • Timezone of the data as it is stored in AppsFlyer
  • Sometimes networks report data in a different time zone compared to what is defined in AppsFlyer. This data point helps explain cost discrepancies between AppsFlyer and networks

Yes

String 

geo

Dimension in the advertising hierarchy

No

2 character string compliant with ISO 3166

channel

Dimension in the advertising hierarchy

No

String

 

keywords

Dimension in the advertising hierarchy

No

String 

-

site_id

Dimension in the advertising hierarchy

No

String 

-

Cost ETL fields

Accessing the bucket

Data in the S3 bucket can be accessed using the AWS command-line interface (CLI) or other utility for accessing data in a bucket, for example. CyberDuck.

Prerequisites

To access the bucket, you require the credentials listed available from your CSM or AppsFlyer support. 

Credentials

Value

Access key ID  
Secret access key  
Path

Has the format: <Bucket name>/<Home folder>

Example: af-spend-export/acc-ffffffff

 To use AWS CLI:

  1. Install AWS CLI on your computer. 
  2. Open the terminal. To do so in Windows, <Windows>+<R>, click OK.
    The command line window opens.
  3. Enter aws configure
  4. Enter the AWS Access Key as it appears in the credentials panel.
  5. Enter your AWS Secret Key as it appears in the credentials panel.
  6. Enter eu-west-1
  7. Press Enter (None)

Use the CLI commands that follow as needed.

In the following commands, the value of {home-folder} can be found

To list folders in your bucket

aws s3 ls s3://af-spend-export/{home-folder}/cost_etl/v1

To use CyberDuck:

  1. In CyberDuck, click Action.
  2. Select New Bookmark. The window opens.
  3. Select [A] Amazon S3.

    CostETLDataDuck.jpg

  4. Complete the fields as follows:
    • [B] Nickname: free text
    • [C] Server: s3.amazonaws.com
    • [D] Access Key ID: copy the AWS access key
    • [E] Secret Access Key: copy the Bucket Secret key
    • [F] Path: {Bucket Name}/{Home Folder} For example: af-spend-export/acc-ffffffff
  5. Close the window. To do so, use the X in the upper-right corner of the window.
  6. Select the connection.
    The data directories are displayed.
Was this article helpful?