At a glance: Cost ETL, part of AppsFlyer ROI360, provides advertisers with campaign cost data having the best granularity available per media source. Data, updated up to 4 times daily, is delivered to your AWS S3 or GCS bucket ready for loading into advertiser BI systems.
Cost ETL principles
Campaign cost data is written:
- To your bucket for viewing, transferring, and loading cost data into your systems.
- To the bucket four times per day (data freshness: intraday).
- For the current day and the previous 6 days (referred to as 7 days in this article), as well as days 14, 29, and 88.
- Example: For the date October 14, 2024, the file contains data for October 14 (the current day), October 13 (1 day back), October 12 (2 days back), October 11 (3 days back), October 10 (4 days back), October 9 (5 days back), October 8 (6 days back), September 30 (14 days back), September 15 (29 days back), and July 18 (88 days back).
- The retroactive data allows for updates and corrections in the cost data reporting.
- For cost matched with attribution, click, or impression.
Note:- Data for the last 7 days is pulled anew from the media sources. Data for days 14, 29, and 88 is re-processed.
- Only cost data is updated retroactively; not attribution data.
Data is provided with guaranteed primary dimensions:
- Geo: breakdown by country
- Channel: media source channel, for example, in the case of Google—YouTube
You can customize other dimensions and the metrics included in the reports to your needs.
View sample file. Note: The sample provided is an Excel file with the data breakdown by channel; Cost ETL files are sent to your bucket as parquet files.
Implementation
Report dimensions
- Reports are for all apps included in Cost ETL, per day, per batch.
- Each time data is written to the bucket, meaning up to 4 times a day, all available data is written, including the history, updates, and corrections of the previous 6 days, and days 14, 29, and 88. Take this into consideration in your data loading process.
- Reports structures are detailed in the file fields table. The structures are:
- Summary report: Is less granular (detailed) to enable easier and faster consumption.
- Dimension reports: Have a primary dimension that is guaranteed. This means that the dimension is available for all media sources contained in that report. In contrast, if a media source does not provide the primary dimension data, that data is not included in the specific dimension report. Secondary dimensions are included when available. They are not guaranteed.
- The primary (guaranteed) dimensions available are:
- Geo: Data grouped by country
- Channel: Media source channel, for example, YouTube in the case of Google and Instagram in the case of Meta ads.
- You should use the dimensions and metrics that best fit your business needs. This may differ depending on the media source.
- [Closed beta] All cost report: Based on the geo dimension. Includes cost data for all marketing activity on all platforms, including for apps/platforms not added in AppsFlyer (in these cases, the app ID is marked as unknown.
Directory and filename structure
- Data written to the bucket has the directory and file structure described. View sample file. Note: The sample provided is an Excel file with the data breakdown by channel; Cost ETL files are sent to your bucket as parquet files.
- When Cost ETL completes writing to a directory, a flag is set by creating a success file. This is always the most recent timestamp in the directory.
- Each time data is written, it includes data for the current day and the previous 6 days (referred to as 7 days in this article), as well as days 14, 29, and 88.
- The number of folders/files is as follows:
- Summary: 4 batch folders per day.
- Each batch folder contains parquet files with 7 days of data.
- Dimensions: Each guaranteed dimension contains 4 batch folders per day.
- Each batch folder contains parquet files containing data with numbering starting from 1.
- [Closed beta] All cost: Based on the geo dimension, contains 4 batch folders per day.
- Summary: 4 batch folders per day.
Example directory structure
Directory structure
/<advertiser bucket name>/cost_etl/version/dt=<yyyy-mm-dd>/b=<n>/ <dimension>/<file name>
Directory structure
Variable | Content |
---|---|
advertiser_bucket_name | As defined in the Cost ETL configuration: af-xpend-cost-etl-<af-account-id>-[your bucket name suffix] |
cost_etl | Always cost_etl |
version | Cost ETL version |
date |
Cost date Format: yyyy-mm-dd |
batch | Number 1-4 |
dimension | Data dimension:
|
file_name |
|
Parquet file number |
|
File name structure
part-<number>
Example
For the first data pull of June 23, 2020, the directory and file name structure is as follows:
/bucket-name/cost_etl/v1/dt=2020-06-23/b=1/geo/part-00001
File fields
Fields/Dimensions
Field | Remarks | Always populated | Format | Included in summary file |
---|---|---|---|---|
date | Date cost incurred reported by the media source | Yes | String yyyy-mm-dd | ✓ |
app_id | App id in the AppsFlyer platform | Yes | String | ✓ |
media_source | Media source responsible for display the ad | Yes | String | ✓ |
os | Operating system of the device. Possible values:
|
Yes | String | - |
agency | Agency responsible for placing the ad | No | String | - |
campaign | Component of the advertising hierarchy | No | String | - |
campaign_id | Component of the advertising hierarchy | No | String | - |
adset | Component of the advertising hierarchy | No | String | - |
adset_id | Component of the advertising hierarchy | No | String | - |
ad | Component of the advertising hierarchy | No | String | - |
ad_id | Component of the advertising hierarchy | No | String | - |
ad_account |
|
No | String | - |
currency | Currency of advertiser spend as defined for the app in AppsFlyer | Yes | 3 character string compliant with ISO-4217 | ✓ |
original_currency | Currency of cost as reported by the network before any conversions | Yes | 3 character string compliant with ISO-4217 | - |
timezone |
|
Yes | String | - |
geo | Dimension in the advertising hierarchy | No | 2 character string compliant with ISO 3166 | ✓ |
channel | Dimension in the advertising hierarchy | No | String | - |
keyword_term | Word(s) used by the user for online search | Yes | String | - |
keyword_id | ID of the ASA keyword terms | Yes | String | - |
site_id | Publisher ID | No | String | - |
campaign_objective | Component of campaign properties. Learn more | No | String | - |
cost_model | Component of campaign properties. Learn more | No | String | - |
af_cost_model | Cost model mapped and normalized by AppsFlyer. Component of campaign properties. Learn more | No | String | - |
bid_strategy | Component of campaign properties. Learn more | No | String | - |
af_bid_strategy | Bid strategy mapped and normalized by AppsFlyer. Component of campaign properties. Learn more | No | String | - |
bid_amount | Component of campaign properties. Learn more | No | Integer | - |
original_bid_amount | Component of campaign properties. Learn more | No | Integer | - |
Metrics
Field | Remarks | Always populated | Format | Included in summary file |
---|---|---|---|---|
impressions |
|
Yes. If no value is available for a particular metric, it is populated with 0. |
Integer | ✓ |
clicks |
|
Integer | ✓ | |
reported_impressions | Counted by the Media source | Integer | ✓ | |
reported_clicks | Counted by the Media source | Integer | ✓ | |
installs | Counted by AppsFlyer | Integer | ✓ | |
reported_conversions | Counted by the Media source | Integer | ||
re_engagements | Counted by AppsFlyer | Integer | ✓ | |
re_attributions | Counted by AppsFlyer | Integer | ✓ | |
cost | Amount of spend (including agency fees where relevant) | Value | ✓ | |
original_cost | Cost as reported by the network, in the currency reported by the network before any currency conversion (with agency fees calculated by AppsFlyer added where relevant) | Value | - | |
impressions_discrepancy |
|
Integer | - | |
clicks_discrepancy |
|
Integer | - | |
installs_discrepancy |
|
Integer | - | |
fees | Fees an agency charges in addition to the usual ad cost. Counted by AppsFlyer | Yes | Integer | - |
cost_without_fees | Cost minus the agency fee. Counted by AppsFlyer | Yes | Integer | - |
original_cost_without_fees | Original cost as reported by the ad network, without agency fees | Yes | Integer | - |
ctr |
|
No |
Integer | - |
cvr |
|
Integer | - | |
ecpm |
|
Integer | - | |
cpi |
|
Integer | - | |
ccvr |
|
Integer | - | |
cvvr |
|
Integer | - | |
reported_cvr |
|
Integer | - | |
ecpc |
|
Integer | - | |
video_25p_views | Video played 25%. Reported by ad network | Integer | - | |
video_50p_views | Video played 50%. Reported by ad network | Integer | - | |
video_75p_views | Video played 75%. Reported by ad network | Integer | - | |
video_completions | Reported by ad network | Integer | - |
Set up Cost ETL for AWS S3
This configuration procedure must be performed by an admin user.
Before you start:
- Setting up Cost ETL consists of setting up your AWS bucket (and giving AppsFlyer permission to write data in it), and setting up Cost ETL in AppsFlyer.
- You will need both AWS admin privileges and access to the AppsFlyer UI to complete Cost ETL setup.
- Keep tabs to both AWS and AppsFlyer open during setup.
- Note: KMS bucket encryption support is currently in Beta.
To set up your AWS bucket and Cost ETL:
- Sign in to the AWS console.
- Go to the S3 service.
- Create the bucket:
- Click Create bucket.
- Complete the Bucket name as follows: Start with the mandatory prefix
af-xpend-cost-etl-acc-<af-account-id>-
and then add a suffix as free text.- Your
af-account-id
can be found in the AppsFlyer UI as indicated in the following steps 7-9. - See Amazon S3 bucket naming requirements.
- Your
- Click Create bucket.
- In AppsFlyer, from the side menu, select Export > Cost ETL.
- Turn on Cost ETL.
- In Report schedule, select how many reports you want to receive per day and at what times. You can get up to 4 reports per day. Learn more
- Go to Amazon S3 settings.
- Select your S3 bucket region from the dropdown.
If your region isn't displayed, contact your CSM. - Enter your Amazon S3 bucket name.
- Click Next.
The bucket policy code snippet displays.
- Copy the bucket policy code snippet and paste it into your AWS settings.
- In AWS, Select the bucket you created for Cost ETL.
- Go to the Permissions tab.
- In the Bucket policy section, click Edit.
- The Bucket policy window opens.
- Paste the bucket policy snippet into the window.
- In your AppsFlyer Cost ETL settings, click Next.
The Validate bucket step displays.
- Click Validate.
Verify that Validation successful displays. - Click Next.
- Select one or more or all apps. Select all to automatically include apps you add in the future.
- Click Apply.
- Select at least one Guaranteed dimension: Channel and/or Geo.
- Select at least one additional dimension.
- Select at least one metric to be included in the reports.
- Click Apply.
Set up Cost ETL for GCS
This configuration procedure must be performed by an admin user.
Before you start:
- Setting up Cost ETL consists of setting up your GCS bucket (and giving AppsFlyer permission to write data in it), and setting up Cost ETL in AppsFlyer.
- You will need both GCS admin privileges and access to the AppsFlyer UI to complete Cost ETL setup.
- Keep tabs to both GCS and AppsFlyer open during setup.
To set up your GCS bucket and Cost ETL:
- Sign in to the GCS console.
- Create a bucket.
Name the bucket as follows:- Start with the mandatory prefix
af-xpend-cost-etl-acc-<af-account-id>-
and then add a suffix as free text. - Your
af-account-id
can be found in the AppsFlyer UI as indicated in the following steps 6-8. - See GCS bucket naming requirements.
- Start with the mandatory prefix
- In AppsFlyer, from the side menu, select Export > Cost ETL.
- Turn on Cost ETL.
- In Report schedule, select how many reports you want to receive per day and at what times. You can get up to 4 reports per day. Learn more
- Go to Data destination and select Google Cloud Storage (GCS) bucket.
- Enter your GCS bucket name.
- Click Next.
The AppsFlyer service account displays, to be used to set the GCS permissions. - In your GCS console, set the IAM permissions for the bucket:
- Add the AppsFlyer service account as a principal to the Cost ETL bucket.
- Assign the role Storage Object Admin.
- In your AppsFlyer Cost ETL settings, click Next.
The Validate bucket step displays.
- Click Validate.
Verify that Validation successful displays. - Click Next.
- Select one or more or all apps. Select all to automatically include apps you add in the future.
- Click Apply.
- Select at least one Guaranteed dimension: Channel and/or Geo.
- Select at least one additional dimension.
- Select at least one metric to be included in the reports.
- Click Apply.
AWS object ownership
In AWS, by default, when AppsFlyer writes objects to your bucket, the object owner is AppsFlyer. Depending on your data loading process you might have to change the default ownership to you—the bucket owner.
To change the ownership of objects in your bucket:
- Sign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/.
- In the Buckets list, choose the name of the bucket that you want to enable S3 Object Ownership for.
- Go to the Permissions tab.
- Under Object Ownership, click Edit.
- Select Bucket owner preferred.
- Click Save.
Best practices
Report schedule
The report schedule settings allow you to get the freshest data right when you need it.
When choosing your report schedule:
- Select the time to get the report as close as possible to the time you start processing data in your BI system.
- If you have any any ad networks that provide yesterday's data later than others, set an additional report at the time when that ad network data is ready.
Override data
When pulling and analyzing your data, it is recommended to pull data for a specific date and batch, or override all previous data for the days that the current batch provides. Otherwise, you may see the same data repeated.
For example, batch 1 on February 20 contains data for Feb 14-20. But, batches written on February 19 also contained data for Feb 14 to Feb 19. Override the data of the previous days received on February 19 with the data received in the most recent February 20 batch.
Geo versus channel
Not all networks provide data for all dimensions together. Geo and channel data in Meta ads are the most common examples. This is why two separate data sets are provided. The geo data set is guaranteed to have geo data and the channel data set is guaranteed to have channel data.
In many cases and for many media sources, the data in the geo and channel sets will be identical. As such, consume one of the data sets (geo or channel), according to what best suits your needs.
If the integration agreement with a given media source doesn't include channel, so that channel is blank, we treat that data as if it contains the channel.
Aggregate data
Cost ETL provides flexible and granular data as deep as can be extracted from the ad network. To extract actionable insight from such potentially huge amounts of data, it is recommended to aggregate the data in a way that best suits your business needs. For example, if you need to understand cost data at the campaign and country-level, use those dimensions.
Standardization across networks
Not all networks provide data with the same granularity. For example, Meta ads doesn't provide site ID cost data, while X Ads doesn't provide geo cost data. Be aware of such cases as you aggregate Cost ETL data, and make sure you look at similar data as you compare networks.
Compare data
Cost ETL provides information regarding all your cost data. Some campaigns provided in Cost ETL do not appear in some AppsFlyer dashboards, for example, data of inactive campaigns, meaning campaigns without any recorded installs. To compare the data, find a specific campaign ID in the overview dashboard and compare it to its cost data in Cost ETL. Learn more about cost data availability
Combine Cost ETL and cohort reports
Consider combining Cost ETL reports with Aggregated advanced cohort reports (or regular Cohort reports via Data Locker) in your BI system. Together, they give the fullest picture of marketing performance with fresh and accurate data, including clicks, impressions, cost, revenue, in-app events, etc.). And you can use this combined data to get ROAS, CPA, etc. Learn more
Additional information
Traits and limitations
Trait | Remarks |
---|---|
Timezone | If the timezone is changed, cost data is duplicated on the day and the day following the change. Learn more |
Data freshness |
|
All cost report | All cost reports (closed beta) currently don't include cost data for Google Performance Max campaigns. |