At a glance: Why do some aggregate data dimensions display as exceeded and how to prevent reaching cardinality limits.
What are exceeded dimensions?
AppsFlyer collects your install and event data and aggregates it. Data is aggregated using unique values populating a given dimension. The number of unique values in a dimension is called cardinality. Most analytics and aggregated reporting tools have cardinality limits per dimension.
As a result, when the number of unique values in a dimension exceeds the cardinality limit, the remaining data is grouped in an exceeded group. The examples that follow illustrate cardinality and how it affects your reports.
Example A: What is cardinality?
The Campaign ID cardinality limit is 3000. If the number of campaign IDs reported on a given day exceeds 3000, then any remaining campaign IDs are grouped together in the Exceeded_CampaignID_Limit.
Example B: Why are some events grouped in the Exceeded_Events_Limit group?
- Assume that Events' cardinality limit is 3.
- On a given day, 7 unique events are reported: A, B, C, D, E, F, and G. In other words, Events cardinality is 7.
- In aggregated reports, events A, B, C are listed separately. Events D, E, F, and G are grouped in the Exceeded_Events_Limit.
Exceeded dimensions and cardinality limits affect aggregate data as follows:
- Overview dashboard
- Aggregate data export and Pull API reports
- Master API reports
- Click and impression data in Activity and Custom dashboards
Raw data isn't affected by cardinality limits.
If a large part of your data is grouped in exceeded groupings, that data isn't broken down. This can lead to imprecise results in analytics reports. If needed, use raw data to build groupings without cardinality limits.
The table that follows lists the cardinality limits.
|Dimension||Exceeded group name||Limit type per...||Cardinality limit per day||Cardinality for Protect360 per day|
|Ad ID||Exceeded_AdID_Limit||Media source||1000||-|
|Ad set||Exceeded_AdSet_Limit||Media source||1000||-|
|Adset ID||Exceeded_AdSetID_Limit||Media source||1000||-|
|Campaign ID||Exceeded_CampaignID_Limit||Media source||3000||3000|
|Site ID||Exceeded_SiteID_Limit||Media source||1000||1000|
|Media source names||Exceeded_MediaSource_Limit||App||1000||1000|
Cardinality limits per dimension
Exceeded_AdSet_Limit and Exceeded_Ad_Limit
- Up to 1000 unique ad set names and 1000 unique ad names per media source are available.
- On a given day, ad set information from the 1001st ad set and above is grouped in the Exceeded_AdSet_Limit source. The same applies to the Exceeded_Ad_Limit source.
"Divide and Conquer" again. Define a small number of general ad set names (ideally up to 50), and assign all previous ad set names as single ads. You could also use the af_sub parameters on AppsFlyer attribution links. This enables you to:
1. Prevent seeing Exceeded_AdSet_Limit or Exceeded_Ad_Limit.
2. Efficiently optimize according to significant ad sets and ad traffic.
3. Perform deep analysis based on the ad set and ad names in the raw data.
Exceeded_AdSetID_Limit and Exceeded_AdID_Limit
- Up to 1000 unique ad set IDs and 1000 unique ad IDs per media source are available.
- On a given day, ad set ID information from the 1001st ad set ID and above is grouped in the Exceeded_AdSetID_Limit source. The same applies to the Exceeded_AdID_Limit source.
- Up to 3000 unique campaign names per day are available.
- On a day campaign information from the 3001st campaign and above is grouped in the Exceeded_Campaign_Limit source.
Define a small number of general campaign names (ideally up to 300), and assign all previous campaign names as ad sets. On AppsFlyer attribution links the parameter is af_adset. This enables you to:
1. Prevent seeing Exceeded_Campaign_Limit.
2. Efficiently optimize according to significant ad sets traffic.
3. Perform deep analysis based on the campaigns and ad set names in the raw data.
- Up to 3000 unique campaign IDs per day are available.
- During a single day, all campaign information from the 3001st campaign ID and above is attributed to the Exceeded_CampaignID_Limit source.
- Up to 20 unique channel names per day per media source are available. For Protect360, the limit is 1000.
- On a day, all channel information from the 21st channel and above is attributed to the Exceeded_Channel_Limit source.
- Up to 300 unique event names per day are available.
- On a day, all event information from the 301st event and above is attributed to the event name called Exceeded_Events_Limit.
To stop seeing Exceeded_Events_Limit, consider using:
- Rich in-app events. Instead of reporting on hundreds of different events define a small number of general event names (ideally up to 20). Use dynamic event values to differentiate between these events. This lets you optimize according to value parameters, and perform analysis based on the event values, which are available via the in-app events raw data report.
- Validation Rules to remove unneeded in-app events from the AppsFlyer platform.
Your app com.greatapp sends a purchase in-app event for every color of socks it sells, for example, buy_red_socks, buy_blue_socks, buy_white_socks, etc. To avoid this inflation of different events narrow them all down to a single event, buy_socks, and insert the color as an event parameter.
- Up to 1000 unique keywords per day per media source are available.
- On a day, keyword information from the 1001st keyword and above is grouped in the Exceeded_Keywords_Limit source.
- Up to 1000 unique media source names per day are available.
- On a day, campaign information from the 1001st media source name and above is grouped in the Exceeded_MediasSource_Limit source.
- Up to 1000 unique site IDs per day per media are available.
- On a day, site ID information from the 1001st site ID and above per media source is grouped into a single site ID titled Exceeded_SiteID_Limit. Therefore, if you do see the site ID Exceeded_SiteID_Limit, it means there are too many used site IDs per the media source in question, and trying to optimize the source's traffic according to site IDs becomes less accurate and less effective.
"Divide and Conquer". Instead of using thousands of site IDs per media source, which distort your aggregate data, use a second parameter on your attribution links called af_sub_siteid. Define a small number of general site IDs (ideally up to 50), and assign all previous site IDs as sub-site IDs under these general site IDs. This enables you to:
1. Prevent seeing Exceeded_SiteID_Limit
2. Efficiently optimize according to significant site IDs traffic
3. Perform deep analysis based on the site IDs and sub-site IDs in the raw data
In the Retention Report, you do not see an Exceeded_SiteID_Limit but not all the Site IDs are shown. Site IDs are shown randomly but it is a UI limitation to show all (when they have exceeded the limit). To work around this issue, retrieve retention data from the master API.
How to avoid getting exceeded sources?
The long term solution
Most advertisers won't encounter Exceeded sources as they don't normally define 3000 campaigns manually.
If you do encounter Exceeded sources it's probably because one or more media sources is using dynamic values for names of campaigns, site IDs, ad sets, or ads. Dynamic in-app events within the app code may cause the Exceeded_Events_Limit source data to appear.
Use only static values for names of in-app events, campaigns, site IDs, ad sets, and ads to avoid getting Exceeded sources.
Also please check specific tips per any Exceeded source above.
The short term solution
The long-term solution may take you a few days to a few weeks to fully implement.
But what if you want to look at your data right now?
As explained, an Exceeded source occurs when AppsFlyer receives the N+1 click (or event) source during a single day. It is possible that the more prominent media sources you use arrive later in the day, and therefore are combined with an Exceeded source's data. Here's a simple trick to minimize the effect of any Exceeded source you see:
Forget today and look only at data from yesterday and beforehand. Every day an aggregation process recalculates the last day's data and in retrospect assigns only the smallest (not most late) sources, to any Exceeded source. This ensures that the distortion caused by the overflow of clicks is minimal!