Data Clean Room—Dynamic Query

Premium

At a glance: Run custom DCR queries in real time on your attribution data in the AppsFlyer system, your custom sources, or a combination. Preview the results of your queries or download the full results.

Overview

Why use Dynamic Query?

Dynamic Query gives you the ability to run ad hoc, custom queries on your DCR data in real time to answer your specific marketing questions quickly and efficiently.

  • Incorporating the power and flexibility of SQL, it allows you to perform advanced research and complex calculations that can't be accomplished in any other way. With Dynamic Query, you can bring your business logic to data that would otherwise be inaccessible.
  • In addition, Dynamic Query makes it possible to obtain efficient answers to one-time or occasional questions without the need to set up reports. (Of course, traditional DCR reports are still the right solution for ongoing data retrieval and ingestion.)

Dynamic Query is available to all DCR customers, giving you the ability to run queries and preview the results without using credits from your DCR package. You'll only be charged credits if you decide to download the full results. In this way, you can be sure you've set up your queries correctly and that you'll get the answers you need before you incur any charges.

Query language and dialect

Dynamic Query utilizes SQL (Spark SQL dialect) and supports:

Available sources

Dynamic Query allows you to perform custom queries on the following data sources:

  • Your attribution data in the AppsFlyer system
    • Learn more about attribution data tables and how to query them
  • Custom sources you've set up in the DCR

You can query these sources individually, or combine 2 or more using JOIN or UNION statements. This flexibility makes Dynamic Query useful in a variety of scenarios, for example:

Based on the types of sources used in a query, rules are applied to ensure the strict preservation of user privacy.

Note

Currently, Dynamic Query does not query attribution data or return results from Meta (Facebook, Instagram, etc.) To query this data, set up DCR reports

Preserving privacy

To ensure that Dynamic Query maintains strict compliance with DCR privacy standards, the following requirements apply to all queries:

  1. Any query utilizing attribution data must include a GROUP BY statement so that results are aggregated.
  2. UNION statements cannot be used to combine custom sources with attribution data.
  3. Grouping by the following fields is not permitted:
    • Attribution data fields:
      • appsflyer_id
      • idfa
      • idfv
      • advertising_id
      • android_id
      • customer_user_id
      • ip
    • Custom source fields that are joined to any of the above attribution data fields.
  4. The following functions are not permitted when run on non-aggregated attribution data. They are permitted when run on aggregated attribution data or custom source data:
    • collect_list
    • collect_set
    • last
    • last_value
    • first
    • first_value
    • max
    • min

To learn more about the dos and don'ts of constructing queries, see samples and explanations of valid and invalid queries.

Reference

Query reference

The following tabs provide additional information about constructing queries in the Dynamic Query environment.

Combining sources: JOIN and UNION

JOIN statements combine sources by matching a common field:

  • Can be used to combine any sources: attribution data tables and/or custom sources
  • Can be used to combine an unlimited number of sources
  • Typically used to enrich custom source data with attribution data

UNION statements append tables with matching schema (containing the same columns and datatypes):

  • Can be used to combine attribution data tables or custom sources; cannot be used to combine attribution data tables with custom sources
  • Can be used to combine an unlimited number of sources
  • Typically used to perform queries on multiple attribution data tables

Querying attribution data tables

Attribution data in the AppsFlyer system exists in 8 different tables:

  • Each table includes different types of attribution events.
  • The timeframe for which query data is available varies by table.
    • When constructing your query in the Dynamic Query console, hover over the name of a table in the selection panel to see the precise data availability dates for that table.
  • The tables within each of the groups below have matching schema, meaning you can use UNION statements to easily query multiple tables at once.

Group 1: App installs and retargeting events

Table name Description Data availability dates
installs

A user downloads, installs, and launches a mobile app

Prior 180 days
(not including current day)
re_installs A user installs an app, then uninstalls and re-installs it during the re-attribution window Prior 180 days
(not including current day)
re_engagements An inactive user of a specific app engages with a retargeting campaign for the app and subsequently launches it Prior 180 days
(not including current day)
re_attribution A former user of a specific app (who had uninstalled it) engages with a retargeting campaign for the app and subsequently reinstalls it Prior 180 days
(not including current day)
organic_installs

Install not attributed to a specific media source

Prior 180 days
(not including current day)

Group 2: In-app events

Table name Description Data availability dates
in_app_events Post-install events set up for measurement in the app SDK Prior day + current day
in_app_organic_installs In-app events following an organic install Prior day + current day

Group 3: Impressions

Table name Description Data availability dates
impressions

An ad is viewed by an app user or website visitor

  • Data available for non-SRN ad networks only

Prior 5 days + current day

Sample queries

These queries demonstrate just a few possibilities of what you can do with Dynamic Query. You can use them simply to get your ideas flowing, or you can use them as a starting point for your own queries.

To use the sample queries as a basis for your queries:

  1. Copy and paste: Copy the SQL samples of your choice and paste them into a worksheet in the Dynamic Query tab.
  2. Replace parameters: Modify the SQL code by replacing the parameters with the specific sources and fields you want to query.
  3. Execute: Click the Run button to execute a query and preview the results.
    • Running a query and previewing the results does not consume credits.
    • Credits from your DCR package are consumed only if you download the full results.

Obtain cross-app insights

This query retrieves information about the 3 apps with the most re-installs from Snapchat during the past 3 days.

SELECT app_id, COUNT(*) AS reinstall_count
FROM re_installs
WHERE media_source = 'snapchat_int' AND dt BETWEEN DATE_SUB(current_date(), 3) AND current_date()
GROUP BY app_id
ORDER BY reinstall_count DESC
LIMIT 3;

Explore data integrity

To help you assess data integrity, this query calculates the rate of null IDFA values for the 10 media sources with the highest number of installs during the past 7 days.

SELECT media_source, COUNT(*) AS total_installs, ROUND((SUM(CASE WHEN idfa IS NULL THEN 1 ELSE 0 END) / COUNT(*)) * 100, 2) AS null_idfa_rate
FROM installs
WHERE dt BETWEEN DATE_SUB(current_date(), 3) AND current_date()
GROUP BY media_source
ORDER BY total_installs DESC
LIMIT 10;

Calculate custom attribution metrics

By calculating the geometric mean of the time difference (in seconds) between install time and attributed touch time for each media source and campaign, the resulting data provides insight into the efficiency of various media sources as regards user conversions and attribution timelines.

SELECT media_source,campaign,geo_mean(GREATEST(1, unix_timestamp(install_time, "yyyy-MM-dd HH:mm:ss") - unix_timestamp(attributed_touch_time, "yyyy-MM-dd HH:mm:ss.SSS")))
FROM installs
GROUP BY 1,2

Enrich your custom data with attribution insights

This query utilizes attribution data to group a score from Custom Source C (c.score) by media source, allowing you to use your custom metric to evaluate the relative performance of various media sources.

SELECT media_source, sum(c.score)
FROM client_input c JOIN install i on i.appsflyer_id = c.appsflyer_id
GROUP BY media_source;

Complex attribution and first-party data join

This query utilizes a more intricate data-join process, allowing for advanced insight generation. First, it retrieves and unifies data from 2 distinct attribution tables ("install" and "installRetarget"). Next, it combines this data with 2 first-party custom sources to provide insights about various conversion types, including an aggregated custom performance metric (score) and maximum age for each.

SELECT conversion_type, SUM(sum_score), MAX(max_age)
FROM (SELECT media_source, conversion_type, SUM(score) as sum_score, MAX(age) as max_age
FROM (SELECT * from install union all select * FROM installRetarget) af
JOIN client_input c on c.appsflyer_id = af.appsflyer_id
JOIN client_age ca on ca.appsflyer_id = c.appsflyer_id
GROUP BY media_source, conversion_type)
GROUP BY conversion_type

Invalid queries

These queries illustrate some of the rules that Dynamic Query uses to ensure compliance with user privacy requirements. Use them to better understand the dos and don'ts of constructing queries for the DCR.

No GROUP BY statement

Each of the following queries is invalid because it utilizes attribution data but doesn't include a GROUP BY statement:

SELECT *
FROM install;

-----------------------------------------------------

SELECT media_source, appId
FROM client_input c
JOIN install af on af.appsflyer_id = c.appsflyer_id;

Grouped by non-permitted attribution data field

The following query is invalid because it groups by a non-permitted attribution data field (appsflyer_id):

SELECT appsflyer_id, count(*)
FROM install
GROUP BY appsflyer_id

-----------------------------------------------------

The following query is invalid because it groups by a non-permitted attribution data field (appsflyer_id). This is not permitted even when the field is within a function:

SELECT substr(appsflyer_id, 0, 5), count(*)
FROM install
GROUP BY substr(appsflyer_id, 0, 5)

-----------------------------------------------------

The following query is invalid because it groups by a non-permitted attribution data field (appsflyer_id). This is not permitted even when the field is within an index:

SELECT appsflyer_id, count(*)
FROM install
GROUP BY 1

Grouped by custom source field joined to non-permitted attribution data field

The following query is invalid because it groups by a custom source field (my_id) that is joined to a non-permitted attribution data field (appsflyer_id); therefore, the same grouping restriction applies to my_id.

SELECT my_id, count(*)
FROM installs
JOIN my_source on my_source.my_id = installs.appsflyer_id
GROUP BY my_id

Restricted function

The following query is invalid because it attempts to run the collect_list function on non-aggregated attribution data:

SELECT App_id, collect_list(appsflyer_id)
FROM install
GROUP BY app_id

-----------------------------------------------------

Compare with the following valid query, which runs the collect_list function on aggregated attribution data:

SELECT app_id, collect_list(campaign)
FROM (select app_id, campaign
  FROM Install
  GROUP BY 1, 2)
GROUP BY app_id

-----------------------------------------------------

Compare with the following valid query, which runs the collect_list function on custom source data:

SELECT app_id, collect_list(c.score)
FROM client_input c
GROUP BY app_id