cancel
Showing results for 
Search instead for 
Did you mean: 
intapiuser
Community Team Member
Community Team Member

Screenshot_5.pngAnalytical Need

A common use case is to know the number of new users in the system. 
It could be that same user appear in various months but we want to count each unique user only once.
If we will use a simplistic approach of count distinct calculation on each month and then SUM the results (for a running sum), we will be including the same user more than once (for example, if the user appeared on January and also in March I would like to count him only once).

Screenshot_1.pngSolution

As an example, let's look at the following sample set: this is the fact table called data.
data.png
1. Generate a custom table with the earliest date for each User and call it EarliestDate
2. Add a Key column from date and user.
3. Add a column flagEarliest with the value 1.
SELECT 
    d.[User]
    ,min(d.[Date])
    ,concat(tostring(min(d.[Date])),d.[User]) Key
    ,1 flagEarliest 
FROM [Data] d
GROUP BY d.[User]
datawithkeyandflag.png
4. Add a similar Key to the data table from date and user.
Key = concat(tostring(date),user)
5. Add a custom table that will left join the tables: data and EarliestDate. Call it DataWithEarliestDate
SELECT 
    d.[Date]
    ,d.[User]
    ,e.[flagEarliest]
FROM [data] d
LEFT JOIN EarliestDate e
ON d.key = e.[Key]
Order By d.[Date]
 
6. Hide the data and EarliestDate tables. We will not be needing them on the dashboard.
7. The DataWithEarliestDate table should look similar to this:
EarliestDate.png
8. We can now perform a simple SUM on the flagEarliest column and the result would be the Running Sum of the distinct users.
dashbnoard.png
To see the above in action, download the dash file, the smodel and the CSV.
Rate this article:
Version history
Last update:
‎03-02-2023 08:51 AM
Updated by:
Contributors