cancel
Showing results for
Did you mean:

# "Filling" missing records in Fact table from Dim tables

Community Team Member
Use Case:
Sometimes, we would like to create calculations from the fact table and we need to have a continuous range of dates or in other cases, we need to count the number of employees (just as an example) that we have in the company.
We strongly recommend using fields from the fact tables for the calculations and fields from dimensions fro filters/categories/columns etc.
But what do we do if not all dates appear in the fact tables (such as weekends, holidays etc.) and not all employees have transactions in the facts?
We then tempted to create a calculation on the dimension tables. However, in return might have undesired implications (M2M, "RandomPath") as it might force a connection between two dimension tables that are connected indirectly through 1 or more fact tables.
Solution:
We want to be able to calculate from the fact table. For this, we need all the values from the relevant dimension tables to be also in the fact table. Then, we could count for example the number of employees from the fact table, even if they did not have transactions in reality.
We have 3 ways to create the desired fact table:
1) Union the fact table with a Cross Join between the relevant dimensions (from which we want to fill the missing values) and then filter the duplicated records using the RankDesc approach.
In this use case, we will have all the combinations of the dimensions involved (date and employees in our example) in the fact table.
``````SELECT *

FROM
(
SELECT
a.Date
,a.Employee
,a.Department
,a.Amount
,a.KeyDateEmployee
,rankdesc(a.KeyDateEmployee, a.Amount) RankKey

FROM
(
SELECT
s.date Date
,s.Employee Employee
,s.Department Department
,s.Amount Amount
, (left(tostring(s.date),10) + '|' + s.Employee) KeyDateEmployee
FROM Sales s

UNION

SELECT
d.Date Date
,e.Employee Employee
,'N/A' Department
,0 Amount
,(left(tostring(d.date),10) + '|' + e.Employee) KeyDateEmployee

FROM dimDate d, dimEmployees e
) a
)a1

WHERE a1.RankKey = 1``````
2) Using the "WHERE NOT IN" -  In this scenario, we will only add to the fact the missing values from each dimension (date and employees) but without the combinations between the dimensions.
``````--Original fact table
SELECT
s.[Date] Date
,s.[Employee] Employee
,s.[Department] Department
,s.[Amount] Amount
FROM [Sales] s

UNION

--Employees who don't have any transactions in the fact
SELECT
ToDatetime(NULL) Date
,e.[Employee] Employee
,'N/A' Department
,0 Amount
FROM [dimEmployees] e
WHERE e.[Employee] NOT IN (SELECT DISTINCT s.[Employee] FROM [Sales] s)

UNION

--Dates which are not appearing in the fact
SELECT
d.[Date] Date
,'N/A' Employee
,'N/A' Department
,0 Amount
FROM [dimDate] d
WHERE d.[Date] NOT IN (SELECT DISTINCT s.[Date] FROM [Sales] s)``````
3) Using Right Join with the dimensions table - use case similar to the "Wher Not In".
``````SELECT
d.Date
,a.Employee
,a.Department
,coalesce(a.Amount,0)
FROM
(
SELECT
s.[Date] Date
,coalesce(s.[Date],createdate(getyear(now()), 1,1)) DateJoin
,e.[Employee]
,s.[Department]
,s.[Amount]
FROM [Sales] s RIGHT JOIN [dimEmployees] e
ON s.[Employee] = e.[Employee]
) a
RIGHT JOIN [dimDate] d
ON a.DateJoin = d.Date``````
Here are the supporting files:
Version history
Last update:
‎03-02-2023 08:51 AM
Updated by:
Contributors
Community Toolbox

Developers Group:

Product Feedback Forum: