Get measures from 2 facts without duplications - Fact Attribution

intapiuser · ‎03-02-2023

Analytical Need

Working with 2 fact tables which have a direct 1:M relationship, and fetching measures from both. A common example from the retail industry is Order Header and Order Line Items tables

Modeling Challenge

A standard connection could create data duplication, if :

Any measures are taken from the Header table while a dim (Filter \ category) is taken from the Line Items table.
Or similarly, a widget containing measures from the Order Header & Line Items.

Solution

Logic:

Creating a Consolidated Fact table:

Create a custom SQL table in which the two tables will be joined, so the new table is on the line item’s level, but containing all of the respective header record’s attributes.

The measures from the line items table will be brought untouched into the new tables, while measures from the Header table will be broken down and distributed to all the line items that it consists of (each line item will hold a part; The original value will be retrieved from the sum of all line items’ values).

The common distribution methods are:

Distribute the values equally between the rows
Distribute the values by each row’s proportional contribution to the header’s total

Example: We have the following orders tables:

Step 1:

Create a custom table based on the original Header and Line Items tables according to the desired distribution method above.

In this example, the SQL expression would look like this:

SELECT H.[Order ID], H.[Customer], H.[Order Date], LI.[Product], LI.[Quantity],
 LI.[Line Total], 

/*Equal Freight Distribution*/ 
H.Freight / OT.[Items In Order] AS [Freight (Distributed Equally)],

/*Proportional Distribution (By Line Total) */ 
H.Freight*(LI.[Line Total] / OT.[Total Amount]) AS [Freight (Proportional)] 

FROM [Order Header] H JOIN [Order Line Items] LI ON
H.[ORDER ID] = LI.[ORDER ID] JOIN 
 /*Calculate each order's total items and amount*/ 

(SELECT [Order ID], count(*) [Items In Order], sum ([Line Total]) [Total Amount]  
FROM [Order Line Items] 
GROUP BY [Order ID] ) OT 

ON H.[Order ID] = OT.[Order ID]

Step 2:

Connect the new master table to all relevant dimensions and use it like any other fact.

Make sure to make both original tables invisible.

Image 1. Updated schema

Important to Note

This use cases’ data duplication can also be solved on the front-end by use of multi-pass aggregations. However, we highly recommend the modeling approach, as it provides a simpler solution for the dashboard designers and reduces the risk of mis-use.

Attachments

- 4.6 MB - fact attribution.ecdata

- 25 KB - fact attribution.dash

Sisense Community