There are many cases where our data comes in with different granularity levels. For instance, Actual vs. Target where the actuals are at the product level and the targets are per sub category (higher hierarchy of the product). How can we connect them?
Logically the connection between two tables are on the same level, but here we need to find a way to make sense of the data while the model needs to support it.
Example: Sales vs. Targets
This is how the schema looks before our changes:
Image 1. Schema with different Granularity levels. The product only appears in one fact & the subcategory appears in another fact.
We will create a new dim product and it will connect to both fact tables based on the product ID (the current connection of ProductSubCategoryID will be removed).
This will be achieved by creating a master record for each of the sub categories and calling its ID as Product ID.
So in fact the data will hold product sub categories but will be called product ID. So now the products table will contain 2 types of data - Product & sub category in one field (productID).
SELECT ProductId, Name, ProductSubCategoryId FROM [Dim_Product] P
SELECT -S.ProductSubcategoryID as ProductId, 'N/A'as Name, ProductSubcategoryID FROM [ProductSubcategory] S
The query takes the current dim product (first query) and adds new lines to it as master records per each of the sub categories.
Ordinarily the subcategories will have a different set of identifiers than the products, but in this case they are both small numbers so we added a minus sign to the subcategories.
The minus sign (-S.ProductSubcategoryID) is used in order to define a new set of identifiers for the subcategories, so it won't create any duplicates with the actual Product IDs - they're in the same column). This needs to be done on the other side of the connection (the fact table).
Image 2. Results from this query. The 1st & 3rd column are the same for the sub categories (except for the minus sign).