FAQ: Elasticube Data Management

5 Topics

Elasticube Data Folders Get Created Only From Web Version Of Elasticube Manager
Question: When an elasticube is created in desktop version of elasticube manager, only a .ecube file gets created, where as when an elasticube is created in web version of elasticube manager, a folder with the elasticube name is created that contains few subfolders. The folder is created in the path C:\ProgramData\Sisense\PrismServer\ElastiCubeData. Do the desktop version and web version of elasticube manager work differently? What do these folders hold? Is it a good practice to first create the elasticube in web version of elasticube manager, so that the folder structure gets created and then use desktop version for any modifications that cannot be done using web version? Solution: Both the desktop and web ElastiCube Managers create folders here C:\ProgramData\Sisense\PrismServer\ElastiCubeData after the first cube build (and subsequent builds). The desktop version does this after the cube is built for the first time (and updates these folders each time a build is done). When you open a cube from the Server Console on the desktop, you're really opening the .ecube file stored within C:\ProgramData\Sisense\PrismServer\ElastiCubeData\<cube name> When you create a new cube and save off the cube, you're really saving the schema (the .ecube file) which contains the relationships between the data but doesn't actually store the data from the data sources. This is nice when you want to do some versioning of the schema (in case you need to change the data model, are testing some relationships, etc). The web version also creates a folder in C:\ProgramData\Sisense\PrismServer\ElastiCubeData on first build. These folders in C:\ProgramData\Sisense\PrismServer\ElastiCubeData store the schema of the cube and all of the data (tables, views, etc) that have been imported into the cube. As for a recommendation for building cubes, keep in mind that the desktop version of ECM is more stable and has more features but the web ECM is easier to access (no need to get onto the server) and has greater sharing capabilities with select users.
intapiuser
02-22-2024 Place Data Models
683Views
0likes
0Comments
Calculating YTD With Variable Start And End Dates Of The Year (Academic Calendar)
Question: For a specific industry, like Education, even though the number of days is roughly the same every year, each academic year can start and end on different dates. The usual way of calculating YTD wouldn't work for this requirements because school year can start and end in the middle of calendar year. Solution: The idea is we want to compare the years by the numbers of days that have passed since the current year started, e.g. compare the first 135 days of this school year with the first 135 days of last school year, the first 135 days of last two school years, the first 135 days of last three school years, and so on, regardless of the actual dates. To accomplish this, we need to create a date dimension table with a day rank and a YTD flag. Day rank is an enumeration of school days within each school year. By comparing the day rank of each day by the day rank of today's date, we can calculate the YTD flag. We need these two tables for this solution to work: List of school year with start and end dates List of dates. You can use the date dimension Excel file from this article. School Year table: Date table (you only need the Date column for this purpose) Step 1 Create a new dim date table using the custom table option: Get the list of school dates (only dates between start and end dates of each school year) and remove dates that fall on Saturday and Sunday (optional). Rank the dates within each school year. Based on the result from above, get today's date rank. Compare each date's rank with today's rank to calculate the YTD flag. You can use this SQL as an example. --Join the list of school dates with today's date rank --Compare each date's rank with today's date rank to create YTD flag SELECT *, CASE WHEN t1.[Day Rank] <= t2.[Today Day Rank] THEN 1 ELSE 0 END AS isYTD FROM ( --Get the list of only school dates (dates between start and end date) --Remove dates that fall on Saturday and Sunday (not school dates) --Rank dates within each school year SELECT d.Date, s.[School Year], s.[Start Date], s.[End Date], rankasc(s.[School Year], d.Date) AS [Day Rank] FROM [Dim Dates] d JOIN [School Year] s ON d.Date BETWEEN s.[Start Date] AND s.[End Date] WHERE DayOfWeek(d.Date) <= 5 ) t1 JOIN ( --Get today's date rank SELECT t.[Day Rank] AS [Today Day Rank] FROM ( --Get the list of only school dates (dates between start and end dates of each year) --Remove dates that fall on Saturday and Sunday (not school dates) --Rank dates within each school year SELECT d.Date, s.[School Year], s.[Start Date], s.[End Date], rankasc(s.[School Year], d.Date) AS [Day Rank] FROM [Dim Dates] d JOIN [School Year] s ON d.Date BETWEEN s.[Start Date] AND s.[End Date] WHERE DayOfWeek(d.Date) <= 5 ) t WHERE t.Date = createdate(getyear(now()), getmonth(now()), getday(now())) ) t2 ON 1=1 The table should look like this: Step 2 Link the fact table(s) to the new date dimension table on the Date field. Step 3 Create your widget(s) and add the YTD flag as a filter, set to 1.
intapiuser
02-22-2024 Place Data Models
2KViews
0likes
0Comments
Flattening JSON Objects In SQL Server
Question: In instances where tables within SQL Server contain JSON objects within a column you can use the import sql query editor feature in Sisense to flatten these objects into individual fields. For example: Can be turned into: Solution: To accomplish this, SQL Server has native functions that are capable of parsing json objects and flattening them into individual fields. The below code uses the JSON_Value extracts a scalar value from a JSON string in which the first parameter takes an expression which is typically the name of a variable or column that contains JSON text. The second parameter takes a JSON path that specifies the property to extract. The default path mode is lax mode, which takes the syntax '$.<keyname>'. Below is the SQL used to flatten the JSON values in the sample data. select SalesOrderNumber , JSON_Value(JSONValue,'$.CustomerID') as CustomerID , JSON_Value(JSONValue,'$.OrderDate') as OrderDate , JSON_Value(JSONValue,'$.TotalDue') as TotalDue , JSON_Value(JSONValue,'$.ShipMethodID') as ShipMethodID , JSON_Value(JSONValue,'$.TerritoryID') as TerritoryID , JSON_Value(JSONValue,'$.SalesPersonID') as SalesPersonID from OrderHeaderJSON Other useful functions regarding working with JSON in SQL Server can be found here: ISJSON (Transact-SQL) tests whether a string contains valid JSON. JSON_VALUE (Transact-SQL) extracts a scalar value from a JSON string. JSON_QUERY (Transact-SQL) extracts an object or an array from a JSON string. JSON_MODIFY (Transact-SQL) changes a value in a JSON string.
intapiuser
11-10-2023 Place Data Models
11KViews
0likes
0Comments
Filtering On Multiple Tags In A Single Row
Question: A relatively common database design strategy is to have a tag column with an arbitrary number of tags Here is an example in which we look at numbers between 1 and 10 and tag them as even, odd, perfect, or parts of the Fibonacci sequence: Now we want to be able to select rows based on the tags. Solution: Option 1 is to create filters using contains. But this will quickly become unworkable for large numbers of tags as new filters would have to be created for every tag. Option 2 is a more flexible solution, doing this work on the elasticube side. Step 1 (The Combo Tags) table : Create a table of unique tag combinations in the fact table. For example: SELECT DISTINCT tags FROM [Tags.csv] Step 2 (The Tag Mapper): Here we need to create a table where every individual tag links to every tag combination it is in. In SQL this is more straightforward so if you can do this in your database look up STRING SPLIT and UNROLL functions. In Sisense we have to be more round about. For Sisense we will need to check the first tag in every tag combo, then the second, then the third, etc etc for whatever the maximum number of tags in a record is. The provided example supports max three tags per row: SELECT DISTINCT combo_tag, tag FROM ( SELECT tags AS combo_tag, tags AS tag FROM [Combo_Tags] WHERE STRPARTS(tags, ',', 1) IS NULL OR STRPARTS(tags, ',', 1) = '' UNION ALL SELECT tags AS combo_tag, STRPARTS(tags, ',', 1) AS tag FROM [Combo_Tags] WHERE STRPARTS(tags, ',', 1) IS NOT NULL AND STRPARTS(tags, ',', 1) != '' UNION ALL SELECT tags AS combo_tag, STRPARTS(tags, ',', 2) AS tag FROM [Combo_Tags] WHERE STRPARTS(tags, ',', 2) IS NOT NULL AND STRPARTS(tags, ',', 2) != '' UNION ALL SELECT tags AS combo_tag, STRPARTS(tags, ',', 3) AS tag FROM [Combo_Tags] WHERE STRPARTS(tags, ',', 3) IS NOT NULL AND STRPARTS(tags, ',', 3) != '' ) AS tmp When done, the table will look like this, with each combo tag linking individually to each component. So the combo_tag Even,Prime,Fib appears three times, once with Even, once with Prime, and once with Fib. Note: These results were ordered for display reasons, do not use an order by in production environments as it unnecessarily slows down the cube. You can order your table while testing to ensure correctness, but remove when done. Step 3 (Dim Tag): Now we just need to make a unique collection of individual tags to create our dim. SELECT DISTINCT tag FROM [Tag_Mapper] Step 4 (Make the joins) Join DIM_Tag.tag to Tag_Mapper.tag Join Tag_Mapper.combo_tags to Combo_Tags.tags Join Combo_Tags.tags to FactTable.tags Let's review what happens in this example when the tags "Odd" and "Perfect" are selected. Dim_Tag joins to Tag Mapper and grabs the following combo_tags - Odd - Odd, Fib - Odd, Prime, Fib - Even, Perfect - Odd, Prime When that table is joined to combo tags and then to our fact table, we see that the following ID's are revealed 1,3,5,6,7,9 The Combo_tags table is needed between the Tag_Mappper and the fact table to prevent Many to Many joins. This approach will run into space issues as the tag lists become larger. The best recommendation I can give here is to ensure that the tags are always in the same order (alphabetical etc) just so that we don't have Tag1,Tag2 as well as Tag2,Tag1. This will help constrain the data size. My Elasticube: And a dashboard:
intapiuser
03-02-2023 Place Data Models
1.7KViews
0likes
0Comments
Supporting Updates And Deletions In DB When Building Accumulate
Question: When using "Accumulate By" builds (see below), Sisense will only import "new" records based on a selected INT or DATETIME index field - if values in the source DB exceed the maximum value of that index as of the previous ElastiCube build, they will be inserted into the cube. The logic will be similar to adding a WHERE clause to the SQL Query: "WHERE Source.[Index] > ElastiCube.MAX[Index]" In some use cases we might need to support updates and deletions - typically for updates a new record is added to the source with the same ID as the initial record, but with a later timestamp. Solution: Tested on: Windows v8.2.4 Linux v8.2.1 In order to track the deleted records you need to create the logic on the data source side that will allow identifying those deleted records and differentiating them from their initial entry. Add a new column in the database and whenever the record should be removed - instead of performing DELETE operation - use UPDATE which sets: The data columns to NULL ID of the record remains the same Deleted column = TRUE (or "1") An initial "Replace All" build is required to import our first set of data. This initial data will be used as our baseline to determine if additional records are new, deleted, or updated. The first build will use the "Replace All" option, then subsequent builds will use the Accumulation option. Example initial table ("FACT_Raw_Data"): ** IMPORTANT ASSUMPTION ** The [OrderDate] of records is correctly set in the first insertion – no future/ faulty dates can be set (original record should not have a bad date) Steps To Implement: To obtain the record of interest, we will first need to create a custom table in the ElastiCube ("Latest_Record") to flag that MAX date per transaction. The SQL should look something like this: SELECT a.OrderID, MAX(a.[OrderDate]) AS max_record FROM [FACT_Raw_Data] a GROUP BY a.[OrderID] Perform a build ("Changes Only" is okay) so we can use our new "Latest_Record" table in future custom table expressions. Create another custom table ("FACT_Sales") INNER JOIN'ing the data in our initial "FACT_Raw_Data" table with the max record of the transaction in the "Latest_Record" table. SELECT DISTINCT a.* FROM [FACT_Raw_Data] a INNER JOIN [Latest_Record] b ON a.[OrderID] = b.[OrderID] AND a.[OrderDate] = b.[max_record] WHERE deleted = 0 The results of executing this query will be identical to the "FACT_Raw_Data" table at this point, as each transaction only has one record in that table (no updates or deletions have been performed yet): Perform another build ("Changes Only" is okay again) so we can use our new "FACT_Sales" table. Subsequently every build after should use the Accumulation option. Use the "Accumulate By" field option and select [FACT_Raw_Data].[OrderDate] as your field or accumulate the entire table. Second Load – Build Accumulate: Dashboards will filter out deleted rows per the INNER JOIN applied in our "FACT_Sales" table (rows highlighted in red will not be displayed): Note: Be sure to hide your initial "FACT" (in our case, "FACT_Raw_Data") and "Latest_Record" tables in the cube to prevent dashboard designers from inadvertently sourcing widgets from these tables. "FACT_Sales" should be the ONLY table incorporated into the larger data model / sourcing widgets on the front-end to ensure accurate reporting.
intapiuser
03-02-2023 Place Data Models
1.2KViews
0likes
0Comments