faq: elasticube data management

13 Topics

Flattening JSON Objects In SQL Server
Question: In instances where tables within SQL Server contain JSON objects within a column you can use the import sql query editor feature in Sisense to flatten these objects into individual fields. For example: Can be turned into: Solution: To accomplish this, SQL Server has native functions that are capable of parsing json objects and flattening them into individual fields. The below code uses the JSON_Value extracts a scalar value from a JSON string in which the first parameter takes an expression which is typically the name of a variable or column that contains JSON text. The second parameter takes a JSON path that specifies the property to extract. The default path mode is lax mode, which takes the syntax '$.<keyname>'. Below is the SQL used to flatten the JSON values in the sample data. select SalesOrderNumber , JSON_Value(JSONValue,'$.CustomerID') as CustomerID , JSON_Value(JSONValue,'$.OrderDate') as OrderDate , JSON_Value(JSONValue,'$.TotalDue') as TotalDue , JSON_Value(JSONValue,'$.ShipMethodID') as ShipMethodID , JSON_Value(JSONValue,'$.TerritoryID') as TerritoryID , JSON_Value(JSONValue,'$.SalesPersonID') as SalesPersonID from OrderHeaderJSON Other useful functions regarding working with JSON in SQL Server can be found here: ISJSON (Transact-SQL) tests whether a string contains valid JSON. JSON_VALUE (Transact-SQL) extracts a scalar value from a JSON string. JSON_QUERY (Transact-SQL) extracts an object or an array from a JSON string. JSON_MODIFY (Transact-SQL) changes a value in a JSON string.
intapiuser
03-02-2023 Place Data Models
11KViews
0likes
0Comments
Using SQL Coalesce To Handle Null Values
Question: SQL NULL is not technically a value, but rather a state. Fields are not equal or not equal to null, they are or are not null. As such, null fields can get a little tricky when attempting different kinds of transformations. For example, mathematical equations across multiple fields where any of the fields are null will result in a null value: 5 * null = null null – null = null 5 + null = null Consider a salary calculation where an employee as Base but no Commission: Solution: In this case the Total for Regina should equal 49000, not null. To transform the null into a not null value like 0 we could use SQL Case: More efficiently we can use Coalesce which checks for null values in the specified expression (i.e., Base or Commission) and does the translation to the specified value (i.e., 0): Coalesce is also useful for finding the first populated (i.e., non-null) value across multiple fields in the same record (e.g., select the first non-null phone number from home, mobile, office fields). Using SQL Case: The same logic using Coalesce results in more efficient and decipherable query: Both queries bear the same correct result:
intapiuser
03-02-2023 Place How-Tos & FAQs
3.2KViews
0likes
0Comments
Calculating YTD With Variable Start And End Dates Of The Year (Academic Calendar)
Question: For a specific industry, like Education, even though the number of days is roughly the same every year, each academic year can start and end on different dates. The usual way of calculating YTD wouldn't work for this requirements because school year can start and end in the middle of calendar year. Solution: The idea is we want to compare the years by the numbers of days that have passed since the current year started, e.g. compare the first 135 days of this school year with the first 135 days of last school year, the first 135 days of last two school years, the first 135 days of last three school years, and so on, regardless of the actual dates. To accomplish this, we need to create a date dimension table with a day rank and a YTD flag. Day rank is an enumeration of school days within each school year. By comparing the day rank of each day by the day rank of today's date, we can calculate the YTD flag. We need these two tables for this solution to work: List of school year with start and end dates List of dates. You can use the date dimension Excel file from this article. School Year table: Date table (you only need the Date column for this purpose) Step 1 Create a new dim date table using the custom table option: Get the list of school dates (only dates between start and end dates of each school year) and remove dates that fall on Saturday and Sunday (optional). Rank the dates within each school year. Based on the result from above, get today's date rank. Compare each date's rank with today's rank to calculate the YTD flag. You can use this SQL as an example. --Join the list of school dates with today's date rank --Compare each date's rank with today's date rank to create YTD flag SELECT *, CASE WHEN t1.[Day Rank] <= t2.[Today Day Rank] THEN 1 ELSE 0 END AS isYTD FROM ( --Get the list of only school dates (dates between start and end date) --Remove dates that fall on Saturday and Sunday (not school dates) --Rank dates within each school year SELECT d.Date, s.[School Year], s.[Start Date], s.[End Date], rankasc(s.[School Year], d.Date) AS [Day Rank] FROM [Dim Dates] d JOIN [School Year] s ON d.Date BETWEEN s.[Start Date] AND s.[End Date] WHERE DayOfWeek(d.Date) <= 5 ) t1 JOIN ( --Get today's date rank SELECT t.[Day Rank] AS [Today Day Rank] FROM ( --Get the list of only school dates (dates between start and end dates of each year) --Remove dates that fall on Saturday and Sunday (not school dates) --Rank dates within each school year SELECT d.Date, s.[School Year], s.[Start Date], s.[End Date], rankasc(s.[School Year], d.Date) AS [Day Rank] FROM [Dim Dates] d JOIN [School Year] s ON d.Date BETWEEN s.[Start Date] AND s.[End Date] WHERE DayOfWeek(d.Date) <= 5 ) t WHERE t.Date = createdate(getyear(now()), getmonth(now()), getday(now())) ) t2 ON 1=1 The table should look like this: Step 2 Link the fact table(s) to the new date dimension table on the Date field. Step 3 Create your widget(s) and add the YTD flag as a filter, set to 1.
intapiuser
03-02-2023 Place Data Models
1.9KViews
0likes
0Comments
How To Concate Three Strings In Elasticube
Question: I have requirement to combine and show the dimension. There are two dimensions Product and Product_Family. I want to display it as single dimension in charts like "Product 1 | Product Family 1", "Product 2 | Product Family 1", so on. Solution: If all dimensions are of string data type it will be enough to use the '+' sign between them and the '|'. If they are not strings yet, you can use the TOSTRING() function before using the '+' You can also use the Concat function: Concat(Concat([Product1].''),[Product2]) Or Concat([Product1],[Product2])
intapiuser
03-02-2023 Place How-Tos & FAQs
1.9KViews
2likes
1Comment
Filtering On Multiple Tags In A Single Row
Question: A relatively common database design strategy is to have a tag column with an arbitrary number of tags Here is an example in which we look at numbers between 1 and 10 and tag them as even, odd, perfect, or parts of the Fibonacci sequence: Now we want to be able to select rows based on the tags. Solution: Option 1 is to create filters using contains. But this will quickly become unworkable for large numbers of tags as new filters would have to be created for every tag. Option 2 is a more flexible solution, doing this work on the elasticube side. Step 1 (The Combo Tags) table : Create a table of unique tag combinations in the fact table. For example: SELECT DISTINCT tags FROM [Tags.csv] Step 2 (The Tag Mapper): Here we need to create a table where every individual tag links to every tag combination it is in. In SQL this is more straightforward so if you can do this in your database look up STRING SPLIT and UNROLL functions. In Sisense we have to be more round about. For Sisense we will need to check the first tag in every tag combo, then the second, then the third, etc etc for whatever the maximum number of tags in a record is. The provided example supports max three tags per row: SELECT DISTINCT combo_tag, tag FROM ( SELECT tags AS combo_tag, tags AS tag FROM [Combo_Tags] WHERE STRPARTS(tags, ',', 1) IS NULL OR STRPARTS(tags, ',', 1) = '' UNION ALL SELECT tags AS combo_tag, STRPARTS(tags, ',', 1) AS tag FROM [Combo_Tags] WHERE STRPARTS(tags, ',', 1) IS NOT NULL AND STRPARTS(tags, ',', 1) != '' UNION ALL SELECT tags AS combo_tag, STRPARTS(tags, ',', 2) AS tag FROM [Combo_Tags] WHERE STRPARTS(tags, ',', 2) IS NOT NULL AND STRPARTS(tags, ',', 2) != '' UNION ALL SELECT tags AS combo_tag, STRPARTS(tags, ',', 3) AS tag FROM [Combo_Tags] WHERE STRPARTS(tags, ',', 3) IS NOT NULL AND STRPARTS(tags, ',', 3) != '' ) AS tmp When done, the table will look like this, with each combo tag linking individually to each component. So the combo_tag Even,Prime,Fib appears three times, once with Even, once with Prime, and once with Fib. Note: These results were ordered for display reasons, do not use an order by in production environments as it unnecessarily slows down the cube. You can order your table while testing to ensure correctness, but remove when done. Step 3 (Dim Tag): Now we just need to make a unique collection of individual tags to create our dim. SELECT DISTINCT tag FROM [Tag_Mapper] Step 4 (Make the joins) Join DIM_Tag.tag to Tag_Mapper.tag Join Tag_Mapper.combo_tags to Combo_Tags.tags Join Combo_Tags.tags to FactTable.tags Let's review what happens in this example when the tags "Odd" and "Perfect" are selected. Dim_Tag joins to Tag Mapper and grabs the following combo_tags - Odd - Odd, Fib - Odd, Prime, Fib - Even, Perfect - Odd, Prime When that table is joined to combo tags and then to our fact table, we see that the following ID's are revealed 1,3,5,6,7,9 The Combo_tags table is needed between the Tag_Mappper and the fact table to prevent Many to Many joins. This approach will run into space issues as the tag lists become larger. The best recommendation I can give here is to ensure that the tags are always in the same order (alphabetical etc) just so that we don't have Tag1,Tag2 as well as Tag2,Tag1. This will help constrain the data size. My Elasticube: And a dashboard:
intapiuser
03-02-2023 Place Data Models
1.7KViews
0likes
0Comments
Split A Delimited Field Into Rows
Question: A data source in your ElastiCube contains a field with multiple values in it delimited by a particular character (i.e - comma, semicolon, space, etc.). You need to parse that field and create a table with a record for every value in the delimited field. For our example we will be working with a .csv file ("emails") containing only two columns - RowID and emails. The emails.email field contains multiple emails addresses delimited by a comma: Solution: Import the table containing the delimited field into your ElastiCube as a standalone data source (meaning do not connect it to the larger data model via a relationship). This table can be hidden to prevent dashboard designers unfamiliar with the schema for using it in dashboards, and will act only as the source for our custom table. Solution is data source agnostic - so it doesn't matter whether this source is a SQL database, Excel workbook, SalesForce connector, etc Create a custom table ("+ Custom" at the top of the web ElastiCube manager). For the purposes of this example the custom table will be named "emails_delimited". Paste the following code snippet: SELECT * FROM ( SELECT *, 1 AS n, StrParts([email], ',', 1) AS email_split FROM [emails] UNION ALL SELECT *, 2 AS n, StrParts([email], ',', 2) AS email_split FROM [emails] UNION ALL SELECT *, 3 AS n, StrParts([email], ',', 3) AS email_split FROM [emails] UNION ALL SELECT *, 4 AS n, StrParts([email], ',', 4) AS email_split FROM [emails] UNION ALL SELECT *, 5 AS n, StrParts([email], ',', 5) AS email_split FROM [emails] UNION ALL SELECT *, 6 AS n, StrParts([email], ',', 6) AS email_split FROM [emails] ) emails_delimited WHERE email_split IS NOT NULL AND email_split != '' UNION ALL SELECT *, 1 AS n, [email] AS email_split FROM [emails] WHERE [email] NOT LIKE '%,%' Replace the following columns with fields applicable to your use case [email] = delimited field ',' = your delimiter (don't forget to replace in the last line of code in the snipped as well!) email_split = split field name [emails] = initial source table Preview to confirm the custom table is returning as expected - in our emails example: Note that in Step 2 - you will need a row in the SELECT clause for the maximum possible values parsing the delimited field will yield. For example in the code snippet above, there are 6 SELECT's as parsing RowID 2 yields 6 emails (RowID1 yields 3, RowID3 yields 1, etc.)
intapiuser
03-02-2023 Place How-Tos & FAQs
1.3KViews
0likes
0Comments
Supporting Updates And Deletions In DB When Building Accumulate
Question: When using "Accumulate By" builds (see below), Sisense will only import "new" records based on a selected INT or DATETIME index field - if values in the source DB exceed the maximum value of that index as of the previous ElastiCube build, they will be inserted into the cube. The logic will be similar to adding a WHERE clause to the SQL Query: "WHERE Source.[Index] > ElastiCube.MAX[Index]" In some use cases we might need to support updates and deletions - typically for updates a new record is added to the source with the same ID as the initial record, but with a later timestamp. Solution: Tested on: Windows v8.2.4 Linux v8.2.1 In order to track the deleted records you need to create the logic on the data source side that will allow identifying those deleted records and differentiating them from their initial entry. Add a new column in the database and whenever the record should be removed - instead of performing DELETE operation - use UPDATE which sets: The data columns to NULL ID of the record remains the same Deleted column = TRUE (or "1") An initial "Replace All" build is required to import our first set of data. This initial data will be used as our baseline to determine if additional records are new, deleted, or updated. The first build will use the "Replace All" option, then subsequent builds will use the Accumulation option. Example initial table ("FACT_Raw_Data"): ** IMPORTANT ASSUMPTION ** The [OrderDate] of records is correctly set in the first insertion – no future/ faulty dates can be set (original record should not have a bad date) Steps To Implement: To obtain the record of interest, we will first need to create a custom table in the ElastiCube ("Latest_Record") to flag that MAX date per transaction. The SQL should look something like this: SELECT a.OrderID, MAX(a.[OrderDate]) AS max_record FROM [FACT_Raw_Data] a GROUP BY a.[OrderID] Perform a build ("Changes Only" is okay) so we can use our new "Latest_Record" table in future custom table expressions. Create another custom table ("FACT_Sales") INNER JOIN'ing the data in our initial "FACT_Raw_Data" table with the max record of the transaction in the "Latest_Record" table. SELECT DISTINCT a.* FROM [FACT_Raw_Data] a INNER JOIN [Latest_Record] b ON a.[OrderID] = b.[OrderID] AND a.[OrderDate] = b.[max_record] WHERE deleted = 0 The results of executing this query will be identical to the "FACT_Raw_Data" table at this point, as each transaction only has one record in that table (no updates or deletions have been performed yet): Perform another build ("Changes Only" is okay again) so we can use our new "FACT_Sales" table. Subsequently every build after should use the Accumulation option. Use the "Accumulate By" field option and select [FACT_Raw_Data].[OrderDate] as your field or accumulate the entire table. Second Load – Build Accumulate: Dashboards will filter out deleted rows per the INNER JOIN applied in our "FACT_Sales" table (rows highlighted in red will not be displayed): Note: Be sure to hide your initial "FACT" (in our case, "FACT_Raw_Data") and "Latest_Record" tables in the cube to prevent dashboard designers from inadvertently sourcing widgets from these tables. "FACT_Sales" should be the ONLY table incorporated into the larger data model / sourcing widgets on the front-end to ensure accurate reporting.
intapiuser
03-02-2023 Place Data Models
1.2KViews
0likes
0Comments
Connecting To The Elasticube From Microsoft Excel
Question: The purpose of this post is to demonstrate how to connect to a Sisense Elasticube from Microsoft Excel. Solution: Begin by installing the Sisense ODBC driver. https://www.sisense.com/data-connectors/odbc/ Once it is installed, go to ODBC-64 bit from the windows search (just search odbc and it will pop up), then go to the system tab and add a new connection. Then choose the Sisense ODBC Driver. Add a name and a description and then click on the empty box for elasticube names. This will prompt you for a Sisense login. If working on the same server as Sisense, the server will be localhost:8081. The username and password corresponds to your login in Sisense. Note: If you receive a licensing error, please reach out to your Customer Success Representative. Finally, select an Elasticube and save the connection. Now you are ready to connect to the elasticube from Excel. Now open a new workbook in Excel. Select the Data tab> Get Data> From Other Sources > ODBC. You will be prompted to either select an existing connection or enter a connection string. Select the connection and continue. You may be prompted for credentials. However, because the connection already contains them, they are irrelevant at this stage. You can enter a random value and click continue if it will not let you continue with an empty box. Once the connection is established, you should see all tables available in the elasticube. Please note you can not see hidden fields from the ODBC connection. In the case above, I have selected the Admissions table and loaded it into Excel. From this point you should have a populated table with the ability to refresh. You can append the query to join two tables together and customize the query. Please refer to the documentation for Excel Power Query.
intapiuser
03-02-2023 Place Data Sources
1.1KViews
0likes
0Comments
Accumulation Build For Millisec (Ms) By Using Unix Timestamp Transformation
Question: This post we will talk about accumulation via millisec by using a unix timestamp transformation In this use case, We will be working with a CData Elasticsearch ODBC driver and client need a way to accumulate data base off milliseconds. Naively, we are unable to accumulate down to the millisec but with this work-around, we are able to do so. Solution: This will be done upon data pull in from your data source where we will edit the query and transform it into a unix timestamp Please keep in mind this is using function from your native data source. In this case, we are using ODBC CData and they support the function we needed for this transformation By using the following function, we are able to bring in data and transform into unixts SELECT datediff('millisecond' ,DATEFROMPARTS(1970, 1,1),datetime) FROM demosisense.testDate After this has been complete, you can now go ahead and set the accumulation base off the new field we created. Following regular Sisense function, we can then hide this field while do an accumulative build on this. And this will accumulate down to the millisec!
intapiuser
03-02-2023 Place How-Tos & FAQs
820Views
0likes
0Comments
Editing An Existing SQL Custom Import Query
Question: I want to edit existing Custom Import Query, how can I do so? Solution: From the context menu open on the table with custom import query Change connection settings for table: From there you will be able to see your custom import query and edit it. It is also described here.
intapiuser
03-02-2023 Place How-Tos & FAQs
748Views
0likes
0Comments