Schedule sequential ElastiCube builds using windows task scheduler
In many cases we would like to schedule our ElastiCubes builds to run one by one, using just one command, or to run in a specific hour so our users won't create or watch dashboards while the cube is building and also will have recently updated data. Although you can schedule a cube to build in a specific hour through the scheduled build settings in the ElastiCube Manager, this post can help you to define more complex and specific terms to run the build. For example building all your ElastiCubes in one process and determining a specific hour for the build. Step 1: Copy the script and save it as .bat file This script contains a few steps to activate a cube build. The following lines are the commands to build the cubes. Add a new line for every ElastiCube you want to build. It is crucial to add the "/wait" part so every command start in its turn only after the previous is finished. @echo off start /wait /d "C:\Program Files\Sisense\Prism" psm ecube build filename="C:\<My_Cube_A>.ecube" serverAddress="LocalHost" mode="restart" start /wait /d "C:\Program Files\Sisense\Prism" psm ecube build filename="C:\<My_Cube_B>.ecube" serverAddress="LocalHost" mode="restart" @end For more information about available command parameters, please see our PSM documentation at https://sisense.dev/guides/legacy/psm/#sisense-shell-psm-exe. Using the PSM commands, you can build from either a .ecube file or an existing ElastiCube. Example to build from a .ecube file: start /wait /d "C:\Program Files\Sisense\Prism" psm ecube build filename="C:\Users\Administrator\Documents\My ElastiCubes\Northwind.ecube" serverAddress="LocalHost" mode="restart" Example to build from an existing ElastiCube in the ElastiCube Server Console: start /wait /d "C:\Program Files\Sisense\Prism" psm ecube build name="Northwind" serverAddress="LocalHost" mode="restart" Here are the available build types: mode="restart": Rebuilds the ElastiCube entirely. mode="full": Rebuilds the ElastiCube and accumulates data for tables marked as accumulative. This mode should only be used for accumulative builds. mode="changes": Rebuilds from scratch tables that have changed in the ElastiCube schema. mode="metadataonly": Updates the ElastiCube server with the ElastiCube schema, without building. Step 2 : Creating a new task Open the windows task scheduler and create a new task. Name the task and set security filters as followed. *note that the user account running the task MUST BE AN ADMINISTRATOR USER Step 3: Setting the Triggers In the triggers tab create a new trigger and define the time to activate the command, this can be an interval for every few hours or run on a specific hour. In case you want to run the task a few times a day in predetermined hours, for every build occurrence create a new trigger. (i.e one occurrence in the morning and one at night). Step 4: Specifying the build action and finishing In the actions tab create a new one and choose "Start a Program" action, then choose the .bat file you created. Edit the Condition and Settings tabs as required (optional) and hit OK. Test the task to see if it's working by right clicking on the task and choosing "Run". In cases you want one cube to run every hour and another cube only once a day at a specific time, and don't want the builds to run in the same time, you can create a task with the following script to run every hour, with a loop that contains a time window to run a second cube. the following script will run CUBE_A every time the task starts, but CUBE_B will run only if the time is between 4pm and 5pm: @echo off start /wait /d "C:\Program Files\Sisense\Prism" psm ecs stopstart /wait /d "C:\Program Files\Sisense\Prism" psm ecs startstart /wait /d "C:\Program Files\Sisense\Prism" psm ecube build name="CUBE_A" serverAddress="LocalHost" mode="restart" set "now=%time: =0%" if "%now%" geq "16:00:00,00" if "%now%" lss "17:00:00,00" ( start /wait /d "C:\Program Files\Sisense\Prism" psm ecube build name="CUBE_B" serverAddress="LocalHost" mode="restart" ) @end In the “if” line you should change the time window to run the other cubes. Add a line for each cube, add a new loop for every time window (notice the parenthesis) **NOTE: if a build fails for some reason, the following ones in the script will not run.2.4KViews2likes6CommentsExport a table from the elasticube to a CSV
* Disclaimer: This works for Windows only. It will work in future with Linux versions as well. Once it does, I will update the post. Sometimes, we need to export a table from the Elasticube to a CSV file. We can then import it to the next build, to send it to a colleague, to back up or any other usage. (In the Sisense Desktop version, we could have done it using the export to CSV post plugin. However, this is not available in the WEB ECM). The method described below utilities the API call: http://localhost:8081/api/datasources/LocalHost/ECName/sql?query=select%20*%20from%20TableName&format=csv If there are spaces in the name of the EC or of the table name, use %20. Happy exporting 🙂1.5KViews0likes1CommentHow to export .ecdata and .dash files
Introduction This forum post will demonstrate how to save your ElastiCube .ecdata file which contains your ElastiCube schema and data, and the .dash file of your desired dashboard. Purpose/Benefits The .ecdata file contains both the data and the logic structure of the ElastiCube model, the .dash file contains the widgets structure logic and the relation to relevant ElastiCube. Exporting the .ecdata and .dash files will allow you to easily import the files and run your ElastiCube/Dashboard environment. Please note that security settings and users are stored on MongoDB and will not be stored within these files. To avoid confusion between .ecdata and .ecube files please see the following post: Difference between .ecdata and .ecube files Steps exporting .ecdata (.sdata for Linux) Windows: Step 1 - Open the ElastiCube Manager, 'Servers' and choose 'Manage Servers' to open Sisense Server Console Step 2 - Stop the relevant ElastiCube and Choose Export This will open a window allowing you to choose the location of the .ecdata file Note: in case your ElastiCube is very large, export can take a while to complete. We recommend stopping your Web server (IIS / IIS Express) to avoid the ElastiCube from automatically starting during this time, as a result of incoming queries. Linux: Use Sisense CLI to export the .sdata file: https://docs.sisense.com/main/SisenseLinux/using-sisense-cli-commands.htm The command export should be used. Example: si elasticubes export -name MyCube -path \"/opt/path/to/NameOfCube\" -file cubeName.sdata Another option to export - using v2 of REST API: To get the ID of the model, use another API call: Or to do the same from GUI: Open the Data page. For the relevant ElastiCube, select > Export Model. Select to export the model schema and the data. Click Export. exporting .dash It will be the same for Windows and Linux. Open Sisense web, click the more options of the relevant dashboard and choose Export Dashboard The dashboard will be downloaded to your local drive. Creating a Sample Set of your ElastiCube It will be the same for Windows and Linux. If the .ecdata file is too large to transfer, you might want to create a smaller ElastiCube sample data. To do so, duplicate the model in order to not to impact your dashboards: And build the duplicated model with a limited amount of rows:2.5KViews0likes1CommentAlert on a M2M Risk
Introduction This article describes how to enable an alert* when there is a risk of Many-To-Many relationship (M2M) between two or more tables. * This implementation involve the use of Pulse and it's supported in Sisense v6.5 and above. Motivation Almost every Data Model designer that dealt with a bit more advanced modeling at some point ran into data\performance issue and after a long troubleshoot process he\her found out that the root cause came from M2M. M2M can occur due to the following**: Connecting 2 tables with a pure M2M relationship such as two Fact tables with a non-unique key. A unique key became a non-unique due to changes or data issue within the source system. A business need to have M2M between two entities. From the described scenario you can assume that the M2M can be identified not only during the EC design but also during its ongoing use. Therefore, having a proactive alert for the EC designer can prevent from the user experiencing poor dashboard performances, wrong result sets and save many hours of troubleshooting, panicking users and general feeling of frustration. ** There can be other more complex cases of M2M which we won't describe here as the mechanism of the M2M alert is less relevant for them. Please refer to this article for a more detailed description of M2M relationship Implementation As an Data Model designer, during the design and after the final version of my schema is completed, I want to track on my connection and make sure that my assumptions of uniqueness on one side of the relationship (to get 1:M relationship) is being captured. Step 1 - Build a M2M Detection Table We'll create a table that will reflect the uniqueness of each table within the final schema that has to be unique according to its relevant primary or surrogate key. Basically we'll count the number of total rows within a table, the number of unique values of the table key and compare between them. if they are equal then the key is unique within the table, but if not there is a risk for M2M and so we'll raise a flag as you can see here: We can learn from the output of the table that for our case we have only one table that is not unique (Cost). Here's the query I used in order to create this table: SELECT 'DimDates' AS Table, 'Date' AS Key, count(t.[Date]) [Total Count], DISTINCT_count(t.[Date]) [Distinct Count], count(t.[Date])-DISTINCT_count(t.[Date]) AS [Duplicated Records], ifint(count(t.[Date])-DISTINCT_count(t.[Date])>0,1,0) AS M2M_Flag FROM DimDates t UNION ALL SELECT 'Customer' AS Table, 'CustomerID' AS Key, count(t.CustomerID) [Total Count], DISTINCT_count(t.CustomerID) [Distinct Count], count(t.CustomerID)-DISTINCT_count(t.CustomerID) AS [Duplicated Records], ifint(count(t.CustomerID)-DISTINCT_count(t.CustomerID)>0,1,0) AS M2M_Flag FROM Customer t UNION ALL SELECT 'SalesPerson' AS Table, 'BusinessEntityID' AS Key, count(t.BusinessEntityID) [Total Count], DISTINCT_count(t.BusinessEntityID) [Distinct Count], count(t.BusinessEntityID)-DISTINCT_count(t.BusinessEntityID) AS [Duplicated Records], ifint(count(t.BusinessEntityID)-DISTINCT_count(t.BusinessEntityID)>0,1,0) AS M2M_Flag FROM SalesPerson t UNION ALL SELECT 'Product' AS Table, 'ProductID' AS Key, count(t.ProductID) [Total Count], DISTINCT_count(t.ProductID) [Distinct Count], count(t.ProductID)-DISTINCT_count(t.ProductID) AS [Duplicated Records], ifint(count(t.ProductID)-DISTINCT_count(t.ProductID)>0,1,0) AS M2M_Flag FROM Product t UNION ALL SELECT 'SalesOrderHeader' AS Table, 'SalesOrderID' AS Key, count(t.SalesOrderID) [Total Count], DISTINCT_count(t.SalesOrderID) [Distinct Count], count(t.SalesOrderID)-DISTINCT_count(t.SalesOrderID) AS [Duplicated Records], ifint(count(t.SalesOrderID)-DISTINCT_count(t.SalesOrderID)>0,1,0) AS M2M_Flag FROM SalesOrderHeader t --Example for uniqueness by surrogate key UNION ALL SELECT 'Cost' AS Table, 'Size + Weight' AS Key, count(t2.Key) [Total Count], DISTINCT_count(t2.Key) [Distinct Count], count(t2.Key)-DISTINCT_count(t2.Key) AS [Duplicated Records], ifint(count(t2.Key)-DISTINCT_count(t2.Key)>0,1,0) AS M2M_Flag FROM ( SELECT t.Size + '|' + tostring(t.Weight) AS Key FROM Cost t) t2 --Example for checking uniqueness for a table which not in the final schema but effects another table (as a source of a lookup custom field) UNION ALL SELECT 'Store' AS Table, 'BusinessEntityID' AS Key, count(t.BusinessEntityID) [Total Count], DISTINCT_count(t.BusinessEntityID) [Distinct Count], count(t.BusinessEntityID)-DISTINCT_count(t.BusinessEntityID) AS [Duplicated Records], ifint(count(t.BusinessEntityID)-DISTINCT_count(t.BusinessEntityID)>0,1,0) AS M2M_Flag FROM Store t Step 2 - Create relevant KPI within a widget We'll create an indicator with the following formula: if([Total M2M_Flag]>0,1,0) This widget can be temporary as it only will serve us for alerting when the total of M2M flags is bigger than 0. Step 3 - Add the KPI to Pulse After Adding the widget to the Pulse, we'll mark the condition according the threshold 'Greater than' 0. Feel free to adjust the notification mode and other preferences according your requirements.938Views0likes0CommentsWhat is the difference between .ecdata and .ecube data files?
Both .ecdata and .ecube files are used to make backups of data and transfer data between ElastiCube Servers. Your data model or schema is saved in an ecube file. ecube files contain the schema and the data connection settings required for building your ElastiCube. The ecube file does not contain the raw data. After building your ElastiCube from an ecube file, your ElastiCube with the data is stored in an ecdata file. The Elasticube Manager stores the design of the data schema in an .ecube file. It is not the data itself, but only a “blueprint” of your data sources and structure. Saving a file in the ElastiCube Manager will output a .ecube file. The .ecdata files contains both the data and the logic structure of the data. Typically .ecube files are very small (a few KBs), whereas .ecdata files are sensitively larger (may reach GBs depending on your data). Exporting a file from the Sisense Server Console will output a .ecdata file. Learn how to export an ecdata file in this LINK904Views0likes0CommentsSome Causes Of Build Failures
One common reason your dashboards may not be working as intended is that your ElastiCube builds are not designed properly. This isn’t a catch-all problem, and though there are multiple reasons why it can occur, the outcome is the same: insights aren’t accurate, dashboards can’t access data, or information is displayed improperly. The issue with build failures is that they aren’t always readily visible and discovering them requires understanding underlying causes for errors. Fortunately, these errors are usually not catastrophic, and each have straightforward fixes. [H2] The Top 4 Reasons ElastiCubes Fail, and How to Solve Them These are some of the most common reasons your ElastiCubes can malfunction and the best ways to quickly resolve problems as they arise: Data Modeling Issues - Many-to-Many Relationships Smart Data Modeling is the backbone of a healthy deployment but some common mistakes, such as Many-to-Many relationships, can undermine your analysis and put unnecessary load on your Elasticube performance. A Many-to-Many relationship exists when the value in each field used to create a relationship between tables is included multiple times in each table. Many-to-Many relationships can cause duplications in the returned datasets, which can result in incorrect results and might consume excessive computing resources. The Solution: Test your schema to see if it includes many-to-many relationships Understand which scenario best fits your current schema According to your schema logic, applying the respective solution Check out these assets to dig deeper: Design A Schema With One Fact - Star Schema Many-to-Many Relationships The Invisible Many-to-Many Many-to-Many by Design Connectivity Issues – Problems Communicating This occurs during the initial setup and can lead to accessibility issues as well as problems gathering data. Connectivity issues can be triggered by anything from incorrect credentials, improperly set up connections and servers, and other similar errors. The Solution: Check for different connectivity issues and make sure your ElastiCubes are properly configured. Another area to investigate is your database availability, for example scheduled maintenance or database restrictions (too many database connections). Check out DbVisualizer, a database tool that let’s you explore your data sets and database as related to the ElastiCube. Not Enough Disk Space – Memory Issues Disk space problems come in two varieties: a lack of storage space, or the more common issue of insufficient RAM to process the existing ElastiCube. The latter is a common error when your ElastiCube schema include too many columns, tables, and connections, and can cause your system to crash. You can notice this during the creation process if there are disk space errors. The Solution: Start by trimming your ElastiCube schema by removing tables, data, and fields that are not essential. Removing unwanted data can show a quick turnaround. Otherwise, you can add more RAM to your machine to handle the larger load. Testing ElastiCube Models on Live Servers – Performance Problems It’s difficult to know exactly how an ElastiCube will perform before deploying it, as errors may not be apparent until it’s interacting with other programs. Deploying the ElastiCube on a live server, then, is an unknown element that can have adverse effects on the larger system. These can include improper functions, broken systems, and in some cases fatal loss of data. The Solution: Before placing anything on your live server, you should always test it on a development server to avoid breaking anything before you know how to resolve it. Testing on a dev server also gives you the chance to test for unexpected errors and make tweaks before uploading an ElastiCube to the live servers. What To Do If Your ElastiCube Fails When you notice a failure in your builds, it’s not always easy to identify the culprit. If you do notice an issue, try the following: Explore Sisense’s Build Errors list to review some of the most common errors for ElastiCube builds. The chart features known errors and their standard solutions. Read through the Sisense documentation for strategies and tips on how to troubleshoot build errors. If you can’t find a solution on your own, submit a support ticket to Sisense’s staff. Speak directly with your Customer Success Manager, who will be able to give you guidance on how to resolve your issues.2.5KViews0likes1CommentSchedule Download Of SQL API Queries To CSV
Introduction With the Python script below and the Window Scheduler, you can automate the download of csv files based on queries of the SQL API to the elasticube. This type of process comes in handy when a user needs to download information regularly to be consumed by other processes. The script takes takes care of the authentication process and keeps the definition of the cube, query, and response file. Prerequisites 1. install python 3.4 (download here) 2. python c++ compiler (download here) 3. python dependencies jwt (pip module PyJWT) parse (pip module parse) requests (pip module requests) django (pip module django) All python dependencies can be downloaded with pip. In the command line, type the following for downloading PyJWT for instance: pip install PyJWT Script The sql_to_csv.py script is attached. Note the Settings section in the script to setup server location, name of the cube, query, user email and password, REST API token, and response file. Windows Scheduler See here how to schedule a task in Windows. You can read this stack overflow post about scheduling the task every hour. The command that you need to set in the Scheduler is the following: python sql_to_csv.py1.4KViews0likes1CommentMonitoring Your Builds
Part of having a successful BI tool is having your data refreshed on the time you need it. To ensure your builds are working as expected, you can follow the steps provided below. Step 1 - Setting Up a Build Alert Follow the documentation of Sisense Pulse Creating build alerts. Build alerts are created from Sisense and are sent following an ElastiCube build. Step 2 - Build Alert In case of a build alert, an alert will pop up letting you know what cube had failed and when Step 3 - But now what? how do I know why it failed? Go to the Data tab where you will see the cube that failed and open the schema of the cube Click on option > View Latest Logs This will show you the stage/table it failed on and the exception message In our case, the build failed due to a missing file used in table1 table Step 4 - Can't understand the exception message Look at the Build Errors Troubleshooting to better understand the build mechanism and errors. Still can't figure out what went wrong. Submit a ticket and don't forget to add the following details - Name of cube Exception message Did the cube ever built successfully? Time of failure and your time zone1.2KViews0likes0CommentsHow to connect to mongodb with a password protected user
Introduction If attempting to connect to MongoDB with a user protected password, but you aren't able to connect to the database through the Simba DSN or the connection string and getting the following error: Unable to establish a connection with data source. Missing settings: {[PWD]} Resolution Please follow these steps: 1. Create a new entry in the DSN as usual 2. Open the Registry Editor 3. Add a string value with name "PWD" to the registry in the following path: LOCAL_MACHINE --> Software --> ODBC -->ODBC.INI --> <DSN connecton name> 4. Rename the new field added to the name PWD and enter the password of username1.1KViews0likes0CommentsGet number of rows in elasticubes
Though there is no direct API endpoint to get the number of rows in an elasticube, there is an endpoint to run SQL directly against a cube. Using PySense, it is straightforward to iterate through all your elasticubes, run SELECT COUNT(*) from each table in those cubes and export the results. How it works Define the csv file to output the results to Include your credentials (authenticate_by_password used below for testing purposes, you will probably want to use authenticate_by_file) The code retrieves all elasticubes. For each cube, iterate through all tables in the data model, and run SELECT COUNT(*) FROM table_name. Store the cube name, table name, and row count. Print the results to csv. Disclaimer Use this script as guidance; you are advised to test / validate / modify this script as needed. Sisense admin credentials are generally required to access the SQL endpoint. Row level data security rules are not applied. This script was built for a Linux environment. from PySense import PySense import json import csv #Parameters report_name = 'C:\\Users\\blahblah\\Downloads\\rowcount.csv' py_client = PySense.authenticate_by_password('url','username','password','Linux') report_map = {} elasticubes = py_client.get_elasticubes() for cube in elasticubes: cube_name = cube.get_title() print('Starting ' + cube_name) try: data_model = cube.get_model() for data_set in data_model.get_data_sets(): for table in data_set.get_tables(): table_name= table.get_name() cube_table_key = cube_name + '_' + table_name query = "SELECT COUNT(*) FROM \"" + table_name + '\"' sql_result_json = cube.run_sql(query, "") sql_result_dict = json.loads(sql_result_json) for key, value in sql_result_dict.items(): if key == 'values': str_value = str(value) str_value_trim = str_value[2:len(str_value)-2] report_map[cube_table_key] = [cube_name,table_name,str_value_trim] print('Finished ' + cube_name) except: report_map[cube_name + '_error'] = [cube_name,'error','0'] print('Error ' + cube_name) print('Start export') with open(report_name, 'w', newline='') as csv_file: dashboard_writer = csv.writer(csv_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL) dashboard_writer.writerow( ['Data Model', 'Table','Row Count'] ) for report in report_map.values(): dashboard_writer.writerow(report)1.2KViews2likes0Comments