The New Notebooks from Sisense

Sisense Employee

02-03-2022

This blog was created by Vidisha Vedvyas, Senior Product Marketing Manager at Sisense, and Pat Bhatt, Director of Product Management at Sisense

Introduction

One of the most exciting advances in the Analytics space is the introduction of Notebooks by Sisense. Notebooks allow analysts to type SQL and generate charts from the outputs for rapid and agile exploration. SQL data may be ingested by Python or R code blocks to process the data for any number of procedural applications – from simple business applications to advanced feature engineering or to Machine Learning (ML) and Artificial Intelligence (AI)!

This article will explain the basic features of Notebooks, and contextualize how Notebooks operate in the larger Sisense platform.

Notebooks

Notebooks allow you to directly connect to several supported data sources. Popular repositories include Redshift, Snowflake, My SQL, Big Query, among several others. Once a connection is established, the repository may be queried with that repository’s native SQL. Analysts can then open multiple SQL blocks and run multiple queries for any form of data orchestration. This flexibility with processing and combining data using SQL also extends to Python/R. For example, multiple Python (and very soon R) blocks may be opened for processing the outputs from SQL, allowing the analyst to perform complex and advanced ad-hoc analysis and build powerful decision-making applications for business users. Each SQL block references previous SQL blocks for sub-querying, joining, or performing other forms of processing. Likewise, each Python block can reference previous Python blocks and any previous SQL blocks.

Notebooks also level up data visualizations. With notebooks, analysts can create multiple/side-by-side charts from any part of a query within a single Notebook for easy comparison, using chart types such as a bar or pie chart. With this flexibility to visualize data, analysts now have the tools to create effective and compelling data stories.

Capabilities of Sisense Notebooks designed to empower analysts to solve important business decision problems include:

Folders. Notebooks may be organized into separate folders and sub-folders. This allows analysts to keep track of various business problems and implementations in a manner that best suits their preferences and workflows.

Sharing. Notebooks may be shared with or without the ability for the recipient to be able to view the SQL and Python code for easy collaboration. Including SQL and/or Python code enables analysts to share their work with other fellow analysts for a second opinion. On the other hand, sharing Notebooks without the code gives analysts the option to share data with business users.

Cloud Integration. Notebooks allow cloud resources to be accessed through Python. For instance, AWS Lambda functions may be called, or documents on S3 directly opened and processed. The ability to use libraries such as boto3 opens the door to significant versatility with Python.

Machine Learning and AI. Python opens the door to implementing, training, and deploying machine learning models. Notebooks support libraries such as scikit-learn, Keras, Theano, among others. Furthermore, additional libraries may be installed on a per-session basis using the pip install commands, providing flexibility to do more advanced analytics beyond the standard packages. Models may be trained and stored in cloud storage such as S3 for use and application with new data at the time of need.

Understanding Sisense Models

While Notebooks is a code-first and code-only solution, it does not require the analyst to first prepare dimensional models prior to using Notebooks. In fact, Notebooks allow the analyst to directly connect to data sources and to use those results for further processing. Notebooks operate independently of the dimensional modeling workflows. In a future release, charts generated in Notebooks will have the ability to be published to dashboards. Stay tuned!

However, there are significant advantages to creating dimensional models and utilizing those models for dashboarding. Currently, there are two types of models supported by Sisense; the live model and second, the ElastiCube model.

Live models are “hot” materialized views. Essentially, live models are a view to a desired data source and a set of data tables, either in a transactional repository such as a SQL Server, or any cloud data warehouse, such as Redshift. Dashboards and charts may be designed using the live model. However, whenever data is retrieved, it is retrieved straight from the model repository and the data is not stored anywhere within Sisense. The biggest advantage of using live models is that any SQL code that will likely be used repeatedly, may be materialized and charts developed on those views. For small datasets or cloud data warehouses, performance can be high. However, on the flip side, one disadvantage of using live models is performance degradation if the data source is slow, like in a traditional data warehouse, or if the volume of data is very large.
The second model, or ElastiCube, overcomes the performance issues of live models. Unlike the live model, the ElastiCube allows arbitrary data sources to be materialized within a proprietary Sisense cache for in-memory query processing. All subsequent dashboards and charts refer to this internal data cache instead of the origin database, increasing performance.

The advantage of the ElastiCube is that it also features periodic refreshes of the materialized data. If for example, an analyst starts work every morning at 9:00 am, the data may be refreshed for them by 7:00 am so that the data is as recent as it can be. The analyst also has the option to schedule the ElastiCube to refresh as often as is required by the business. Please note, in cases where the data must be in real-time, Sisense’s Live Model would be the preferable option.

Build to Destination

A game-changing capability being launched by Sisense is Build-to-Destination (B2D). With B2D, users can import data from multiple sources, mesh it together in a single model and materialize these complex data relationships to their cloud data warehouse of choice, such as Redshift, Snowflake, or BigQuery, in addition to the ElastiCube. Why is this important? In cases where data volumes are large, high performance is required, and the analyst does not want to spend resources on the ElastiCube, B2D can operate on the desired cloud data warehouse.

Conclusion

Notebooks is an exciting new feature that will transform the way that you approach analytics in Sisense! Notebooks introduce code-first analytics working alongside model-based business intelligence, enhancing your ability to scale the delivery of analytics and better collaborate with other analysts and business users. Code-first may reside as a standalone development environment without requiring the analyst to create dimensional models. By furthering your SQL analysis with Python and R, Notebooks also opens the door to advanced analytical applications, including Machine Learning and Artificial Intelligence.

Updated 03-17-2022

Version 6.0

Sisense Employee

Joined October 12, 2021

View Profile