Add SQL Charts from Notebooks to a Dashboard
Sisense is happy to announce added functionality that allows users to add SQL charts to a Dashboard directly from Notebooks. This update significantly reduces the steps required to share insights and eliminates task switching between Notebooks, Dashboards, and Models while maintaining data security and data continuity.3.3KViews1like0CommentsExploring RAG: A Sisense-Based Approach with BigQuery and Gemini
Exploring RAG: A Sisense-Based Approach with BigQuery and Gemini Continuing from our previous article, Automated Machine Learning with Sisense Fusion: A Practical Guide, where we discussed how Sisense and AutoML simplify complex machine learning workflows, I am now excited to introduce an advanced feature: a chatbot interface powered by Retrieval-Augmented Generation (RAG) and a large language model (LLM) like Gemini. This enhancement allows users to interact with their data in natural language through Sisense's Native Blox widget. Imagine asking questions like "How many products did I sell per category?" and receiving insightful answers instantly. This not only improves data accessibility but also bridges the gap between technical data repositories and business decision-making. In this article, I’ll cover the first step of enabling this functionality: setting up a seamless integration pipeline between Sisense and Google BigQuery, embedding table and column metadata, and preparing the data for efficient querying. Important Note: Experimental Project: This implementation is an experimental project and not an official Sisense feature. It demonstrates how Sisense’s functionalities can be utilized to build a custom Retrieval-Augmented Generation (RAG) pipeline or how you can use this example by importing the provided notebooks. Google Ecosystem: The example provided is specific to the Google ecosystem, utilizing services like BigQuery (BQ) and Gemini for query generation and processing. Associated Costs: Please note that each prompt sent to the chatbot incurs a cost, which will be billed to your Google Cloud Platform (GCP) account. This includes charges for LLM processing and database queries. Building the Foundation: Data Preparation and Embeddings To enable natural language interactions with your data, we first need to establish a robust data infrastructure. This includes creating a vector store for embeddings in Google BigQuery (BQ) and preparing metadata about tables and columns. The process is fully automated using a pre-built Sisense custom code notebook, which simplifies embedding generation and management. Dataset in BigQuery The journey begins with your dataset in Google BigQuery. The notebook retrieves metadata such as table and column names, descriptions, and schema information from BigQuery’s Information_Schema. If any descriptions are missing, the system automatically generates them, ensuring comprehensive metadata coverage. Setting Up the Vector Store in BigQuery The vector store acts as the backbone for similarity searches within the RAG system. Using the custom notebook, the system: Creates the vector store: A structured repository to store embeddings for tables and columns. Organizes metadata: Ensures all table and column descriptions are structured and accessible. Generating Table and Column Descriptions Missing descriptions can hinder data understanding. The notebook includes a Description Agent that automatically generates meaningful text for: Tables: Contextual descriptions based on schema and usage. Columns: Descriptions highlighting their role within the dataset. These enhancements ensure the metadata is both informative and ready for embedding generation. Creating Embeddings with Vertex AI To enable semantic search, metadata descriptions are converted into numerical embeddings using Vertex AI’s TextEmbeddingModel. This is facilitated by the EmbedderAgent, which: Accepts strings or lists of strings (e.g., column descriptions). Generates embeddings through Vertex AI. Handles both single and batch processing for efficiency. Efficient Embedding with Chunked Processing For large datasets, embeddings are generated in chunks using the get_embedding_chunked function. This ensures: Scalability: Handles datasets of all sizes without performance issues. Parallel Processing: Processes text chunks simultaneously to speed up the workflow. Structured Outputs: Embeddings are returned in a structured DataFrame for storage or analysis. Storing Embeddings in the Vector Store The final step is storing these embeddings in the BigQuery vector store. This ensures that: Similarity searches are fast and efficient. Metadata is always accessible for chatbot interactions. ALT text: A screenshot of a data table displaying various columns such as "table_schema," "column_name," "data_type," "source_type," and several other attributes. The table shows sample values and metadata related to a database structure, organized in rows and columns. How the Chatbot Interface Works Now that the foundation for embeddings and metadata storage is set, let’s explore the chatbot interface in action. Imagine opening the chatbot in the Blox widget and asking a question about your dataset. Within moments, the chatbot responds in natural language, providing actionable insights. But what exactly happens under the hood to generate this seamless interaction? RAG Notebook and the Chatbot Workflow The chatbot operates using a pre-built RAG custom code transformation notebook, which orchestrates the end-to-end process. With this notebook, the entire pipeline—from understanding the query to generating the response—is automated. The notebook uses multiple specialized agents, each responsible for a specific task, ensuring precision and efficiency at every step. SQL Query Builder Agent BuildSQLAgent This agent specializes in constructing SQL queries for BigQuery. It uses the LLM to analyze the user’s natural language query and matches it with table schemas and column details from the vector store. It outputs a fully formed SQL query tailored to the user’s dataset and question. SQL Validation Agent ValidateSQLAgent The ValidateSQLAgent validates the SQL query before execution using a Large Language Model (LLM). Validation ensures the query adheres to essential rules, including: The presence of all referenced columns and tables. Proper table relationships and join conditions based on the schema. Formatting and compliance with BigQuery-specific SQL standards. Validation occurs during the debugging process, specifically within the DebugSQLAgent, to identify potential errors before attempting a dry run or execution. It provides a detailed JSON response: If valid, the process moves to the next step (dry run or execution). If invalid, the DebugSQLAgent uses the error details to refine the query iteratively. SQL Debugging Loop Agent DebugSQLAgent This agent runs the debugging loop to refine queries that fail validation or execution. The process includes: Validation: The query is passed to ValidateSQLAgent to check syntax, schema compliance, and structure. If valid, the query is ready for execution. Dry Run: If validation passes, the query is tested using a dry run via the test_sql_plan_execution function to confirm execution readiness. Execution: Once validation and dry runs succeed, the final query is executed using the retrieve_df function, which returns results as a DataFrame. Iterative Refinement: If the query fails either validation or the dry run, the DebugSQLAgent uses the LLM to troubleshoot and generate an alternative query. The loop repeats until a valid query is generated or the maximum debugging rounds are reached. This agent ensures the final query is: Correctly structured and semantically valid. Optimized for performance and aligns with the original user intent. Response Agent ResponseAgent This agent translates the SQL query results into natural language. It bridges the gap between technical SQL outputs and user-friendly communication. By combining the query results with the user’s original question, it crafts a clear and relevant response. How the Workflow Executes Here’s the step-by-step process for generating a response: User Query Embedding The EmbedderAgent converts the user’s natural language question into a numerical embedding. Using BigQuery native vector search, the system retrieves similar embeddings from the vector store created in the first phase. Schema and Content Retrieval Based on the retrieved embeddings, the system fetches relevant table and column schema details from the vector store. SQL Query Generation The BuildSQLAgent uses the retrieved schema details to construct an SQL query that aligns with the user’s question. SQL Validation and Execution The ValidateSQLAgent checks the generated SQL for accuracy and potential errors. If the SQL passes validation, it is executed against the BigQuery database. Debugging (if needed) If the query fails or generates an error, the DebugSQLAgent refines it iteratively until a valid query is produced. Response Generation The ResponseAgent uses the query results and the user’s original prompt to generate a natural language response. If the system fails to generate a valid response, it communicates the issue to the user. Conclusion By combining the foundational embedding process with this RAG-powered workflow, the chatbot transforms how users interact with their data. From seamless SQL query generation to delivering natural language responses, the system exemplifies the power of Sisense Fusion and advanced AI tools to simplify data-driven decision-making. As always, please reach out to your Customer Success Manager (CSM) if you would like to implement this in your own environment.2.2KViews1like0CommentsBuilds fail because libraries used in Custom Code tables are not installed
Use case You have specific libraries in Notebooks that have to be installed to run Custom code tables. However, these libraries can be uninstalled automatically (during upgrade or other system changes). Solution Add the following Python code into the first cell of each Custom Table Notebook to check if necessary libraries exist and install them if not, before executing the code: Adjust 'libraries' values to the libraries you need to check. import importlib import pandas as pd # List the libraries you want to check libraries = ['numpy', 'pandas', 'matplotlib', 'openpyxl', 'scipy', 'pyarrow'] # Initialize an empty list to store results results = [] for library in libraries: lib_found = importlib.util.find_spec(library) if lib_found is not None: results.append({"library": library, "status": "Previously Installed"}) else: results.append({"library": library, "status": "Newly Installed"}) print(f"{library} NOT FOUND. Installing now...") !pip install {library} Related Content: Sisense Docs: https://docs.sisense.com/main/SisenseLinux/transforming-data-with-custom-code.htm Sisense Academy: https://academy.sisense.com/notebooks-course386Views1like0CommentsAllow modifying columns width in notebook results
There is no option to modify table columns width in notebook results. This means that if I query a dataset of accessed URLs, where the end of the line matters most I cannot see anything beyond the domain name. Optional workarounds are hovering row by row or exporting all results to CSV. Both are cumbersome. Modifying columns width seems to be a common notebook functionality in similar solutions, so it would be nice if Sisense could support this as well.246Views0likes0CommentsDashboard Filters Applied to Notebooks and Non-Notebooks
Currently Notebooks provide a powerful opportunity to access and visualize data utilizing SQL and to add these to fusion dashboards. Fusion's primary analytics environment allows for powerful dashboard-wide filters. Currently dashboard filters do not apply to charts or tables which are created through notebooks even if the notebook is pointed to the exact same data table and column as other dashboard widgets. Please develop the ability for dashboard filters to filter both fusion dashboard widgets and also notebook-based widgets.892Views2likes1CommentLeveraging Sisense Notebooks with Compose SDK
Discover how to enhance your analytics integration with Sisense Notebooks and the Compose SDK. This article explores a code-first approach to creating custom datasets and visualizations using SQL, empowering data teams to leverage their existing skills. Learn how to seamlessly embed tailored insights into customer-facing applications, streamlining workflows for both analysts and developers. Unlock the potential of your data with Sisense, delivering impactful, personalized analytics that meets your organization’s unique needs.590Views1like0CommentsResolving SystemError and Memory Allocation Issues in Notebooks
Resolving SystemError and Memory Allocation Issues in Notebooks Summary This article addresses common issues encountered when running notebooks, specifically SystemError and memory allocation errors. It provides step-by-step instructions to resolve these issues and ensure smooth notebook functionality. Main Content Step-by-Step Instructions to Resolve SystemError Identify the Failing Line: Review the notebook to locate the line in the configuration cell that is causing the SystemError. If the line has been commented out, uncomment it to identify the exact issue. Check for Syntax or Logic Errors: Ensure that there are no syntax errors in the notebook code. Verify the logic of the code to ensure it aligns with the intended operations. Run the Notebook Again: Execute the notebook to see if the SystemError persists. If a different error appears, proceed to the next section. Troubleshooting Memory Allocation Issues Identify the Memory Allocation Error: If the error message indicates that the output exceeds the resources allocated to the Notebooks Compute Instance, note the memory allocation details. Increase Memory Allocation: Access the Configuration Manager: Navigate to Admin > System Configuration / System Management > Configuration. Select the 5-click config menu > Warehouse-Service > notebooks.compute.defaultMemoryInMibs. Increase the memory allocation from the default value (e.g., 256 MiB) to a higher value (e.g., 1024 MiB). Verify the Changes: After increasing the memory allocation, run the notebook again to ensure the error is resolved. Additional Tips Testing with Real Data: When testing notebooks with real data, ensure that the sample size is manageable within the allocated resources. For larger datasets, consider increasing the memory allocation proportionally. Feedback and Support: If the issue persists, contact support with detailed information about the error and the context of its occurrence. Providing feedback on your experience can help improve the support process. Check out this related content: Academy Documentation521Views0likes0CommentsUtilizing REST API with Notebooks to export tables
I am trying to grab the tables I created in Notebooks or export the graphs I created in Notebooks to a Dashboard I created. From the documentation, there is supposed to be a button that allows me to pick a dashboard to where I want the charts to be exported to, but this does not show up for me. Environment: Multi-tenant, tenant admin, Lunix, 2024 version. Since this option isn't available, I am trying to utilize the REST APIs to extract the tables and then upload them to my data model. It requests the Tenant ID, but there isn't much information on how to grab this ID. Look for a solution to how to extract either the tables and upload them to my data model or table the charts created and upload them to my dashboard so I can have more versatility with them as Notebooks charts are limited. DRaySolved1.1KViews0likes2Comments