Understanding custom code tables: why “infer from notebook” fails and when to avoid it [Linux]

Custom Code Tables allow you to generate ElastiCube data using Python and Jupyter notebooks. However, confusion often arises around how data is actually ingested and why the Infer from Notebook option can fail in real-world scenarios. This article explains how the feature works internally, why inference can be fragile, and when a file-based approach is recommended. Applies to: Sisense for Linux (Cloud and On-Prem)

Step-by-step guide

How custom code tables work (high-level)

When an ElastiCube build runs a Custom Code Table:

Sisense executes the notebook.
The notebook must return a pandas DataFrame at the end of execution.
Sisense temporarily serializes that DataFrame to an internal CSV file.
The CSV is ingested into the ElastiCube.
The temporary file is deleted after the build completes.

Important:
Regardless of configuration, only the returned DataFrame is ingested. Displaying a DataFrame in the notebook UI does not load data into the cube.

Option 1: “Infer from Notebook” (schema inference)

What it does

Runs the notebook.
Reads the returned DataFrame.
Automatically infers column names and data types.
Saves the inferred schema to the table definition.

This is intended to simplify setup by avoiding manual schema definition.

Why it can fail:

Schema inference is fragile and commonly fails with:

Mixed data types in a column
Null-heavy columns
Date / datetime fields
Large datasets
Nested or irregular JSON structures

Failures typically surface as build-time errors, such as:

NPE in the guess column type
CSV connector errors during schema guessing

Although these appear as ingestion errors, they occur before ingestion, during schema inference.

When to use

Prototyping
Small, clean, well-typed datasets
Early development only

Option 2: File-based output (recommended for production)

What it does

The notebook explicitly writes a file (CSV, Parquet, etc.) to:
/opt/sisense/storage/notebooks/output/
The Custom Code Table is created to read from that file.
The schema is fixed and deterministic.

Why this works reliably

Removes automatic schema guessing
Ensures consistent data types
Matches how most production-grade pipelines operate

Example (Python code)

# Python
output_path = "/opt/sisense/storage/notebooks/output/my_data.csv"
df.to_csv(output_path, index=False)

Note:
Even in this approach, Sisense still ingests data via an internal CSV — the difference is that the schema is no longer inferred from a live DataFrame.

When to use

Production deployments
APIs and external data sources
Large or complex datasets
Any scenario where inference errors occur

Common misunderstanding: notebook “test” or preview cells

Developers often add cells that display a DataFrame for validation during development. While useful for debugging, these cells:

Only affects the Jupyter UI
Do not impact ElastiCube ingestion
Are ignored during the build process

During builds:

Sisense may ignore designated dev/test cells
Sisense injects its own cells for parameters and final serialization

Only the final returned DataFrame (or its serialized output) is ingested.

Conclusion

Custom Code Tables rely on a single ingestion pipeline: a returned pandas DataFrame that is serialized and loaded during the build. The Infer from Notebook option affects schema discovery, not ingestion, and can fail with real-world data. Writing a file and using it as the table source removes schema inference and provides the most stable, production-ready configuration.

References / related content

Custom code for ElastiCube builds
https://docs.sisense.com/main/SisenseLinux/transforming-data-with-custom-code.htm
Community discussion: Notebook output DataFrame behavior
https://community.sisense.com/discussions/build_analytics/notebook---output-data-frame-to-be-a-selectable-option-in-cube/20266
Sisense Linux data modeling best practices
https://docs.sisense.com/main/SisenseLinux/data-model-building-practices.htm

Published 12-30-2025

CKennington

Sisense Employee

Joined August 13, 2024

View Profile

Knowledge Base Article

Understanding custom code tables: why “infer from notebook” fails and when to avoid it [Linux]

Step-by-step guide

How custom code tables work (high-level)

Option 1: “Infer from Notebook” (schema inference)

Option 2: File-based output (recommended for production)

Common misunderstanding: notebook “test” or preview cells

Conclusion

References / related content

Related Content

Preventing and understanding custom table build failures [Linux]

Part 1: Understanding The NVL/COALESCE Expression

Understanding Sisense REST API request rate limits

Understanding Java licensing for Sisense insight installation

Understanding NULL handling in Sisense [Linux-Windows]

About Best Practices

Sisense

Support

Resources