cancel
Showing results for 
Search instead for 
Did you mean: 

What influences cube size in Sisense?

dougnewton
10 - ETL
10 - ETL

Assuming that you have a properly-designed dimensional cube structure, with no many-to-many problems...  in your experience, what are the biggest factors that influence total cube size?  (And therefore Sisense system performance impact, memory utilization, etc.)

Typically, you'd have:

  • A fact table which is narrow but with many rows.  That is, most of the columns should be of some numeric data type.  Foreign keys, plus the measures.  But depending on the complexity of the metrics needed, it could be just a handful of measures, or it could be dozens.  Typically less than a dozen foreign key columns.  So from a total bytes perspective, still pretty small.  You could have 1M, 10M, or hundreds of millions of rows or more, in your fact table.
  • Multiple dimension tables, which are orders-of-magnitude smaller than the fact table, but which are quite "wide", with many attributes, often varchar.  So 200K customer rows, but each row might have 20+ character attributes.

I'm assuming that due to the nature of the underlying Monet db, that it is handling this sort of thing efficiently.  Going from 1M rows in a table to 2M rows shouldn't result in a 2x size increase in that table, assuming that there is repeating/compressible data.

Maybe to ask it a different way, which of these things would cause a big increase in space utilization?:

  • Adding many fact table rows?
  • Adding many fact table columns for measures?
  • Adding columns to a fact table for foreign keys, that do not result in a change to the grain of the table (more rows), but rather just a new dimension being present in the model.
  • Adding columns to a dimension table
  • Adding rows to a dimension table.

I'd imagine the answer to all this is "it depends", but I'm wondering what peoples' real-world experiences are with Sisense on these things.  Thank you for your thoughts!

 

2 REPLIES 2

HamzaJ
12 - Data Integration
12 - Data Integration

Hello,

From my own experiences,besides having staging/duplicate tables, our primary factor are text-fields. All fields are indexed by default and having "open" text-fields (e.g. no standard options) will consume quiet some space. Especially if the texts are long

Just adding tables with a lot of structured data does not really increase cubesize (unless its ofcourse millions of rows)

DRay
Community Team Member
Community Team Member

Hello @dougnewton 

Thank you for your question. We do have some documentation on Data Model best practices here if would like additional information. 

Please let us know if you have any more questions.

Thank you.

David Raynor (DRay)