Hi Shanah,
Below you will find the steps and recommended actions to take to improve Elastic Cube build performance:
To optimize Elasticube performance there are two focus areas:
- Decreasing Build Times
- Optimize Queries
In order to optimize the build time, reduce:
Many fields, long strings:
- Don't import long string fields if they will not be used in the model
- Always question the need for columns with long strings before adding them to the model (URLs, very long comments)
Many dates:
- Removing time-based data that is not needed will reduce build time (ie. don't import old data if you don't need to)
- Consider the date range in the requirements of the dashboard and data model
- Import a Dates File instead of a Custom table to create a date dimension
Use the source database when possible:
- Create views to replace custom tables and import the View
- Filter out irrelevant data (history, in-active, etc.)
- Customize the query when adding the data to the Elasticube
Optimize custom tables
Avoid Processing Power and Time Expensive Operations:
- Replace UNION with UNION ALL when possible
- Left and Right joins
- Consider lookups
- Filter data within the table
Avoid Redundant Operations:
- Consider sub queries
- Avoid SELECT *
- Order by
In order to optimize the queries in the ElastiCube, do the following:
Consolidate:
- Look up “translation” tables
- Avoid unnecessary joins
- Consolidate Facts
Calculate custom columns:
- In large data sets, it may be significant
Sum and DUPCOUNT are faster than count
Joins on indexed fields:
- Check for casting in custom tables
- Cast fields in source tables instead of using casting functions
Join on numeric fields:
- Numeric dates
- Join on Date with no Time component
Surrogate Keys: