Browsing our site, speaking to our team, or reading about us in analyst reviews, you might have noticed that we dig technology here at Sisense. That’s why when clients and prospects desire to push the limits, both in terms of data complexity and quantity, we happily oblige.
After asking what our recommendations would be for the most data to host in a single Sisense server, one newly signed client (a prospect at the time) passed us one billion transactional records and three million dimensional records to host in a single Sisense node – that’s 500gb of data to test with 100 concurrent users logging in and banging around on the server. We used a 32 CPU core and 244gb RAM cloud machine for the job, in agreement with our straightforward specs. We’ll cut to the chase and share the details from Load Impact below.
AWS Instance - r4.8xlarge (32CPU, 244gb RAM)
100 Concurrent Users
38 Max Concurrent Queries
Sisense Concurrency is defined as querying within the same millisecond
2 types of usage scenarios
50% of users returned results from the entire billion record dataset
50% of the users viewed a subset of data, simulating use by clients who see only their own data
Query response time averaged 0.1 seconds and maxed at 3.1 seconds. This represents the time for Sisense to receive a query from the web application and return a result set to the client application.
The Sisense ElastiCube RAM consumption remained stable at approximately100gb despite the 500gb+ of data loaded into disk of the ElastiCube Server.
The average CPU usage during the load test was approximately 10-20%. This is spread across all of the distinct CPU cores.
We used a tool called logz.io to analyze the server performance during the load test to aggregate logs into KPIs which we can analyze to determine the impact on the server and determine impact in production.
Here’s what those query performance results looked like across the hour-long test. To summarize, no query took longer than 3.1 seconds to return results to the web front end.
When it comes to the server usage, we passed the test in flying colors as well. Our amazing in-chip technology was on full display - we hosted 500gb of data without utilizing more than 128gb of RAM. CPU utilization during query times never rose above 75% throughout the load test, and it averaged less than 20%.
We used a tool called Load impact to create artificial users that log in and interact with dashboards to mimic production. That includes the following types of actions in Sisense:
Loading a dashboard with nine widgets
Changing filter from one account to another and from one year to two years
Filtering by clicking on context from one chart to control the others
Drilling from country town to region-level data
Downloading a .csv of the information in a Sisense widget
Switch dashboard, repeat all steps above.
The two different user types (scenarios 1 and 2 below) performed the same steps. One group, however, had a where clause appended to all their queries to limit their view to one out of the seven customer accounts. This simulates the external, OEM use case for deploying to clients to view your dashboards.
Here is a visualization describing the usage pattern over the timeframe. Across the two hours on the x-axis, the number of virtual users (VUs) is displayed on the y-axis. As you can see, the number of users ramped for 50 minutes, remained steady for 10 minutes,and then did the same thing during the second hour.
The concurrent number of queries over the two-hour test increased throughout the period of testing, as shown below. In Sisense, concurrency represents two or more users initiating a query within the same millisecond.
The data represented one billion purchases on a website, each with its own unique transaction ID. The purchases were split into three categories - planes, trains, and automobiles. Furthermore, the analysts wanted to kick the tires on Sisense’s ability to join large tables on demand. On user request, a three million record dimension table would join with that one billion record fact table to provide revenues from the fact table grouped by origin/destination combinations, contained in the dimension table.
The ElastiCube looked like this:
At the end of the day, the client wanted dashboards that tracked revenues, bookings, and average revenues per booking across time, across client types and fee types.
Here's one of the dashboards used during the testing: