Feedback Form

Si Blog

May 2008 - Posts

  • Customers who bought this book, also bought... Hmm... Basket Analysis.

    One day I was browsing around Amazon looking for a book about some boring technical stuff (diving into the company pocket is always a good way to start a lazy morning). 


    When I finally found the book I was looking for, I immediately got a list of products who were bought by other customers who also bought my book (try saying that three times fast).  Amazon have had this feature for as long as I can remember (it's called Collaborative Filtering), but being a bit of a data freak - that feature was always extra cool to me.

    Now coming back from fantasy land - from a Business Intelligence perspective, this reminded me of Basket Analysis.  In simple terms, Basket Analysis is the idea of finding products that will yield more profit if they are sold together - like shampoo and conditioner.  Should they be sold together for maximum profit?  Or should they be sold separately?

    I was wondering if it's possible to perform similar analysis with Prism and how long it would take me.  Soon enough, I had it up and running.  It looks something like the following image.  You can download the sample PSM file here: Basket Analysis.psm


    This PSM document lets you select a product, and in turn see what products users (who bought this product) also bought.

    -- The data --

    This is the example table we're going to analyze.  It's pretty straightforward and describes daily product sales per customer.  


    For my example, I used an Oracle table but you can download the data in Excel format and it will look and feel just the same: Basket Analysis.xls.  If you need help connecting to it, you can watch the Connecting To Data video that explains it.


    -- The Prism Data Source --

    After connecting to  the table, it looks like this:

     

    -- Step 1: Add Widgets and Data --

    This one's pretty easy. Just select the Product ID dimension and drag-n-drop it onto the canvas.  Repeat this action for the Customer ID dimension. Dropping a dimension (or filter) onto the canvas automatically creates a Drop-down list Widget. 

    To create a pivot, you select the Pivot Widget Widgets Pane and drop it onto the canvas, or by right-clicking the canvas and navigating to the New menu item and selecting "Pivot".

    To populate the pivot, select the Product ID dimension and drop it onto the Pivot's Rows axis in the Data Layout Editor.  Finally, select the Measures node and drop it onto the Columns axis - and click update.  If you wish, you can watch the Working With Pivots video to see how this is done.

    -- Step 2: Add Background to Widgets --

    Now here's the whole trick.  Prism's Background (or Scope) feature basically lets you connect one Widget to another to indicate that the results of the target Widget should be limited to the selection in the input Widget.  To activate the Background feature, just hover with the mouse of the Customer ID Drop-down Widget and click on the Background button (see image below).

     

    Selecting Product ID and clicking Apply will cause the Customer ID Drop-down to contain only those Customer IDs that intersect with the selected members in the Product ID Drop-down.

    Similarly, click on the pivot's Background button and select the Customer ID Drop-down as its background.

    The rest is just text and some formatting!

    Now you can select a product in the Product ID Drop-down and see a table with the products bought by others that also bought the selected product! 

    Did I mention that as you add rows to this Excel file, the Prism document will be automatically updated to reflect this new data?

    Have a great day,

    Elad
    SiSense

  • Handling currency conversions

    Ok, this one isn't from me, it's from one of our users who wanted to show us how he handled currency conversions with Prism.  Thank you!

    I've simplified his scenario, just to get the main point across.  Note that this user was working with Excel files, but this example is just as valid for any data source with similar data.  Remember, as far as Prism is concerned, all data sources were created equal!

    Imagine an Excel file with the following data in it:

     

    What the user was trying to do is create a simple report that shows total sales for each product.  But the report had to be in Euros, while the data was received in US Dollars.  He was getting these figures on a daily basis, so he was looking for a way to create it in such a way that he only had to change the rate of conversion between Euros and US Dollars to update his report.  He sure wasn't about to create the same report again and again every day.

    Calculating Sales in US Dollars

    You probably noticed that the Excel sheet does not contain an explicit field for sales.  So the first thing he needed to do was create a Sales In US Dollars measure that is defined by the price of each individual product multiplied by its quantity.  This is easily done by creating a Calculated Measure (more about creating these can be found here):

    Defining a Conversion Rate Measure

    In order to convert the US Dollars value to Euros, he defined a conversion rate measure.  To do this, he created a Calculated Measure whose formula is the conversion rate (at the time of writing, this value was 0.64):


    Calculating Sales in Euros

    Finally, to create a Sales in Euros measure, he simply had to create a new Calculated Measure whose formula is as follows:

     

    ... and viola!  A Sales in Euros measure to use in his reports!  Changing the value of the Conversion Rate formula to the required conversation rate will make sure his report always shows accurate results, without having to repeat any of his work.

    I'd like to thank this user again for this nice and useful example.

    Elad
    SiSense 

  • Showing the most recent data in a Dashboard

    Prism dashboards often contain a Date Picker or a Drop-down List that affects the entire dashboard.  This is very useful when you wish to go back and forward in time, and see the figures for the selected time period(s).

    Usually, when I refresh the entire dashboard, I wish to see the most recent data.  Meaning, having the dashboard present the information in terms of the most recent date available in the database. Instead of going back to the Drop-down List, scrolling down and selecting the last member - you can have Prism do this for you automatically.

    Right clicking on the Drop-down list and navigating to the Preferences menu gives you quite a few interesting options to choose from.  One of which is the "Automatically Select" option (see image below).  Choosing "Last Member" will have Prism select the last member in the Drop-down List every time it is refreshed.  So when a new date comes in, it will be automatically selected and your dashboard will present the most recent data. 

    You can use this option even when you don't wish the dashboard users (if it doesn't happen to be you) to change dates, but rather just see the most recent data.  In this case, what I do is just hide the Drop-down behind one of the other widgets.  That way, the dashboard presents the most recent data, and the dashboard consumer cannot change that selection. 

    Elad
    SiSense 

     


  • Using "Actions" to reduce database workload

    A typical dashboard or report has several widgets that require query execution in order to present up-to-date results.  Some widgets may need to be updated frequently, some on a daily basis, or on a weekly basis - and so on.

    After you are done creating a dashboard, you can easily refresh it (by clicking F5 or clicking the Refresh button).  If you're information hungry like me, you find yourself doing that quite often.

    If your dashboard is created over an Excel or CSV file, this is not really a problem.  But if it's over a database server, these refreshes can take a while - depending on the size of the tables and network traffic.

    At some point I found myself staring at a dashboard with 56 widgets on it, each of them executing a different query against 3 different database servers down in our basement.  Taking into consideration that I am not the only one craving non-stop information in the organization and the fact that I didn't want to wait 10 minutes for my dashboard to refresh - I turned to re-organize my dashboard a little bit.

    I realized that most of the widgets in my dashboards could be refreshed just once a day. There were about two groups of 5 widgets each I need to refresh at more frequent intervals.  So, I turned to the Widgets tab on the left of the screen, scrolled down until I reached the Actions section and dropped 3 "Refresh Widgets" action buttons onto the sheet.

     

     
    A "Refresh Widgets" action is exactly what it sounds like.  It refreshes widgets.  Right clicking on it lets you configure which widgets will be refreshed once the button is pressed.

    By adding three separate action buttons and configuring the widgets they refresh, I was able to control the amount of queries my dashboard generated every time I needed to see updated results.

    Elad
    SiSense

  • Analyzing website user activity, disregarding your own company and/or fake users

    Hi everyone, 

    Since the Web 2.0 boom, many of our users are companies that have websites which are the foundation of their business.  These users often use Prism to analyze user activity on their site.

    The table(s) they use to perform this analysis usually hold a unique identifier for a user (the email, for example), the company they belong to, etc.  With Prism, It takes exactly 10 minutes to create a live dashboard that tells you, in real-time, exactly what you need to know about user activity.

    However, a common problem these types of users usually encounter is that their table(s) contain a bunch of users that throw off their activity statistics - users created for QA purposes, fake users (users providing a fake email, for example), etc.  This scenario has a very simple and effective solution.  But before we get to it, I'll mention a few key features of Prism you should be aware of when tackling such a problem.

    The Exclude Filter 

    The "Exclude" filter is very simple.  It accepts a static list of members as well as results of other filters and returns all members that do not match these values.  For example, if I create an Exclude filter over the Company dimension and specify "SiSense", this filter will return all company values that are not SiSense.  We'll call this filter "Non-SiSense Companies".

    Filtering By Label

    Much like it sounds, the "By Label" filter returns members that match a text-based criteria.  Creating a By Label filter over the Email dimension with the criteria "Doesn't End With sisense.com" will return all emails that do not end with "sisense.com".  We'll call this filter "Non-SiSense emails".

    Filtering By Attribute

    The "By Attribute" filter is in a sense the exact opposite of the Exclude filter.  In simple terms, it returns all members that intersect with the filter's defined criteria.  For example, creating a Filter By Attribute over the Email dimension and specifying the "Non-SiSense Comanies" filter as its criteria will return all emails not associated with the company "SiSense".  We'll call this filter "Non-SiSense Company emails".

    Now back to the problem at hand...

    Ok, so let's take a look at what we have so far.  We have a filter that has only emails not associated with the company SiSense and another filter that has only emails that do not end with sisense.com.

    You may ask yourself why I created these two filters that apparently return the same members.  Well, they will not necessarily return the same members.  It all depends on whether every person with a sisense.com email also entered SiSense as his or her company.

    Regardless, whether we just pick one of these filters, or use an "Intersection" to combine them both into a single filter - let's call that final filter "Non-SiSense Users". 

    Prism Filters are just as valid anywhere a dimension is valid.  This include on a pivot axes, in a formula, in a Drop-down List, etc.  So instead of using the Email dimension in my dashboard, I will use the "Non-SiSense Users" filter. This will instantly cause my dashboard to reflect data relevant to non-SiSense users only!

    Using filters in this way has two significant advantages, apart from focusing on real data and filtering out the garbage:

    1. This filters will always return up-to-date results.  If tomorrow a new SiSense employee is hired, he will automatically be added to this filter (either by his company or his email).
    2. These filters can be modified at any time.  I can modify the "Non-SiSense Companies" filter to include the company "SiSense Ltd" as well, without redoing any of my work and have it automatically update all filters, formulas and widgets that use it.
    And to top it all off - any modification I make to these filters will automatically be reflected in the dashboard I created!  I don't have to think of everything in advance, which at least for me is a very difficult task :-)

    Elad
    SiSense 

     
    For more information about filters, click here

    For more information about using filters in formulas, click here.


  • Data Field Types and the Dimensional Model

    How does Prism decide what the data type of each field is when I connect to a data source?

    This is a question that keeps popping up, usually because it has direct implications on the type of analysis you can later perform on these fields.  More specifically:

    • Numeric Fields can act as regular dimensions as well as measures, with any type of aggregation (Sum, Avg, Min, Max, Count, DistinctCount)
    • Textual Fields can act as regular dimensions as well as measures, limited to the Count and DistinctCount aggregations.
    • Date/Time Fields can act as Time dimensions, Regular Dimensions and measures, limited to the Count and DistinctCount aggregations.

    Identifying Database Table Fields

    Most serious databases such as SQL Server, Oracle, MySQL etc provide detailed schema information about the tables they hold.  When connecting to these types of sources, Prism simply asks the database what data type each field is configured to.

    Identifying OLAP Cube Fields

    OLAP cubes are handled a little differently than other data sources as they are already structured within a dimensional model.  When connecting to an OLAP cube, Prism correlates a dimension for each cube dimension, and a measure for each cube measure.

    Identifying Excel Range Fields

    In the case of Excel, identifying field types is a little more tricky.  This is because the creator of the Excel file can mix up different types of values in different types of formats within the same field.

    In order to determine the type of field, Prism will search for the first non-empty cell in the relevant fields (after skipping the title row) and evaluate two things:

    1. The formatting applied to the cell.  If the cell is formatted in a way Prism recognizes (Date/Time formatting, currency, numeric value with two decimal places, etc), the field type will be determined according to that format.
    2. If Prism does not recognize the applied formatting, it will try to figure out the data type according to the value itself.

    This methodology is not 100% accurate, but it usually is.

    However, since Prism will get confused sometimes (for example, if the first non-empty cell holds an irrelevant value that does not match in type with the rest of the field values, or if you have more than one title row), Prism allows you to define all types of dimensions/measures on any field in Excel.  If during the synchronization process (where Prism brings the Excel data into its own belly) a value that does not match the dimension/measure configuration is encountered - You can either fix this value, or skip the row it belongs to.

    Elad
    SiSense

  • Welcome to SiSense Blog!

    Welcome everybody!

    The SiSense Blog (SiBlog) is finally here.

    We get a lot of requests about topics to address here.  We will do our best to cover as many as we possibly can.

    Feel free to comment on any post (you must sign in before you can do that).

    Be sure to check back periodically, and keep those ideas coming!

     

    Elad
    SiSense R&D