cancel
Showing results for 
Search instead for 
Did you mean: 
Community_Admin
Community Team Member
Community Team Member

Question

LAG & LEAD Functions are functions that give access to multiple rows within a table, without the need for a self-join. The LAG function is used to access data from a previous row, while the LEAD function is used to return data from rows further down the result set.

perfect example of these functions in use is with the “Sales Order” scenario. A department store is trying to monitor the purchase trends of its customers; more specifically, per customer, what’s the average time it took to repeat an order? For us to do so, we must take a record in the table (Order Date) and compare it to another record in the same table. What are some ways we can do that?

In our example, we will analyze & compare the efficiency between the multiple solutions.

We will start by addressing the problem in the most common/traditional of solving a problem like this.

Answer

 

Steps:

1. Create A New ElastiCube

Add Data > Microsoft SQL Server > Connect To Server

Pull in Sales Order Header Table

Community_Admin_0-1634389201706.png

"How are we going to manipulate that table to get it into a format that we would want to work with?"

Uncheck fields that you don’t need; looking for high-level/important data

Community_Admin_1-1634389201499.png

 

Community_Admin_2-1634389201175.png

BUILD!

2. Let's Create A Custom SQL Expression

Community_Admin_3-1634389201530.png

 

Community_Admin_4-1634389201546.png

What we are getting here a single customer’s activity.  It also orders the Order Date in ascending order (from first date to last)

"For this specific customer, I want to be able to see __ amount of days between each order.

How can we compute / compare those dates?

In order to do so, we want to take the order table and join it to itself, where the customer is the same and the order date is offset to the previous or future date."

So, let’s start by leveraging the rank function to assign an order number to every date…

Community_Admin_5-1634389201521.png

The Rank function is important here because within the Customer ID, we want to rank based off the order date, group it per customer, and order by the order date.

This basically creates create a system that makes it so that it is comparing itself to itself minus 1.

This is what sets up our ranking per order.

In order for us to join the table to itself, let’s create a sub query and inside of it, put the entire Customer’s table:

Community_Admin_6-1634389201245.png

(Type Of Join:)

Community_Admin_7-1634389201478.png

What it’s getting joined on:

Community_Admin_8-1634389201397.png

Result:

Community_Admin_9-1634389201512.png

3. Now That We Have The Order Date & The Total Due, We Can Start Subtracting Them Off Each Other & Start Doing Differences

Now you would want to go back to the top; the first “Select” of the sub query and change the claims to:

Community_Admin_10-1634389201441.png

Preview:

Community_Admin_11-1634389201607.png

Now that we saw that this function is working, we can now delete our customer filter, order by filter.

Now we get all the differences across all of the customers.

Community_Admin_12-1634389201626.png

Issue With This Solution

The main issue with this solution is that it requires a lot of processing due to the two tables. Although we are really using one table, we still have two and the same data is repeated between them. This extra storage isn’t necessary, creates the long processing, and can also create confusion for the customer.

Community_Admin_13-1634389201496.png

Solution 2: Doing A Lookup Function On Itself (Own Data Set).

To improve upon the last solution, is there a way to pull in one table and do the manipulation in Sisense, without creating a second table?

Answer: Yes! What we can do is take the custom table and add it as a view to the source; that way we only import one table

Steps:

1. Duplicate "SalesOrderHeader 1", Rename "SalesOrderHeader 2"

BUILD!

For us to do a look up on this table, we need to have need to have one key to join.  The key is a combination of the customer ID + ranking, that’s what makes it unique.

2. So First, Create A New Field, Record_Rank (Int),  Input:

Community_Admin_14-1634389201424.png

BUILD

Add Data > Custom SQL Table

Community_Admin_15-1634389201355.png

 

Community_Admin_16-1634389201514.png

3. Head Back Into SalesOrderHeader 2, And Set Up The Different Fields To Create The Key

+ New Field > CurrCustomerRankKey (Text)

Community_Admin_17-1634389201497.png

Do Entire Build

Community_Admin_18-1634389201391.png

4. Let's Create A Test Function

+ Custom SQL Expression > Test Function

 

Community_Admin_19-1634389201324.png

5. Go Back To SalesOrderHeaders 2 Table, Do Lookup In There

+ New Field > PrevOrderDate (Date-Time)

Community_Admin_20-1634389201515.png

+ New Field > PrevTotalDue (Float)

Community_Admin_21-1634389201256.png

+ New Field > DaysBetweenOrders(Int)

Community_Admin_22-1634389201262.png

+ New Field > DiffOrderAmount

Community_Admin_23-1634389201438.png

Schema Build:

Community_Admin_24-1634389201605.png

Solution 3: LAG & LEAD Functions

This solution does not include a self-join

Steps:

1. Open Up SQL Server (TC SQL)

Go to same database (Adventure Works 2012) > create new query

Let’s start off the new query as:

Community_Admin_25-1634389201601.png

Now we want to dynamically compare one against the other. Instead of doing a self-join, it tells itself to look at the look at the record/number that is above or below and just pull that value in.

So it’s almost like a lookup , but it doesn’t require a join , it just requires you to tell it how to sort the data and just dynamically go up and go down and pull the field that you want in to the current record.

This is where the LAG & LEAD functions step in

To go back a record, we call the LAG function. Within the LAG function, we must define the field that we want to pull in (order date) and how many records back do you want to go (1). Next we want to group by (partition by) and Order By a type of variety and then order it either ASC / DES.

Community_Admin_26-1634389201580.png

To make our “look up” a bit more distinct we can

Community_Admin_27-1634389201561.png

Result:

Community_Admin_28-1634389201763.png

If we want to compare up against future dates, we would use:

Community_Admin_29-1634389201523.png

And its result would be:

Community_Admin_30-1634389201784.png

What we can also pull in the Total Due and calculate the previous Total Due:

Community_Admin_31-1634389201471.png

Which would require a sub query:

Community_Admin_32-1634389201598.png

To get of the first null, we would enter:

Community_Admin_33-1634389201482.png

Pull in everything from t2.* and now we can calculate our differences  (Date Diff, etc):

Community_Admin_34-1634389201603.png

As for getting this entire query into Sisense, we want to create a View

Community_Admin_35-1634389201767.png
Rate this article:
Version history
Last update:
‎02-23-2024 10:37 AM
Updated by: