Knowledge
Note: We are actively completing a knowledge migration process. This space will grow over the next few months as this work is completed. Provide feedback and/or content requests via the Share BETA Feedback forms found below.
cancel
Showing results for 
Search instead for 
Did you mean: 

Knowledge Base Articles

Sankey Diagrams With Plot.Ly In Periscope

Sankey diagrams are a very easy-to-digest way to look at the flow of objects between different points. The example we will walk through here is a basic Sankey overview, leveraging data from a fictional gaming company. The emphasis of this post is foc...

Community_Admin_0-1635245516505.png

Text Mining With SQL

One of the recent questions I had to answer focused on analyzing text data. How have you solved this problem in the past? I was not sure how to create an optimized solution for both efficiency and completeness. My solution I came up with used a join ...

Generate Series Of Dates In Snowflake

As Snowflake doesn't have a native generate_series function, here is our solution to generating a table of incrementing dates, starting from the current date, in Snowflake. It is currently set to generate 1095 rows (3 years) of dates. select dateadd(...

Loose String Matching (Levenshtein For Redshift?)

Perhaps a bit tangential to text mining, but has anyone found a way to efficiently do string analysis in Redshift? Ideally something like levenshtein() or levenshtein_less_distance() in postgres to return string similarity. There must be a better way...

Chart Type - Radial Bar Chart In Matplotlib (Python)

Here's a script that takes a data frame with two values, the current and benchmark, and returns radial bar charts to plot progress toward a goal. You can also choose a color using the color_theme parameter that takes values 'Grey', 'Purple, 'Blue', '...

Community_Admin_0-1635242545447.png

Calculating Trimmed Means (SQL And Python Variations)

Data can oftentimes have extreme outliers, which can heavily skew certain metrics, such as the mean. One way to get around this is to use a trimmed mean. Using a trimmed mean, users will remove the top and bottom x percent of their data and take the ...

Community_Admin_1-1635242219218.png

Funnel Charts In Python

Funnel charts are a great way to represent any drop-offs in sample size throughout a series of steps. Using a little bit of Python handiwork in Sisense for Cloud Data Teams' R/Python integration, we can easily create this chart type. A common use cas...

Community_Admin_0-1635241790831.png

Confidence Interval Printout - Python

Let's say we want a printout of our confidence interval for an entire sample (Note, if you're looking for a visual of a confidence interval over time, check out the post here!) The solution here requires Periscope Data's Python/R Integration as we'll...

Community_Admin_0-1635241572949.png

Get Yesterday's Date

Often times, we want to analyze data with the date from yesterday. How do we auto-populate yesterday's date? There are a few formats that we can use! The syntax below lets us get yesterday's date with timestamp: select dateadd(day,-1,getdate()) To on...

Community_Admin_0-1635241370626.png Community_Admin_1-1635241380771.png Community_Admin_2-1635241389789.png Community_Admin_3-1635241396834.png

CASE WHEN Statements

Overview CASE WHEN statements provide great flexibility when dealing with buckets of results or when you need to find a way to filter out certain results. You can think of these almost as IF-THEN statements similar to other coding languages. Below yo...

Filters In Case When Statements

I came across an interesting use case with a customer where they had a name and ID column on a dataset: Select a filter value for name, and not id -> show results for name Select a filter value for id, and not name -> show results for id If values ar...

Test For Normal Distribution Of Data With Python

One of the first steps in exploratory data analysis is to identify the characteristics of the data, importantly including a test for distribution patterns. In this example, learn how to check if your data is normally distributed in Python with a visu...

Community_Admin_5-1634739861733.png Community_Admin_6-1634739958173.png Community_Admin_7-1634739991648.png

Weighted Vs Unweighted Averages

When summarizing statistics across multiple categories, analysts often have to decide between using weighted and unweighted averages. An unweighted average is essentially your familiar method of taking the mean. Let's say 0% of users logged into my s...

Community_Admin_0-1634739085214.png Community_Admin_1-1634739085058.png Community_Admin_2-1634739085177.png