# Retention Analysis

Retention analyses are critical in understanding a business’ strengths and weaknesses. Primarily, they can help a business user understand when and why some customers stay and others leave – giving a data-driven way to address these. Before you can understand why customers churn, you must first identify when they churn.

For this example, we’re going to examine a dataset that includes information on retail orders going back to 2018, and the customers that made them to better understand why some customers continue to buy and others don’t.

Step 1: Identify Cohorts

Since this is a customer-centric analysis, we will want to group our customers into groups based on similar information – or cohorts. If you have customer-level data, this can be something like a “Customer Since” datapoint. If you have order-level data, this can be from the first time each customer made a purchase, aggregated to month, quarter, year, etc.

Let’s do just that.

1. Group by unique customer ID

1. Identify each customer’s first order by creating a calculation within the previously made grouping through Min([DATE])

1. Determine your cohort granularity (e.g., the quarter/year of their first order) and truncate accordingly, naming it “First Order Cohort”

Step 2: Calculate Retention Rates

Now we will need to calculate retention rate, or the percentage of customers who are continuing to use the product. Since we want consistency in order to pick up on any patterns, we will need to calculate the periods, at the user level.

1. Outside of the previously made grouping, at the row level calculate the difference in periods between each order date and the first order, for example: DateDiff(“quarter”, [First Order Cohort], [Date])

Step 3: Create Visualizations

One of the most common visualizations for this type of analysis involves a Cohort Table output. This can easily be achieved with Sigma through creating a pivot table that is able to work on nearly unlimited rows of live data all while staying performant.

1. Create child element → pivot table

1. In pivot rows: Cohort

2. In pivot column: Quarters since first purchase

1. In values: CountDistinct([Customer ID])

1. Set aggregate: Percent of Total (row)

1. Conditional Formatting: Apply to Values, Color Scale

Another quick and easy way to visualize the same data is through an Area chart.

1. Create child element → visualization
2. In x-axis: Quarters since first purchase
3. In y-axis: CountDistinct([Customer Id])

1. In colors: Cohort

Step 4: Diagnose the why

Now that these initial analyses, nearly anyone within an organization now has the agency to begin their data exploration to understand the why. From here this data can be published to a dashboard to enable business users the ability to freely do ad-hoc analyses.

For more information on just how easy this is in Sigma, check out the articles below on Drill-down Capabilities and Row-level Security.

6 Likes