Skip to main content
Interania

Calculating user churn using Relative Time Window metrics

3votes
9updates
409views

As a Customer Success Manager at Interana, I'm frequently asked the best way to calculate user churn.  It is my belief that the best way to do so is to use a combination of our "Relative Time Window" Per-Actor Metrics and our Ratio Measures to create a representation of churn easily visualized in time view.  This will allow you to track churn over time, and easily compare churn across different groups of users.

To demonstrate this calculation, I'll be using a very simple data set for the sake of clarity.  It will follow 5 users over 7 days (April 1-7, 2016).  The following table describes the activity for those 5 users over that 7 day period (X marks activity):

  4/1/2016 4/2/2016 4/3/2016 4/4/2016 4/5/2016 4/6/2016 4/7/2017
John X X X   X X X
Felix   X          
Sarah X X     X X X
Michael X X X X X X X
Dana   X X   X X  

 

I've created some test events in Excel (these are UTC timestamps, so in my PDT cluster 2016-04-02 will be 5:00PM, April 1st):

Transformed these events into JSON for import using csv_load in our Transformer Library:

If you're interested in doing something similar, here's the Transformer Library config:

[
["decode"],
["csv_load"],
["json_dump"]
]

And imported the events into Interana:

Now it's time to calculate churn!

Step 1: Create a Relative Time Window metric

Now that we have our dataset, it's time to create the Relative Time Window Per-Actor Metric that will allow us to calculate churn!  To do so, I simply create a per-user metric that calculates the count events for each user, with a few additional options:

You'll notice above that I used a "time override" for this metric.  As you can see in the upper right hand corner of the previous screen shot, this will create additional "relative time window" metrics for our use in explorer view.  What this metric does is, for each user, we count up the number of events that happened on the current day- but also count up the number of events that happened on the next day, yesterday, and 2 days ago.  If the time window is different (one week, perhaps), we will calculate the events for that user seen in the current week, next week, previous week, and week before the previous week.  These metrics are stored as {metricname}_curr, {metricname}_next, {metricname}_prev, and {metricname}_prev2T and accessible wherever normal Per-Actor metrics are accessible in explorer view.

Step 2: Create a Ratio Measure using your new Relative Time Window metric

Our next step is to use our new metric to actually count our churn ratio!  Our churn ratio will be composed a numerator and a denominator:

  • Denominator: How many users generated an event today?
  • Numerator: Of the users that generated an event today, how many of those users did not generate an event tomorrow?

We can define this exact numerator and denominator in a ratio metric by using the relative time window metric that we just created as follows:

Specifically, the above will divide the unique count of users that acted in the current time window but not the next time window by the total unique count of users that acted in the current time window.

Step 3: Use the Ratio Measure to understand user churn

Now it's time to reap the rewards of our labors- let's investigate our churn ratio measure in explorer view! Here's what it looks like:

Now, let's mouse over a point to see how it is calculated:

For this point (representing the 1 day time bucket ending at 12:00AM 4/3, or 4/2), we have a churn ratio of .4 (2 / 5).  When we look at the original dataset, we can see that all 5 of our users acted on 4/2 - but only 3 of those users acted on 4/3.  Since 2 out of 5 of our users did not return, we have a churn ratio of .4.

Step 4: Understand churn across user segments

Now that we have our churn measure, we can use the rest of the explorer functionality to answer our high-value questions.  For instance, we can now group by our "gender" attribute to compare churn between our male and female userbases:

Of course, this graph would look a little bit prettier if we had a larger sample size, but that's the price we play for simplicity!  What we can see here is that, of the 3 males that generated an event on 4/2, 1 of them "churned" on the next day.  Upon checking the raw data, we can see that this calculation is correct!

What's next: Apply this to your own data

By now I'm sure you're thinking about all of the insightful ways that you can apply this how-to to your own data.  I find that the highest value analyses involve investigating these measures over different user groups, like above.  Know that we aren't limited to just using group by here- I find that using compare groups to compare measures like these against different groups of users can be extremely valuable.

Here's a few more example questions you can answer to get you started:

  • Do we see a higher churn rate for our users that do not use the tutorial functionality of our site within their first week?
  • Did our overall churn rate go up after our last update?
  • Is the churn rate of our users referred from {insert site here} higher than that of other sites?
  • What enterprise customers are we seeing the highest churn rates in?

Thanks for reading!  Feel free to comment or ask me additional questions.

  • Was this article helpful?