OK, I admit it. I’m recycling parts of a blog post from 2 years ago.
The Normal Distribution, or “Bell Curve”
Our brains are wired somehow to think of everything in terms of a Normal Distribution, aka the Bell Curve. It’s a trap that can obscure important patterns in data such as churn drivers.
The shape of the curve means that we think of populations of data (such as customers) as being a somewhat homogeneous group if only we could compute the average. For example, how many minutes per day “on average” a user spends in your product. Or, the percentage of customers “on average” who renew their subscription.
The problem is that populations of people and customers almost never behave in a normal distribution. Instead, the more prevalent pattern of behavior is a Power Law, or Pareto Distribution.
The Pareto Distribution, or “80/20 rule”
The Pareto distribution is also known as the 80/20 rule. Except that in online worlds, the ratio can be even closer to “95/5″. So, it’s very likely that the 5, 10 or 15% of your customers that churn each year actually have a lot in common, and differ materially from the “average” customer.
Let’s de-construct this. Start by asking yourself a few questions.
Is churn the same across every acquisition channel?
Probably not. You’ll see variances between channels such as outside sales, inside sales, online channels, and even within the various online campaigns that can beget customers (email marketing, search engine marketing, display advertising, etc.).
Is churn the same across every customer segment you sell into?
Probably not. If you’re B2B and you sell into various industries, then you probably have some industries with better retention than others. You will also see differences between small, medium and large customers according to their company size (their annual revenue and/or employee count). If you’re B2C and you sell into various demographics, then you probably have some demographic segments with better retention than others (think age, household members, etc.)
Is churn the same across every region you sell into?
Probably not. Different countries have different retention characteristics. Sometimes, even across different states or provinces.
Is churn the same across product usage patterns?
Probably not. There are probably usage patterns such as feature use and frequency that correspond to higher and lower churning customer segments. Also, think about usage across a population of users inside an account. Some accounts will have a couple of very engaged users and the rest no so much. Others may have widespread, evenly distributed usage. You wouldn’t see these differences if you were working with “averages”. Yet, these different segments will probably have different retention rates.
De-coding churn is about the search for “hot-spots”
The game of de-coding churn is about finding variances from the average, instead of focusing on the average itself. There’s probably a high-risk segment in your customer base that can be described by a combination of multiple factors including acquisition channel, region, customer demographics and usage pattern. Think of this example: “churn is higher in the U.S. through the reseller channel by 25%, than the regional average” If you knew that, what would you do? You’d probably have a conversation with the person who recruits and trains re-sellers in the U.S. about doing a better job of it. You just found a “hot-spot”.
So why is the search for hot-spots so hard? It requires all the data about your customers to be in one place and fully dimensionalized (think “data cube”).
For most companies, getting data into this type of structure is done using Pivot Tables, which is time-consuming, error-prone and hard to maintain. This is why Big Data and analytics systems are becoming commonplace.