Mining for gold: 3 sources of product usage data

Update: the vendor landscape described here has evolved considerably since writing this. But the basic concepts still apply.

In a prior blog, I talked about ways to drive revenue growth with product usage data. Including reducing churn, spotting up-sell opportunities, increasing trial conversions etc.

In a follow-up blog, I discussed the four dimensions of product usage that you need to analyze:

  • Frequency: how often a user interacts with your product
  • Features: which features are used
  • Volumetrics: how often a feature is used, or a data record is created
  • Configuration settings: ways a user configures the app to suit their needs

By now you’re probably interested to unlock the value of product usage data for your company. Which begs the question: where do you get this data in the first place?

There are three sources of usage data. Generally speaking, usage data comes from:

  1. Clickstream data
  2. Log events
  3. Database queries

We’ll review each in succession.

Clickstream data

Clickstream data is generated by an end-user’s interaction with your product interface. For example, logging in to a web browser. Or performing an action in your mobile app.

This type of usage data can help you understand frequency of use (usually but not always) and coarse-grained feature usage, depending on how you “tag” usage events from the browser.

Several products are good sources of clickstream data for your browser-based app:

  • KISSmetrics
  • Mixpanel
  •, a popular Javascript plug-in that feeds lots of other clickstream tools

An aside: why not Google Analytics for product usage? Unlike these other tools, Google’s terms of service prohibit you from collecting user-specific data of any sort. Thus, it becomes very difficult to understand which user or even company is accessing your application and how. Stick to the products above.

Getting usage data for Mobile apps can come from several of the packages above, plus some that are purpose-built for mobile:

  • Flurry. Note that Flurry monetizes by driving ad placements, so it’s not for everyone
  • Tapstream

Pros of clickstream data

  • Easy to deploy
  • Little to no engineering team involvement
  • Good for basic engagement metrics

Cons of clickstream data

  • Not detailed enough to reveal important features and user segments
  • Brittle to maintain

Log events

Depending on how well your engineers have instrumented your server-side code, they may be generating usage data as log events. For example, your Web or App Servers might be generating Apache logs that contain details about the user’s actions, especially feature usage.

Pros of log data

  • Log events can be very specific and accurate in depicting feature usage, compared to page-level clickstream data from a browser

Cons of log data

  • Log events need to be parsed, which can be challenging if you’re doing it for yourself in a database or Excel file
  • Log events can contain types of events that aren’t meaningful to you, because they describe system behaviors not user behaviors (think error logs)
  • Your engineers need to be involved to do a good job of log instrumentation

Database queries

On the server side of your application is some sort of database (such as MySQL, Hadoop, MongoDB, etc.). Each user action may have a corresponding “transaction” or record in the database that forms a picture of usage. For example, if a user started a new “Project” in your Project Management app, then that record was created in the database at a specific time by a specific user.

These records can be queried from your database to produce events or counts of events (such as daily summaries).

Other functions of your application may behave as “set it and forget it” where your app is automating processes without requiring a user action each time. In this case, usage events are generated even though the user hasn’t logged in lately. Database queries may be the only means to collect this type of usage event.

Pros of database queries

  • The most comprehensive picture of usage

Cons of database queries

  • Your Engineering or Operations team has to retrieve the data for you


There’s no “silver bullet” to getting usage data.

In many companies, you have easy access to one type of usage data and not the others. And, no single source of usage data depicts your application’s usage in a comprehensive way.

Think of it as a journey. It’s best to get started with the data on hand. As you learn to make sense of it, and drive business results, you’re armed with the justification to get other types of usage data. Sometimes this means further instrumenting your product. In other cases, it’s simply about getting another team to help you access their data.

But the journey is well worth the effort, because usage data is the foundation of understanding the health of your customer.