We sat down with Dexter Horthy, product manager for Instance Insights, to ask him about the progress vendors are making in measuring what matters for their apps running in customer-managed (non-SaaS) environments.
How have you seen vendors start to adopt the instance insights? I know you have your vision, but for each one of them, there's different metrics they care about. Are there any vendors you've seen do some really interesting things so far or any surprising directions they've tried to take the insights?
That's a great question. It's been really interesting to see some of the work flows that vendors are building on top of instance insights. Things like taking the instance detail page, which can give a deep dive into what's been happening with one customer instance recently and everything about the current state and linking that into their support work flows into their customer care workflows process and into their notification and alerting and on call rotations even.
That's been cool to see. And then we published this adoption report with a couple of key metrics that were our guesses, my theories, on “these things could evolve to be really good key metrics for on prem software for customer hosted software.”
To see people take some of those visualizations and turn them into insights we didn't even expect. Using the adoption chart to help a product marketing or a marketing team understand the success of a campaign around the launch of a new version and understanding how quickly new users are adopting that new version.
It’s definitely been impressive and interesting to see how folks are taking these bits of data and turning them into new insights, new workflows, new feedback loops for teams beyond just the core development engineering team.
Who, with the vendors you've been working with, have you seen most interested or most responsible for collecting these metrics, and trying to come up with new initiatives based on them? How are you engaging with different vendors and who do you see taking charge?
It's been really interesting to find the people, as you might expect, that benefit the most from these higher level aggregate views of data are the ones who have large installed customer bases, people with 50, 100, 200 or more customers.
It's been really interesting to get to actually engage with not just the support leadership and engineering leadership who were commonly in the room, but we're getting plugged in with the analytics and analytics engineering teams.
We're getting plugged in with sales and professional services as we're able to give new and different types of insights about how the on-prem product is performing.
You recently announced custom metrics. Can you tell me a little bit in your own words? What's that all about?
Great question. Custom metrics, I'm super excited about. It's something that people have been asking us for for a while. It has been this blurry space. Basically, what we found was that beyond the things that we can do by default, which is like, ”is the app up or down? What version is it on? What was the infrastructure looking like? What's changed recently?”
What our vendors really want to know, what software vendors delivering in the customer environments really want to know are things about usage and scale. Is the customer using more? Are they using more workspaces, more builds, more projects, more active users? However they measure growth and usage to either say, “Hey, look, that's a growing customer. We need to trigger an upsell motion.”
Or “hey, this customer is in trouble and we need to save them because their usage has plateaued or is dropping.”
They have these early signals in their SaaS and they're really suffering for not having it for their customer-hosted software. We wanted to deliver that same level of insight and that level of knowing how to respond and knowing where your blind spots are.
We're really excited to roll that out for both KOTS and the SDK. Most of the insights that everybody wants are the answer to the question is the result of a SQL query across some database running in the customer environment.
How many users were active in the last week or how many workspaces have been created. Giving a framework to generate data however you want and then send it to the Replicated in-cluster components on a regular interval so that you can track those changes over time and understand how customer usage may be changing.
The people who are using it are really excited about it, and I can't wait to see what else people do with it.
Traditionally for applications, particularly Kubernetes applications, there's been a large focus on collecting metrics like performance, resource utilization, cluster availability. Why do you think now is the time for people to shift their focus to more business-oriented metrics versus straight activity metrics?
At its core, uptime performance, these kind of metrics, they do roll up and they are truly business metrics, especially in the context of if you look at something like DevOps research assessments, DORA, and Accelerate. These kind of four key metrics, they're all about half of them, changed fail rate and time to recover. They're about reliability and uptime and these kinds of things.
I don't think it's as much about shifting focus away from performance and reliability, and towards other metrics. Our thesis here is really that when you deliver software into customer environments, you are exploding the degenerate case of SaaS, where you have exactly one version running at any given time to all these different instances with all these different states.
There's new signals that can be measured, things like adoption rate and aggregate uptime, that give different pictures and different views into how your software is performing beyond the ones that are applicable for SaaS. Uptime and time to recover and things like that.
One area a lot of vendors have struggled with is applications which are installed into air gap or otherwise secured environments. Trying to get any visibility into that has been a challenge. How are you approaching that?
The hardest nut to crack for the customer hosted story in general is air gap deployments. But the bottom line is that for our software vendors – for our customers, their customers – the ones out of their customer base that want or require an air gap solution tend to be the biggest and most secure and most strategic of their customers.
We could be giving them insights for 80 percent of their customer base, and they're still flying blind with regard to the most important 20%.
There's an option to sign up to be a design partner on this right now, we're going to continue rolling this towards beta and GA. Mechanisms to be able to get at least some insight around what's going on with air gap instances by downloading anonymized data or diagnostic reports from an air gap instance and uploading into the Replicated platform. We can represent that data and those instances alongside (in the same reports) as all your online instance data.
When you introduced instance insights almost a year ago, you had about a dozen or so metrics that were on your roadmap to produce. How have you prioritized those and big upcoming features?
As a product team at Replicated, we are always striving to be more and more customer focused and understand the value we can give customers.
The key to prioritizing is: we build a bunch of prototypes. We go and talk with a bunch of different customers. We try to figure out which ones are the most valuable and which ones solve the broadest number of problems with a single sort of feature or report or metric. Then we go build that, and then we come back to the drawing board.
I do find myself in the case sometimes of wanting to build things that people didn't quite know that they wanted. While we are very customer driven, we are also making a few bets on what else might the industry benefit from that is not being directly articulated to us by customers.
Any advice for vendors who are just starting to think about business metrics for software deployed in customer environments? Like, how should they get started?
I would look at where your pain points are and where your fires are, and then look at all of the possible things you could measure and there's about 16 different things that we think are worth looking at.
Pick one or two or maybe three to focus on for a month or even a quarter and see if it helps align your team or not. Constantly figure out how to measure it, watch it, do retrospectives with your team.
If it works, continue with it. If it's wrong, try another one. Deciding what to measure is generally harder than actually figuring out how to measure it. It will probably take some iteration.