Install success rate is the next key metric for measuring excellence in distributing customer-hosted software we’ll focus on in the Instance Insights Series. In addition to time to install, surveying the rate at which you’re able to successfully complete the installation of an instance gives you insight into the quality of the packaging of your software and how prepared you and your customers are to begin the install.
A customer may encounter one or more failures during the process of installing an application, and if software vendors are only looking at the time to install, they could be missing the opportunity to prevent future failures. Additionally, time to install tends to only make sense when measuring successful installations. Successful users may be able to get software running in under 2 hours, but if 75% of the install attempts end in failure and the user gives up, valuable opportunities to improve your product, people, and processes are missed.
For the purposes of this examination, we’ll discuss a hypothetical Kubernetes application that end users install in their own environment, either in a shared cluster or in a purpose-built cluster designed to host only that application. We won’t make any assumptions about how the Kubernetes cluster is created, including cases when a vendor includes a Kubernetes installer as part of their application. While some of the notes are Kubernetes specific, most of these concepts should apply to any application that’s distributed into customer environments, including as a linux package, standalone binary, OVA, or other package format.
When beginning the process of surveying your install success rate, think about how many attempts it takes before an install is successful. What percentage of installs require more than one attempt? How often are installs successful on the first attempt?
Based on our (Replicated’s) limited anecdotal experience, best in class vendors have a 90% install success rate, averaging 1.1 installation attempts to get an instance up and running. We’re actively researching performance and working to establish more substantial data on this metric.
When defining this metric, you’ll need to become familiar with instance install statuses, and also determine what installation success means to you. What is your ideal state for your customer to be in, and how long does it take to validate that state?
While the statuses below can be recorded manually for air gap installs, the definitions provided apply specifically to online installs, as they are determined by instance activity and checking in.
While you may define a successful install differently, at Replicated, we define it as an instance that ends up in a ready state and stays there for at least 2 weeks. Defining success in this way ensures that successful installs don’t include instances that become ready but end up in a failed or inactive state soon after. This scenario results in additional effort to get the instance back in a ready state before it is considered as delivered and therefore should not be deemed as a success. You should experiment with different values for this readiness threshold and see how it affects the insights your data might surface.
Install success rate is a survey of the number of instances that are successful vs. failed, the other statuses are excluded as their end state is yet to be determined.
Example: using the install timeline above as an example, 3 installs were attempted with 1 success and 2 failures. For this individual customer, the install success rate is 33%.
If the vendor had 9 additional installs that were successful on the first attempt, the total install success rate would be 10 successful installs / 12 total attempts = 83% install success rate, or 1.2 attempts per install.
Knowing, or being able to estimate your current install success rate is important for being able to set a realistic goal. If you aren’t currently measuring install success rate, but you have a team that assists your customers with installs, they may be able to estimate, on average, how many install attempts it takes before one is successful. While estimation and general experiences aren’t reliable data points, they’re a good place to start if you don’t have data yet.
There are several areas where it can be valuable to segment and evaluate install success rate to help identify specific areas for improvement. If you only look at the metric at a high level across all instances, it can be difficult to determine why your rate of success isn’t where you would like it to be. We’ll go into more depth on segmenting data in a future post, but we’ll use the example of online vs. air gap installs to help show the impact it can have on determining if there is a specific area that is negatively impacting your overall install success rate.
You can use the information gathered during the evaluation of your current install success rate to establish your goals and target specific areas for improvement.
We recommended setting both a short and long term goal as well as a specific and overall goal for install success rate. They should also be based on current performance and are impacted by your current estimated install success rate. If detailed information was gathered in the evaluate stage, you should be able to pinpoint an area for setting a specific goal while keeping it attainable. The specific goal should end up having an impact on helping you reach your overall goal.
Example:
While your primary goal is the overall install success rate, you will also want to measure this metric in subsections to ensure you measure progress towards your specific goal as well as the other areas you analyzed during the evaluate stage. Segmenting data in addition to looking at it as a whole will help you determine if you missed, met or exceeded your goal(s) and what areas may have impacted the results.
In the example chart below, you’ll see that this vendor chose to focus on the success of online versus air gap installs in addition to their overall install success rate. The install success rate is calculated by identifying the percentage of total install attempts (not including ones still in progress) that were successful. The in progress install attempts should be noted, but are not included in the calculation as it is unknown if they will end up as a failure or success at the time of measurement.
This example shows a much lower success rate for air gap installs, and this vendor should focus on adjustments to improve the success of those specific install types.
When measuring install success rate over time, you can see the impact that segments with a lower success rate have on your overall performance and how making adjustments to them can help you reach your goals.
Install success rate is highly impacted by the pre-install steps that come before you hit enter on the command line. How prepared you and your customers are to begin the installation is key to its success. The adjustments below are focused on being prepared and can help you increase the success of your installs:
Adjustments made to improve your install success rate can also have a significant impact on other metrics, most notably your time to install, and improvements made with the intention of impacting one may also affect the other. As you make adjustments, you’ll want to look at what impact any changes had on your install success rate and take them into account as you go back through the process and continue working towards reaching your goals.
At Replicated, we are making progress on improving metrics and telemetry within our product to help ISVs understand the health and performance across customer instances and help with measuring metrics, including install success rate. In addition to working on providing information on individual instance install statuses and rolling data up into customer and channel overviews, Replicated also has resources that can be used to make adjustments to improve your install success rate: