Replicated Runthrough: Troubleshoot 101

Treva Williams
 | 
Apr 29, 2021
Replicated Troubleshoot

Greetings and welcome to part two of the Replicated Runthrough series. We’ll will be covering Troubleshoot.sh, a Kubernetes plugin for – you guessed it – troubleshooting host and cluster-level issues. Part of a suite of Kubernetes cluster maintenancetools by Replicated including (but not limited to) Outdated, Unfork, and KOTS,Troubleshoot is a highly customizable, powerful, yet lightweight framework for collecting, and analyzing cluster information that is also intelligent enough to automatically redact sensitive cluster information like passwords, secrets, and keys. 

Comprised of two separate CLI components – preflight and support-bundle – troubleshoot is incredibly useful for streamlining the application delivery process via preflight checks by identifying host incompatibilities early. 

The second component, support-bundle, helps to eliminate “language barriers” between clients and support teams by quickly and securely extracting as little or as much clusterinfo as needed and compressing said info into a lightweight and portable .tar archive. As an added benefit, both preflight and support-bundle checks are written in yaml format which is already familiar to Kubernetes users, meaning there’s basically no learning curve. All you need is kubectl connected to a Kubernetes cluster.

preflight

[.pre]apiVersion: troubleshoot.sh/v1beta2 kind: preflight metadata: name: my-application-name spec: analyzers: [][.pre]

As mentioned earlier, preflight is run when you want to make sure that your client isn’t running K8sv0.7 when your application requires >v1.15 or something else equally terrible. preflight provides access to checks (calledanalyzers) for host information like storage availability, cloudprovider, available RAM, CPU, and can also check overall cluster details like node count, eliminating the headache of trying to debug anenvironment incompatibility by identifying it before your app is evenlaunched. 

You (or your team) will write out a yaml definition for your check declaring what analyzers you want to be run on the host which is then shared with your client to run on their own host using the kubectl preflight plugin. After a few seconds, preflight will tell them whether or not their environment is compatible with your software. 

support-bundle

[.pre]apiVersion: troubleshoot.sh/v1beta2kind: SupportBundlemetadata:name: my-application-namespec:analyzers: [][.pre]

The second component to troubleshoot, support-bundle is what you pull out when your client calls and tells you that something is on fire. Same as with preflight, you or your team will write out a yaml file specifying collectors that can pull system info or even inject data depending on which collectors are used. Using the kubectl support-bundleplugin, the client runs it on their end and is supplied with an archivewhich can then be sent to you (or your team) for quick diagnosis support-bundlecollectors can pull everything from basic Kubernetes cluster info andlogs to database information and tons of other crucial data, whileautomatically redacting sensitive customer info from a client withoutthe need for installing additional software. The collected data isautomatically compressed into an easy-to-share tar.gz archive that can quickly be passed over to support teams to make it easier to get your client back up and running.

Both preflight and support-bundle begin with a collectphase that will scrape default data like cluster version, node status,pod status, along with any other collectors specified in your customizedcheck. To make the troubleshooting process even more streamlined,preflight and support-bundle checks can be doubled up in the same check,as shown in the example below. 

[.pre]apiVersion: troubleshoot.sh/v1beta2kind: SupportBundlemetadata:name: supportbundle-tutorialspec:collectors: []analyzers: [][.pre]

There are a million different optionsavailable for Kubernetes troubleshooting tools, but I doubt you’ll findone as versatile, yet lightweight, as Troubleshoot. 

But you don’t have to take my word for it. If you’d like to see it in action, head on over to troubleshoot.sh for more examples, install instructions, documentation, and more. Check out our previous edition, kURL 101, in the meantime, and be sure to stay tuned to the Replicated blog forthe next edition in the Replicated Runthrough series. See you then!