How Replicated Ship Works

Replicated Ship enables Kubernetes cluster operators to deploy both Open Source Software (OSS) and Commercial Off The Shelf (COTS) software (externally-developed software) in a whole new way. This new approach is based on a few foundational concepts that change what is possible.

Patches, Not Templates

Ship is based on a different type of configuration management that eschews the common practice of using templating to generate custom application manifests (YAML) that can then be deployed to a cluster. In the Kubernetes world, application config is represented and delivered as YAML files that implement parts of the Kubernetes API.

Most Kubernetes package managers (Helm, Duffle, and others) operate using the same core concept: the package maintainer creates a generic, reusable, templated version of the configuration manifests. Any installation-specific data is removed from the YAML and replaced with a template function, to be substituted in by the Cluster Operator during installation.

Instead of templating Replicated Ship is tightly integrated with Kustomize, the patch native configuration tool that is becoming part of Kubernetes. While patching is growing in popularity as a way to incorporate differences between environments (dev, test and prod) when deploying internally-written applications, the benefits of patching over templating are amplified when the author of the base YAML (the package maintainer) is different than the author of the patches (the cluster operator).

Application Config / Runtime Config

When packaging an application that will be installed, configured and operated in unknown environments, the package maintainer has to build support for all possible application configuration parameters. Templating is often used for this, and can still be a viable option. The package maintainer has a known, finite number of application configuration parameters available, and can make a decision about the best way to expose these to the cluster operator to configure at deploy time.

Additionally, when distributing software for Kubernetes, package maintainers must specify the runtime configuration for the Kubernetes API. Unlike application configuration, the runtime configuration options are not known or finite. The Kubernetes API is extensive and flexible enough to support various enterprise needs, and can be (and is often) extended using Custom Resource Definitions (CRD). A package maintainer cannot build support for every possible CRD into a package.

The Kubernetes API is extensive. For example, at the time of this writing (Kubernetes 1.13), a valid Pod spec contains at least 30 distinct, valid child attributes. While the Podspec has been included in the Core API since the first release of Kubernetes, the Podspec has grown to this size. In future releases of Kubernetes, it likely will grow to include additional fields, and new specs will be available.

Patches are the only practical method to support an extensible API when packaging software that must support various unknown environments.

Differences in the SDLC of Externally Developed Software

Traditionally, packaged and open source software has not had a SDLC that’s as clearly defined and reusable as that of internally written software.

A GitOps workflow is commonly used to deploy internally written software to Kubernetes clusters. This starts with a commit to the source control repository, which automatically triggers a Continuous Integration (CI) process. This is where the deployable artifacts are built, tested and validated before starting any deployment process. When the CI process completes successfully, the artifacts are uploaded. For Kubernetes applications, this includes pushing Docker images to registries and storing ready-to-deploy Kubernetes YAML to a git repository. A separate tool will sync that YAML into the cluster, and Kubernetes will pull the images as part of starting the application.

Mapping the GitOps workflow to externally-developed software is possible, but creates a few additional steps. Ship fills these gaps and makes it easy to deploy this software using GitOps.

To deploy externally-developed software using GitOps, a new source control will be required to store a copy (or fork) of the upstream application. This is needed to both store your changes to the YAML and also to configure a CI process from. A new CI job is required to be written, to be triggered on commit, that will simply generate deployable assets (Docker images and Kubernetes YAML) and commit that into the same git repository that the GitOps tool will sync from.

Using Ship to prepare and manage externally-developed software eliminates the need to set up this additional repo and CI process, which would have to be repeated for each application. Ship also automatically creates pull requests into the GitOps repo when the application is updated, without requiring manual work of downloading, editing, merging and pushing - which again, is required to do per application without Ship.

SDLC of Externally Developed Software

Ship defines and provides separate workflows for the various phases of managing externally developed software. Installing new software requires a different workflow than updating software. Changing a configuration is different than installing a hotfix. Watching and triggering updates automatically is part of the SDLC.

Ship introduces 3 distinct modes to manage the SDLC of preparing externally developed software for delivery to Kubernetes:

init

The ship init functionality solves the first part of the lifecycle of externally-developed software: discovery of how the application works, and providing the initial configuration. This process is manual, doesn’t deploy it to the cluster, and involves the cluster operator gaining an understanding of the configuration options, the application YAML, what options are available, and what changes will be required to make the application compatible with the target environment.

watch

The ship watch functionality solves the update notification step of the SDLC. Ship watch will maintain a constant check of the upstream application, and can trigger any events (generally update) when the upstream application is updated.

update

The ship update functionality automates the process of merging in all upstream changes with the custom configuration supplied by the cluster operator (in patches and overlays). This is now possible to do automatically because the patches are maintained separately from the base, which guarantees there will be no merge conflicts when updating an application.

Helm Compatibility

Helm is the most popular Kubernetes package manager today. As a result, there are 1000s of Helm Charts available to install, and Helm is the commonly used tool to package and distribute open source software. Helm is built on templating and relies heavily on the package maintainer substituting any configuration option with template code.

Ship maintains compatibility with this supply of Helm applications, while removing the need to run Helm (and Tiller), or any other package-specific runtime in the cluster. Ship is able to easily consume a Helm application, and watch and update Helm applications by pushing updates into standard SDLC pipelines instead of forcing creation of a new one.

Ship does this by running helm template, with your configuration values, to generate plain Kubernetes YAML. This Kubernetes YAML is patchable and deploying using standard tooling such as kubectl or any other Kubernetes compatible deployment toolset. There’s no need to run Tiller or use a separate process to deploy Helm applications. Helm applications are identical to plain Kubernetes YAML when using Ship.

Migration (unfork)

Most cluster operators already have open source and other externally-developed software configured and running on clusters. Because there was not a generally-available way to patch this software with the desired runtime configuration until now, much of this software was configured by forking the upstream chart or manifest and making all necessary changes.

Ship has a supported migration path to unfork existing forks or copies that have been edited, and extract the changes as targeted patches, so that the fork is no longer needed. Existing forked applications contain all of the YAML required to run the application, and the cluster operator has to maintain this and manage the process of merging in changes from the package maintainer. This can be tedious because of merge conflicts. After running the unfork command, the cluster operator will go from managing the entire application YAML to managing just the specific changes desired to make the application compatible with the target environment, resulting in significantly fewer lines of YAML to maintain.