Replicated Runthrough: SchemaHero 101

Treva Williams
 | 
Jun 29, 2021
schemaHero title card

Greetings and welcome to our final issue of Replicated Runthrough, starring the coolest project you’ve probably already seen a few things about –SchemaHero. Like almost everything else in the Replicated catalog, SchemaHero is an open source tool created to simplify a task that at its mere mention strikes trembling fear in the hearts of even the most experienced administrators – and that task is Database Schema migration. Back in The Olde Days, before containerization became industry standard, mention of any task having to do with MySQL/Redis/Postgres – no matter how simple (though “simple” generally isn’t a word that’s used when describing DBA) would trigger a Chorus of Nope that even the most ambitious project manager knew not to challenge. 

To fully understand how impactful SchemaHero is, you’ll first need to understand DB schema and what schema migration entails. Think of a database as a museum with any number of wings, and each wing is filled with dozens, hundreds, even thousands of priceless, mostly irreplaceable artifacts with more being added every minute. Those artifacts will represent your organization’s data, and patrons visiting the museum. (Hopefully, you see where I’m going with this.)

Database schema in this situation would be like a live-updated blueprint of the entire museum, carefully recording the presence of every item in each room in great detail, along with the room itself. Now, imagine that the entire museum – building and all – needs to be moved piece by piece to a new location using only the blueprint as a guide, all while still entertaining guests. That’s basically what a schema migration is. 

If that blueprint were to fall out of date or – heaven forbid – were missing information on so much as a closet lightbulb, the entire migration process would fail spectacularly. Now picture a single person – say, the museum curator – attempting to move the entire museum contents piece by piece, room by room, all by themselves while still attempting to take in new items and direct patrons through the halls.

That’s a lot going on, so it isn’t a stretch to imagine that something is misplaced along the way, which triggers a Museum Meltdown. I’m talking total chaos where now-unframed paintings are fluttering about the halls, the formerly well behaved 3rd graders on an accompanied tour are suddenly alone, armed with Nerf guns filled with a seemingly endless supply of neon orange paint, a horde of sentient guinea pigs have absconded with several Faberge eggs with the intention of using them to somehow create the world’s most expensive omelette, all while the once dignified museum curator is now huddled in a corner sobbing. While the analogy is intentionally ridiculous, it’s still a pretty accurate account of a support floor in the event of a failed migration, except everyone is crying.

This is where SchemaHero comes to the rescue (pun intended) as the 5-star rated, highly recommended moving company that also specializes in artifact preservation, architecture, and building de/reconstruction using sophisticated, yet extremely user-friendly automated tooling to get the migration done so quickly that hardly anyone notices the change in location.

SchemaHero is a Kubernetes Operator that converts schema definition into migration scripts that can be run virtually anywhere. To link it to the museum migration, imagine the SchemaHero moving company, after walking through the entire building to categorize each item, then hands the formerly overwhelmed museum curator a tablet that enables them to move an entire wing of the museum to literally any location in the world with a swipe of their finger. Even better, the same tablet can be used for all future moves, and even has the capability to switch out displayed artworks, if need be.

Getting back on topic, SchemaHero works by installing a Kubernetes Operator using the kubectl-schemahero plugin. The Operator will be deployed in the schemahero-system namespace along with three new Custom Resource Definitions. Once the Operator is running, it will then need to be connected to the cluster database(s) using a new Custom Resource Definition to deploy a DB object. The cluster administrator can write a new CRD defining an object that can be referenced in tables and migrations. 

[.pre]apiVersion: databases.schemahero.io/v1alpha4kind: Databasemetadata:name: testdbspec:connection:postgres:uri:valueFrom:secretKeyRef:name: postgresql-secretkey: uri[.pre]

Once that’s done, terrifying tasks such as creating, editing, or dropping tables, columns, and keys, indexing, and more can now be handled using yaml definitions, all thanks to one handy little kubectl plugin. In addition to unofficially being credited with lowering the collective heart rates of DevOps and sysadmins across the globe, SchemaHero also has the honor of officially being accepted by the Cloud Native Computing Foundation as a sandbox project. If you’d like to know more about it and how you can get involved, head on over to https://schemahero.io or follow the project on Twitter @SchemaHerocto stay up-to-date on what’s happening.