I returned late last week from the inaugural CoreOSFest in San Francisco. I had the privilege of spending two days with some of the best minds on the bleeding edge of the internet.
If you haven’t heard about it yet, CoreOS is a new breed of operating system that is leading the charge in facilitating containerized workloads. It turns a lot of long-held technology concepts upside down, with a fully automated update system, no package manager, full reliance on the new init suite, systemd, and an intentionally clustered architecture.
Now, I’m a systems administrator with a good deal of development background. Let’s just say I’ve answered the pager at 3am enough times to be more than a little nervous about bleeding edge technologies. As a technologist, I love them. I think everyone likes playing with new toys. I’ve long since outgrown the impetus to jump on the latest tech for my next project without some really compelling reasons.
Formerly Jaded Sysadmins, Unite!
I want you to have that context in mind when I tell you that CoreOSFest was like being at a conference of formerly jaded sysadmins that all got so fed up with the status quo, getting paged at all hours of the day and night, and dealing with crappy deploy system after crappy deploy system that they threw it all out and started over. And they’re all really, really excited about it.
The CoreOS community has all the enthusiasm of any new tech product, but the problems this community is engaging with and the approach they are taking is unique. There were two related themes that permeated the presentations and discussions I participated in while I was there.
- First, everyone there was universally rejecting the concept of "because that’s how we've always done it." Just because something like systemV has worked very well for 32 years doesn’t mean it should not be torn apart and re-invented.
- The second major take away was everyone’s commitment to reducing complexity. This is dramatically more difficult in technology than people give credit for and is directly related to being comfortable rejecting the status quo. When most developers or systems administrators encounter a problem, their first reaction is to build something to solve that problem. Write a snippet of code, open a new github repository, or in some other way *add another piece of functionality*.
This makes sense. We’re all builders of one sort or another so building a solution to a problem fits with what we all do.
However, There is another side to that coin: whenever you add something to a system you add another piece that has the potential to break. It’s another piece that needs to be maintained when the next security vulnerability is announced. It’s more custom code that makes your system differentiated from any standards or best practices. It’s more complex.
This is why there is an inherent reticence in most experienced systems administrators to do anything non-standard. Developers frequently get frustrated with this. “C’mon it’s just one little piece, what’s the big deal?” Of course, they’re not the ones that get the alerts when things go offline on Christmas day.
How do you fix that? That is the fundamental question that the developers at CoreOS are wrestling with.
How do you build the next generation of the internet without increasing complexity?
The solutions to date have all been about managing increasing complexity. For example, compiling code from source became cumbersome, so we invented package managers to manage dependencies and install more software more easily. But that never addressed the underlying problem at all. You still need to update that software and if you ever need to scale that software you need to do it all over again on multiple machines.
The traditional solution to that is some sort of configuration management tool. Use Puppet, or Chef, or Ansible to interact with your package manager to make sure your software dependencies all line up properly. This all lives in a grey area between writing code and maintaining infrastructure. It’s led to a whole new role we call DevOps, or SysOps, or Deployment.
The CoreOS Approach
CoreOS’s solution is to say that managing that complexity should not be the problem of the operating system that is custom designed to scale your workload and serve your traffic. These are two different problems and shouldn’t be conflated. It gives you the bare minimum you need to schedule and start containerized workloads and leaves the rest to however you want to build and deploy your software. In doing so it has made the operating system dramatically simpler.
And CoreOSFest was a congregation of people with whom that methodology resonates.
Recently I’ve had several great discussions with Project Calico, “a layer 3 approach to virtual networking”. They want to do away with the ubiquitous overlay network that comes attached to all virtualized platforms. In short: why make an overlay when we can do this simpler? Christopher Liljenstolpe of their team even references what he calls the “3am” test. Don’t deploy anything to production that can’t be troubleshot by the guys available at 3am without waking you up.
Diego Ongaro gave a brilliant talk about his work on the RAFT consensus algorithm. Consensus is a problem that has plagued computer science phd’s for literally decades. That project’s stated goal? “Raft is a consensus algorithm that is designed to be easy to understand.”
The CoreOS team announced a new project (Ignition) designed to make the entire boot process simpler and more streamlined. It takes a bigger project, coreos-cloudinit and breaks it out into multiple stages, allowing machine provisioning to deal with configuration of drives and hardware in one phase and services in a different phase, instead of hinging the two off of one tool.
An Internet With Fewer Features
This is why I say that the internet will have less features. Because at the end of the day what we need is standardized, easily deployable, scalable systems. Full stop. We don’t need increasingly complex methods to deploy code, with obscure dependency resolution managers installed at the server level. We don’t need more options. We need fewer.
We’re moving past the days where we take an operating system off the shelf and plug things into it until it can serve our application and into the days of purpose built OSes that push software complexity into the realm of software engineers.