Why Baker?

After 10 years of industry and teaching nearly 1000 students various software engineering courses, including a specialized course on DevOps, I saw one common problem that has not gotten better. It was amazing to see how difficult and widespread the simple problem of installing and configuring software tools and dependencies was for everyone. The combination of versions and variations in tools, programming language, services, and operating system creates immense complexity.

This is when I started teaching the idea of introducing computing environments.

Computing environments are for running code, not writing code.

To use a computing environment, you can use your host operating system to write code, interact with the running program, and visualize its executions. But the code itself runs in a headless virtual machine (or container). To accomplish this, we use a set of tools to enable you to map files and program between your host environment and computing environment.

To build computing environments, we started with a mix of tools including vagrant, docker, ansible. After a semester of teaching students these tools, they were able to successfully do great things; the however, training required to learn these tools requires a large investment---time most people not specializing in software engineering/infrastructure do not have.

Baker Design Goals

We built baker to allow anyone to create their own computing environments without needing to know specialized tooling and to allow users to get to speed quickly. We tried to take the best of all the tools we were inspired from and package it into one simple experience.

Bakelets are modules that install components such as a language (nodejs, python, java), tool (latex, jupyter), or service (mysql, docker, neo4j). So far, we've created dozens of different computing environments for different research and software projects by just composing the bakelets. Our goal is to provide a set of curated components that make it easy to build the most common types of computing environments needed. We also support the ability to create custom bakelets. Long-term we will integrate research we have that takes advantage of our ability to auto-discover dependencies and set those up as needed.

Baker also tries to include sane defaults for everything. For example, if you include a python or nodejs bakelet, we will automatically run package managers associated with your project. We create a shared folder for your code. You can easily ssh into your environment with baker ssh.

Finally, one technical goal we have with baker is to minimize the amount of system dependencies needed. On Mac/Linux, Baker has no dependencies --- it is possible to create a computing environment without any virtualization tools (VirtualBox).

Baker Implementation

Baker started out like much other tools---we started with using wrappers around existing tools and scripted together a common workflow. However, overtime, as we started to understand how these tools worked and what we needed from them, we were able to replace their functionality with a simplier re-implementation. For example, we initially used vagrant to help provision VirtualBox machines. But at the core, vagrant is simply making calls to VBoxManage. We eventually were able to perform a replace vagrant with a new npm module, node-virtualbox. As a result, we were able to greatly speed up time to provision machines as well as avoid several issues/bugs with vagrant. The current implementation of bakelets uses a mixture of nodejs and ansible scripts. Overtime, we believe we could reduce/replace ansible as well. Again, ansible at its core is simply making ssh calls with the help of python scripts copied over to the target server.

Baker uses a small microkernel to host the ansible configuration server, runc, and provide a host kernel for Baker containers. The microkernel runs completely in ram with a custom initramfs and kernel running alpine linux. For Mac, we're able to run this completely in the mac.hypervisor framework. We'd like to be able to target building custom microkernels for hosting baker environments in the future as well.