Building device software can be challenging, but managing software at scale adds an entirely different level of complexity to every project. Software at scale needs to consider platform security, over the air updates, per-device content customization, rollback, software supply chain management and so on. This is the devops side of building device software. While the huge evolution of cloud based devops demonstrates what’s possible, device vendors, with their unique needs and complexities of managing remote devices, are typically left to solve these problem on their own. While KOS isn’t tied to any particular device management solution, it is architected to solve many software at scale problems from the start.
Most software systems are build based. Developers create a source repository for a device, everyone contributes and the source tree is built into deliverables. These deliverables are then gathered during another build step and packaged with larger components such as the operating system image, itself a monolithic collection of components generated by a build. In such a system, if you want to recreate a release, you re-run the entire build to get the final installable image.
KOS replaces this approach with an assembly based model. A release is not built, it is assembled from known published parts. A KOS release is in fact simply a manifest that lists all the parts that make up the release. This is similar to systems like docker, which simply describe releases as collections of existing components. In the KOS model, the concept of recreating a release makes no sense as the manifest is the release so having the manifest is the same as having the release.
The assembly based system has many advantages:
Components can be reused across different devices without the need to copy code or modify builds.
Developers only need their small piece of the overall project as all other components can be assembled from build servers.
Any individual component can be replaced at any time for testing and diagnostics.
Device management can add additional components, such as custom content, allowing easy per-device customization.
Component builds are decoupled and manifests provide an audit trail of what components are in a release.
Assemblies are completely deterministic, allowing automatic garbage collection of components that are no longer needed, avoiding problems with long term disk usage.
The down side of assembly based software is the considerable infrastructure required to make it all work seamlessly. KOS Studio exists to fill this gap and make assembly based software a seamless experience.
Security is hard, expensive, and a requirement for modern devices. Security doesn’t just apply to developer code, it applies to the device boot process and entire software supply chain, including device management, over the air update and everything else in between. Considerable energy has been put into underlying KOS security model so that developers can focus on delivering functionality.
Security in KOS starts with KAB files. Everything from the Linux kernel to java applications to data files are packaged as KABs. A KAB is a file of files, much like a zip file, that is digitally signed by developer keys and uniquely identified by UUID. Every KAB also has a mode, test or production, based on the keys that were used to sign it. Releases, defined as manifests, are also packaged as KABs. When KOS boots, it uses the manufacturer certificate to verify the manifest and then all referenced KABs in the chain. If any file has been tampered with or doesn’t have the correct authority, the release will be rejected and the device will rollback to the previous release. KOS has the ability to mount KABs directly into the native linux filesystem so the contents of KABs are never extracted, thus the entire system is completely immutable. KOS provides delegation models to allows 3rd party KABs to be trusted as well as support for chaining manifests which allows external parties to add components to releases in developer defined ways.
As KABs are signed, typed, uniquely identifiable and can contain virtually anything, KOS uses KABs as the the basic unit of assembly. KOS Studio has extensive support for working with KABs as well as a KAB registry so that organizations can publish their KABs within their organization, allowing releases to be defined and run without the need to build any of the KABs locally. KOS Studio also allows KABs to be published in a market, allowing organizations to collaborate and share components securely. KOS Studio handles downloading and verifying KABS as well as installing them onto devices, making the entire process transparent to developers.
Modern devices need to be tailored to customer needs. Consider a device in a hotel lobby that authenticates using a room key, or a device located at a particular venue that needs themed content. This often goes beyond custom settings and may include custom content or even unique applications per device. Pulling down custom content or apps opens another channel for security concerns, and changing state on a device in a way that impacts a rollback release can also impact overall stability. To address these needs, KOS provides a feature called chained manifests. While the developer can define a release with a single manifest, that manifest can be provided to upstream tooling which can add more kabs in a developer defined way. This doesn’t modify the original manifest, which is digitally signed and immutable, it creates a new manifest KAB that references the original manifest. Using this process, external systems can generate arbitrary chains of manifests that in the aggregate, define the entire state of the device. A KOS device can evaluate these chains and construct the final device state from the chain.
One of the most complex aspects of building a device is remote update. Updating an entire operating system image can be very complex, even when resorting to inefficient strategies like A/B partitions. Beyond the mechanics of picking the correct image to run and getting the image installed, there are challenges around delivery of updates. How to avoid installing incompatible software, how to verify a usb release is not corrupt, and how to deliver releases over the air onto a device that may be turned off at any time are all complex problems. Extend this to multi-node based devices and this is even more complex.
As KOS is assembly based, a typical update doesn’t involve installing all KABs in the release as it’s common that many of the required KABs are already available from the previous release. KOS handles identifying the KABS necessary for an installation, garbage collecting existing KABS to make space while ensuring rollback dependencies are preserved, and the actual process of installing and activating the update. KOS provides extensive support for both local updates, such as with a USB drive and remote updates using one or more agents.
When working with USB updates, KOS is able to examine a disk of KABs, identify all the manifest chains that are compatible with the device and provide descriptions of each chain suitable for display to a technician. A given manifest chain can then be installed with a single api call. Once installation on the primary node is complete, all secondary nodes are updated automatically from the primary. This is part of the standard KOS boot process so even if nodes were turned off during the installation or if a failed node is later replaced, they will all synchronize to the correct software version automatically.
Remote updates are somewhat different however. Utilizing built in KOS OTA support, an agent simply needs to specify the UUID of the manifest to be run. If KOS doesn’t have the KAB installed, it will request the KAB from registered agents. It will use this approach to fetch the entire manifest chain and then download all KABs that are missing from the device. Once complete it will automatically install the update but not activate it. Since OTA can occur at any time of day, updates can disrupt customer operations. If an OTA completes and the customer reboots, they will still run the previous release until the new release is activated at a configured activation time in order to minimize disruption.
KOS support for software updates, combined with firmware update infrastructure, ensures that device updates are deterministic and controlled. The assembly model also ensures that rollbacks restore the entire device state, firmware included, to guarantee that every state the device is in is the result of a tested combination of components.
While KOS is agent agnostic, it provides much of the core functionality that agents typically provide. A centralized configuration system provides access all all settings even in multi-node devices and with events that can be used to watch configuration changes. An analytics service provides a standard way to capture rich event data into different channels and notify agents when data needs to be sent off the device. Over the air updates is as simple as implementing the ability to download a single KAB at a time. This and many other core features in KOS make it a very friendly environment for agents.