During the last few years, container technologies have revolutionized the ways in which complex applications are developed, deployed and operated. For many years, application deployments took place on a given host following proper packaging of components at the operating system (OS) level. This legacy approach had the disadvantage of mixing various elements and artifacts of an application, including executables, configuration files, libraries and host operating system functionalities. In order to alleviate this heterogeneity, deployers had to resort to the development of immutable virtual machine images. In this way, rollouts and rollbacks were predictable, yet Virtual Machines (VMs) were heavyweight and non-portable.
The advent of container technologies such as Docker alleviated the limitations of these legacy approaches, based on virtualization of resources at the OS level rather than based on hardware virtualization. OS level containers operate in isolation from each other and from the host. Each container has its own filesystems and processes, which are not mingled with processes of other containers. Moreover, containers are easier to build than VMs, while being portable across clouds and OS distributions. Also containers are small and fast, which allows applications to be packed in each container image. This allows the creation of immutable container images at build time rather than deployment time, as applications are not tight to the application stack and to the production infrastructure. Likewise, containerization facilitates the porting of the development environment to the production environment, as well as monitoring and management of the images.
Based on these properties, containers are integral elements of microservices architectures and DevOps methodologies for non-trivial highly automated deployments. DevOps deployments rely on infrastructures for managing containerized workflows and services in a both declarative and automated fashion. As such they have to be supported by infrastructures offering such management functionalities in computing clusters that host container images. Nowadays, the most popular platform for managing workloads and services within containers is Kubernetes. The popularity of Kubernetes is growing as organizations adopt microservices architectures: A recent study on microservices maturity conducted by the of the O’Reilly Software Architecture Conference revealed that 40% of microservices deployers use Kubernetes as their workload management platform.
Introducing Kubernetes
Kubernetes is a portable, extensible and open-source platform for managing containerized deployments in clustered environments. It has been created by Google based on the company’s multi-year experience in running and managing large scale workloads, while integrating relevant ideas and best practices from the microservices and DevOps communities. Kubernetes is currently maintained by the Cloud Native Computing Foundation (CNCF), which has resulted from a partnership of Google with the Linux foundation. Nowadays thousands of enterprises use Kubernetes to orchestrate the lifecycle of individual Docker containers, as a means of managing complex deployments and constructing robust and scalable systems. It supports operations like scaling up and scaling down deployments, canary deployments (i.e. rolling out releases to a subset of users or servers), rollbacks and more.
In practice, Kubernetes has multiple facets as it can be seen as:
- A container platform that enables the creation, hosting and management of container images. Kubernetes offers a management environment for container images.
- A microservices platform that supports the deployment and coordination of sets of microservices. It enables developers to abstract away the functionality of a set of services and to expose them to other developers through well-defined (Application Programming Interfaces (APIs).
- A portable cloud platform that supports portability of container images across infrastructure providers. Kubernetes orchestrates computing, networking and storage for different user workloads. In this way it combines the ease of use of a Platform as a Service (PaaS) platform and the flexibility of Infrastructure as a Service (IaaS) offerings.
This multi-facet nature is the main source of Kubernetes popularity, as it allows companies to solve deployment problems, to automate micro services operation and to easily connect development and operations. Moreover, it is very easy to operate a Kubernetes infrastructure, following however some significant effort to set it up and to configure it. Indeed, Kubernetes setup effort and steep learning curve are the main pain points of the technology, as even skilled engineers may need a few weeks to create a Kubernetes infrastructure and get it up and running. However, the amount of time saved after the infrastructure is setup (e.g., as part of development and release efforts) can completely pay off for the time invested in setting it up.
Kubernetes Architecture
From a high-level architectural view point, Kubernetes comprises a master node that orchestrates several running containers. The latter are grouped in structures called “Pods”. A Pod is the basic scheduling unit in Kubernetes, which provides a higher level of abstraction by grouping containerized components. Containers within a pod are guaranteed to be co-located on the same host machine and can share resources. Pod containers run in orchestration in order to achieve a single task. Containers within a pod communicate either through the filesystem or based on a common network interface. Pods can also define a volume (e.g., a local disk directory or a network disk), which they can expose to their containers. Furthermore, Pods can be managed manually through the Kubernetes API, or through delegating their management to a controller.
The master node hosts all support services of a Kubernetes deployment such as service discovery services (based on DNS) and an API server. Some Kubernetes deployments comprise more than one master nodes as a means of ensuring high availability. Nevertheless, being able to fast recover the master in case it goes down is even more important than a redundant configuration. To this end, whenever there is a problem with the master, Kubernetes operators should be ready to spawn a new VM on the cloud, based on the same template as the one used by the old master. The master controls various nodes, which are basically workers. They follow the commends of the master in order to ensure that applications remain alive and are hosted according to a given configuration. To this end, nodes host and operate a software called Kubelet, which coordinates their communication with the master node.
The Kubernetes architecture is technology-agnostic. While Docker is used as the main container engine, it is possible to switch to another container technology (e.g., such as the rkt container system, which is currently emerging as a light weight and secure alternative to Docker). Hence, Kubernetes does not bind developers and deployers to particular technologies and therefore avoids vendor lock in.
Overall, Kubernetes is considered a game changer in managing production workloads and containers at scale. Even though there are still companies that use more conventional container management technologies, its popularity is growing rapidly. This is also the reason why the demand for Kubernetes engineers is exploding. As Kubernetes is a quite complex and new technology, there is a significant “Kubernetes skills gap” that should be filled in the years to come. This is something that you should have in mind, especially if you have not yet started your Kubernetes journey.