Learning Docker
Objectives
- Describe Docker architecture.
- Define the docker daemon and docker client and explain how they work together.
- Explain how docker interacts with the kernel components (namespaces, cgroups, SELinux)
Before the Handshake
Before we get started with Docker, it is necessary for us to understand the evolution leading to the emergence and popularity of container technology, like Docker.
Let us start with the question of what is required to develop and run an application?
- Infrastructure resources, like servers, storage, and network.
- Application-infrastructure resources, like operating systems, application servers, messaging servers, cache service, database etc.
- Application development frameworks and tools, like programming language binaries and libraries, IDEs etc.
Traditionally, all the above had to be procured from different vendors, deployed, integrated and lifecycle managed to produce a computing, development and hosting environment. IT administration and operation teams formed a formidably large part of technology investment for businesses. Virtualization of machines transformed this landscape. With the advent of virtualization software and hypervisors, pooling and on-demand creation of infrastructure resources became a reality. Business models offered these pooled infrastructure resources on a request basis, which we today refer to the IaaS or infrastructure as a service business.
Soon application infrastructure vendors leveraged IaaS to offer application infrastructure software and application development frameworks and tools on an on-demand basis, leading to the inception of PaaS or platform as a service. IaaS and PaaS ushered a new era of IT, where businesses could think of using shared and public resources at fractional costs for development, testing and production deployments. This changing landscape of IT propelled the need to be able to move application workloads across environments with ease. Container technology was the answer. A container is an application that is packaged along with its dependencies, which includes libraries, binaries and application infrastructure software required to run the application. Docker was one of the foremost open source projects, which contributed an efficient and popular containerization technology.
Docker architecture
Before we look at Docker architecture let us understand the difference between a VM (virtual machine) and a container. VMs require hypervisor or virtualization software on top of the host operating system (OS). The Guest OS is installed on top of the hypervisor. Applications are installed in the Guest OS. Containers, on the other hand can be installed directly on the host OS, eliminating the need for a hypervisor. Containers can hence leverage the host OS capabilities. In Figure 1, the container technology layer is Docker.
Source & image credit: Docker docs., Containers and virtual machines
Containers provide isolated environments for application execution within a physical machine, just like VMs, additionally, containers provide efficient methods to create portable application packages which can be moved across deployment environments in a hardware and OS agnostic manner. VMs are not dependent on host OS capabilities, VMs allow the installation of any OS as Guest OS. Docker technology is based on Linux kernel features.
What do we expect from a container technology?
- A standardized way to define, package and ship the application stack.
- A model enabling the lifecycle management of containers from remote machines.
- A support mechanism to replicate and cluster containers to provide scale and disaster recovery.
- Support creation of template definitions for containers.
- Manage the network and storage aspects of container instances.
- Secure resources on host OS and resources allocated to the container.
Docker provides all of the above and more. We will discuss some of these features of Docker in the following sections.
Docker client and Docker daemon
Docker follows a client-server architecture, which means that there is a docker client component and a docker server component also called as the docker daemon, which communicates with each other. The docker client component is usually installed on a local machine, desktop or otherwise, docker client communicates to the server component using a command line interface or the REST layer.
Docker daemon is a process that runs on a physical or virtual machine where you would want to create and manage container instances. Docker daemon is responsible for creating containers and container templates called images. There is a third aspect to docker technology, which is called the docker registry, this is a mechanism to store, distribute and provide lookup for docker images. Figure 2 demonstrates the component architecture of docker and the communication paths between the components.
Source & image credit: Docker docs., Docker overview
Docker client command line interface (CLI) provides dockercommands to interact with the docker daemon. See herefor a list of available docker commands.
Docker daemon is responsible for understanding the docker commands sent from the docker client and for performing the necessary actions. Docker daemon creates docker images and containers and is responsible for managing and monitoring the container lifecycle.
Let us illustrate the interaction between the docker client and docker daemon using an example scenario.
Scenario:
Build your application source from GitHub and deploy it to a docker container.
Before we get started with the scenario, there is another docker specific object that you need to understand, Dockerfile. Dockerfile is a standard way to describe your application source, dependencies and commands to be run from the built execution environment. To learn more about Dockerfile, see here. For this illustration, you only need to understand that Dockerfile is a file that carries the information specific to your application context.
Assumption:
- Docker client and daemon is installed on separate machines.
- Your source repository in GitHub also has the Dockerfile in the root directory of the repository path.
Steps:
- On the machine where docker client is installed, open the CLI.
- Build the application by using the source from GitHub. Execute the following command:
docker build <path to application source on github>
You get the following response:
Sending build context to docker daemon ….
The response line indicates that the actual action happens at the docker daemon, the client sends the build context to the daemon, the context in this case is the GitHub repository path.
The docker daemon, now starts the process of cloning the GitHub repository and will look for the Dockerfile. On finding the Dockerfile, it parses the file contents to understand the application context, based on the Dockerfile the daemon starts to get the application dependencies required to build the application execution environment. Think of docker image as a template for your container instance. Docker image is created by the daemon following the instructions in the Dockerfile.
Let us take a simple Dockerfile example:
FROM ubuntu
RUN echo“Say Hello to Docker”
On the docker client terminal, you will see the following output:
Sending build context to Docker daemon 2.048kB
Step 1/2 : FROM ubuntu
latest: Pulling from library/ubuntu
32802c0cfa4d: Pull complete
da1315cffa03: Pull complete
fa83472a3562: Pull complete
f85999a86bef: Pull complete
Digest: sha256:6d0e0c26489e33f5a6f0020edface2727db9489744ecc9b4f50c7fa671f23c49
Status: Downloaded newer image for ubuntu:latest
— -> 93fd78260bd1
Step 2/2 : RUN echo “Say Hello to Docker!”
— -> Running in bbc4b04e355d
Say Hello to Docker!
Removing intermediate container bbc4b04e355d
— -> 7082b515741a
Successfully built 7082b515741a
Docker daemon first gets the latest image for ubuntu from the docker public registry, since docker is designed to look up and fetch images from public docker registry or hub. Next, the RUN command from the Dockerfile is executed. You will see that an intermediate container instance is created and the command specified in the RUN is executed from the container instance. The status of execution is output in the log. The intermediate container instance is then deleted. The state of the container after the command execution is saved or committed and a new image is created, you will see this in the log with a new image id printed.
You can now use the docker createor docker runcommand to create a container instance using the image ID of the docker image created.
Docker interaction with kernel components
Docker relies on Linux kernel modules to provide some of the features such as resource isolation, resource management, security, networking, etc. In this section we will look at how docker leverages three of the kernel modules namespaces, cgroups and SELinux.
Source & image credit:Virtualisation abstraction, Docker
namespaces
Docker users namespaces to decide on what resources or processes can be seen and hence used by a container instance. The namespaces used by docker are:
- PID, which provides process isolation.
- NET, for managing network.
- IPC (inter-process communication), for managing access to IPC resources.
- MNT, for managing file system mount points.
- UTS, for isolating kernel and version identifiers.
Processes within a PID namespace can only see other processes within the same PID namespace. Similarly, each docker container has its own network namespace, which provides each of the containers created on the same host OS with network isolation.
cgroups
Docker makes use of cgroups and cgroup policies to provide resource allocation and perform management of resources like cpu, memory, disk I/O etc. Resources allocated to a container instance can be limited using cgroups, thus preventing other container instances on the same host OS, from running out of required resources. In other words, cgroups are used for resource accounting and resource limiting functions in docker.
SELinux
Security in docker concerns itself with protecting the host and protecting the containers from one another. By default, docker containers are secure when processes in them are run as non-privileged users. Additionally, docker allows leveraging functions of the kernel security modules such as AppArmour or SELinux. Red Hat Linux distribution comes with SELinux policies for Docker. The SELinux security policies can be enabled to further harden security for the docker containers running on host OS.
SELinux policies restrict access to resources based on SELinux labels, which consist of four parts: User:Role:Type:Level. SELinux assigns labels to each process, each file or directory object, network ports etc. SELinux policy defines the access control rules using which object access by a process is controlled. The kernel is responsible for enforcing these rules and this process of enforcement is called as MAC (Mandatory Access Control). Two methods of enforcement are supported by SELinux.
- Type enforcement
- Multi Category Security (MCS) enforcement
Let us draw a parallel with the famous cartoon characters Tom and Jerry to understand the enforcement types. Assume Tom and Jerry are two processes, cat and mouse respectively.
Cat drinks milk and mouse eats cheese. Cat should not be allowed to eat cheese and mouse should not be allowed to drink milk. Both milk and cheese belong to the same class called food. If we were to build this restriction using the type enforcement policy, then we would need to:
Allow cat (process) drink (action) milk (object) and allow mouse (process) eat (action) cheese (object).
With the type enforcement policy, cat will not be able to eat cheese and mouse will not be able to drink milk.
MCS enforcement is required when type enforcement alone would not suffice.
Let us now assume that Jerry has a cousin Muscles, and we would like to build an enforcement such that Jerry mouse should be able to eat only cheddar cheese and Muscles mouse should be able to eat only mozzarella cheese. To build this enforcement we will need to use MCS. In this method a random label is attached to both the processes, which are of the same type. Random label is also attached to both the objects, which are also of the same type. In our example, Jerry will be labeled mouse:random1 and Muscles will be labeled mouse:random2. Cheddar cheese will be labeled cheese:random1 and mozzarella will be cheese:random2. The type policy is enforced first and then MCS policy is applied, wherein the random labels are checked for both the process and the object. Only if there is a match of random labels the process is allowed to access the object. So in this case, mouse:random1 can eat only cheese:random1 and mouse:random2 can eat only cheese:random2, which translates to our requirement of allowing Jerry mouse to eat cheddar cheese and allowing Muscles mouse to eat mozzarella cheese. To learn more about SELinux, see here.
Summary
You should now be able to understand and explain:
- the ‘why’ and the ‘what’ of Docker.
- the docker high level architecture and docker components such as docker client and docker daemon.
- how docker uses some of the kernel modules to achieve its features.
References
1. Docker Docs. Get Started, Part 1 Orientation and setup. [Online]. Available from: https://docs.docker.com/get-started/#images-and-containers
2. Docker Docs. Docker overview. [Online]. Available from: https://docs.docker.com/engine/docker-overview/
3. Docker Docs. Dockerfile reference. [Online]. Available from: https://docs.docker.com/engine/reference/builder/#usage
4. Marco Chiappetta. Understanding Docker without losing your shit. [Online]. Available from: http://engineering.hipolabs.com/understand-docker-without-losing-your-shit/
5. Nitin Agarwal. Understanding the Docker Internals. [Online]. Available from: https://medium.com/@nagarwal/understanding-the-docker-internals-7ccb052ce9fe
6. Martijn Dwars, Wiebe van Geest, Rik Nijessen and Rick Wieman. Docker: Build, Ship and Run Any App, Anywhere. [Online]. Available from: https://delftswa.github.io/chapters/docker/
7. Docker Docs. Docker security. [Online]. Available from: https://docs.docker.com/engine/security/security/#kernel-namespaces
8. Docker Kubernetes Lab. Linux Network Namespace Introduction. [Online]. Available from: https://docker-k8s-lab.readthedocs.io/en/latest/docker/netns.html
9. Daniel J Walsh. Your visual how-to guide for SELinux policy enforcement. [Online]. Available from: https://opensource.com/business/13/11/selinux-policy-guide