Docker Basics

18 Apr 2020

General Concepts

What is Docker Container?

Isolated area of an OS with resource usage limits applied.

There are two main building blocks of containers: Namespaces and cgroups (Control Groups).

What are Namespaces and Control Groups?

Namespaces are used for creating isolation on OS and control groups are all about setting limits and grouping objects. They both are linux kernel primitives (also in Windows now). Because of isolation, containers are not aware of each other even if they are running on the same host OS.

Following is the Linux namespace:

Each container has its own isolated namespace. In isolated namespace, each container has its own isolated process tree with pid1, network stack, file system with root (“/” in linux, “C:\” in windows), IPC (process from single container has same shared memory space but isolated from processes from other containers), own hostname (via UTS) and user namespaces.

Now, when there are many isolated containers on a single host OS, there might be a problem of resource crunch (One container using too many resources (memory, disk space etc.)). To limit the resources each container uses, Control Groups are used.

Combining layered file system, namespaces and cgroups, containers can be created.

Difference between docker containers and Virtual Machines: https://stackoverflow.com/questions/16047306/how-is-docker-different-from-a-virtual-machine

Details of Docker Engine

This is how docker architecture looks like:

We interact with docker client via CLI (or UI) and it makes API calls to docker daemon, which implements these APIs and listens for docker client. ContainerD handles execution/lifecycle operations like start, stop, pause and unpause. OCI (Open Container Initiative) layer does the interface with the kernel.

The reason why docker daemon and containerD are decoupled is that when docker daemon is restarted, it won’t affect any of the running containers. This is a super useful feature when we’re using docker in production. Docker can be upgraded to new versions without killing any of the running docker containers. After upgrade/restart, containerD rediscovers the running containers and connects with its shim process again.

What happens under the hood when we create a new container on Linux?

When the command is fired from CLI by the user, it makes an API call to the docker daemon, which then calls containerD via GRPC, which further calls shim process and runC. RunC spins up the container and exits, however shim remains connected to the container. This is also the case when multiple containers are spun up.

Docker Images Explained in Depth

How image layering works?

Docker Registries (Where images live)

Containerizing an Application

Multi-stage Builds (because size of images matters)

Working with Docker Containers

Docker logging

There are two types of logs:

Daemon logs:

In linux with systemd, the daemon logs are sent to journald and we can read them using journalctl -u docker.service and the systems without systemd, the daemon logs can be found at “/var/log/messages”

Container Logs:

The application to be run in a container should be designed such that it runs as PID1 and issues the logs to stdout and stderr. If the logs are being issued into a file, files can be linked to stdout and stderr. Another approach is to mount a host volume to the docker container so that the logs/files persist even if the generating container is discarded.

Also in the latest docker versions (Enterprise Edition), logging drivers are supported. These drivers integrate the container logging with the already existing logging solutions like Syslog, Splunk, FluentD etc.

We can set the default logging driver for the container from the “daemon.json” file. With this approach, it’s easy to access the container logs using command “docker logs ”.

Useful Commands and their details