Note: This post will give you an idea of what is docker. I am a beginner in Front-end engineering. This is my way of testing how much I have understood the concepts. I have given a basic explanation of what Docker, Virtualisation and Containers are. I have also listed a bunch of articles as I proceed which can help you get started with docker quite easily.
So let’s get started. There has been a lot of talk going on about Docker and how great it is. As a beginner in front-end engineering, none of it makes sense to you. So you first go to google and type in “What is docker”. A bunch of articles come up. The first one is “What is docker” by the very good people at docker.com who actually created Docker. This is what you read:
Docker is the world’s leading software container platform. Developers use Docker to eliminate “works on my machine” problems. Docker is not the traditional virtualisation technology. It is about delivering applications quickly and with highest level of flexibility.
Woah! A lot of fancy words!
You identify the works on my machine problem. You have incurred a lot of wrath because of that problem while setting up your machine or another contributor’s machine for any new application. Hell, you also incur a lot of wrath from your dev-ops team when you try to deploy that application on a server. You always forget one or the other configuration that you did on your machine. You then open up your browser history and go through all those stackoverflow links to find out the fix that you did on your machine but forgot to add in the configuration. Then you deploy your application on a server. Your application turns out to be a hit and you want to scale because of all the traffic coming your way. That should be easy, you think. You try to duplicate instances of your application on different boxes. Thankfully, you had maintained a sheet of all the steps needed to deploy your application. But as it turns out, you face more frustration. Now, you gotta manually run all the steps for all the different boxes. And before you know it, you start banging your head. This is not worth it, you say. There’s got to be a better way. This is where Docker comes in the picture. Now, go up there and read the definition again. Yes, now you understand what docker does: delivering applications quickly, with highest level of flexibility.
But what about those fancy words: virtualisation technology and software container ? What are these, you ask.
Virtualisation and Virtual Machines
Virtual Machines (VMs) are softwares that emulate an entire Operating System. This creation of a virtual version of any software application is Virtualisation. Basically, VMs separate OS from underlying hardware. VM uses a technology called hypervisor. A hypervisor is a software which abstracts — isolates — operating systems and applications from the underlying computer hardware. Using hypervisors, we can install any OS on any hardware. Eg: We can install a MacOS on a Windows machine and vice versa. This understanding of hypervisors is enough to understand Docker.
Why is Virtualisation needed?
Earlier the hardware and the software used to be tightly coupled . A software OS used to be pre-installed on the hardware when you buy one. It cannot be separated from the hardware. Multiple problems arise because of this:
- Applications: Plenty of server applications work better with one or the other OS. If we want to run an array of such applications, we need to deploy separate servers for separate applications.
- Scaling: With the tight coupling between software and hardware, if we want to scale our sever capacity, we need to buy multiple such servers and then set them up individually. If each physical server has one OS installed, maintaining tons of such servers becomes very tedious, risky and resource intensive.
- Maintenance: If our server hardware goes down, we need to buy the exact same type of server and migrate our entire application from the old to the new server which is a very tedious task.
We all know how costly servers can be. Plus think about all the electricity and maintenance overhead with multiple servers. That’s a nightmare and will empty your pockets pretty soon.
Using VMs, all of the above problems are solved quite easily. VMs separate the OS from the hardware using hypervisors.
- Applications: If we want to install separate server applications, we need not worry about buying separate servers. All we need to do is start with a virtualisation-enabled server hardware, install a compatible hypervisor and then install as many OS as we need on that hypervisor. We install our applications on different OS and then run them as required.
- Scaling: We can deploy only 1 server with high processing power, install a hypervisor and spawn multiple instances of different OS on the hypervisor thus consolidating multiple servers into 1.
- Maintenance: As we have multiple servers consolidated into a few, maintenance and electricity cost goes down drastically.
When VMs are moved from one hardware to another, they take everything that used to sit on the older hardware including the data. Thus, VMs are stateful and mutable. You can buy VMs and hypervisors from multiple vendors like VMware, Microsoft, etc. Many free versions are also available for commercial as well as non-commercial uses.
Note: If you want to learn virtualisation and hypervisors in detail, watch this amazing video by Eli, the Computer Guy. Most of my understanding about hypervisors and virtualisation comes from here.
VMs offer great flexibility in migrating and multiplying entire operating systems. They make sure that our OS along with its state runs fine irrespective of the environment. What if we wanted the same flexibility with migrating and scaling individual applications? Bam! Containers are the answer.
Containers are for applications what VMs are for operating systems.
This explanation by the very good people at Docker is self-explanatory: A container image is a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries, settings. Containerised softwares run the same everywhere regardless of environment.
This makes for self-contained systems and guarantees that software will always run the same, regardless of where it’s deployed. The self-contained nature of containers makes it very easy to have multiple instances of the application or to move the application. When containers are moved from one OS to another, they only take the application logic with them. They do not take the data unlike VMs. They are stateless and immutable. When VMs are migrated, they maintain the entire state of the OS along with the data layer.
Note: At this point if you are still confused between containers and VMs, check out this article.
Docker images and Docker containers
Docker images or Images are another terminology for Container images. Docker images are an executable piece of software that includes everything needed to run an application. On executing these files, a Docker container is created. In reality, Docker containers or Containers are running instances of a Docker image. We can instantiate multiple containers for any given image.
A Docker Image is the template (application plus required binaries and libraries) needed to build a running Docker Container (the running instance of that image).
As templates, images are what can be used to share containerised applications. Collections of images are stored/shared in registries like Docker Hub or DigitalOcean. When a you download an image, it can then be used (as a template) to spin up multiple running Containers in your own environment.
An image is a set of layers (instructions). Each layer represents an instruction in the image’s Dockerfile. Each layer except the very last one is read-only.
Docker can build images automatically by reading the instructions from a Dockerfile. A dockerfile is a text file which has all the commands that a user can take on a command line to create an image. You can directly run all the instructions in a Dockerfile using docker build command.
Docker applies certain optimisation when several images are pulled on the same machine. When you use docker pull command to pull down an image from a repository, each layer is pulled down separately, and stored in Docker’s local storage area. But when we pull a second image which share layers with the previously pulled image, the layers are shared between the 2 images and are not duplicated.
Images are read-only and stateless. A container is a stateful instantiation of an image. The major difference between a container and an image is the top writable layer called the container layer. When you create a new container, you add a new writable layer on top of the underlying layers. All changes made to the running container, such as writing new files, modifying existing files, and deleting files, are written to this writable container layer. When the container is deleted, the writable layer is also deleted. The underlying image remains unchanged.
Note: To understand Containers, Images and Layers in depth please refer this amazing article by the good folks at Docker. I understand how the term Layers can be confusing for beginners. My major understanding of Layers and how they interact comes from this article. If you want to work with Docker, Layers are a must know.
To share an application across multiple users or to deploy it at multiple places like staging or production, all you need to do is create a DockerFile of that application, build an image out of it and share that image. Bam! Your application is ready to run.
That’s a basic introduction to Docker, Virtualisation and Containers. I hope I succeeded in explaining why is Docker a big deal these days. If you want to have a hands on session, please refer this video tutorial series by LearnCode.academy. It will get you going good.
In case of queries, please comment below. This is my first article and I would like to know what needs to be improved and whether you liked this article or not.
If I succeeded in giving a good introduction to for Docker, please press the little green heart!