(Day 240) Example for transitioning from Docker to K8s

Ivan Ivanov · August 28, 2024

Hello :) Today is Day 240!

A quick summary of today:

  • atteneded a career fair in Seoul
  • found a short and nice article about going from just Docker to a container orchestrator (kubernetes)
  • looking for books

Firstly, about the career fair

image image

It was advertised as the biggest career event for foreigners in Korea, and when I got there it was 70% only for Korean people … It is fair to say I was quite disappointed.

Morgan Stanley

Actually, on the topic of career events, I have been attending some online sessions run by Morgan Stanley in the UK

image

Above is the schedule. So far 3 sessions have passed and they were all quite good - people from HR coming out, current interns, placement students, graduate students coming out and just talking and sharing about life at MS (as they call Morgan Stanley).

Docker and Kubernetes

  • Nodes: the worker machines (either physical or virtual) in a Kubernetes cluster that run the applications. Each node contains the necessary components to manage and run containers, such as the container runtime (e.g., Docker)

  • Pods: the smallest deployable units in Kubernetes, consisting of one or more containers that share resources like storage and networking. Pods run on nodes and are the building blocks for applications in a Kubernetes cluster

Do I Need cri-dockerd?

A critical aspect of running Docker and Kubernetes together is understanding container runtimes. The article explains the deprecation of dockershim in Kubernetes and introduces cri-dockerd, an adapter ensuring compatibility between Docker Engine and Kubernetes. This section is crucial for those who want to continue using Docker Engine with Kubernetes.

It also talked about the benefits of using the local Kubernetes cluster bundled with Docker Desktop. It allows developers to test their deployments locally before pushing them to production environments, such as AWS EKS.

The practical example

The code is all in the article, but simply put it was converting a simple app that uses nginx, redis and a flask app that is firstly built with docker-compose, and then converting that to using k8s (kubernetes). And the main change is enabling k8s in docker desktop (I did not know that is an option so next time I need one I will use the docker desktop provided one), and creating a bunch of yaml files for service and deployment.

As I have used Docker many times now and am quite comfortable with it, this was a nice and smooth intro to k8s and its usage. However, I do feel this taps even more into the DevOps role into a company, but still it is something good to be aware of.

Lastly, books

After the career fair (in Seoul), I went to one of the biggest book stores looking for a cool new book. As it is Korea, the availability of Western book is understandably limited, and even if a book was available it’s price would be 2x, 3x or even more than the original. That being said, I almost bought Mathematics for Machine Learning. I know it is freely available and provided by the authors, but having a physical copy feels so good. I did not end up buying it as understandably the price was too much.

And on my train ride back home, I looked at available books on O’Reilly. I found some good next reads:

Big Data on K8s

  • Leverage Kubernetes in a cloud environment to integrate seamlessly with a variety of tools
  • Explore best practices for optimizing the performance of big data pipelines
  • Build end-to-end data pipelines and discover real-world use cases using popular tools like Spark, Airflow, and Kafka

Databricks Certified Associate Developer for Apache Spark Using Python

  • Understand the fundamentals of Apache Spark to help you design robust and fast Spark applications
  • Delve into various data manipulation components for each phase of your data engineering project
  • Prepare for the certification exam with sample questions and mock exams, and get closer to your goal

High Performance Spark

  • How Spark SQL’s new interfaces improve performance over SQL’s RDD data structure
  • The choice between data joins in Core Spark and Spark SQL
  • Techniques for getting the most out of standard RDD transformations
  • How to work around performance issues in Spark’s key/value pair paradigm
  • Writing high-performance Spark code without Scala or the JVM
  • How to test for functionality and performance when applying suggested improvements
  • Using Spark MLlib and Spark ML machine learning libraries

Actually there is a 2nd edition being written so I am saving this one just to remind myself that later this year a newer version will come out

Streaming Databases

  • Explore stream processing and streaming databases
  • Learn how to build a real-time solution with a streaming database
  • Understand how to construct materialized views from any number of streams
  • Learn how to serve synchronous and asynchronous data
  • Get started building low-complexity streaming solutions with minimal setup

That is all for today!

See you tomorrow :)