OpsInsights.dev

Exploring the Newest S3 Bucket Innovations from AWS re:Invent 2023

Jothimani Radhakrishnan — Sat, 27 Jan 2024 17:34:46 GMT

Amazon S3 is often the starting point for anyone embarking on their AWS cloud journey, and it was my initial experience with AWS as well.

Since its inception in 2006, S3 has undergone significant evolution. From offering a range of bucket classifications like Standard and Glacier to implementing zonal replications and versatile access policies, the advancements have been noteworthy.

One of the more recent additions is the concept of mount points for S3, which further enhances its functionality.

You can learn more about this development here: [Mountpoint for Amazon S3]

Another exciting feature recently introduced is "Directory Buckets." This addition to the S3 lineup offers even more flexibility and options for managing cloud storage efficiently.

What is a Directory Bucket?

S3 has now been categorized into two main types:

1. General Purpose Buckets

2. Directory Buckets

Let's explore what Directory Buckets are all about.

Key advantages:

This advanced storage class stands out with three key features:

- Single-digit millisecond first byte latency for compute-intensive and latency-sensitive applications

- Consistent performance eliminates tail latencies, driving down query times

- Data access speeds up to 10x faster, and requested costs up to 50% lower than S3 standard.

Points to consider:

These buckets has naming standards
- Base-name--azid--x-s3

More details here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/directory-bucket-naming-rules.html

Points to Consider during design:

This is one zone s3 storage so, in times of outage there will be dataloss or unavailability
Directory buckets store data across multiple devices within a single Availability Zone

Thank you!!!

Elevate Your Kubernetes Skills: Key Insights for Acing CKA and CKS

Jothimani Radhakrishnan — Mon, 22 Jan 2024 19:40:03 GMT

Achieving certification in the Certified Kubernetes Administrator (CKA) and Certified Kubernetes Security Specialist (CKS).

We're all familiar with the syllabus and structure available on the official pages. In this post, I'll be sharing a fast-track method to attain the certification, assuming you have around two years of experience in Kubernetes.

Maximizing Efficiency in CKS Exam Preparation: Emphasis on Practice and Speed

When preparing for the Certified Kubernetes Security (CKS) or Certified Kubernetes Administrator (CKA) exam, two key elements play a crucial role in your success: practice and speed.

It's essential to familiarize yourself with imperative commands, which can significantly streamline your workflow. Keeping a handy list of these commands can be a game-changer. Here are a few critical commands, among others, to remember:

Creating Secrets Quickly: Instead of dealing with the complexities of YAML definitions and base64 encoding, use the create secret command for swift creation. For example:
```
 kubectl create secret generic --from-literal=username=admin --from-literal=password=admin
```
This command allows you to swiftly set up secrets, a fundamental aspect of Kubernetes security.
Creating a Service Account: Simplify the creation of service accounts with this command:
```
 kubectl create serviceaccount readonly-sa --namespace=dev
```
This is an efficient way to manage access within different namespaces.
Defining Roles: Establish roles easily with:
```
 kubectl create role dev-role --namespace=dev --verb=get --resource=pods
```
This command helps in setting up roles that define what actions are permitted on specific resources.

Setting Up Role Bindings: To link roles to service accounts, use:

 kubectl create rolebinding dev-rolebinding --namespace=dev --role=dev-role --serviceaccount=dev:readonly-sa

Other keys to keep handy,

Creation of Persistent Volumes (PV)
Creation of Persistent Volume Claims (PVC)
Creation and management of Pods

Effective Documentation Utilization:

For instance, consider using the official Kubernetes documentation at https://kubernetes.io/docs/home/.

It's crucial to develop proficiency in locating specific information, such as steps for upgrading a cluster.

When searching for "upgrade," ensure that the sources you refer to are from the official Kubernetes documentation (URLs containing 'kubernetes.io'), rather than relying on discussions or forum pages. This approach guarantees accurate and authoritative information.

Study Materials and Resources Utilized for Exam Preparation

CKA:

Kodekloud
practices test from http://killer.sh/
https://github.com/kodekloudhub/certified-kubernetes-administrator-course/tree/master/docs

CKS:

Good Luck!

My Journey to the HashiCorp Certified Terraform Associate Exam

Jothimani Radhakrishnan — Sun, 12 Nov 2023 13:30:27 GMT

Hashicorp Certified Terraform Associate - HCTA

This isn't just another typical blog post rehashing the HCTA exam details. We won't be covering the syllabus or exam structure that you can easily find on Hashicorp's official website.

In this post, I'll be detailing my journey to successfully passing the HCTA, focusing on the strategies and preparations I undertook.

Prerequisites:

The key is to gain as much practical experience as possible.

The more challenges you encounter and overcome in your hands-on practice, the smoother your exam experience will be.

Although the exam consists of multiple-choice questions, some of them are designed to test your practical knowledge to such an extent that you're likely to answer correctly only if you have real-world experience with the scenarios presented.

Key Preparation Strategies and Challenges

Understanding the nuances of Terraform commands and their differences is crucial.
terraform state (all sub commands)
terraform show
terraform output

Some of the tricky questions include - terraform push or terraform state push
After the moving the state file to another backend, what should you do?

terraform init or terraform state push

These questions cannot be answered unless you have worked; going on with the theory and white papers will not help.

Terraform count() vs foreach?

Remote backends and available remote backends

How to Prepare Effectively?

Enroll in a Comprehensive HCTA Exam Preparation Course: This will provide a structured learning path.
Undertake a Small Project Covering All Terraform Use Cases: It's vital not just to know but to understand the use of all Terraform commands thoroughly.
Mock Exams: When you feel ready, test your knowledge with mock exams. There are numerous resources available for this.

Pls remember Theory + Hands-on + Mock exams = HCTA Certified

Good Luck!

References:

Course I used for Preparation: https://www.udemy.com/course/terraform-beginner-to-advanced/

Exam practice: https://www.udemy.com/course/terraform-associate-practice-exam

Scaling Stateful Applications in Kubernetes: EKS + EFS.

Jothimani Radhakrishnan — Tue, 30 May 2023 11:31:01 GMT

Motivation:

Kubernetes is widely recognized for its ability to manage containerized applications at scale. However, when it comes to managing stateful applications, certain considerations must be addressed. This article explores the challenges faced in scaling stateful applications and presents solutions for seamless scalability.

Let's consider a scenario where we have an application consisting of two microservices that communicate and require a shared volume to support specific processing operations.

Additionally, the data generated by these services need to be retained for further analysis.

Requirement:

To ensure scalability among worker nodes, it is necessary to implement a solution that meets the following criteria:

The pods must have access to a shared persistent volume.

Create EFS

We are not going into the details of creating EFS. Assuming we have an existing EFS already.

my efs mount id: fs-582a0sdat

Make sure to create an access point. Access points are mount paths in EFS where the data should be accessed.

my efs access point /tmp/poc-efs

Deploy EFS driver in the cluster

Checkout the official document from AWS for EFS driver:

https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html

Create Storageclass

This is to define a storage class that tells the k8s to this resource to provision volumes. There are a lot of provisioners like EBS, GCE PD, EFS etc.

https://kubernetes.io/docs/concepts/storage/storage-classes/#provisioner

We are using EFS provisioner for our usecase.

storageClass.yaml

kind: StorageClassapiVersion: storage.k8s.io/v1metadata:  name: efs-scprovisioner: efs.csi.aws.comparameters:  provisioningMode: efs-ap  fileSystemId: fs-582a0sdat  directoryPerms: "700"  gidRangeStart: "1000" # optional  gidRangeEnd: "2000" # optional  basePath: "/dynamic_provisioning" # optional

kubectl apply -f storageclass.yaml

Verify the installation.

kubectl get pods -n kube-system | grep efs-csi-controller

Create a persistent volume and persistent volume claim.

apiVersion: v1kind: PersistentVolumemetadata:  name: efs-pv1spec:  capacity:    storage: 5Gi  volumeMode: Filesystem  accessModes:    - ReadWriteMany  persistentVolumeReclaimPolicy: Retain  storageClassName: efs-sc  mountOptions:    - tls  csi:    driver: efs.csi.aws.com    volumeHandle: fs-582a0sdat:/tmp/poc-efs-----------apiVersion: v1kind: PersistentVolumeClaimmetadata:  name: efs-claimspec:  accessModes:    - ReadWriteMany  storageClassName: efs-sc  resources:    requests:      storage: 5Gi

Verify the status

kubectl get pv,pvc

Deploy Application

Let's use it in our deployment spec

deployment.yaml

apiVersion: v1kind: Deploymentmetadata:name: efs-app  labels:    app: efs-appspec:  replicas: 3  selector:    matchLabels:      app: efs-app  containers:    - name: myapp      image: centos      command: ["/bin/sh"]      args: ["-c", "while true; do echo $(date -u) >> /data/out; sleep 5; done"]      volumeMounts:        - name: persistent-storage          mountPath: /data  volumes:    - name: persistent-storage      persistentVolumeClaim:        claimName: efs-claim

kubectl apply -f deployment.yaml

Now bash into the pods and the mount path can be accessed across the pods.

Thank you, Happy Provisioning!

References:

https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html

Mastering Fault Tolerance: Ensuring High Availability for Kubernetes Pods

Jothimani Radhakrishnan — Mon, 22 May 2023 05:20:20 GMT

Are Kubernetes pods highly available by default? - Partially true! :p

Certainly! Let's consider a scenario where you have deployed a cluster (let's stick with EKS for simplicity and my preference) and have deployed node groups across 3 availability zones, consisting of 6 worker nodes. Now, when scheduling a deployment with 3 replicas, can you guarantee that they will be evenly spread across the nodes?

The answer is no!

It's challenging to expect such even distribution based on my exploration. However, Kubernetes provides an out-of-the-box solution to address this problem. This solution requires some preparation and planning during the setup stage.

We can delve into this detail in our blog post.

Strategies for Making Pods Highly Available in Kubernetes

Node affinity: Node affinity is similar to node selector with additional flexibility. There are two types of node affinity.

requiredDuringSchedulingIgnoredDuringExecution: The scheduler can't schedule the Pod unless the rule is met. This functions like nodeSelector, but with a more expressive/customizable/flexible syntax.**

preferredDuringSchedulingIgnoredDuringExecution: The scheduler tries to find a node that meets the rule. If a matching node is not available, the scheduler still schedules the Pod.**

The above will help us to attract pods to respective nodes based on the labels and constraints.

To deploy the pods across the worker nodes we use

Pod Topology Spread Constraints

TopologySpreadConstraints describes how a group of pods ought to spread across topology domains. The scheduler will schedule pods in a way that abides by the constraints. All topologySpreadConstraints are ANDed.**

Example definition file for the same

apiVersion: v1 kind: Pod metadata: name: example-pod spec:

Configure a topology spread constraint

---apiVersion: v1kind: Podmetadata:  name: example-podspec:  # Configure a topology spread constraint  topologySpreadConstraints:    - maxSkew: 1      topologyKey: kubernetes.io/hostname      whenUnsatisfiable: ScheduleAnyway      labelSelector:       matchLabelKeys:        - app-name  ### other Pod fields go here

maxSkew - The extent of uneven distribution in Kubernetes is defined by a parameter, which is set to a default value of 1. This means that at least one pod should be scheduled on each node.

For example, in a cluster with three zones, the initial deployment may have 2 pods in zone 1, 2 pods in zone 2, and 1 pod in zone 3 (2/2/1 distribution). If the deployment is scaled up to 6 pods, the distribution across zones would be adjusted to 2 pods in each zone (2/2/2 distribution).

topologyKey - in simplified terms, it defines a group of nodes using its labels kubernetes.io/hostname - is the node label and it recognizes all nodes with this label as topology and each server inside it is a domain.

labelSelector - used to select matching pods

The above definition will help us to deploy and spread the pods across the nodes making it highly available.

Other Useful Tools:

NodeLabeller - Auto labels node if it is GPU processor.

Descheduler - To effectively utilize resources in the nodes.

Thank you!

**definition from the k8s documentation

References:

https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/

https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector

https://github.com/RadeonOpenCompute/k8s-device-plugin/tree/master/cmd/k8s-node-labeller

Golang Cheat-sheet: Quick guide to start Golang for python developers.

Jothimani Radhakrishnan — Tue, 21 Feb 2023 10:28:09 GMT

As a cloud-native python developer in day-to-day infrastructure and Site reliability operations, I always wanted to learn Golang due to its great performance comparably and for its out-of-the-box simplicity in multiprocessing and multithreading.

Check out this blog post Go concurrency: https://opsinsights.dev/my-first-go-program

So, while getting started with Golang, I noticed similarities;

Otherwise, I tried to relate, and compare with python which helped me to grab the concepts, and syntax quicker. This is the motivation for this blog post, as understanding the similarities of what we already know helps our mind to easily adopt new learning.

Sharing my findings and understanding while learning GOlang. Part-1

A comprehensive guide to getting started with GO for python developers.

Highlights of Golang.

Go is expressive, concise, clean, and efficient\**

**as per documentation

Go is a compiled language. Statically typed.
Compiled executables are OS-specific, like a runtime. If compiled in windows will work only for windows.
Case sensitive
Go is designed for the next generation of C.

Sometimes it feels like python, but It's a fast, statically typed, compiled language that feels like a dynamically typed, interpreted language.

Advanced concepts in python like multiprocessing, and multithreading is day to day in Golang which is a core functionality.

Golang does not support.

Type inheritance
Method or Operator overloading
Does not have structured exception handling. (try catch as in python)
NO implicit numeric conversions

Importing packages.

.py

to import a package, below are some of the methods of importing modules, packages

import Game.Level.start #to import any specific module import Game.* #full import from Game.Level import start #without package prefiximport Game.Level as GL #alias imports

.go

import "fmt" import "os"//to import multiple packages we can also enclose in simple brackets as shown below for.import (     "fmt"     "time"     "math" )import mt "math" // syntax for alias import import . math // Dot import  allows to use modules without referencing package names.

Using variables

.py

variable declaration is a simple assignment using this operator. "="

variable_1 = 10 variable_2 = "hello, team!"

*variables are case-sensitive.

.go

//var variableName dataType = initialValuevar x int = 10

there will be default initial values for variables if not assigned.

int and float types: 0
bool type: false
string type: ""
array and struct types: all of their elements are set to their respective zero values pointer, slice, map, channel, and function types: nil

For example, if you declare an integer variable x without an initial value, Go will assign it the zero value of int, which is 0:

Error handling

.py

try-catch block syntax in python.

try:     print(x) except:     print("An exception occurred")

.go

In Go, errors are represented as values of the built-in error, which can defined as.

type error interface {     Error() string }

If a function completes successfully the error value is null, else it will return the error.

The above example is with the interface, we will learn more in the next blog post.

All the above sections deserve an individual post, however*,* this will help you to kindle and kick-start Golang learning.

End of Part 1.

Happy learning. Thank you.

References:

https://go.dev/doc/code

https://blog.xojo.com/2017/12/06/compilers-101-overview-and-lexer/

Terraform Workspaces with a simple example use case.

Jothimani Radhakrishnan — Tue, 24 Jan 2023 07:26:12 GMT

What is a terraform workspace?

A TF workspace isolates state information within a workspace. At the time of workspace creation, an isolated TF backend is created. This ensures existing configurations are not disturbed.

Sounds interesting? Let us find some interesting use in this blog post.

Using terraform workspaces is equivalent to using a container with different volumes.

NVM :p that's a metaphor.

By default, every time you do terraform init a workspace is created which is a default workspace. So by design or by choice, we are using terraform workspace every day.

you can quickly switch between workspaces using the terraform workspace cli. (Refer to TF workspace cheat sheet below)

Definition as in documentation:

You can create multiple working directories to maintain multiple instances of a configuration with completely separate state data.

As defined, state data if isolated between workspaces which helps to use the same code multiple provisioning.

To understand better let us try an example use case:

An identical infrastructure should be provisioned in two different regions, us-east-1(N.Virginia), and eu-west-1(Ireland)

Provided the use case, there are several ways to achieve this, however, let us see an example with terraform workspaces.

I have used a local provisioner for this demo to keep it simple. I did not create any workspace yet, let us check which workspace we are in.

Below terraform creates a new txt file with text inside "Workspace Blog post". Also if you notice I have used terraform workspace variable to dynamically name the txt file.

This will help us to identify from which workspace execution this file was created. Let us plan and apply our script. And a file will be created as shown below.

let us create a new workspace with the same blog-demo, and make sure we are in the correct workspace.

terraform workspace new blog-demo

Successfully created and terraform workspace list will display all the workspaces available. Trying to terraform plan and apply will results as shown below.

With the same terraform code without any modifications we had two different executions.

Now co-relate this with provisioning multi-region infrastructure with the common code base.

Key takeaways of using terraform workspace.

in a similar approach, we can differentiate between regions based on the workspace names.
terraform execution is faster since it will reuse the modules and packages downloaded stead of downloading again.
increases code reusability and efficiency.

And any more use cases, it grows depending on our design pattern.

Workspace commands cheat sheet.

To list all the existing workspaces.

 terraform workspace list

To select/switch to a new workspace:

terraform workspace select

To create a new workspace:

terraform workspace new

To delete a workspace

terraform workspace delete

ignore dependencies and track

terraform workspace delete -force

To display the current active workspace:

terraform workspace show

Thank you, Peace!

References:

Feel free to go through these articles if you prefer a more detailed understanding.

https://spacelift.io/blog/terraform-workspaces

https://developer.hashicorp.com/terraform/language/state/workspaces

https://registry.terraform.io/providers/hashicorp/local/latest/docs/resources/file

ARGO workflows Vs AWS Step functions.

Jothimani Radhakrishnan — Tue, 13 Dec 2022 15:12:40 GMT

This blog quickly shows the comparison between AWS Step functions and ARGO workflow.

AWS STEP Functions

AWS Step Functions is a serverless orchestration service that lets you integrate with AWS Lambda functions and other AWS services for any specific use cases.

Step functions help to execute stateful and stateless jobs in a sequential way to achieve a solution.

Assume 10-STEP operational workflow; in each step, it does a specific operation that is interrelated to all other jobs, like the success of STEP A triggers STEP B.

Each step can be in the different executor,

STEP-1: python, STEP-2: bash etc..,

And above is a single example and there can be n different use cases for such, like sequential Jobs, parallel jobs and at scale etc.

AWS Step functions are AWS agonistic. Yes, we can communicate with any of the other AWS services via API/Cloud SDK.

One of the practical AWS Step function use cases.

Restore a database from the backup.
Sanitize the sensitive data.
update the connection strings in the secrets manager.

ARGO Workflows.

ARGO workflow is an open-source container-native workflow engine for orchestrating jobs on Kubernetes.

container native - Yes, bring out your imagination of use cases here to leverage the power of ARGO-Workflow at scale.

Some of the most common use cases are,

CICD pipelines
analytics operation
high volume data processing
sequential, parallel and more
directed-acyclic graph (DAG)

How it is different from AWS Step functions?

AWS Step functions are AWS agonistic whereas ARGO Workflows are cloud-agnostic.

Can be provisioned using cloud formation templates and AWS CLI. (submitting step functions via CLI)
job submission can be in JSON/YAML.
support GUI, API, CLI and through supported AWS services. (Pls refer to the aws doc linked at the bottom.)
support to create lambda functions at each stage.

ARGO Workflows can be used across cloud providers.

Workflows can be templated/bootstrapped.
ARGO jobs can be submitted using GUI and CLI.
Workflows can be templated and YAML syntax

Pricing

AWS STEP Functions.

It is easy to pay as you go, However, if we decode below are the price points to be considered while using step fn.

State transition charges + based on resource usage(CPU RAM utilized for each step)
Lambda backend charges, EC2, ECS, ECR(if required)

ARGO

Cloud hosting charges in the Kubernetes environment. (can be a part of existing cluster) or Independent hosting in any vm.

Who knows maybe ARGO can come up with their native cloud-hosted environment in the future, and I am counting on it.

We will discuss ARGO workflow with an example in detail in our upcoming blog post

Thank you!

References:

https://docs.aws.amazon.com/step-functions/latest/dg/development-options.html

https://argoproj.github.io/workflows/#:~:text=What%20is%20Argo%20Workflows%3F,the%20workflow%20is%20a%20container.

https://aws.amazon.com/step-functions/pricing/

My first GO! program. Letsss G0!!!

Jothimani Radhakrishnan — Mon, 04 Jul 2022 12:32:23 GMT

Before getting started with GO, WHY GO? Do I hate python, it's a BIG NO. There are some perks, ease of adoption, and responsibilities as we consider microservice architecture mainly in INFRA design and automation.

Merits of GO alongside Python.

GO solved two major problems,

Ease of programming as an interpreted language like Python.
With efficiency and safety as a static language like C++.
Competitive multi-core computing as built-in support

Concurrency. This blog will focus only concurrency speciality of GO.

Concurrency in other languages comes as extended functionality unlike in GO we have built-in support also known as goroutines. Goroutines are deeply integrated with Go's runtime, this runtime engine takes care of managing the threads (blocking & unblocking)
The runtime and the logic of a goroutine work together. Goroutines can communicate with one another and synchronize their execution.
In Go, one of the synchronization elements is called a channel. Channel helps to share data between goroutines

To learn more about concurrency in python: https://dev.to/kcdchennai/optimising-python-workloads-for-kubernetes-1d6c

To run a function as a goroutine, call that function prefixed with the go statement.

sum()     // A normal function call that executes sum synchronously and waits for completing itgo sum()  // A goroutine that executes sum asynchronously and doesn't wait for completing it

Lets the learn this concurrency in GO using an example.

Using Opensource API endpoint here http://worldtimeapi.org/pages/examples, this API helps us with timezone using various params, we are using this to find timezone based on IP.

package mainimport (    "fmt"    "io"    "log"    "net/http"    "os")func request(url string) {    res, err := http.Get(url)    if err != nil {        panic(err)    }    defer res.Body.Close()    b, err := io.ReadAll(res.Body)    fmt.Println(string(b))}func main() {    if len(os.Args) < 2 {        log.Fatalln("Usage: go run main.go   ... ")    }    for _, url := range os.Args[1:] {        request("http://" + url)    }}

Below is the response to my above excerpt. I have passed 4 args to the above function.

What is happening in the above program is Sequential execution

We sent a request to the first argument and when it completes, a response comes in. Then the program returns to the for loop and sends another request to the next argument, and the process continues.

Let's do concurrency using goroutines now

GOroutines

GOroutines are managed by GO runtime scheduler which takes care managing threads accessing CPU.

asynchronous, powerful

As we already know to make fn into goroutine, add a prefix go behind the fn invocation. Updating the code below

package mainimport (    "fmt"    "io"    "log"    "net/http"    "os")func request(url string) {    res, err := http.Get(url)    if err != nil {        panic(err)    }    defer res.Body.Close()    b, err := io.ReadAll(res.Body)    fmt.Println(string(b))}func main() {    if len(os.Args) < 2 {        log.Fatalln("Usage: go run main.go   ... ")    }    for _, url := range os.Args[1:] {        go request("http://" + url)    }}

The above program will result in below, which in 0.843 seconds.

:| why?Yes, we have started the concurrency and it triggered the fn 4 times but did not wait for the output. In order to visualize this let us add a waitgroup

WaitGroup

WaitGroup is included in the Golang sync package. It includes features that allow it to block and wait for any number of goroutines to complete their execution.

Code with added wait:

package mainimport (    "fmt"    "io"    "log"    "net/http"    "os"    "sync")var wg sync.WaitGroupfunc request(url string) {    defer wg.Done()    res, err := http.Get(url)    if err != nil {        panic(err)    }    defer res.Body.Close()    b, err := io.ReadAll(res.Body)    fmt.Println(string(b))}func main() {    if len(os.Args) < 2 {        log.Fatalln("Usage: go run main.go   ... ")    }    for _, url := range os.Args[1:] {        go request("http://" + url)        wg.Add(1)    }    wg.Wait()}

While executing the same fn with 8 args, the response was within 0.853s which is 4x faster than the normal way.

Hope you enjoyed this new exploration. We will discuss more about Go in my upcoming blog posts.

Peace!

Container Runtime Interface (CRI), Docker deprecation & Dockershim

Jothimani Radhakrishnan — Thu, 27 Jan 2022 15:12:33 GMT

#kcdchennai #kubernetes #docker #devops

Author: Jothimani Radhakrishnan (Lifion by ADP). A Software Product Engineer, Cloud enthusiast | Blogger | DevOps | SRE | Python Developer. I usually automate my day-to-day stuff and Blog my experience on challenging items.

Intro

Hey Docker lovers <3, this is not going to be a happy story for you guys. Yes, for me as well. Like everyone, I loved using Docker very much. When I get to know about this project (Dockershim), it was a heartbreaking moment for me.

Before knowing about Dockershim let us discuss the following.

Kubelet takes care of managing worker nodes in relation to the master node. It ensures that the specified containers for the pod are up and running.

To know more about kubelet: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

Container runtime:

Container runtime is a software that is responsible for running containers. To create a pod, kubelet needs a container runtime environment. For a long time, Kubernetes used Docker as its default container runtime.

This creates a problem/dependency that whenever docker release updates/upgrades it breaks the Kubernetes.

There are also several container runtime tools.

containerd
CRI-O
Docker
Rocket
LXD
OpenVZ
Windows Server Containers

Okay! Coming back to the context of this blog. Docker is going to be deprecated from Kubernetes as default and containerd is going to replace the place.

What is Dockershim?

Docker existed as a default engine in k8s, after introducing additional CRI access (Container runtime interface) in k8s, Kubernetes created an adaptor component called dockershim.

The dockershim adapter allows the kubelet to interact with Docker as if Docker were a CRI compatible runtime.

Img src: Kubernetes documentation

Switching to Containerd as a container runtime eliminates the middleman

Points to Check to make sure your environment is not affected by this change.You can continue to use Docker to build images, using Docker is not considered a dependency.

All these below pointers should be considered and updated as per your native CRI.

Make sure to update your docker commands operations that are running inside the pod. For example, listing running containers in the worker node using docker ps which might not work after the deprecation.

Check for any private registries or image mirror settings in the Docker configuration file (like /etc/docker/daemon.json)

Any scripts that ssh in worker nodes and does any docker CRUD operations.
Any third-party tools using docker needs to be updated. Migrating Telemetry
Any alerts that are configured based on Docker specific errors should be updated.
Any automation or bootstrap scripts based on Docker commands should be updated.
The list is not limited as mentioned above and might vary based on your adoption of usage.

To know more about the deprecation FAQ: https://kubernetes.io/blog/2020/12/02/dockershim-faq/

Thank you,

Happy containerd! :p

Reference:https://developer.ibm.com/blogs/kube-cri-overview/https://kubernetes.io/docs/tasks/administer-cluster/migrating-from-dockershim/check-if-dockershim-deprecation-affects-you/https://kubernetes.io/blog/2022/01/07/kubernetes-is-moving-on-from-dockershim/

AWS - Karpenter - Kubernetes cluster auto-scaler

Jothimani Radhakrishnan — Wed, 08 Dec 2021 05:09:37 GMT

AWS recently announced Karpenter An Open-Source High-Performance Kubernetes Cluster Autoscaler - ReInvent-2021

Before getting into the discussion of Karpenter, let's discuss k8s native cluster auto-scaler.

According to the documentation below is the definition:

Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:

there are pods that failed to run in the cluster due to insufficient resources.
there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.

And if we prefer to adjust the CA for various use cases we have to create each group for them, as explained below.

Traditional way:

Creating autoscaling groups different types of node groups based on our needs.

Example:

ASG node group for GPU useASG node group for general use and based on instance family etc..

This creates more overhead in maintenance and operation costs. :(

Why do we need a cloud-native cluster auto-scaler?

Cloud-native CA (cluster-autoscaler) explores the full capability of the native tools and technologies which helps to effectively and efficiently use the resources in all aspects.

This creates a debate about Kubernetes native vs AWS native.

Karpenter:

Karpenter solves the following problem making effective decisions and scheduling.

What does the pod need? Where is the pod best fit-in? What can I do to best fit that pod in a node?

it provides complete control over ec2 instances

cat < |kubectl apply -f -apiVersion: karpenter.sh/v1alpha5 kind: Provisioner metadata: name: default spec: #Requirements that constrain the parameters of provisioned nodes.  #Operators { In, NotIn } are supported to enable including or excluding values   requirements:     - key: node.k8s.aws/instance-type #If not included, all instance types are considered       operator: In       values: ["m5.large", "m5.2xlarge"]     - key: "topology.kubernetes.io/zone" #If not included, all zones are considered       operator: In       values: ["us-east-1a", "us-east-1b"]     - key: "kubernetes.io/arch" #If not included, all architectures are considered       values: ["arm64", "amd64"]     - key: " karpenter.sh/capacity-type" #If not included, the webhook for the AWS cloud provider will default to on-demand       operator: In       values: ["spot", "on-demand"]   provider:     instanceProfile: KarpenterNodeInstanceProfile-eks-karpenter-demo   ttlSecondsAfterEmpty: 30   EOF

From the above provisioner config file, we can see we can control instance type, Availability zone, architecture and capacity type (on-demand vs Spot)

Highlights:

Faster Scheduling - Karpenter directly manages the nodes.
No Node group provisioning - Karpenter takes care of instance provisioning with node groups, it can schedule pods in instances based on the configuration.
Cost-effective than k8s native cluster auto-scaler.

Karpenter: Right now it only supports AWS native, anyone can contribute to other cloud tools.

Wonderfull self-explained video from JustinGarrison about Karpenter https://www.youtube.com/watch?v=3QsVRHVdOnM&ab_channel=JustinGarrison

--- End-of-Blog ---

Reference:https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md

My take on Elasticsearch as Primary Database?

Jothimani Radhakrishnan — Sun, 21 Nov 2021 11:08:13 GMT

ElasticSearch:

Elasticsearch(ES) is a distributed, full-text document store, search engine which is based on Apache Lucene as a core library. ES was originally designed for rich text-based search with advanced features to support complex queries, filters, analyzers with multiple languages. ES indexes and stores data so that it can be retrieved/searched in near real-time.

Distributed: ES saves the data in multiple nodes, data can be retrieved from any node at any time

Highlights of Lucene core:

It can return search responses quickly.
Based on the inverted index data structure. (stores mapping from content such as words or numbers to its location in a document(s))

Possibilities on using ES as primary database:

It depends on a lot of factors, such as,

(One ongoing WIP from the elastic team is that they are constantly working towards improving the resiliency)

     Size of document     Number of concurrent requests of read per second     Writes per second

ES always does best in read requests, as the name suggests it is indexed data store.

If we prefer to increase the write (aka indexing rate) it can be achieved in several ways. refresh_interval, flush_threshold_size are some of the key parameters to be considered while improving writes.

Not impossible, however, is it advisable to go? My take;

Below are the pointers out of my experience in using ES as a database in production.

It is not preferred to use ES as the primary database when you have (WPS) writes per second dominant reads (RPS).
ES is very helpful when one of your primary use cases is to read, visualize the data using the complex combination.
Best works for structured data set with less number of nested items.

To mitigate the concerns of write in elastic search, add another layer over ES, that can be REDIS or KAFKA buffer to queue the incoming data and to avoid data loss.

The main question is it worth adding more complexity to architecture? Which adds maintenance and other ad-hoc items related to the new services.

The use-cases which use this architecture are;

Realtime logs storing systems
Data capturing systems, where continuous ingestion of data from frontend applications.

And if your primary goal is to save JSON data and with the best performance in writes and reads, (considering all the points above in this section )please go ahead with mainstream NoSQL databases like MongoDB.

Invention and Discovery always being the part of human nature,

I always love the idea of extending the capabilities of any such machine with a pinch of salt and pepper to adapt to our complex use-cases in our daily. However, this should happen without killing the nature, purpose for which the tool is built since we might have straight forward alternative for any such use case.

EOB (End-of-Blog)

Reference:

https://medium.com/@merrinkurian/elasticsearch-as-the-primary-database-5e41b2a0189dhttps://cloud.netapp.com/blog/cvo-blg-elasticsearch-vs-mongodb-6-key-differenceshttps://aws.amazon.com/premiumsupport/knowledge-center/opensearch-indexing-performance/

AWS Memory DB [NEW]

Jothimani Radhakrishnan — Tue, 14 Sep 2021 16:24:10 GMT

A new database in the NoSQL category of AWS database suites, adding to a total of 12 database offerings from AWS.

What is a Memory DB?

As the name suggests Memory DB is a completely in-memory database (uses RAM, rather than an HDD/SSD) solution offered by AWS under the NoSQL category. In-memory databases are ideal for applications that require microsecond response times and can have large spikes in traffic coming at any time.

AWS memory DB was developed in rivalry with SAP HANA.

AWS Memory DB for Redis:

Redis is a compatible in-memory database that can completely replace the traditional database stead of using it as a traditional cache-store.

ElastiCache vs Memory DB

ElastiCache

- Acts as a plain key-value store based on TTL. Cannot be used as a primary database.- Traditional cache systems are used to reduce no of reads to cache and minimize the no of calls that hits the database by caching them.- Also used in places where 3rd-party (partner integrations) API integrations where the partner API's throttled, Caching used to save the API response in cache store for a period to save redundant requests to partner API

High level architecture:

Memory DB

- Can be used as a primary database that is completely working by storing the data in memory.- Potentially replaces the primary database

High level architecture:

Use-cases of Memory-DB

- Caching.- High volume transactional data.

So what are the other popular in-memory database? SAP HANA

Reference:

https://aws.amazon.com/nosql/in-memory/https://aws.amazon.com/memorydb/

OpsInsights.Dev

Jothimani Radhakrishnan — Sun, 12 Sep 2021 03:27:08 GMT

once upon a time, I had this grand idea to start blogging, writing books, and going for morning jogs. It all began with a New Years resolution, as most great ideas do. However, much to my dismay, I couldnt even keep up with it for a week. The concept of blogging seemed to slip away from me, vanishing into thin air.
Like many others, I was quick to blame time for my failures. But in those moments of introspection and self-reflection, I realized that time wasnt the culprit hereI was.
Time is a constant, always there, always enough. Its our perception, priorities, and preferences that determine how we utilize it. When we have a well-planned routine in place, time works exactly as expected.
In an attempt to better manage my time, I started exploring hobbies. I immersed myself in the joy of reading good books, documenting my daily experiences through blogging, creating comprehensive to-do lists, and striving to diligently follow them. I can proudly say that Ive become 20% better than the person I was before. And Im hopeful that this improvement will continue on an upward trajectory each day.
However, what truly ignited my motivation to focus more was the idea of blogging. As an SRE/DevOps Engineer, my days are filled with endless challenges. But the beauty lies in deciphering whether theres a predefined solution to each problem. Ive had my fair share of frustrating days spent struggling over minor setbacks. And the reason? Lack of communication. Often, the solution was already within reach, and a brief discussion with my peers could easily unravel the enigma.
Throughout my journey, Ive realized that blogs and online channels have played a crucial role in my personal growth. Theyve helped me mature, gain knowledge, and evolve as a professional. This realization sparked a desire within me to give back to the open-source community.
And thats where my story takes a turn. I began summarizing my daily challenges, and sitting down to write posts that could benefit others. It became my way of contributing to the open-source movement and making a positive impact.
So, thats a glimpse into my story, and I promise to share more as I go with the flow. I know it may sound somewhat motivational, but its the kind of content that suits my landing page perfectly.
Lets keep the inspiration flowing!

TCP BBR — Bottleneck Bandwidth and Round Trip Propagation Time — OpsInsights

Jothimani Radhakrishnan — Sun, 30 May 2021 05:17:10 GMT

This blog post does talks what is TCP BBR on a high level to tickle your interest read more on BBR :p

We all know about TCP and learned how it works in earlier stages of life, however right from the inception of TCP there were several changes have been made to solve the problems and limitations that pose. Let's discuss one of the major enhancements in TCP in this post.

Since the 80s, the internet is controlled by a lossy congestion control algorithm. On the High level. TCP

TCP (Transmission Control Protocol) is to control the transfer of data so that it is reliable.

TCP makes sure packet is data is delivered to the Rx end and dynamically reduces the send packets based on the ACK signal from the Rx end. When an ACK is not received for a particular packet it dynamically controls the ingestion flow.

Quick Two-line Definitions of other commonly used TCP models,

TCP RENO: Increase or decrease sender bandwidth based on packet less ACK on each RTT.TCP CUBIC: IN each lossless RTT interval, BIC attempts to inflate the congestion window by one half of the difference between the current window size and the previous maximum window size.TCP Vegas: An increasing RTT, or packet drop, caused Vegas to reduce its packet sending rate, while a steady RTT caused Vegas to increase its sending rate to find the new point where the RTT increased.

TCP BBR, which achieves higher bandwidths and lower latencies for internet traffic.

What is BBR?

BBR ( Bbottleneck Bbandwidth and R round-trip propagation time) is a new congestion control algorithm developed at Google. Congestion control algorithms running inside every computer, phone or tablet connected to a network that decide how fast to send data.

How it works?

It uses recent measurements of the networks delivery rate and round-trip time to build an explicit model that includes both the maximum recent bandwidth available to that connection and its minimum recent round-trip delay. BBR then uses this model to control both how fast it sends data and the maximum amount of data its willing to allow in the network at any time.

Thus utilizing the maximum bandwidth of the client,

More widely adopted all over globally, AWS GCP and CDNs like Fastly etc.
CDNs, Load balancers use TCP BBR to deliver content efficiency, which shows better metrics than earlier.

Points to Debate:

BR provides high throughput but at the expense of high packet retransmissions in shallow buffers. Therefore, if the delivered content is sensitive to packet loss, then BBR might not be a good choice.
This is debatable, over usage of the bandwidth, which uses most of the allotted bandwidth and reducing others usage in a browser, pc, etc.

Peace!

You can read more about TCP congestion control here:

https://www.noction.com/blog/tcp-transmission-control-protocol-congestion-control

Need to know hands-on results?

https://medium.com/google-cloud/tcp-bbr-magic-dust-for-network-performance-57a5f1ccf437

Originally published at https://opsinsights.dev on May 30, 2021.

K8s — Managing Multiple Ingress- Pods -> Service -> Nginx ingress -> AWS ingress — OpsInsights

Jothimani Radhakrishnan — Thu, 16 Apr 2020 08:12:59 GMT

There is no stable infrastructure, that is un-disturbed right from the initial design.

if you disagree; at least, my infra is not :p

Yes, we adopt change and always look for improvement. This is not going to be a technical aspect of the illustration, we will see scripting and HOW-To stuff in another blog.

Let us consider a simple Django application. Django is the most famous framework for python like Flask.

Some of the main Pros to prefer Django

Running a python application is quite easy as it does,

Bingo! this solution best works only for the development environment and it does not support multi-threading, the same cannot be used for production instances.

We need a wrapper i.e, another layer/gateway interface that helps to multithread and to manage the underlying python threads.

So comes uwsgi, the most famous application server interface which supports many frameworks, but was initially developed to support python.

Django with uwsgi server with multi-threading workers enabled (by default multithreading is off by python GIL).

Since we had the uwsgi server in Django, I prefer not to have Nginx and use AWS ingress controller in front, which in turn creates application load balancers. All the configs, SSL offloading everything was done ALB level.

the setup looks as below:

Problems in the above setup:

Considering the security and performance of the application uwsgi workers were configured to restart every 60 seconds. This avoids any dead service calls; the number 60s was arrived by benchmarking the application, provided no service call waits for more than 60s.

Okay, let us assume we have good traffic now, 1000 requests per minute. Every 60 seconds the worker threads were restarted; when any of the requests are sent to the worker at the time of restarting, it returns 502. This keeps happening randomly in the application.

Approximately it is faced by 5% of the overall requests. Thus it is identified that AWS ALB is not mature enough to route traffic healthy uwsgi workers inside a container.

**Nginx **handles this by default; so the solution is to introduce an Nginx ingress in front of aws ingress controller. I need

AWS ALB since I have configured other settings like WAF in L7 in ALB.

After updating the setup as described above, we found a drastic change in the requests and 502 is not a nightmare anymore.

Launched Nginx ingress controller with type service: NodePort so that it will not launch any unnecessary load balancers. Mapped aws ingress to listen to Nginx ingress service endpoints

We will discuss how to do this setup in detail in our next blog.

More on web server gateway: https://www.python.org/dev/peps/pep-3333/

Originally published at https://opsinsights.dev on April 16, 2020.

Whats in with Helm v3? Quick look — OpsInsights

Jothimani Radhakrishnan — Sun, 10 Nov 2019 23:25:02 GMT

Hello readers, in our last blog we have went through using helm charts, which includes basic usage. Now lets see whats new in the latest version of helm charts v3 which has more stable version of its beta.

No more Tiller ! Lua templates are going to be introduced alongside Go templates

Requirements are now listed in the Chart.yaml file instead of the requirements.yaml file.
A crd directory has been added to charts for the placement of CRDs. These will be installed before the rendering of the templates is performed. Once the Kubernetes community has worked out more workflow details with CRDs more features can be added to Helm to support them.
The crd-install hook has been removed and will not work for Helm v2 charts. A legacy plugin will be released by the Helm project to support v1 charts with thecrd-install hook.
Experimental feature gates are now supported. As new potential features are added to Helm they can be worked on as experiments and enabled using an environment variable.
Pushing and pulling charts from OCI registries is now an experimental feature. The final details of this feature are being worked out with the OCI on the proper use of the API. To access this feature set the environment variable HELM_EXPERIMENTAL_OCI=1 needs to be set.
helm server has been removed.
Helm now supports library charts. These charts are not meant to be installed but can be depended on and referenced by other charts. These were inspired by the common chart in the incubator repository.
helm test received some major refactoring. This included the test-success hooks behavior coming in line with other hooks and the removal of the test-failure hook due to lack of use.
Several CLI changes have happened for usability including:
helm inspect is now helm show
helm fetch is now helm pull
helm delete is now helm uninstall
Instead of using the --purge flag on helm uninstall the behavior is to use the --keep-history if you want to keep the history.

Will soon write another blog to how to upgrade from helm V2 to V3 without damaging our resources.

Originally published at https://opsinsights.dev on November 10, 2019.

Helm Charts — Quick Tips for Kick Starts — OpsInsights

Jothimani Radhakrishnan — Sat, 06 Jul 2019 12:22:50 GMT

Technology is always ahead of us and it is tough to follow up with.

Yes, when I tend to know about Docker, there were already superstars like Docker Swarm and Kubernetes. Docker evolves and lets others automate, so comes docker swarm. No, lets do another way Kubernetes says, I will manage everything even scaling, you just do maintenance and deployments.

No end to this story and we are here to discuss Helming. Helm is the manager/driver for Kubernetes. We can manage k8s using helm clients.

What is a helm? It is a light client to manage k8s management. Let us discuss more with our first helm chart.

Installing helm:There are two parts of installing helm; One was with Tiller (Helm server) and Helm (client)

You can install helm using the below command, based on your OS distro.https://helm.sh/docs/using_helm/

Sample helm chart:

Navigate to the location where you would like to maintain helm for the app.
Execute the following command.

helm init -history-max 200

Hmmm, wonder why history max 200, thus helm maintains only 200 revisions of history for the deployment. Meaning that you can roll back up to the last two hundred versions easily.
This will install Tiller to your running Kubernetes cluster. It will also set up any necessary local configuration.

helm create $ChartName

-Helm creates initializes the local repository with the list of required templates. |- .helmignore # Contains patterns to ignore when packaging Helm charts. |- Chart.YAML # Information about your chart | |- values.YAML # The default values for your templates | |- charts/ # Charts that this chart depends on | |- templates/ # The template files | |- templates/tests/ # The test files

Let's add all our scripts.

Step-1 : Copy all your scripts to templates directory (deployment/service/ingress/secrets )

Step-2: List down all your used values/variables in values.YAML file.

In my yaml file, I have listed the following.

namespace: sandbox replicaCount: 1 name: jenkins image: repository: XXX DOCKER REPO XXX tag: latest pullPolicy: Always strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 25% maxSurge: 25% revisionHistoryLimit: 5

Step-2: Replace hardcoded values with bootstrapped values. as shown below example.

once you have followed the above steps for all your deployment files.do helm install to install your app.

helm install -namespace sandbox Jenkins-app Jenkins/

> jenkins-app > my custom chart name.> jenkins/ > charts directory.

Helm install will return the output if you have configured any outputs for your easy use; say IP, load balancer name etc. (will discuss this more in helm deep dive post)

to list the available helm packages installed and running in your cluster.

helm ls

And finally, deployments is super easy by the way,

helm upgrade -namespace sandbox jenkins-app jenkins/ -set image.tag=XXXX

if your new image for the app is XXXX just replace the above value with image id.

As simple it is, Did I miss any quick bites from the helm, do comment let's discuss.

Peace!!!

Originally published at https://opsinsights.dev on July 6, 2019.

Kubernetes — Managing Secret

Jothimani Radhakrishnan — Mon, 22 Apr 2019 18:03:22 GMT

Hey Hi,

Happy to join back to my Kubernetes series. In our previous blogs on k8s we have done the following:

Kubernetes setup in AWS:

https://opsinsights.dev/my-first-path-to-kubernetes-setup%e2%80%8a-%e2%80%8apart-1-of-2/

https://opsinsights.dev/my-first-path-to-kubernetes-setup-in-production%e2%80%8a-%e2%80%8apart-2-of-2/

Lets learn some best practices on handling the credentials in Kubernetes.

There are several methods of handling secrets in Kubernetes. this is mainly meant to use mission-critical information which is not to be exposed in the codebase.

First, lets create a deployment with all our required secrets,to proceed further, base64 encoded secrets are required which are to be deployed

apiVersion: v1 data: DATABASE_NAME: YW5pbWAFrASDZXJfZGV2 DATABASE_USER: YW5pASDbWFFwcDFAASVzZXI= DATABASE_HOST:YW5pbWFrZXItZGV2ZWASDxvcC5jOWtwaTFDAFhaTE5ZWMudXMtd2VzdC0yLnJkcy5hbWF6b25hd3MuYASD29t DATABASE_PORT: MzSDFDSFMwNg== DATABASE_PASSWORD: QU5ASXCXZSW1hYSFQUHVzMQ== kind: Secret metadata:  name: demo  namespace: default  type: Opaque

To create base64 encoded values:

echo 'k8s-demo' | base64 output: azhzLWRlbW8K

To confirm your input is the same as the output decode and verify it

echo 'azhzLWRlbW8K' | base64 --decode

Once you have all the secrets ready, lets move to pod deployment. Below is a sample excerpt of the secrets.yaml file

I have updated my database credentials with this deployment. Once done. deploy it

kubectl apply -f secrets.yaml

verify your deployment by kubectl get secrets demo

Hmm, in-order to use this as env variable in our pods/container. Modify the deployment configuration as shown below and proceed with your pod deployments.

containers: - name: jenkins-app image:  REPO URL> env: - name: SECRET_KEY valueFrom: secretKeyRef: name: demo key: SECRET_KEY - name: GOOGLE_OAUTH2_KEY valueFrom: secretKeyRef: name: demo key: GOOGLE_OAUTH2_KEY - name: GOOGLE_OAUTH2_SECRET valueFrom: secretKeyRef: name: demo key: GOOGLE_OAUTH2_SECRET

To confirm your setup; SSH into the pod and list all the environment variables. you can see all the fields which we have mapped thus far.

Originally published at https://opsinsights.dev on April 22, 2019.

Docker in Docker

Jothimani Radhakrishnan — Mon, 25 Mar 2019 01:57:51 GMT

There was a time I once visualised this scenario, Also have discussed with one of mentor (Allwyn) of this usage. How come it would be if we could use docker inside a Docker container. :p

Sounds buzzy; Yes it is.

There are several scenarios where we may end up using a scenario like this. We will see one such scenario and a simple way to implement them.

The scenario I faced was like when automating the process of building the Docker image. To build a Docker image, we can simply do

for this we need docker needs to be installed in our machine so that it can be automated via Jenkins

Okay, let us assume we have hosted the Jenkins as docker container, then what??

Shall we connect the container and install docker in it? Are you sure?

Yes, we have some cool solutions for this.

https://hub.docker.com/_/docker

This means, if you pull an image inside the container, this image will also be visible on the host system (and vice versa). The above solution is straight forward, like the nested scene. A docker daemon will be installed inside the container.

However there are other optimal solutions for this scenario;

Just mount the docker sock (from local install path) with the docker image while starting up the container.

-v /var/run/docker.sock:/var/run/docker.sock docker run --name myjenkins -p 8080:8080 -p 50000:50000 -v /var/jenkins_home -v /var/run/docker.sock:/var/run/docker.sock jenkinssudo chown root:{USER} /var/run/docker.sock

docker exec -it jenkins /bin/bash docker ps docker images

full command is as follows; I have started jenkins container by mounting the volumes

make sure the docker.sock file has permissions to the docker user group

Note: replace the user with the user from which you start the docker container: my case it is ubuntu

Now log into the docker image and check you have the access for the same.

Inside the container try running the docker commands

Thank you,

[DISPLAY_ULTIMATE_SOCIAL_ICONS]

Originally published at https://opsinsights.dev on March 25, 2019.

A story of a DevOps Engineer

Jothimani Radhakrishnan — Tue, 05 Mar 2019 17:36:47 GMT

"I'm writing this blog inspired by numerous other blogs and their ideas about the DevOps culture. I also want to share my vision on DevOps culture. 😊

Note: This blog won't delve into technical aspects. It's simply a reflection on my understanding of DevOps. Your comments are welcome!"

"What is DevOps? Does it have a clear definition? How can it be implemented? What tools are available to facilitate a smooth DevOps workflow? And how does one become a DevOps Engineer?"

Plenty of questions popped into my mind when I began my career as a Cloud Engineer (Software Engineer Cloud). I constantly pondered what I would achieve in the next five years and what steps I should take to get there.

I kept asking myself these questions, especially as the buzz around DevOps being a trendsetter in the software industry grew louder.

So, I started researching and attempted to gather some well-defined resources, including a link for the definition of a DevOps Engineer, their job role, and a course link to become one.

However, my search seemed endless, and I still couldn't find a definitive definition for the term. It took me some time to realize why.

Devops is not a Job title or Work role, it is a lifestyle, a culture and beyond.

Essentially, it's a process that streamlines deployment to production in an Agile environment. Sorry, no technical details beyond this.

It's about automating a developer's daily tasks and allowing them to focus on their work.

A passionate lifestyle is crucial, and some key aspects include:

Never fear technology: A golden rule; new technology is always simpler than we think. We just need to adapt and learn anew.
Maintain a proactive attitude: The DevOps lifestyle often presents challenges and typical scenarios. Embrace them with a 'have a go' attitude, as every day brings new problems."Whenever there is a problem there should a solution definitely and I am sure someone would have already solved that, We have forget everything we have earlier start from scratch rework and relearn.

And the great thing is when the problem doesn't have a solution yet? Bingo!

Define a solution to the problem with all your effort, Stage it, Sell it. You are an Entrepreneur, Product owner.

And I am eagerly awaiting such a moment, a magical moment.

Never shy away from a problem; always go for it. Indeed, look for a problem, as problems are stepping stones that could lead to a wonderful solution.

Yes, when you solve a problem faced by a group of people, you end up with a market-changing product.

To summarize,

DevOps is a lifestyle that simplifies the workflow of a product team in an Agile environment.

Jothimani - DevopsEngineer

Kubernetes — Exposing the services-

Jothimani Radhakrishnan — Mon, 25 Feb 2019 14:00:52 GMT

Hello All,

In my previous blogs, we have discussed on creating k8s cluster and launched the demo application and exposed them with elb (classic). lets discuss more the exposing the services.

If you have missed out my previous blogs have a look at here Part 1 : https://medium.com/@jothimanikrish/my-first-path-to-kubernetes-setup-in-production-fe54358d1de3Part 2 :https://medium.com/@jothimanikrish/my-first-path-to-kubernetes-setup-in-production-part-2-of-2-1cd3c718b824

What are the different possibilities of exposing the services?

Cluster IP: Exposes the Service on an internal IP in the cluster. This type makes the Service only reachable from within the cluster.
NodePort: Exposes the Service on the same port of each selected Node in the cluster using NAT. Makes a Service accessible from outside the cluster using :
LoadBalancer: Creates an external load balancer in the current cloud (AWS in our case)
ExternalName: Exposes the Service using an arbitrary name (specified by externalName in the spec) by returning a CNAME record with the name

For more notes refer k8s documents:

Let us discuss how to expose a service to access via AWS ingress controller (Application Load Balancer)

Before that, in our previous tutorial we have exposed the service using elastic load balancer, There are some tough breaks points in that; yes

though we have load balancer in place proxying the node port for us; We are unable to do path based routing. More importantly http to https redirection is not possible. I have tried out a lot google here and planned to move for ALB. But how?ALB highlights,
path based routing to the app,
http to https redirection in the Load balancer level. :) and more..

Lets launch the application with a Application load balancer

Create a delpoyment for the application

apiVersion: extensions/v1beta1kind: Deploymentmetadata:  app: jenkinsspec:  replicas: 2  template:    metadata:      labels:        app: jenkins    spec:      containers:      - name: demo-jenkins        image: jenkins:2.3-alpine        ports:          - containerPort: 8080

Create a service for the application

kind: ServiceapiVersion: v1metadata:  name: jenkins-deploymentspec:  selector:    app: jenkins  ports:  - protocol: TCP    port: 80    targetPort: 8080  type: NodePort

Form the above service script, stead of Loadbalancer we have to expose it via NodePort, so that we can the ALB to listen from nodeport.

its time to configure and install AWS Ingress controller:

you can also follow the aws docs: https://aws.amazon.com/blogs/opensource/kubernetes-ingress-aws-alb-ingress-controller/

step 1:

Deploy RBAC Roles and RoleBindings needed by the AWS ALB Ingress controller:

kubectl apply -f [https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.0.0/docs/examples/rbac-role.yaml](https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.0.0/docs/examples/rbac-role.yaml```)

step 2:

Download the AWS ALB Ingress controller YAML into a local file:

Note: before applying this file update the CLUSTER NAME

curl -sS "https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.0.0/docs/examples/alb-ingress-controller.yaml" > alb-ingress-controller.yaml

step 3: Apply the configuration

kubectl apply -f alb-ingress-controller.yaml

to confirm the installation; check with the below command:

kubectl logs -n kube-system $(kubectl get po -n kube-system | egrep -o alb-ingress[a-zA-Z0-9-]+)

the output should look similar to this

--------------------------------------------------------------------AWS ALB Ingress controller  Release: v1.0.0  Build: git-6ee1276  Repository: https://github.com/kubernetes-sigs/aws-alb-ingress-controller--------------------------------------------------------------------

Now getting back to our application setup:

we are going to listen the app in 443 via alb; and redirect all the requests from 80 to 443 using ALB.

Note: make sure to have aws ssl certificates from ACM

apiVersion: extensions/v1beta1kind: Ingressmetadata:  name: jenkins  annotations:    kubernetes.io/ingress.class: alb    alb.ingress.kubernetes.io/target-type: instance    alb.ingress.kubernetes.io/scheme: internet-facing    alb.ingress.kubernetes.io/subnets: 'SUBNETS, 'SUBNETS'    alb.ingress.kubernetes.io/security-group: 'SECURITYGROUP'    alb.ingress.kubernetes.io/healthcheck-path: "/"    alb.ingress.kubernetes.io/success-codes: "200,404,400"    alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS":443}]'    alb.ingress.kubernetes.io/certificate-arn:** REPLACE-YOUR-ARN**    alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-TLS-1-1-2017-01    alb.ingress.kubernetes.io/actions.ssl-redirect: '{"Type": "redirect", "RedirectConfig": { "Protocol": "HTTPS", "Port": $  labels:    name: jenkinsspec:  rules:    - http:        paths:         - path: /*           backend:             serviceName: ssl-redirect             servicePort: use-annotation         - path: /*           backend:             serviceName: jenkins             servicePort: 80

Make sure to replace, ssl certificate ARN received from ACM, subnets and security groups.

save the above file as jenkins-alb.yaml

kubectl apply -f jenkins-alb.yaml

once the deployment was success; describe the service to get ALB CName:

kubectl get ingress

copy and paste the alb arn in the browser to access the app

Note: also create cname record for cname with route53 for the domain you have purchased the SSL from ACM

Bingo!

My first path to Kubernetes setup in production — PART 2 of 2

Jothimani Radhakrishnan — Tue, 19 Feb 2019 15:02:09 GMT

Hey hi,

if you have missed out my previous blog on env setup in AWS for EKS cluster and worker nodes, please have a look at here. [https://opsinsights.dev/my-first-path-to-kubernetes-setup-in-production]

Lets get move along with some deployments;

Before surfing in, lets learn some basics of k8s, commands and a simple what is what on all the naming conventions

K8s cluster: A cluster consists of at least one cluster master and multiple worker machines called nodes; This is fundamental blocks for Kubernetes.

K8s workers:Workers consists of at least one or more nodes (linux machines) which is also called as nodes. Each node holds the required services to run a pod.

K8s Pods:Pods are group of containers that are deployed for an application in a host machine. i.e pods aka container

Learn more on the k8s doc page:

Okay lets know some most useful commands and shortcuts of the same.

As we have configured kubectl in our cluster manager, all our commands will prefix kubectl.

This will list all the ungrouped/default namespaces running containers. (namespaces are used to group the deployments,services etc. We are not discussing in detail on the namespaces)

kubectl get pods aka kubectl get po

This will list all the ungrouped/default namespaces running services.

kubectl get services aka kubectl get svc

This will list all the ungrouped/default namespaces running nodes.

kubectl get nodes

This will list all the ungrouped/default namespaces running deployments.

kubectl get deployments aka kubectl get deploy

Lets start deploying a simple application and create a service and finally access them via load balancer.

Creating deployments (jenkins-deployment.yaml)

save this as yaml file replacing the required fields which best suits your app; for the ease I have used jenkins image.

apiVersion: extensions/v1beta1kind: Deploymentmetadata:  app: jenkinsspec:  replicas: 2  template:    metadata:      labels:        app: jenkins    spec:      containers:      - name: demo-jenkins        image: jenkins:2.3-alpine        ports:          - containerPort: 8080

to deploy the app;

kubectl deploy -f jenkins-deployment.yaml

Decoding the above yaml file:

kind states it is deployment with added tags.
specs: we need two replicas (i.e, containers) and req labels:
Then container definition: name of the container and image repository and followed by the container port to expose.

To check the success deployment:

kubectl get deploy

kubectl get pods This will display the status of the pods/containers

your jenkins-deployment status will be displayed

Creating a service (jenkins-service.yaml)

kind: ServiceapiVersion: v1metadata:  name: jenkins-deploymentspec:  selector:    app: jenkins  ports:  - protocol: TCP    port: 80    targetPort: 8080  type: LoadBalancer

The above script will expose the deployed app under port 80 and map it with the pods running under 8080, using a load balancer (cloud native) in our case an elastic load balancer (classic)was created in the region.

we can also use multiple port exposure, ALB and ingress controller and many morewe will discuss those in upcoming blogs

kubectl apply -f jenkins-service.yaml

get the status of the service using jenkins-service.

kubectl get svc jenkins-deployment This will list the status of the service

Accessing the service from the load balancer address.

kubectl get svc jenkins-deployemnt

the above command will display the k8s svc status with external DNS name (load balancer cname)

Copy and paste the load balancer dns in web browser;

bingo! we have our app ready.

We can also deploy the app with multiple ports exposed and we can use application load balancers as well; sounds interesting? Lets discuss more in the next blog.

Thank you! :)

My first path to Kubernetes setup in production

Jothimani Radhakrishnan — Tue, 19 Feb 2019 00:40:38 GMT

Hey All, would like to share my journey on K8s setup with a quite brief story and small workflow. Would like share this demo scripts as I have faced many challenges while implementing them.

Fyi; All the below mentioned setup was in AWS cloud. Managed EKS.

It is always challenging on moving forward to adapt with latest technology; All are meant to ease our process to automate the lifestyle here.

It all started with the a migration plan to move forward to an automated environment. We initially had tough path to go with;

either with Dockeriznig the application and moving to kubernetes (k8s) Or directly to k8s architecture

We finalized with moving k8s directly as do not want another setup plan and also we have some time left for our new product launch. :p

Lets get into the process;

Preface:

It is assumed you have already Dockerized your application and have .yaml file ready for deployment.
AWS cloud account
Container Registry; Preferably with the native cloud managed. We have ECR (elastic container registry form AWS)
we have used kubectl here as the managing tool.

It was pretty fair that the AWS docs have clear ideas in creating them; however there are some minute tricks which will end up in spinning all around.

AWS doc link: https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html

Decide which is the environment you are going to use as the administration console, from which you can manage the k8s cluster, deployments etc. Either your local machine or any machines in cloud.
Create two IAM user with following access; One for EKS cluster to manage services and Other for admin console from which you have access to the resourcesEKS cluster role:

To create your Amazon EKS service role in the IAM console

Open the IAM console at https://console.aws.amazon.com/iam/.

Choose Roles, then Create role.

Choose EKS from the list of services, then Allows Amazon EKS to manage your clusters on your behalf for your use case, then Next: Permissions.

For Role name, enter a unique name for your role, such as eksServiceRole, then choose Create role.

Another IAM role / user with following access; (if you prefer ec2 instance as a manager, create IAM roles, else go with IAM user and bind the keys in your local machine)

ECR container access IAM pass for EKS EKS manager role with LIST/ UPDATE/ DESCRIBE/CREATE access to the eks cluster.

Once the IAM user / role was created bind with your machine with awscli.

Create a VPC for EKS as mentioned in the document; preferably go with aws template initially and later we can modify them.

Refer create cluster VPC section in the document: https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html

Install and configure kubectl for eks; use the below link based on your native machine mac/windows/linux distros

https://docs.aws.amazon.com/eks/latest/userguide/install-kubectl.html

Install aws-iam-authenticator from the below link

https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html

use aws cli for cluster creation from the local machine with aws cli configured in previous step.

aws region us-east-1 eks create-cluster name Developement role-arn arn:aws:iam::REPLACEHERE resources-vpc-config subnetIds=REPLACE SUBNETS HERE ,securityGroupIds=SECURITYGROUP

make sure to replace the values with VPC created from cloud formation template.

Wait for the cluster to be active you can fetch the status of cluster using the below command or checking via console.

**aws eks --region *region* describe-cluster --name *devel* --query cluster.status**

once the status is active move the next step:

Adding the cluster to your your kubeconfig
**aws eks --region *region* update-kubeconfig --name *cluster_name*

To confirm the cluster has joined your manger

**kubectl get services**

troubleshoot: if you are facing issues joining the cluster

Make sure you have used the same IAM user to join eks, and from which eks cluster was created.

you will have your cluster listed. It takes 2 to 3 minutes for joining.

Follow the launch and configure the eks worker nodes from a simple cloud formation template.

https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html

Once completed we have nodes listed; Check those by

kubectl get nodes

Now we have our environment ready for deployments, in the next blog we will have our sample application with load balanced services.

Lets containerize!!!

Link for Part 2 [https://opsinsights.dev/my-first-path-to-kubernetes-setup-in-production-part-2-of-2]

Updating the Jenkins: Docker way

Jothimani Radhakrishnan — Sun, 09 Dec 2018 05:20:54 GMT

Best practice to upgrade Jenkins docker containers.

Important Pointers:

always prefer docker volumes. This method works if you are running Jenkins docker container with mounted volume.

Traditional way is to get into the container and download the new war file, replace & finally restart the container.

There are chances container may not start and due any incorrect changes. :(

lets look at the more effective way to meet this.

Log in to your host server.
Directly pull a Jenkins docker image with your desired version.docker pull jenkins/jenkins:2.54

(assuming 2.154 is the latest)If you are planning to install any additional packages, it is better to have dockerfile and build your own image.

Stop your current docker container.
docker stop CONTAINER_ID
Start your new container with the existing location of jenkinshome.
docker container run \ name jenkins \-p 8080:8080 -p 50000:50000 \-v $HOME/jenkins:/var/jenkins_home \-d \jenkins/jenkins:2.514

$HOME is the your local directory for docker volume.

Check for the running container. Now and access your Jenkins web with latest version

Note: Having a separate volume for Jenkins home makes easier to upgrade the Jenkins version in clicks. And all our settings, personalisation in place.

peace!