Sunday, January 30, 2022

Deploying Airflow on AWS eks and exposing the webserver UI for learning purposes.

This description assumes that you already have a AWS account. It reveals nothing new and only extracts or copies the instructions created by some real pros (unlike the hacker of this how-to) from the following links:






Most of the cluster work was done on an Amazon Linux instance with kubectl, eksctl and helm installed.

First create an eks cluster as follows:

eksctl create cluster \

--name dev-apps \
--region eu-central-1 \
--version 1.21 \
--nodegroup-name linux-nodes \
--nodes 1 \
--nodes-min 1 \
--nodes-max 2 \
--with-oidc \
--ssh-access \
--ssh-public-key ergregatta-20200928 \
--managed

 What follows is not necessary, however, it is nice for learning purposes to install the kubernetes dashboard. To do so following the instructions here or skip down to where we install airflow with helm:

https://docs.aws.amazon.com/eks/latest/userguide/dashboard-tutorial.html

Here you will have to run the kubectl proxy from your laptop and to do so you’ll need the aws cli as well as kubectl installed.

 
$aws eks update-kubeconfig --region eu-central-1 --name dev-apps
 

This will configure the local kubectl to work with the cluster created above.

Then to get a token to use later for accessing the console in the browser run this (here run in gitbash):

$ kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep eks-admin | awk '{print $1}')
Name:         eks-admin-token-rswtg
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: eks-admin
              kubernetes.io/service-account.uid: 01be9965-5fd5-469e-97e6-6bb6e0c5c5f9
 
Type:  kubernetes.io/service-account-token
 
Data
====
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6Im0wUXZrTE1NeTZNdHlNd3B4U25UOGI2aTVyc2tpUl9BNDJ3M2k1ZGYtQ1UifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJla3MtYWRtaW4tdG9rZW4tcnN3dGciLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZWtzLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiMDFiZTk5NjUtNWZkNS00NjllLTk3ZTYtNmJiNmUwYzVjNWY5Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmVrcy1hZG1pbiJ9.fnHn42Z5gNJ3ZzGebIo7Fo1t8gd1EsGXdDtq9TxcZalcGICRPd3B-8Kwy8CT4-qYEDrNTX27heDTbIJmgod5eZxbFDZMTPyzRKcuk_T1TXFiSfCLRi4wtlWgT-E_5EIJTteqbWk-GvrmTr1O4vmIzNA-8Y4d2sinEGYbESmT8jOK26KmwPKuizKxrzZGYSIL9so3cHuSRe-33IeS0XYR1rk7uU2NDTAGSKMA3-wYLk9heSVdReMfDC__DKlRGR6GMb18jxqi5C08mqJyR7DPVjnR4WTpAh9MO-7SqEQiW6MEsWmHgDbHFIPYg_TN7xPDp3fT5pbbBR70jX8ka2sFog
ca.crt:     1066 bytes

 

Then start the proxy

 
$kubectl proxy
 

and go to localhost as instructed in the instructions and use the token above to log into the dashboard.

Now, that the optional installation of the dashboard is done, let’s return to installing airflow into the eks cluster by using the helm chart as described here:

https://airflow.apache.org/docs/helm-chart/stable/index.html

If you have the dashboard installed, then you can browse around and see all the components which have been installed for airflow.

Next we have to make the service available through the internet which we shall to because exposing the airflow-webserver k8 service by following these instructions:

https://www.eksworkshop.com/beginner/130_exposing-service/exposing/

Replace the namespace, in the above instructions, with “airflow” and the service with “airflow-webserver”. With this done, you should be able to access the airflow-webserver via http and login with admin/admin (non of which is secure)

To delete everything just run:

 
$eksctl delete cluster --name  dev-apps^
 

and say good-bye.