In this post, I’ll show you how to set up a managed Kubernetes cluster with autoscaling using low-cost nodes (spot instances) with AWS EKS and Karpenter.
Amazon EKS (Elastic Kubernetes Service) is a managed Kubernetes service that lets you run Kubernetes on AWS without having to install, operate, and maintain your own Kubernetes control plane or nodes. Karpenter is an open-source project developed by Amazon that offers a Kubernetes-native node autoscaling solution. Integrating Karpenter with EKS can optimize resource allocation and reduce costs significantly.
Using spot instances can reduce drastically the cost of cloud infrastructure used by Kubernetes to process workload, personally, I recommend using this for devops, machine learning, or any other application that requires high resources on the cloud.
This post is based on Karpenter getting started docs.
Requirements
- Linux Bash Terminal
- AWS account and AWS cli tool
- kubectl – Kubernetes cli tool
- eksctl – AWS EKS cli tool
- helm -Package manager for Kubernetes
Installation Steps
- First, create some environment variables we are going to use for the installation.
# Export environment variables
export K8S_VERSION=1.26
export KARPENTER_VERSION="v0.27.5"
export CLUSTER_NAME="demo"
export AWS_DEFAULT_REGION="us-east-1"
export AWS_PROFILE=default
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export TEMPOUT=$(mktemp)
- Check environment variables before we start the installation.
echo $KARPENTER_VERSION \
$CLUSTER_NAME \
$AWS_DEFAULT_REGION \
$AWS_ACCOUNT_ID \
$TEMPOUT
- Let’s create an EKS cluster on AWS
# Karpenter AWS requirements configuration
curl -fsSL https://karpenter.sh/"${KARPENTER_VERSION}"/getting-started/getting-started-with-karpenter/cloudformation.yaml > $TEMPOUT \
&& aws cloudformation deploy \
--stack-name "Karpenter-${CLUSTER_NAME}" \
--template-file "${TEMPOUT}" \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides "ClusterName=${CLUSTER_NAME}"
# Create EKS cluster
eksctl create cluster -f - <<EOF
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: ${CLUSTER_NAME}
region: ${AWS_DEFAULT_REGION}
version: "${K8S_VERSION}"
tags:
karpenter.sh/discovery: ${CLUSTER_NAME}
vpc:
cidr: 10.10.0.0/16
#autoAllocateIPv6: true
# disable public access to endpoint and only allow private access
clusterEndpoints:
publicAccess: true
privateAccess: true
nat:
gateway: Single
iam:
withOIDC: true
serviceAccounts:
- metadata:
name: karpenter
namespace: karpenter
roleName: ${CLUSTER_NAME}-karpenter
attachPolicyARNs:
- arn:aws:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}
roleOnly: true
iamIdentityMappings:
- arn: "arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}"
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
# We keep one node as permanent for any service like istio, grafana, etc
managedNodeGroups:
- instanceType: r5a.large
amiFamily: AmazonLinux2
name: ${CLUSTER_NAME}-node-core-services
labels: { role: core }
volumeSize: 100
volumeType: gp3
desiredCapacity: 1
minSize: 1
maxSize: 1
cloudWatch:
clusterLogging:
enableTypes: ["api", "audit", "authenticator", "controllerManager", "scheduler"]
logRetentionInDays: 7
addons:
- name: coredns
version: latest # auto discovers the latest available
- name: kube-proxy
version: latest
- name: aws-ebs-csi-driver
wellKnownPolicies: # add IAM and service account
ebsCSIController: true
## Optionally run on fargate
# fargateProfiles:
# - name: karpenter
# selectors:
# - namespace: karpenter
EOF
- Export cluster endpoint and kubeconfig configuration
export CLUSTER_ENDPOINT="$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.endpoint" --output text)"
export KARPENTER_IAM_ROLE_ARN="arn:aws:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"
# Check cluster endpoint and karpenter Iam role
echo $CLUSTER_ENDPOINT $KARPENTER_IAM_ROLE_ARN
# Create kubeconfig
aws eks update-kubeconfig --region $AWS_DEFAULT_REGION --name $CLUSTER_NAME
- We need to enable spot instances service in our account, otherwise, Karpenter will fail to create nodes with spot instances.
# Enable spot instances service in AWS account
aws iam create-service-linked-role --aws-service-name spot.amazonaws.com || true
- Install Karpenter helm chart
# Logout of docker to perform an unauthenticated pull against the public ECR
docker logout public.ecr.aws
helm registry logout public.ecr.aws
# Install karpenter helm chart
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version ${KARPENTER_VERSION} --namespace karpenter --create-namespace \
--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \
--set settings.aws.clusterName=${CLUSTER_NAME} \
--set settings.aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
--set settings.aws.interruptionQueueName=${CLUSTER_NAME} \
--set controller.resources.requests.cpu=1 \
--set controller.resources.requests.memory=1Gi \
--set controller.resources.limits.cpu=1 \
--set controller.resources.limits.memory=1Gi \
--wait
- Once Karpenter installation is finished we must configure it, we need to create a provisioner, this is the way we tell Karpenter what kind of instances it should use when is launching a new node, also what is the autoscaling limit.
cat <<EOF | kubectl apply -f -
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
requirements:
- key: "karpenter.k8s.aws/instance-cpu"
operator: In
values: ["2", "4", "8", "16", "32"]
- key: "kubernetes.io/arch"
operator: In
values: ["amd64"]
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
limits:
resources:
cpu: 10
memory: 64Gi
providerRef:
name: default
ttlSecondsAfterEmpty: 30
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
name: default
spec:
subnetSelector:
karpenter.sh/discovery: ${CLUSTER_NAME}
securityGroupSelector:
karpenter.sh/discovery: ${CLUSTER_NAME}
EOF
-
Check if Karpenter installation was successful, create a test deployment, and scale it.
# Create a test deployment
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 0
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
EOF
# Scale the deployment
kubectl scale deployment inflate --replicas 5
# Check the logs to see how the autoscaling is working
kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller
# Check the nodes created by karpenter
kubectl get nodes
- Scale down the test deployment
# Delete test deployment
kubectl delete deployment inflate
# Check logs to watch the progess of node scaling down
kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller
# Check if the nodes are gone
kubectl get nodes
- Once we have finished testing Karpenter we can delete the EKS cluster.
# Uninstall Karpenter helm chart
helm uninstall karpenter --namespace karpenter
# Remove resources created by cloudformation
aws cloudformation delete-stack --stack-name "Karpenter-${CLUSTER_NAME}"
# Delete launch templates
aws ec2 describe-launch-templates --filters Name=tag:karpenter.k8s.aws/cluster,Values=${CLUSTER_NAME} |
jq -r ".LaunchTemplates[].LaunchTemplateName" |
xargs -I{} aws ec2 delete-launch-template --launch-template-name {}
# Delete EKS cluster
eksctl delete cluster --name "${CLUSTER_NAME}"