luktom.net
  • blog
  • contact
  • polish





How to (and why) replace AWS CNI with Calico on AWS EKS cluster

On 09 Apr, 2020
AWS, Kubernetes
With 3 Comments
Views : 6024

All EKS clusters come with default AWS CNI plugin that provides some nice features like getting an address within VPC subnet range, with a performance of ENI. So why on earth you may want to use some other CNI?

Apart from some SNAT issues you may encounter while deploying first clusters there’s one BIG limitation of AWS CNI that comes from the number of IP addresses and ENI that you can assign to the single instance.

This table shows that limits. As you can see, e.g. for t3.large you can assign 3×12 = 36 IP addresses to a single EC2 instance. This seriously limits a number of pods that can be scheduled to a single node. It may or may not be a problem, but if you hit the wall with this limit, here’s the recipe on how to replace AWS CNI with Calico.

Remove existing AWS CNI components

First, we need to get rid of AWS CNI. But please, don’t just delete daemonset like other tutorials suggest as you’ll leave other parts of that component hanging in your cluster.

To do it properly, just to that:

curl https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/release-1.5/config/v1.5/aws-k8s-cni.yaml > aws-cni.yaml
cat aws-cni.yaml | kubectl delete -f -

Add Calico components

Now, it’s time to add Calico CNI components. For a typical deploy it’s enough to use standard manifest (for larger deployment please read the docs and adjust manifests accordingly):

curl https://docs.projectcalico.org/manifests/calico.yaml > calico.yaml
kubectl apply -f calico.yaml

Disable max pods limit

Unfortunately, replacing CNI plugin is not enough, we need to also modify EKS boostrap script and add the following flag:

--use-max-pods false

Depending on the deployment method of the cluster there may be different ways to accomplish that, I use Terraform EKS module for that and in that case it’s as simple as adding that flag to bootstrap_extra_args:

module "eks" {
  source           = "terraform-aws-modules/eks/aws"
  version          = "8.2.0"
  ...
  worker_groups_launch_template = [
    {
      ...
      bootstrap_extra_args    = "--use-max-pods false"
    }
  ]
}

Terraform will rotate the nodes for you, after that you should be able to run (almost) as many pods as you wish :)

The last part is about fixing stuff that depends on CNI plugins.

Fix kube2iam

If you use kube2iam (IIRSa not elasic enough? :)), you need to change interface to cali+, e.g. for helm-based deployment you need to set:

host:
  iptables: true
  interface: cali+

Fix metrics-server

Another problem I discovered after deploying Calico is that EKS-managed Kubernetes API Server blocks internal CIDR that calico uses. The easiest and acceptable way to fix that is to set (also, for helm-based deployments) host networking for metrics-server pod:

hostNetwork:
  enabled: true

And that’s all – now you have a working EKS cluster without pod number limit :)



Tags :   calilcocnieksk8skube2iamkubernetesmetrics-serverterraform

Related Posts

  • ArgoCD vs Flux

  • Don’t trust Terraform’s prevent_destroy feature

  • “GitOps – introduction, tools and best practices” – an invitation to my speech

  • Ansible Operators – let’s give them a spin

  • Comments ( 3 )

    • Christophe Aug 11 , 2020 at 10:47 /

      And now you don’t have support and SLA on your cluster.

    • Dennis Oct 27 , 2020 at 18:41 /

      You sure there’s no support since AWS even has an article about using Calico on EKS? I mean there may be something about not supported without the VPC CNI, but I don’t find anything stating that.
      https://docs.aws.amazon.com/eks/latest/userguide/calico.html

    • HidroFranxi Jan 10 , 2022 at 17:32 /

      If this is aws managed node group, you still have support from AWS


    Leave a Comment

    Click here to cancel reply

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>





    Łukasz Tomaszkiewicz

    Łukasz Tomaszkiewicz

    Łukasz Tomaszkiewicz is a highly skilled and passionate cloud expert who loves to automate repeatable things and secure them.

    His broad experience in the areas of software development, database design, containerization and cloud infrastructure management gives him a holistic view of a modern technology stack.

    In his spare time he enjoys photography, blogging and speaking on local IT-related communities.

    Vim-believer :)

    Categories

    • Ansible
    • AWS
    • C#
    • Go
    • Google Cloud
    • Kubernetes
    • Prometheus
    • Speeches
    • Virtualization
    • Windows

    Tags

    alert alerting alertmanager ansible ansible operator argocd aws aws cli aws ug bash c# centos cloudwatch databases esxi flux gcp gitops google cloud k8s kubernetes linux mysql open source operator operator-sdk policies powershell prelekcje prometheus recovery restore rhel rpo rto scp speeches terraform virtualization vmware vsan vsphere weaveworks wifi windows

    Copyright © 2006-2018 by Łukasz Tomaszkiewicz. Wszelkie prawa zastrzeżone