Your Starting Advantage: You already have production-grade depth — Kubernetes operators, multi-cloud (AWS/Azure/GCP), ArgoCD, Argo Workflows, and Go. The gap isn't knowledge — it's demonstrable, differentiated projects and system design storytelling at scale. This roadmap closes that gap.
For Kubernetes/Cloud Engineers transitioning into AI/ML Engineering Updated with: Data Engineering foundations, Vector DBs, LLMOps, Distributed Training, Security & Governance, ML Observability Resources listed in the order you should follow them.
You can't be an ML Platform Engineer without understanding the data layer. Start both tracks in parallel.
This is a gist of examples also mentioned in the blog IAM Roles for Service Accounts (IRSA) in AWS EKS within and cross AWS Accounts. Prerequisite for this gist is to create the EKS Cluster as explained in my earlier blog Create Amazon EKS Cluster within its VPC using Terraform, OR you can use this github repository.
Assuming you have the EKS Cluster running and your AWS CLI is configured to talk to the AWS Account where your EKS Cluster is running. If not please follow the our earlier blog on How to create an EKS Cluster using Terraform