Imagine you are an API Management company and your business depends on your ability to be involved in the request/response lifecycle for HTTP-based API traffic. Also imagine that you've got a Kubernetes cluster that runs both your company's applications and even some client applications. This means when it comes to doing API Management for all necessary traffic, you need to be involved in the request/response lifecycle for targets running within Kubernetes for both requests originating outside the cluster and even some (if not all) requests originating within the cluster. To continue this conversation, let's establish some terminology:
- Inter-Cluster: An external request is made for an API that maps to a resource running within Kubernetes
- Intra-Cluster: An internal request is made for an API that maps to a resource running within Kubernetes
The question at hand is how do you as a Kubernetes cluster owner get involved in the request/response lifecycle for all necessary API traffic as described above?
When it comes to Inter-Cluster traffic handling, Kubernetes does not provide a complete ingress implementation itself. While Kubernetes does provide you with the primitive for storing the routing rules, called Ingress Resources, it does not ship with an Ingress Controller that listens at the edge of the cluster and routes traffic based on the routing rules. While this might seem like an oversight on Kubernetes' part, this is actually a good thing for a number of reasons:
- You can implement ingress (rule storage, Ingress Controller implementation, ...) however you deem fit. This means if you don't agree with Kubernetes' approach, you can design your own.
- If you own the Ingress Controller, you have the necessary touch point to be involved in the request/response lifecycle
Note: There are a number of example Ingress Controller implementations located here: https://github.com/kubernetes/contrib/tree/master/ingress/controllers (These can be used as-is or as a guide for implementing your own Ingress Controller)
So if you own the Ingress Controller implementation, you can change the direct Client -> Ingress Controller -> Pod
traffic flow to Client -> Ingress Controller -> API Management -> Pod
. Problem solved.
Note: Replace "API Management" with whatever your business needs are.
When you are within a Kubernetes cluster, communication happens directly. You are provided with an IP address of a Pod/Service you depend on (or you go get it using the Kubernetes API) or you can use DNS (assuming you have deployed the optional, but strongly suggested, DNS cluster add-on). Unfortunately, Intra-Cluster communication does not have a central equivalent for the Ingress Controller. [Pods][pods] can communicate directly with other Pods and Pods can communicate directly with [Services][services] but there is no Ingress Controller equivalent between them.
Note: When using Services, there is a middleman called kube-proxy that is involved but it is purely for load balancing. Not only that but kube-proxy is not in the request/response lifecycle when not using Services.
This means that out of the box, there is no way to be involved in the request/response lifecycle for Intra-Cluster communication. But fret not, below is a proposal on how this can be done which would not only solve the problem but it reuses the ingress concept which should be really simple. Let's get to the proposal.
While working on an Ingress Controller implementation (https://github.com/30x/k8s-pods-ingress) we realized that
processing Inter-Cluster requests is almost identical to processing Intra-Cluster requests: You have routing rules
based on hostname/path combinations and a Controller that uses these rules to route to Pods/Services. The main
difference is while Inter-Cluster requests will be for some hostname, like www.github.com
, Intra-Cluster requests
would be for an IP, Pod name or Service name. These nuances do not change the fact that you could repurpose an
Ingress Controller to handle both the Inter-Cluster and Intra-Cluster communication routing. (This does not mean you
have to have one Controller deployed that handles both types of traffic. This just means that the same source base
could be written to serve both the Inter-Cluster and Intra-Cluster use cases.)
Below are the details of the proposal. For each of the important pieces, there is likely more than one way to do this. The hope is that this will be a launching point to discuss the viability of supporting something like this natively within Kubernetes, and maybe even coming up with the best solution for doing this with Kubernetes as-is.
To simplify the problem being solved, we need to take the typical PodA -> PodB
communication and turn it into
PodA -> Intra-Cluster Router -> PodB
.
Before we go into the proposal details, we should get one thing straight: This proposal is being made with the hopes that the application author is not impacted. Here are the design considerations:
- There should be no Kubernetes-specific code in their application, at least not related to routing unless that is a part of their application
- This implementation should not dictatate what Kubernetes constructs can/cannot be used
- This implementation should not require the application author to do weird things (Example: Requiring the application
to make requests like
curl -v "Host: SERVICE_NAME" http://INGRESS/path
instead ofcurl -v http://SERVICE_NAME/path
)
Much like the Ingress Controller, we need some Controller that will process traffic based on hostname/path combinations and route based on the known routing rules. The specifics on how you deploy your Controller is not important really unless your Controller has some mode-specific configuration. Other than that, you can deploy your Controller as a Replication Controller, a Replica Set, a Pod, etc.
Note: For proper isolation, you might deploy your Controller so that it is not exposed to the outside world.
Now while the actual specifics on how you deploy your Controller do not matter, the Controller will dictate how and where the routing rules are stored. Once your Controller is deployed, we will create a Service for the router so that we can reference it later.
So now that we have a Controller deployed that can handle traffic, we need to look at the options for storing the routing rules for the Intra-Cluster traffic routing. As mentioned above the Controller dictates how and where the routing rules are stored. But since there is no native Kubernetes object to store these rules, we have a few options:
- A custom Kubernetes object
- Overload the Ingress Resource object (not suggested) but using labels to dicate Inter-Cluster vs. Intra-Cluster routing
- Use annotations/labels for storing these rules
Now up to this point, nothing special has been suggested. While the problem domain is not natively solved by Kubernetes, nothing has stopped Kubernetes from natively handling the needs mentioned in the proposal above. What is missing at this point is wiring it up so that application authors can request some hostname/path and that traffic gets automatically routed through our Intra-Cluster Router.
To achieve our needs, we somehow need to make it so that a Pod/Service name gets pointed to our Intra-Cluster Controller. One way to do this would be to use proxy Services. The idea here is you would create a Service that points to the Intra-Cluster Router instead of your Pods. To do this, you would use the appropriate Label Selector that identifies Pods running the Intra-Cluster Router. Then when your service is resolved, the traffic would go to the Intra-Cluster Router which itself would route the traffic to the appropriate Pod(s) based on the routing rules.
This is not ideal and it does not solve the direct Pod<->Pod communication. But since that is an anti-pattern of sorts, maybe that does not matter. But for Services, this should work.
At the end of the day, what makes this work is a convention that instead of an application creating a Service that resolves to their Pods, it resolves to the Intra-Cluster Router Pods. It just so happens that something similar has come up recently called "Service Aliases": kubernetes/kubernetes#13748
Below are example Kubernetes deployment files for each of the moving pieces described above. This example will be built using the k8s-pods-ingress as the Intra-Cluster Router.
Note: As mentioned above, you can deploy the Intra-Cluster Router using whatever Kubernetes construct that fits your needs. To remove any ambiguity, below is an example deployment that will deploy both an Inter-Cluster router (Ingress Controller) and an Intra-Cluster router. The reason this is important is to show how you could use the same deployment to describe both internal and external routes in the same deployment.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: routers
labels:
names: routers
spec:
template:
metadata:
labels:
names: routers
spec:
containers:
- image: whitlockjc/k8s-pods-ingress:v0
imagePullPolicy: Always
name: inter-cluster-router
ports:
- containerPort: 80
hostPort: 80
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# Use the configuration to use the public/private paradigm (Inter-Cluster version)
- name: API_KEY_SECRET_LOCATION
value: routing:public-api-key
- name: HOSTS_ANNOTATION
value: publicHosts
- name: PATHS_ANNOTATION
value: publicPaths
- image: whitlockjc/k8s-pods-ingress:v0
imagePullPolicy: Always
name: intra-cluster-router
ports:
- containerPort: 81
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# Use the configuration to use the public/private paradigm (Intra-Cluster version)
- name: API_KEY_SECRET_LOCATION
value: routing:private-api-key
- name: HOSTS_ANNOTATION
value: privateHosts
- name: PATHS_ANNOTATION
value: privatePaths
# Since we cannot have two containers listening on the same port, use a different port for the private router
- name: PORT
value: "81"
apiVersion: v1
kind: Service
metadata:
name: intra-cluster-router-service
labels:
name: intra-cluster-router-service
spec:
ports:
- port: 80
selector:
name: routers
apiVersion: v1
kind: ReplicationController
metadata:
name: my-application
labels:
name: my-application
spec:
replicas: 1
selector:
name: my-application
template:
metadata:
labels:
name: my-application
routable: "true"
annotations:
# Expose this application to the Inter-Cluster router so that http://test.apigee.com/nodejs routes here
publicHsots: "test.apigee.com"
publicPaths: "3000:/nodejs"
# Expose this application to the Intra-Cluster router so that http://my-application.my-namespace/nodejs routes here
privateHosts: "my-application.my-namespace"
privatePaths: "3000:/nodejs"
spec:
containers:
- name: nodejs-k8s-env
image: whitlockjc/nodejs-k8s-env:v0
env:
- name: PORT
value: "3000"
ports:
- containerPort: 3000
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
apiVersion: v1
kind: Service
metadata:
name: my-application-service
labels:
name: my-application-service
spec:
ports:
- port: 80
selector:
# This is basically as proxy service so instead of using a label selector
# that points to your Pods, point it to the Intra-Cluster Router Pod(s).
name: intra-cluster-router-service
So now that you've got your application deployed that allows for Intra-Cluster communication processing,
let's deploy another application that consumes this application called my-other-application
.
apiVersion: v1
kind: ReplicationController
metadata:
name: my-other-application
labels:
name: my-other-application
spec:
replicas: 1
selector:
name: my-other-application
template:
metadata:
labels:
name: my-other-application
spec:
containers:
- name: some-application
image: whitlockjc/some-application
At this point, if you deployed all of these files above your my-other-application
should be able to make an HTTP request to
my-application-service.my-namespace
and it should be routed through the Intra-Cluster router and end up at the Pod
corresponding to your application. Here is the flow:
my-other-application -> my-application-service -> Intra-Cluster Router -> my-application
I now realize this approach is somewhat naive in that this works fine within the confines of a single namespace but once you want to cross the namespace boundary, you really should be a
clusterIP
for the intra-cluster router service and selector-less services for the alias/proxy services with a manually created endpoint object for the alias/proxy service pointing to theclusterIP
of the intra-cluster router.