Intent is to create my prow instance to service my own github org as a testbed / staging area for changes I would want to make to prow.k8s.io
Made GitHub things
- Created @bashfire org
- Reusing @spiffxp-bot user
- Invited @spiffxp-bot to @bashfire
- Setup @spiffxp-bot as an Owner of @bashfire
Followed Getting Started Guide
- at https://github.com/kubernetes/test-infra/blob/master/prow/getting_started_deploy.md
- Used
tackle
viabazel
in my existing checkout ofkubernetes/test-infra
repo and it pretty much worked - TODO: Needed to add
admin_org:hook
to the recommended scopes ofpublic_repo
andrepo:status
I went to the Next Steps section, and did things a bit out of order.
I created a bare repo at https://github.com/bashfire/prow-config
Added a Makefile
# GKE cluster variables
PROJECT ?= spiffxp-gke-dev
ZONE ?= us-central1-f
CLUSTER ?= prow
# prow docker image variables
REPO ?= gcr.io/k8s-prow
TAG ?= v20190620-f4e6553b2
CWD = $(shell pwd)
get-cluster-credentials:
gcloud container clusters get-credentials "$(CLUSTER)" --project="$(PROJECT)" --zone="$(ZONE)"
.PHONY: get-cluster-credentials
update-config: get-cluster-credentials check-config
kubectl create configmap config --from-file=config.yaml=config.yaml --dry-run -o yaml | kubectl replace configmap config -f -
update-plugins: get-cluster-credentials check-config
kubectl create configmap plugins --from-file=plugins.yaml=plugins.yaml --dry-run -o yaml | kubectl replace configmap plugins -f -
.PHONY: update-config update-plugins
check-config:
docker run -v $(CWD):/prow $(REPO)/checkconfig:$(TAG) --config-path /prow/config.yaml --plugin-config /prow/plugins.yaml
.PHONY: check-config
Verified I could update the cluster's plugins and config
# plugins.yaml
plugins:
bashfire:
- cat
- size
# config.yaml
prowjob_namespace: default
pod_namespace: test-pods
# from bashfire/prow-config
make check-config
make update-plugins
make update-config
Followed the steps in Configure Cloud Storage
gcloud iam service-accounts create spiffxp-gke-dev-prow # step 1
identifier="$( gcloud iam service-accounts list --filter 'name:spiffxp-gke-dev-prow' --format 'value(email)' )"
gsutil mb gs://bashfire-prow # step 2
gsutil iam ch allUsers:objectViewer gs://bashfire-prow # step 3
gsutil iam ch "serviceAccount:${identifier}:objectCreator" gs://bashfire-prow # step 4
gcloud iam service-accounts keys create --iam-account "${identifier}" service-account.json # step 5
kubectl create secret generic gcs-credentials --from-file=service-account.json # step 6
Next tried the Add More Jobs step
Discovered I had to add more to config.yaml
and plugins.yaml
than is documented. Also by telling pods to run in a different namespace, I needed to put the gcs-credentials secret in that namespace as well.
# config.yaml
plank:
default_decoration_config:
utility_images: # TODO: "no default decoration image pull specs provided for plank" could be a clearer error message
clonerefs: "gcr.io/k8s-prow/clonerefs:v20190619-25afbb545"
initupload: "gcr.io/k8s-prow/initupload:v20190619-25afbb545"
entrypoint: "gcr.io/k8s-prow/entrypoint:v20190619-25afbb545"
sidecar: "gcr.io/k8s-prow/sidecar:v20190619-25afbb545"
gcs_configuration:
bucket: bashfire-prow
path_strategy: explicit
gcs_credentials_secret: gcs-credentials
# plugins.yaml
plugins:
bashfire:
- cat
- size
- trigger
I then discovered the gcs-credentials
secret wasn't being found, because it was in the default
namespace, and the Pods created by ProwJobs were now being created in the test-pods
namespace
kubectl delete secret gcs-credentials
kubectl create secret generic gcs-credentials --namespace=test-pods --from-file=service-account.json
When I did this, the echo-test
periodic job that had been working stopped working
kubectl get pods --namespace=test-pods --field-selector=status.phase!=Succeeded
# pick a pod
kubectl describe -n test-pods pod/3465dd1c-96d1-11e9-b19b-aa00477a1c7c
# look for the Init Containers section, see initupload has State: Terminated
kubectl logs -n test-pods 3465dd1c-96d1-11e9-b19b-aa00477a1c7c -c initupload
# {"component":"initupload","error":"failed to upload to GCS: failed to upload to GCS: encountered errors during upload: [[googleapi: Error 403: [email protected] does not have storage.objects.delete access to bashfire-prow/logs/echo-test/latest-build.txt., forbidden]]","file":"prow/cmd/initupload/main.go:45","func":"main.main","level":"fatal","msg":"Failed to initialize job","time":"2019-06-24T22:41:54Z"}
It turns out initupload
's Run calls gcsupload
's Run, which will ultimately try overwriting build-latest.txt
. It seems weird that we'd want to do that before a build has actually finished. But let's assume the legacy behavior is correct before we fall too deep down the rabbit hole.
So, TODO: objectCreator
isn't sufficient permissions for periodics to take advantage of podutils
gsutil iam ch "serviceAccount:${identifier}:objectAdmin" gs://bashfire-prow
At this point, I had a more or less functional prow cluster. Next, I wanted to allow prow to update itself.
# added to plugins.yaml
config_updater:
config_file: config.yaml
plugin_file: plugins.yaml
plugins:
#...
bashfire/prow-config:
- config-updater
make update-plugins
Next I wanted to setup the prow.bashfire.dev domain name. I own bashfire.dev through namecheap.com, so I followed the instructions at https://cloud.google.com/dns/docs/quickstart and https://cloud.google.com/dns/docs/update-name-servers to setup a bashfire.dev zone in Google Cloud DNS.
I could get to prow.bashfire.dev, but it was redirecting to https which was not happy because I hadn't setup ingress yet.
I followed instructions at https://docs.cert-manager.io/en/latest/getting-started/install/kubernetes.html to install cert-manager and verify that it worked
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
# using kubectl 1.12 so need the --validate=false flag
kubectl apply -f --validate=false https://github.com/jetstack/cert-manager/releases/download/v0.8.1/cert-manager.yaml
I followed some combination of https://docs.cert-manager.io/en/latest/tasks/issuers/setup-acme/index.html and looking at test-infra's setup to get https://prow.bashfire.dev to work. There has been some flail involved because of deltas from all the ideal/example setups:
- I'm using a newer version of cert-manager than test-infra and trying to avoid deprecated flags
- I'm using the gce-ingress setup that comes by default with GKE rather than trying to setup nginx-ingress
- I had this running so long that a self-signed cert was picked up by gce-ingress first; once I had a proper cert it took a while for it to be picked up (https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#setting_up_https_tls_between_client_and_load_balancer claims 10 minutes)
- I was pointed at the staging ACME server for a while, so I was getting a cert, it just wasn't signed by a valid CA
- I verified what the actual cert was in-cluster via
k get secret tls-ingress-secret -o=json | jq -r '.data["tls.crt"]' | base64 --decode | openssl x509 -text -noout
gcloud compute addresses create prow-bashfire-dev-ip --global
kubectl delete ingress ing
kubectl apply -f tls-ingress.yaml
kubectl apply -f cluster-issuer.yaml
# tls-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
namespace: default
name: tls-ingress
annotations:
kubernetes.io/ingress.class: gce
kubernetes.io/ingress.global-static-ip-name: prow-bashfire-dev-ip
certmanager.k8s.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- secretName: tls-ingress-secret
hosts:
- prow.bashfire.dev
rules:
- host: prow.bashfire.dev
http:
paths:
- path: /*
backend:
serviceName: deck
servicePort: 80
- path: /hook
backend:
serviceName: hook
servicePort: 8888
# cluster-issuer.yaml
apiVersion: certmanager.k8s.io/v1alpha1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: [email protected]
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-prod-private-key
solvers:
- http01:
ingress:
name: tls-ingress