Skip to content

Instantly share code, notes, and snippets.

@JustinKuli
Last active September 6, 2024 18:17
Show Gist options
  • Save JustinKuli/5b810888609e9dace5c63c443fee6ad9 to your computer and use it in GitHub Desktop.
Save JustinKuli/5b810888609e9dace5c63c443fee6ad9 to your computer and use it in GitHub Desktop.
A demo of Openshift GitOps and RHACM OperatorPolicies

Overview

This gist demonstrates using Red Hat OpenShift GitOps (read: ArgoCD) with the policy framework in Red Hat Advanced Cluster Management for Kubernetes (RHACM, with a community at open-cluster-management.io). In particular it shows how the OperatorPolicy kind provides a more gitops-friendly surface for managing OLM operators, compared to managing Subscriptions, ClusterServiceVersions, InstallPlans (and more) directly.

For more information on OperatorPolicy from a non-GitOps view, I have written a separate article: https://developers.redhat.com/articles/2024/08/08/getting-started-operatorpolicy.

Installation and Configuration

Although this document will not cover how to install RHACM, it includes zz_gitops-policy.yaml to install the Openshift GitOps operator with the RHACM Policy Framework. The policy should work out-of-the-box by directly applying it on the hub cluster, but it can also be copied into the Policy "wizard" in the RHACM console and then customized to apply to specific clusters in the fleet. Without modifications it will apply to all managed clusters.

After the operator installation is complete (gated by policy dependencies), the policy will also configure the ArgoCD object on the cluster to use healthchecks specific to the policy types, and will create a Role and RoleBinding so that ArgoCD can sync the policy types.

Those policies can be viewed and managed in ArgoCD by creating an Application like this:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: gitops-policy
  namespace: openshift-gitops
spec:
  destination:
    server: https://kubernetes.default.svc
  project: default
  source:
    directory:
      include: zz_gitops-policy.yaml
      jsonnet: {}
    path: .
    repoURL: https://gist.github.com/5b810888609e9dace5c63c443fee6ad9.git
    targetRevision: HEAD

This Application is also a nice example of viewing RHACM resources in ArgoCD, since it includes many different types.

Traditional Installation of an Operator

Ideally, installing an operator is as simple as creating a single subscription. We will use the quay operator as an example throughout this gist:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: quay-operator
  namespace: openshift-operators
spec:
  channel: stable-3.11
  installPlanApproval: Automatic
  name: quay-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace

Warning

The actual functionality of the example operator installations has not been verified here, only that it appears to correctly install.

This Subscription will install the latest 3.11.z version, and keep it automatically updated... However, it is not possible in this resource to declare more specifically which exact version of the operator we want to maintain on the cluster. There is the spec.startingCSV, but this does not control the current/active version of the operator. This subscription specifies Automatic approval of future InstallPlans, and the only alternative is Manual. The manual setting affects all other operators in that namespace, and requires users to parse and approve InstallPlans that are not predictably named.

One option for approving these InstallPlans is to periodically create a Job which can look up all InstallPlans and approve some or all. One implementation of this is at https://github.com/redhat-cop/gitops-catalog/tree/main/installplan-approver

Using OperatorPolicy for a Specific Version

Another option is to use OperatorPolicy, which became generally available in RHACM 2.11. To install specifically the 3.11.1 version of the quay operator, and allow for controlled upgrades in the future, you can create this OperatorPolicy:

apiVersion: policy.open-cluster-management.io/v1beta1
kind: OperatorPolicy
metadata:
  name: quay-operator
  namespace: local-cluster
spec:
  complianceType: musthave
  remediationAction: enforce
  severity: high
  subscription:
    name: quay-operator
    channel: stable-3.11
    startingCSV: quay-operator.v3.11.1
  upgradeApproval: Automatic
  versions:
  - quay-operator.v3.11.1

To follow along with this "demo", create an Application like this, with a pinned revision that can be updated to advance through the steps:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: quay-operator
  namespace: openshift-gitops
spec:
  destination:
    server: https://kubernetes.default.svc
  project: default
  source:
    directory:
      include: quay-operator-policy.yaml
      jsonnet: {}
    path: .
    repoURL: https://gist.github.com/5b810888609e9dace5c63c443fee6ad9.git
    targetRevision: d28ee80edcf50420bc1cdb2aca16e297f36db57c

Thanks to the healthchecks we've configured, the application will report itself as Degraded while the operator is installing. Once that process is complete, it will become Healthy, and we can see more information in the health message:

# kubectl get application.argoproj -n openshift-gitops quay-operator
status:
  resources:
  - group: policy.open-cluster-management.io
    health:
      message: Compliant; the policy spec is valid, the policy does not specify an
        OperatorGroup but one already exists in the namespace - assuming that OperatorGroup
        is correct, the Subscription matches what is required by the policy, an InstallPlan
        to update to [quay-operator.v3.11.4] is available for approval but approval
        for [quay-operator.v3.11.4] is required, ClusterServiceVersion (quay-operator.v3.11.1)
        - install strategy completed with no errors, there are CRDs present for the
        operator, all operator Deployments have their minimum availability, CatalogSource
        was found
      status: Healthy
    kind: OperatorPolicy
    name: quay-operator

At the time of writing this, quay-operator.v3.11.4 was the latest version in the stable-3.11 channel, and OLM has prepared a new InstallPlan to upgrade to that version. The spec.versions list in the OperatorPolicy does not include that version as allowed, so the policy controller will not approve that InstallPlan.

Tip

By default, the OperatorPolicy only reports the upgrade as available - the Policy is Compliant and the Application is Healthy. The spec.complianceConfig.upgradesAvailable field configures this behavior: it can be set to NonCompliant to make the Policy NonCompliant and the Application Degraded, to draw extra attention to the situation.

Upgrades to the Operator with OperatorPolicy

Assuming the role of a responsible adminstrator, say we've noticed that an update is available, and we've confirmed that this new version will work on the cluster. Traditionally we would need to approve the InstallPlan, but the name of that resource will vary between clusters. With OperatorPolicy, we can simply add this version to the spec:

@@ -14,3 +14,4 @@ spec:
   upgradeApproval: Automatic
   versions:
   - quay-operator.v3.11.1
+  - quay-operator.v3.11.4

The policy controller will then approve the InstallPlan (as long as it does not include any unapproved updates to other operators) and the Application should become healthy again once the new Deployment is available. Note that if there is a newer version available in the stable-3.11 channel, perhaps v3.11.5, then this update will not approve the InstallPlan for that version. Unfortunately, that means that the operator will remain at version v3.11.1. To allow any version on that channel, set the versions field to an empty list, which provides a similar behavior to the traditional installPlanApproval: Automatic Subscription setting:

@@ -12,6 +12,4 @@ spec:
     channel: stable-3.11
     startingCSV: quay-operator.v3.11.1
   upgradeApproval: Automatic
-  versions:
-  - quay-operator.v3.11.1
+  versions: []

If you're following along, you can update the quay-operator Application to targetRevision fa824a3408a0aec612cfb6dfca778e4e2b4394ed to only allow v3.11.4. Or, update to targetRevision d502b05373ee78152e93262867b4a0a816258ba2 to allow all v3.11.z versions.

As one other option, if we no longer wanted to allow the previous version - say it has some vulnerability that we want to avoid - we could update the allowed versions list to only include the non-vulnerable version. The application would be marked as Degraded until a "good" version was running.

Limitations

OperatorPolicy namespace

Currently, the policy controller only handles OperatorPolicies created in the "managed cluster namespace". This demo utilized the local-cluster, which is the canonically correct namespace when the hub is self-managed by RHACM. When deployed to other clusters, this field would need to be customized. It might be possible with kustomization, or the policy framework can be used to distribute the policies, which will handle setting the correct namespace.

Upgrades to arbitrary versions

OperatorPolicy relies on the InstallPlans created by OLM, and so it can not orchestrate upgrades to intermediate versions on the upgrade graph. For example, this demo upgraded from v3.11.1 to v3.11.4, since that was the tip of the upgrade graph at the time, and matched the InstallPlan in the cluster. If the policy was instead set to allow version v3.11.3, in this case the operator would have remained at versions v3.11.1.

apiVersion: policy.open-cluster-management.io/v1beta1
kind: OperatorPolicy
metadata:
name: quay-operator
namespace: local-cluster
spec:
complianceType: musthave
remediationAction: enforce
severity: high
subscription:
name: quay-operator
startingCSV: quay-operator.v3.11.1
upgradeApproval: Automatic
versions:
- quay-operator.v3.11.1
- quay-operator.v3.11.4
- quay-operator.v3.12.0
apiVersion: policy.open-cluster-management.io/v1
kind: Policy
metadata:
name: gitops-operator
namespace: open-cluster-management-global-set
spec:
disabled: false
policy-templates:
- objectDefinition:
apiVersion: policy.open-cluster-management.io/v1beta1
kind: OperatorPolicy
metadata:
name: gitops-operator
spec:
complianceType: musthave
remediationAction: enforce
severity: high
subscription:
name: openshift-gitops-operator
upgradeApproval: None
versions: []
- extraDependencies:
- name: gitops-operator
namespace: ""
apiVersion: policy.open-cluster-management.io/v1beta1
kind: OperatorPolicy
compliance: Compliant
objectDefinition:
apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
metadata:
name: gitops-argocd-exists
spec:
remediationAction: informonly
severity: medium
object-templates:
- complianceType: musthave
objectDefinition:
apiVersion: argoproj.io/v1beta1
kind: ArgoCD
metadata:
name: openshift-gitops
namespace: openshift-gitops
- extraDependencies:
- name: gitops-argocd-exists
namespace: ""
apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
compliance: Compliant
objectDefinition:
apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
metadata:
name: gitops-operator-policy-healthchecks
spec:
remediationAction: enforce
severity: medium
object-templates:
- complianceType: musthave
objectDefinition:
apiVersion: argoproj.io/v1beta1
kind: ArgoCD
metadata:
name: openshift-gitops
namespace: openshift-gitops
spec:
resourceHealthChecks:
- group: policy.open-cluster-management.io
kind: CertificatePolicy
check: |
hs = {}
if obj.status == nil or obj.status.compliant == nil then
hs.status = "Progressing"
hs.message = "Waiting for the status to be reported"
return hs
end
if obj.status.compliant == "Compliant" then
hs.status = "Healthy"
hs.message = "All certificates found comply with the policy"
return hs
else
hs.status = "Degraded"
hs.message = "At least once certificate does not comply with the policy"
return hs
end
- group: policy.open-cluster-management.io
kind: ConfigurationPolicy
check: |
hs = {}
if obj.status == nil or obj.status.compliant == nil then
hs.status = "Progressing"
hs.message = "Waiting for the status to be reported"
return hs
end
if obj.status.lastEvaluatedGeneration ~= obj.metadata.generation then
hs.status = "Progressing"
hs.message = "Waiting for the status to be updated"
return hs
end
if obj.status.compliant == "Compliant" then
hs.status = "Healthy"
else
hs.status = "Degraded"
end
if obj.status.compliancyDetails ~= nil then
messages = {}
for i, compliancy in ipairs(obj.status.compliancyDetails) do
if compliancy.conditions ~= nil then
for i, condition in ipairs(compliancy.conditions) do
if condition.message ~= nil and condition.type ~= nil then
table.insert(messages, condition.type .. " - " .. condition.message)
end
end
end
end
hs.message = table.concat(messages, "; ")
return hs
end
hs.status = "Progressing"
hs.message = "Waiting for compliance"
return hs
- group: policy.open-cluster-management.io
kind: OperatorPolicy
check: |
hs = {}
if obj.status == nil or obj.status.conditions == nil then
hs.status = "Progressing"
hs.message = "Waiting for the status to be reported"
return hs
end
if obj.status.observedGeneration ~= nil and obj.status.observedGeneration ~= obj.metadata.generation then
hs.status = "Progressing"
hs.message = "Waiting for the status to be updated"
return hs
end
for i, condition in ipairs(obj.status.conditions) do
if condition.type == "Compliant" then
hs.message = condition.message
if condition.status == "True" then
hs.status = "Healthy"
return hs
else
hs.status = "Degraded"
return hs
end
end
end
hs.status = "Progressing"
hs.message = "Waiting for the compliance condition"
return hs
- group: policy.open-cluster-management.io
kind: Policy
check: |
hs = {}
if obj.status == nil or obj.status.compliant == nil then
hs.status = "Progressing"
hs.message = "Waiting for the status to be reported"
return hs
end
if obj.status.compliant == "Compliant" then
hs.status = "Healthy"
else
hs.status = "Degraded"
end
noncompliants = {}
if obj.status.status ~= nil then
-- "root" policy
for i, entry in ipairs(obj.status.status) do
if entry.compliant ~= "Compliant" then
noncompliants[i] = entry.clustername
end
end
if table.getn(noncompliants) == 0 then
hs.message = "All clusters are compliant"
else
hs.message = "NonCompliant clusters: " .. table.concat(noncompliants, ", ")
end
elseif obj.status.details ~= nil then
-- "replicated" policy
for i, entry in ipairs(obj.status.details) do
if entry.compliant ~= "Compliant" then
noncompliants[i] = entry.templateMeta.name
end
end
if table.getn(noncompliants) == 0 then
hs.message = "All templates are compliant"
else
hs.message = "NonCompliant templates: " .. table.concat(noncompliants, ", ")
end
end
return hs
- extraDependencies:
- name: gitops-operator
namespace: ""
apiVersion: policy.open-cluster-management.io/v1beta1
kind: OperatorPolicy
compliance: Compliant
objectDefinition:
apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
metadata:
name: gitops-operator-policy-permissions
spec:
remediationAction: enforce
severity: medium
object-templates:
- complianceType: musthave
objectDefinition:
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: openshift-gitops-policy-admin
rules:
- verbs:
- get
- list
- watch
- create
- update
- patch
- delete
apiGroups:
- policy.open-cluster-management.io
resources:
- certificatepolicies
- configurationpolicies
- operatorpolicies
- policies
- policysets
- placementbindings
- verbs:
- get
- list
- watch
- create
- update
- patch
- delete
apiGroups:
- apps.open-cluster-management.io
resources:
- placementrules
- verbs:
- get
- list
- watch
- create
- update
- patch
- delete
apiGroups:
- cluster.open-cluster-management.io
resources:
- placements
- placements/status
- placementdecisions
- placementdecisions/status
- complianceType: musthave
objectDefinition:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: openshift-gitops-policy-admin
subjects:
- kind: ServiceAccount
name: openshift-gitops-argocd-application-controller
namespace: openshift-gitops
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: openshift-gitops-policy-admin
---
apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
name: gitops-operator-placement
namespace: open-cluster-management-global-set
spec:
clusterSets:
- global
tolerations:
- key: cluster.open-cluster-management.io/unreachable
operator: Exists
- key: cluster.open-cluster-management.io/unavailable
operator: Exists
---
apiVersion: policy.open-cluster-management.io/v1
kind: PlacementBinding
metadata:
name: gitops-operator-placement
namespace: open-cluster-management-global-set
placementRef:
name: gitops-operator-placement
apiGroup: cluster.open-cluster-management.io
kind: Placement
subjects:
- name: gitops-operator
apiGroup: policy.open-cluster-management.io
kind: Policy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment