Results of Reconciliation

In a controller, a reconciliation loop execution performs some domain specific operations to eliminate any drift in the declared configuration and the state of the world. Usually, the result of the core operations of the reconciliation is directly a change in the world based on the declared configuration. Other results of the reconciliation are the reported status on the object, ctrl.Result value and any error during the reconciliation. The ctrl.Result and error are runtime results, and the reported status is an API change result on the target object. Based on the controller-patterns(flux) document, the core operations are handled in the sub-reconcilers. The ctrl.Result, error and status API are handled in a deferred function called summarizeAndPatch(). This document describes in detail about how these three types of results are formulated and how they affect one another. It describes a generic model for computing these results independent of the domain of a reconciler. It takes into consideration the kstatus standards for reporting the status of objects using status conditions. However, this model can be used to create similar models with other standards in other domains.

        ┌────────────────────────────────────────────────────────────────────┐
        │                             Runtime                                │
        │           ┌──────────────────────────────────────────────────────┐ │
        │           │                Reconciler                            │ │
        │           │                                                      │ │
        │ Reconcile │          ┌─────────────────┐                         │ │
 Event  │  Request  │  object  │ Domain specific │ Intermediate expression │ │
───────►├──────────►│─────────►│    operations   ├──────────┐of results    │ │
   1    │    2      │     3    │       4         │       5  │              │ │
        │           │          └─────────────────┘          │              │ │
        │           │                                       ▼              │ │
        │           │           Runtime results     ┌────────────────┐     │ │
        │           │        (ctrl.Result + error)  │  Final result  │     │ │
        │           │◄──────────────────────────────┤   computation  │     │ │
        │           │      Update object status API │       6        │     │ │
        │           │                 7             └────────────────┘     │ │
        │           └──────────────────────────────────────────────────────┘ │
        │                                                                    │
        └────────────────────────────────────────────────────────────────────┘

As described in the controller-patterns document, the results of reconciliation can be abstracted into three constants: ResultEmpty, ResultRequeue and ResultSuccess, and a contextual error to express the result of reconciliation more clearly. The expressed results in these forms are used as input to the generic model that computes the final result which are then used to patched the object status and returned to the runtime.

We'll go through the details about how the final result is computed and then discuss about the user-facing APIs to tune these result computation based on the needs of a reconciler.

Runtime Result (`ctrl.Result`)

In the result abstraction, unlike ResultEmpty and ResultRequeue, which have clear meaning of what they are, ResultSuccess may have a different meaning in different domains. For a reconciler that always reconciles at some particular interval of time, ResultSuccess is ctrl.Result with RequeueAfter value as the requeue interval. However, for a reconciler that reconciles only when there's an event related to the objects it watches, the ResultSuccess is an empty ctrl.Result value, which is equivalent to ResultEmpty. In this case, although ResultSuccess and ResultEmpty have the same underlying value, they have different meanings. ResultEmpty doesn't mean that the reconciliation was successful. It can be returned along with a failure error, or a stalling error. But ResultSuccess is explicitly used to indicate that the reconciler has succeeded in its operations.

The BuildRuntimeResult() introduced in the controller-patterns document is an example of converting these intermediate results into runtime results. To make it customizable, we can define an RuntimeResultBuilder interface that can be used to implement custom result conversion.

// RuntimeResultBuilder defines an interface for runtime result builders. This
// can be implemented to build custom results based on the context of the
// reconciler.
type RuntimeResultBuilder interface {
	BuildRuntimeResult(rr Result, err error) ctrl.Result
}

In the above, Result is the abstracted result and error is the reconciliation error. Based on the domain, if an error affects the returned runtime result, this can be specified in a custom RuntimeResultBuilder implementation. For example, in the case of reconcilers that always requeue at a specific period, when there's a waiting error which indicates that the reconciler should wait for some period of time before retrying again, the RuntimeResultBuilder implementation would look like:

// AlwaysRequeueResultBuilder implements a RuntimeResultBuilder for always
// requeuing reconcilers. A successful reconciliation result for such
// reconcilers contains a fixed RequeueAfter value.
type AlwaysRequeueResultBuilder struct {
	// RequeueAfter is the fixed period at which the reconciler requeues on
	// successful execution.
	RequeueAfter time.Duration
}

// BuildRuntimeResult converts a given Result and error into the
// return result of a controller's Reconcile function.
func (r AlwaysRequeueResultBuilder) BuildRuntimeResult(rr Result, err error) ctrl.Result {
	// Handle special errors that contribute to expressing the result.
	if e, ok := err.(*serror.Waiting); ok {
		return ctrl.Result{RequeueAfter: e.RequeueAfter}
	}

	switch rr {
	case ResultRequeue:
		return ctrl.Result{Requeue: true}
	case ResultSuccess:
		return ctrl.Result{RequeueAfter: r.RequeueAfter}
	default:
		return ctrl.Result{}
	}
}

This shows how the runtime result, ctrl.Result, can be computed, taking into consideration the domain specific factors.

ComputeReconcileResult

ComputeReconcileResult() was introduces in the controller-patterns document to consolidate all the computation of results. It can use the RuntimeResultBuilder described above to compute the ctrl.Result.

For computing the runtime error and status conditions, we need to consider the kstatus conditions, particularly the Reconciling and Stalled conditions. They are dependent on the Result and error of reconciliation and also affect the runtime error that's computed.

Reconciling status

When the reconciler has detected a drift in the declared configuration and the state of the world, a Reconciling status can be added on the object while the reconciler is working on eliminating the drift. An example of this would be a new configuration. When a new object generation is observed, the reconciler can add a Reconciling status condition on the object (not persisted in the API server yet, only in memory). By the end of the loop, if the reconciliation was successful, the Reconciling status can be removed. But if the reconciliation wasn't successful in eliminating the drift, maybe the reconciler needs to wait and retry or encountered an error, the Reconciling status remains on the object status across the reconciliation loop runs. In this case, the status condition value is persisted in the API server at the end of a reconciliation.

In this scenario, the reconciliation result and the error affect the object status API. The following information are needed to compute the results:

Was the reconciliation successful?
Did the reconciliation fail due to some error?
Was the reconciliation unsuccessful due to some unmet preconditions and it may be resolved with some retries?

Using the result abstraction, we can determine the reconciling status condition and update the object as:

// Remove reconciling condition on successful reconciliation.
if recErr == nil && res == ResultSuccess {
	conditions.Delete(obj, meta.ReconcilingCondition)
}

where recErr is the reconciliation error and res is the abstracted result. The abstracted result makes a clear distinction between a successful result, empty result and immediate requeue result. When the reconciliation error is nil and the reconciliation was successful, the Reconciling condition can be removed.

Note that, the Reconciling condition is added by the core operations code, in this section, we only evaluate if it should stay or be removed.

If the res value was ResultRequeue, which means we need to retry, Reconciling condition should not be removed.

In other words, the Reconciling condition can be removed when the reconciliation was successful and there's no error.

Stalled status

When the reconciler detects a configuration that can't be used to reconcile successfully, even if retried, it can enter into a Stalled state. This state requires a human intervention to fix the provided configuration. The error result in this situation would be a stalling error.

Reconciling and Stalled status conditions are mutually exclusive. Reconciling exists when there's no error and requeue is requested, but Stalled exists only when the Stalling error is encountered, along with an empty result, ResultEmpty.

Considering these, we can analyze the error and Result as:

// Analyze the reconcile error.
switch t := recErr.(type) {
case *serror.Stalling:
	if res == ResultEmpty {
		// The current generation has been reconciled successfully and it
		// has resulted in a stalled state. Return no error to stop further
		// requeuing.
		pOpts = append(pOpts, patch.WithStatusObservedGeneration{})
		conditions.MarkStalled(obj, t.Reason, t.Error())
		return pOpts, result, nil
	}
case *serror.Waiting:
	// The reconcile resulted in waiting error, but we are not in stalled
	// state.
	conditions.Delete(obj, meta.StalledCondition)
	// The reconciler needs to wait and retry. Return no error. The result
	// contains the requeue after value.
	return pOpts, result, nil
case nil:
	// The reconcile didn't result in any error, we are not in stalled
	// state. If a requeue is requested, the current generation has not been
	// reconciled successfully.
	if res != ResultRequeue {
		pOpts = append(pOpts, patch.WithStatusObservedGeneration{})
	}
	conditions.Delete(obj, meta.StalledCondition)
default:
	// The reconcile resulted in some error, but we are not in stalled
	// state.
	conditions.Delete(obj, meta.StalledCondition)
}

Stalled status condition is added to the object when an Stalling error is encountered with empty result.

Stalled status condition is removed when there is no error or the error is not Stalling error.

The patch options set in variable pOpts is used to configure when to update the status.observedGeneration of an object to indicate the status of the object. A stalled state means that the current object generation has been reconciled, so patch.WithStatusObservedGeneration() is set in the patch option.

With the above, we have computed all the three results of reconciliation using BuildRuntimeResult and the analysis of error and result. The next section discusses about processing, summarizing and patching the object status.

Summarize and Patch

The controller-patterns document introduced a version of summarize and patch with some details left out for simplicity. In this section, we will discuss it in more details and with some advanced usage of SummarizeAndPatch() to be able to configure how it functions. This version of SummarizeAndPatch() has a different API which provides more control over the process.

The required arguments for SummarizeAndPatch() are an event recorder, a patch helper and the target object. The event recorder could be any event recorder that adheres to the K8s event recorder interface. The patch helper is based on the go package github.com/fluxcd/pkg/runtime/patch, which helps patch the final object.

It is created using a summarize and patch helper constructor.

func NewHelper(recorder kuberecorder.EventRecorder, patchHelper *patch.Helper) *Helper

The SummarizeAndPatch() method takes a context, a target object and some helper options and returns the runtime result and error:

func (h *Helper) SummarizeAndPatch(ctx context.Context, obj conditions.Setter, options ...Option) (ctrl.Result, error)

The default behavior of SummarizeAndPatch() with only the required arguments is only to patch the provided object.

Summarizing the Conditions

The summarization in SummarizeAndPatch() refers to the condition status summary of the conditions of an object. Usually, when using kstatus, the Ready condition is expected to be present. The Ready condition summary depends on the values of other conditions. In addition to Ready, other conditions or even any custom conditions can be summarized. For this, a new Conditions type can be defined in the context of summarization.

// Conditions contains all the conditions information needed to summarize the
// target condition.
type Conditions struct {
	// Target is the target condition, e.g.: Ready.
	Target string
	// Owned conditions are the conditions owned by the reconciler for this
	// target condition.
	Owned []string
	// Summarize conditions are the conditions that the target condition depends
	// on.
	Summarize []string
	// NegativePolarity conditions are the conditions in Summarize with negative
	// polarity.
	NegativePolarity []string
}

An example instance of this for Ready condition looks like:

var gitRepoReadyConditions = Conditions{
	Target: meta.ReadyCondition,
	Owned: []string{
		sourcev1.SourceVerifiedCondition,
		sourcev1.FetchFailedCondition,
		sourcev1.IncludeUnavailableCondition,
		sourcev1.ArtifactOutdatedCondition,
		meta.ReadyCondition,
		meta.ReconcilingCondition,
		meta.StalledCondition,
	},
	Summarize: []string{
		sourcev1.IncludeUnavailableCondition,
		sourcev1.SourceVerifiedCondition,
		sourcev1.FetchFailedCondition,
		sourcev1.ArtifactOutdatedCondition,
		meta.StalledCondition,
		meta.ReconcilingCondition,
	},
	NegativePolarity: []string{
		sourcev1.FetchFailedCondition,
		sourcev1.IncludeUnavailableCondition,
		sourcev1.ArtifactOutdatedCondition,
		meta.StalledCondition,
		meta.ReconcilingCondition,
	},
}

In this case, the Target condition to be summarized is the Ready condition. The Owned conditions are conditions that are relatd to this target condition which will be patched along with the target condition. Owned is used to configure the patch helper to resolve any conflict by making the patcher the owner of those conditions. The Summarize conditions are the conditions the target condition depends on. The NegativePolarity conditions are the conditions from the summarize conditions that have a negative polarity.

Similarly, other Conditions can be configured for computing their summary in SummarizeAndPatch(), passed as an option. SummarizeAndPatch() iterates through all the conditions and adds all the summaries on the object, which is patched at the end.

Result Processing

In SummarizeAndPatch(), the final runtime result of reconciliation can be calculated by passing the Result, error and a RuntimeResultBuilder. The RuntimeResultBuilder is the same result builder discussed in the Runtime Result section above. The Result and error are passed to the result builder and ComputeReconcileResult to compute the final result as described above.

If no RuntimeResultBuilder is passed, SummarizeAndPatch() skips computing the result. The returned results should be ignored by the caller.

In order to perform any pre-processing on the results (target object, Result and error), before passing to the ComputeReconcileResult(), custom result processors can be inject which are middlewares in SummarizeAndPatch(). The result processors are defined as:

// ResultProcessor processes the results of reconciliation (the object, result
// and error). Any errors during processing need not result in the
// reconciliation failure. The errors can be recorded as logs and events.
type ResultProcessor func(context.Context, kuberecorder.EventRecorder, client.Object, reconcile.Result, error)

These result processors are useful for logging, event emitting based on the results. They can also be used to perform any final modifications to the object before its used to compute result.

Patching

Patching is the final step of SummarizeAndPatch(). It uses the patch helper to patch the final form of the object. In cases where patching may be applied to an object being deleted, to ignore the resource not found error, an option IgnoreNotFound can be passed to SummarizeAndPatch().

Following is an implementation of SummarizeAndPatch() method, consisting of the details described above:

func (h *Helper) SummarizeAndPatch(ctx context.Context, obj conditions.Setter, options ...Option) (ctrl.Result, error) {
	// Calculate the options.
	opts := &HelperOptions{}
	for _, o := range options {
		o(opts)
	}
	// Combined the owned conditions of all the conditions for the patcher.
	ownedConditions := []string{}
	for _, c := range opts.Conditions {
		ownedConditions = append(ownedConditions, c.Owned...)
	}
	// Patch the object, prioritizing the conditions owned by the controller in
	// case of any conflicts.
	patchOpts := []patch.Option{
		patch.WithOwnedConditions{
			Conditions: ownedConditions,
		},
	}

	// Process the results of reconciliation.
	for _, processor := range opts.Processors {
		processor(ctx, h.recorder, obj, opts.ReconcileResult, opts.ReconcileError)
	}

	var result ctrl.Result
	var recErr error
	if opts.ResultBuilder != nil {
		// Compute the reconcile results, obtain patch options and reconcile error.
		var pOpts []patch.Option
		pOpts, result, recErr = ComputeReconcileResult(obj, opts.ReconcileResult, opts.ReconcileError, opts.ResultBuilder)
		patchOpts = append(patchOpts, pOpts...)
	}

	// Summarize conditions. This must be performed only after computing the
	// reconcile result, since the object status is adjusted based on the
	// reconcile result and error.
	for _, c := range opts.Conditions {
		conditions.SetSummary(obj,
			c.Target,
			conditions.WithConditions(
				c.Summarize...,
			),
			conditions.WithNegativePolarityConditions(
				c.NegativePolarity...,
			),
		)
	}

	// Finally, patch the resource.
	if err := h.patchHelper.Patch(ctx, obj, patchOpts...); err != nil {
		// Ignore patch error "not found" when the object is being deleted.
		if opts.IgnoreNotFound && !obj.GetDeletionTimestamp().IsZero() {
			err = kerrors.FilterOut(err, func(e error) bool { return apierrors.IsNotFound(e) })
		}
		recErr = kerrors.NewAggregate([]error{recErr, err})
	}

	return result, recErr
}

An example usage of this in in the GitRepository reconciler looks like:

summarizeHelper := summarize.NewHelper(r.EventRecorder, patchHelper)
summarizeOpts := []summarize.Option{
	summarize.WithConditions(gitRepoReadyConditions),
	summarize.WithReconcileResult(recResult),
	summarize.WithReconcileError(retErr),
	summarize.WithIgnoreNotFound(),
	summarize.WithProcessors(
		summarize.RecordContextualError,
		summarize.RecordReconcileReq,
	),
	summarize.WithResultBuilder(sreconcile.AlwaysRequeueResultBuilder{RequeueAfter: obj.GetInterval().Duration}),
}
result, retErr = summarizeHelper.SummarizeAndPatch(ctx, obj, summarizeOpts...)

This covers all the details involved in the computation of the result of a reconciler. The result computation and patching model descibed above can be applied to any of the reconcilers independent of the core business logic. The model provides options to define any domain specific modifications that may be needed. It can be developed more to cover more cases that may not be addressed in this document.

darkowlzz/results-of-reconciliation.md