Duplication and repetition is pretty common in infrastructure configuration languages like GitHub Actions. Although you can never seem to get rid of all of it, there is usually something you can do to get rid of at least some of it.
If you use GitHub Actions in your GitHub repositories, like for your Continuous Integration/Continuous Deployment (CI/CD), you may find that you have a lot of repetition and duplication in your Actions workflows.
This is especially true if you have several repositories with the same workflows.
To reduce some of this duplication and the costs and hassles that come with it, you can use Reusable Workflows.
Because of the limitations listed below and demonstrated later, reusable workflows will not eliminate all of your duplication, but it definitely helps and provides solid value in easier maintenance.
In this post I will show you the basics of Reusable Workflows in GitHub Actions, enough to get you going with your own reusable workflows. Their Limitations. How they work. How they are structured. How to call and use them. How to get information in and out of them. Finally, (and you are totes free to stop reading by this point) my perspective on testing and design and development with them.
It is always good to start with what you can't do. Get it right out in the open right away. Besides you will revisit most of this again in specifics later in this post.
The limitations of reusable workflows...
-
Reusable workflows can not call reusable workflows, you can not nest reusable workflows
-
Reusable workflows in private repositories can only be used by workflows in the same repository
-
Caller workflow environment variables are not available to the called workflow
- You can not set reusable workflow inputs from caller environment variables
- Any environment variables set in an env context defined at the workflow level in the caller workflow are not propagated to the called workflow.
-
Any job that calls a reusable workflow can not use
strategy
property i.e. build matrix
π GitHub Documentation on Limitations of Reuseable Workflows
This covers some of the background and an overview of reuseable workflows...
- What does a workflow have to have to be reuseable
- The two sides of a reusable workflow, the caller and called
- What happens when a reusable workflow runs
What makes a workflow able to be reused?
-
Reusable workflow files are basically the same as other GitHub Actions YAML workflow files and must be located in the relative directory (no subdirectories)...
.github/workflows/
-
You can call a reusable workflow in your workflow IF either...
- The caller and called workflows are in the same repository
- The called workflow is in a public repository
π GitHub Documentation on Access to Reusable Workflows and GitHub Documentation on Calling a Reusable Workflow
There are two sides or perspectives in reusing workflows...
-
The caller workflow is the workflow that is using the reusable workflow
-
The called workflow is the reusable workflow being, well, called
Understanding these two perspectives is critical to knowing what will be available when running your reusable workflows.
When running a reusable workflow in a calling workflow...
-
The entire called workflow (all jobs) is used
-
The called workflow runs like it was part of the caller workflow e.g. called
actions/checkout
checks out the contents of the caller repository hosting the caller workflow -
The called workflow Github Context is the caller's GitHub Context
-
The called workflow is automatically granted access to
github.token
andsecrets.GITHUB_TOKEN
of the caller workflow
This post contains examples from my reusable CI/CD workflows which are called and used in my personal projects.
For examples of reusable workflows, reference the workflows under https://github.com/brianjbayer/actions-image-cicd/tree/main/.github/workflows
Note that the workflows that begin with on_
are calling workflows that
call and exercise the reusable workflows in that project.
For examples of calling workflows that use my reusable CI/CD workflows,
you can reference the workflows that begin with on_
in
https://github.com/brianjbayer/actions-image-cicd/tree/main/.github/workflows
or see the workflows at
https://github.com/brianjbayer/sample-login-capybara-rspec/tree/main/.github/workflows.
Here is an example of a reusable (or called) workflow named...
.github/workflows/git_github_info.yml
This example happens to show that you can use other actions and have multiple steps in your job.
This example only has a single job, but reusable workflows can have multiple jobs which will all run.
name: Display Basic Git and GitHub (Actions) Info
on:
workflow_call:
jobs:
git-github-info:
name: Git and GitHub Information
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- name: Git log
run: git log
- name: Show GitHub context
env:
GITHUB_CONTEXT: ${{ toJson(github) }}
run: echo "$GITHUB_CONTEXT"
This example shows that the structure of a reusable workflow file is mostly that of a basic GitHub Actions workflow file with a few nuances.
π You can refresh yourself on the basics of GitHub Actions workflows with my post The Basics of GitHub Actions Workflows
One of the main differences of a reusable workflow is that it is not
triggered by an event, but by being called. This is why the on
(the trigger configuration) is different.
To trigger the reusable workflow by calling it, set the on:
to
workflow_call:
.
on:
workflow_call:
With reusable workflows, it is the calling workflow that is triggered by events such as a Pull Request (PR) which will then call your reusable workflow.
Although simple and self-contained with no inputs or outputs, this example reusable workflow does illustrate the two perspectives of a called reusable workflow.
Consider the single job in the reusable workflow...
steps:
- uses: actions/checkout@v1
- name: Git log
run: git log
- name: Show GitHub context
env:
GITHUB_CONTEXT: ${{ toJson(github) }}
run: echo "$GITHUB_CONTEXT"
In the first step, it uses: actions/checkout@v1
, showing that you
can use other "Actions".
steps:
- uses: actions/checkout@v1
But another important point is that when this action runs, it will
"run like it was part of the caller" so it will do a git checkout
of the caller's git repository. This then allows the next step
to output the git log
for the caller's repository.
π’ Captain Obvious says that this demonstrates how when the reusable workflow runs, it has the caller's perspective.
A final important point that this example reusable workflow shows...
- name: Show GitHub context
env:
GITHUB_CONTEXT: ${{ toJson(github) }}
run: echo "$GITHUB_CONTEXT"
When this step in the reusable workflow runs, it will display the
GitHub Context
which will be the caller's Context. This means that your
reusable workflows have access to the caller's GitHub Context
(e.g. GITHUB_CONTEXT: ${{ toJson(github) }}
). This also is how you can
display the values and fields in your GitHub Context in printable form.
π’ Captain Obvious says that it is having the caller's perspective that makes your reusable actions more reusable.
BUT one thing that is not available to the reusable workflow is any of the caller's environment variables as mentioned in Limitations.
Now here is an example of a workflow file (maybe in another repository) that calls the reusable workflow...
name: On PR Workflow Checks
on:
pull_request:
branches:
- main
jobs:
check-git_github_info:
uses: brianjbayer/actions-image-cicd/.github/workflows/git_github_info.yml@main
This shows that the calling workflow has the GitHub event trigger
(in this case, on
PRs against the main
branch).
It then defines it's own job check-git_github_info
that calls or
uses
the reusable workflow specifying the reference.
jobs:
check-git_github_info:
uses: brianjbayer/actions-image-cicd/.github/workflows/git_github_info.yml@main
There are a couple of things that you should know about calling a reusable workflow in another (calling) workflow...
-
You must specify an
@{ref}
(e.g.@main
) when using a reusable workflow -
The only thing that the calling job can do is call (i.e.
use
) a reusable workflow, it can not have steps and is basically a "proxy" job for the reusable workflow in the calling workflow
π About that
@{ref}
: You must specify an@{ref}
whenuse
ing a reusable workflow. It can be a SHA, a release tag, or a branch name. The SHA is the safest, most stable, and clearly defined.
You can refer to the reusable workflow file using either of the following two manners...
- For reusable workflows in a public repository...
{owner}/{repo}/{path}/{filename}@{ref}
- For reusable workflows in the same repository...
Note that here the reusable workflow is from the same commit as the caller workflow./{path}/{filename}
Calling a reusable workflow can be dependent on another job in that
caller workflow just like in basic workflows with needs
.
check-pull_push_image:
needs: [check-build_push_image]
uses: brianjbayer/actions-image-cicd/.github/workflows/pull_push_image.yml@main
In this example, the calling workflow job check-pull_push_image
(which calls the reusable workflow .github/workflows/pull_push_image.yml
)
is dependent on (needs
) the successful completion of calling
workflow job check-build_push_image
.
β¨ For all of the capabilities available when calling reusable workflows, see the GitHub Documentation on Supported keywords for jobs that call a reusable workflow
To be truly flexible and reuseable, you will need a way of getting information into your reuseable workflows. Information like input values and secrets to access the caller's secure systems like image repositories.
For this reason, reusable workflows have inputs
and secrets
π GitHub Documentation on inputs and secrets in a reusable workflow
inputs
are how you pass input values to your reusable workflows from the calling workflow.
There are three parts to passing inputs to a reusable workflow...
- In the reusable workflow, define the inputs
- In the reusable workflow, reference and use the inputs
- In the calling workflow, pass the inputs to the reusable workflow
In the reusable workflow file, the inputs are first defined and then used.
Here is an example from part of a reusable workflow
.github/workflows/build_push_image.yml
that uses inputs
.
First, define the inputs in the reusable workflow using inputs
.
name: Idempotent Build Push Image
on:
workflow_call:
inputs:
image:
required: true
type: string
buildopts:
type: string
Here we have defined two inputs (variables) image
and buildopts
both
strings. The required: true
specifies that the image
input is not
optional and must be supplied when calling the reusable workflow.
Second, reference and use the defined inputs in the reusable
workflow using the format ${{ inputs.<inputs-name> }}
e.g. ${{ inputs.image }}
.
jobs:
pull-or-build-and-push-image:
name: Pull or Build and Push Image
runs-on: ubuntu-latest
env:
IMAGE: ${{ inputs.image }}
steps:
- uses: actions/checkout@v1
- name: Image name
run: echo "Image name [${IMAGE}]"
Here we happen to define a job-level environment variable IMAGE
with
the value of the input ${{ inputs.image }}
and then later output it.
β¨ This does show that you can set environment variables from inputs in reusable workflows
The final part of inputs with reusable workflows is setting them when
they are called in another workflow. To include inputs when you call
a reusable workflow, use with
.
Here is our reusable workflow with inputs
.github/workflows/build_push_image.yml
being called
in another workflow...
name: On PR Workflow Checks
on:
pull_request:
branches:
- main
env:
BRANCH: ${{ github.head_ref }}
COMMIT: ${{ github.event.pull_request.head.sha }}
jobs:
check-build_push_image:
uses: brianjbayer/actions-image-cicd/.github/workflows/build_push_image.yml@main
with:
image: ${{ github.repository }}_${{ github.head_ref }}_test:${{ github.event.pull_request.head.sha }}
Here the image
input is set with values from the GitHub Context.
Note that the value is explicitly set with the GitHub Context
values and not with the calling workflow's environment variables.
π Here is an example of not being able to use environment variables in the calling workflow with reusable workflows. Although the
BRANCH
andCOMMIT
variables are set, they can not be used to set the reusable workflow inputs (they appear as unset/empty in the reusable workflow)
Often your reusable workflows need secure access to the calling workflow's secrets (for example a reusable workflow that builds an image from the calling repository and pushes it to an image registery as part of CI).
Using secrets with a reusable workflow is basically the same as inputs with the same three parts...
- In the reusable workflow, define the secrets
- In the reusable workflow, reference and use the secrets
- In the calling workflow, pass the secrets to the reusable workflow
In the reusable workflow file, the secrets are first defined and then used.
Here is an example from part of the reusable workflow
.github/workflows/build_push_image.yml
that uses secrets
.
name: Idempotent Build Push Image
on:
workflow_call:
inputs:
image:
required: true
type: string
buildopts:
type: string
secrets:
registry_u:
required: true
registry_p:
required: true
jobs:
pull-or-build-and-push-image:
name: Pull or Build and Push Image
runs-on: ubuntu-latest
env:
IMAGE: ${{ inputs.image }}
steps:
- uses: actions/checkout@v1
- name: Image name
run: echo "Image name [${IMAGE}]"
- name: Login to DockerHub Registry
run: echo ${{ secrets.registry_p }} | docker login -u ${{ secrets.registry_u }} --password-stdin
Along with the inputs from before, this reusable workflow defines the
required secrets registry_u
registry_p
.
on:
workflow_call:
inputs:
image:
required: true
type: string
buildopts:
type: string
secrets:
registry_u:
required: true
registry_p:
required: true
It then uses them in the step to Login to the DockerHub image registry.
- name: Login to DockerHub Registry
run: echo ${{ secrets.registry_p }} | docker login -u ${{ secrets.registry_u }} --password-stdin
The final step in using secrets with your reusable workflows is passing
the calling workflow's secrets to the reusable workflow. Like inputs,
use with
to include setting secrets when calling a reusable workflow.
Here is the example workflow being called with its secrets being set by the calling workflow.
check-build_push_image:
uses: brianjbayer/actions-image-cicd/.github/workflows/build_push_image.yml@main
with:
image: ${{ github.repository }}_${{ github.head_ref }}_test:${{ github.event.pull_request.head.sha }}
secrets:
registry_u: ${{ secrets.DOCKER_HUB_USERNAME }}
registry_p: ${{ secrets.DOCKER_HUB_ACCESS_TOKEN }}
Here the reusable workflow secrets registry_u
and registry_p
are being set
to the calling workflow's secrets ${{ secrets.DOCKER_HUB_USERNAME }}
and
${{ secrets.DOCKER_HUB_ACCESS_TOKEN }}
respectively.
When you need information from a reusable workflow, probably to use in
other workflows, you use outputs
.
Although overall similar to inputs and secrets, outputs are a bit different (and a little more work).
At a high level, outputs still have the same three parts...
- In the reusable workflow, define the outputs
- In the reusable workflow, set the values of the outputs
- In the calling workflow, get the values of the outputs
But the key difference with outputs is that the output is set at the step level in a job but passed at the reusable workflow level.
Since the reusable workflow output is set at the step level in a job but passed at the workflow level, there are a couple of places in the reusable workflow where you define and set outputs (some intermediate).
You will first define the output at the workflow level and set it from the output of a job in the reusable workflow.
Then you will define the job-level output and set it from an identified step output.
Finally you will set the step-level output in the identified job.
Consider this example of reusable workflow file
.github/workflows/get_merged_branch_last_commit.yml
name: Merged Branch and Last Commit SHA of Branch
on:
workflow_call:
outputs:
branch:
value: ${{ jobs.merged-branch-commit.outputs.branch }}
commit:
value: ${{ jobs.merged-branch-commit.outputs.commit }}
jobs:
merged-branch-commit:
name: Generate merged branch and branch last commit
runs-on: ubuntu-latest
outputs:
branch: ${{ steps.getbranch.outputs.branch }}
commit: ${{ steps.getcommit.outputs.commit }}
steps:
- uses: actions/checkout@v1
- id: getbranch
run: echo "branch=$(git log -1 --pretty=%B | grep 'Merge pull request' | sed 's/^[^/]*\///')" >> $GITHUB_OUTPUT
- id: getcommit
run: echo "commit=$(git log -n 1 --skip 1 --pretty=format:"%H")" >> $GITHUB_OUTPUT
Here there are are two workflow outputs defined,branch
and commit
and their values are set from the outputs branch
and commit
of a job
named merged-branch-commit
(i.e. branch: ${{ steps.getbranch.outputs.branch }}
)
name: Merged Branch and Last Commit SHA of Branch
on:
workflow_call:
outputs:
branch:
value: ${{ jobs.merged-branch-commit.outputs.branch }}
commit:
value: ${{ jobs.merged-branch-commit.outputs.commit }}
Then later in the workflow, the two job outputs branch
and commit
for the merged-branch-commit
job are defined and set from the output
of steps in the job with the id
s of getbranch
and getcommit
.
jobs:
merged-branch-commit:
name: Generate merged branch and branch last commit
runs-on: ubuntu-latest
outputs:
branch: ${{ steps.getbranch.outputs.branch }}
commit: ${{ steps.getcommit.outputs.commit }}
Finally in the steps of the jobs, the output values are actually set
with the syntax run: echo "{name}={value}" >> $GITHUB_OUTPUT
.
- id: getbranch
run: echo "branch=$(git log -1 --pretty=%B | grep 'Merge pull request' | sed 's/^[^/]*\///')" >> $GITHUB_OUTPUT
- id: getcommit
run: echo "commit=$(git log -n 1 --skip 1 --pretty=format:"%H")" >> $GITHUB_OUTPUT
Here we are executing a pipeline of unix commands to set the values of the outputs.
Originally GitHub Actions used
:set-output
as the output method, however this was deprecated for setting theGITHUB_OUTPUT
environemt variable
In the calling workflow, there are at least two jobs associated with the outputs of the reusable workflow.
The first job calls the reusable workflow and serves as the holder or "proxy" for the outputs.
The second or more jobs actually use the outputs from the first job.
Here is an example from a workflow that calls our reusable workflow with outputs.
branch-and-last-commit:
uses: brianjbayer/actions-image-cicd/.github/workflows/get_merged_branch_last_commit.yml@main
branch-and-last-commit-merged-info:
needs: [branch-and-last-commit]
runs-on: ubuntu-latest
env:
BRANCH_LAST_COMMIT: ${{ needs.branch-and-last-commit.outputs.commit }}
BRANCH: ${{ needs.branch-and-last-commit.outputs.branch }}
Here is the first job that calls the reusable workflow with outputs...
branch-and-last-commit:
uses: brianjbayer/actions-image-cicd/.github/workflows/get_merged_branch_last_commit.yml@main
Then the second job is dependent on (needs
) the first job and uses its
outputs...
branch-and-last-commit-merged-info:
needs: [branch-and-last-commit]
runs-on: ubuntu-latest
env:
BRANCH_LAST_COMMIT: ${{ needs.branch-and-last-commit.outputs.commit }}
BRANCH: ${{ needs.branch-and-last-commit.outputs.branch }}
Here the outputs are being used to set the value of the second job's environment variables.
This concludes the basics of GitHub Actions workflows and you should now be able to start developing and using your own reusable workflows or at least better understand them.
In the remainder of this post, I will present my initial thoughts on testing and developing based on my experience with reuseable workflows.
As I mention in my post on the The Basics of GitHub Actions Workflows there is no native testing facility for GitHub Actions and this is also the case for reusable workflows. This makes testing your workflows rather painful.
Limiting your testing even more is that you can not really call scripts in your reusable workflows as these scripts would need to be in the repository of every calling workflow that uses the reusable workflow.
Thus the only two practical ways to test your reusable workflows as well as the calling workflows are..
-
Putting the reusable workflow under test in an on-Pull-Request calling workflow letting you run it when pushing a new commit on a Pull Request (PR). However, this approach will not help if there is any data or state that is only present during merges or other events.
-
Creating a separate (and disposable) GitHub repository as a testbed for your calling and reusable workflows. This is especially useful when developing and testing merge workflows.
There is definitely value in reusable workflows for all of their limitations, but those limitations really do hurt you in the amount of duplication and repetition ("copy pasta" and "cargo culting") and cost that you can eliminate in your GitHub Actions workflows, especially if you have basically the same workflows in multiple GitHub repositories.
These are the specific limitations that I encountered.
Reusable workflows can not call reusable workflows
This limitation really kills the levels of duplication that you can remove and your design options.
Consider that I have several GitHub Repositories that have basically the same CI/CD and the same workflows. These workflows have the same triggers, environment variables, jobs, job dependencies, and steps.
There is repetition at the repository levels at the overall CI/CD logic of multiple workflows. Within a workflow, there is repetition in the jobs and their similar but a little different steps.
But since reusable workflows can not call reusable workflows, you can only tackle one of these layers of duplication.
You can design your reusable workflows at the high-level repository business-logic level (e.g. CI/CD) but then still have duplication in your jobs, or you can design your reusable workflows at the lower utility job level and have duplication at the calling repository level (especially across repositories).
I generally choose to implement at the lower utility level with the hope of more reuse and greater cohesion.
Caller workflow environment variables are not available to the called workflow
This also hits you in the amount of code duplication and cost you can reduce. This one really hurts if you have a lot of the same inputs and values in your calling workflow especially if they are composites of other values like from GitHub Contexts or strings. For me, it is my branch-based, commit-tagged image names.
There is also the balancing of cohesion and making the smallest unit of functionality and scalability versus the operational effectiveness and efficiency of doing multiple things in multiple steps.
Since each job runs on its own worker, that is the level of parallelism (and separation like environment variables). But this level of parallelism is limited to the number of Runners available (and this is where scalability comes into play by adding more runners).
Every time a job starts, there is a cost in the time that it takes to start up the machine and job. There is also a cost for that runner. So from a financial perspective, it makes sense to be efficient in "unit costs" and run as many jobs and as many steps as possible on a single runner.
This then is the struggle. I personally lean towards the side of cohesion
and small utility workflows, but I always try to do what makes sense for
the situation. In the outputs example,
.github/workflows/get_merged_branch_last_commit.yml
gets both the branch
name and the commit instead of 2 separate jobs. In my use case for this
reusable workflow and information, I always needed both of these things
so it did make the decision a little easier.
This concludes this post on reusable workflows, I hope that it may help you or at least that you found something useful in it.
This is great, thank you for writing it up. I wanted to point out that it looks like nested reusable workflows are supported now:
https://github.blog/changelog/2022-08-22-github-actions-improvements-to-reusable-workflows-2/
https://docs.github.com/en/actions/using-workflows/reusing-workflows#nesting-reusable-workflows