Skip to content

Instantly share code, notes, and snippets.

@chgeuer
Last active October 12, 2022 02:25
Show Gist options
  • Save chgeuer/4dd26086ed1d7003126ddb6c13f04a09 to your computer and use it in GitHub Desktop.
Save chgeuer/4dd26086ed1d7003126ddb6c13f04a09 to your computer and use it in GitHub Desktop.

incoming comments

  • Eleanore
    • Summary and decision matrix?
  • Julie NG
    • Code Reuse bzw. Modules (Knowledge Sharing im Org) fehlen.
    • Anderer Use Case is das manche "Best Practices" v.a. globally unique names finde ich 1000x leichter mit TF oder Pulumi. Deshalb braucht man crazy Bash Scripten mit ARM/Bicep
    • Discuss "mix" scenario. Multiple times damaged an AKS cluster, when Terraform aggresively prefered to delete and re-create infrastructure
    • Clean Up is simpler with Terraform statefiles. Check out https://julie.io/writing/arm-terraform-pulumi-infra-as-code/
  • Laura Nicolas
    • I really like the doc I think you covered everything, but I wanted to share a few things that might be a bit too much detail but that I have seen from working with customers and partners recently
    • The topic of multicloud I think it is a bit misleading, I have had a couple of engagements where the customer thought that they could reuse the terraform template in Azure coming from AWS or that there was a “translation tool”. I like the fact here that you’ve mentioned different providers, but I thought it might be worth sharing that it is multicloud in terms that you can use the same language but not the same code…
    • You have in the first table Cut and past of VM? And honestly, I was not sure what you meant here…Maybe the multicloud thing I was referring to?
    • I think the way terraform handles states adds additional challenges that you could mention in the source of truth section:
      • Need to have an external storage to host the state, so teams can be collaborative and work in a DevOps like environment or using CI/CD tools.
      • When working with something like DeployIfNotExist policies, Terraform has the challenge of having to refresh state as additional things may be configured by the policy
    • An important topic is also the availability of the modules, often when we release a new product/feature it is not necessarily available in Terraform right away and they may need to use a “workaround” whereas in ARM/Bicep the automation solutions tend to be available sooner and keep up with Azure innovation. An example could be Container Apps.

Terraform versus ARM/Bicep

Azure fully embraces multiple options to provision infrastructure, i.e. Infrastructure as Code (IaC).

Infra as Code can be declarative or imperative

  • A declarative approach uses a domain-specific language to describe the desired state of the infrastucture. Examples of that would be ARM/Bicep templates and Terraform templates. All these are collectively listed on the templates page @AzureTerraform
  • Imperative approaches facilitate a step-by-step build-up of the infrastructure. Examples of that include bash scripts (where you use the Azure/CLI or powershell commandlets) to incrementally build things. Tools like Farmer, or Pulumi, or Microsoft's Azure SDKs, allow you to imperatively create infrasturcture from your preferred programming language.
  • Sometimes the borders between declarative or imperative are blurry:
    • You can imperatively deploy a declarative template, such as having one step in your setup where a full template (containing many resources) is deployed.
    • You can have a declarative template which runs imperative code, for example in an Azure deploymentScript resource.
Property ARM / Bicep Hashicorp Terraform
Number of API calls Single call Large number of calls
Orchestration by whom? Azure runs template and tracks dependencies in the cloud, so little work for my laptop. CI/CD or user laptop runs the jobs
What is the truth? ARM looks at concrete resources in Azure to determine reality TF looks at state files on what it believes reality is like
Interaction with AAD Creating AAD or Microsoft/Graph objects like service principals in ARM is painful, must spin up deploymentScripts TF has own AAD provider (azure-ad), so this works well
Multi-cloud Created specifically for Azure Supports many resource providers, such as [[Amazon Web Services]] , OpenStack, DNSimple, Kubernetes, ...
DSL Writing ARM JSON is laborious. Bicep heavily inspired by TF HCL can be used across all the supported providers
Cut and past of VM? n.a. Not possible. An EC2 VM in AWS and a VM in Azure are fundamentally different things
Azure Marketplace Managed Applications are ARM templates Terraform not supported

ARM JSON templates, and Bicep templates, can be treated as roughly equivalent.

  • Bicep compiles down into ARM JSON (bicep build)
  • Bicep can be created from ARM JSON (bicep decompile).
    • When creating a Bicep file from an existing ARM template, the resulting Bicep file certainly contains lots of cruft and often needs to be cleaned manually

Orchestration by whom

  • ARM: The client usually POSTs an ARM JSON template to Azure, in a single API call. The Azure Resource Manager system then parses the template, creates the dependency graph, computes the difference between desired state from the template and the already-existing resources in Azure, and executes all necessary underlying calls in the cloud. So the user just throws their template into the cloud, and the backend orchestrates the long-running provisioning process.
  • Terraform: The whole orchestration is done by terraform.exe, running on the user's laptop or CI/CD environment, or build server. So the user has to ensure the machine orchestrating the deployment is up and running. An exception would be to use Hashicorp's commercial "Terraform Cloud" offer, which provides the same server-side orchestration as ARM does for Azure.

Number of API calls

  • Why is that a topic? Azure has certain quotas and limits (at subscription and tenant level) on how many ARM API requests can go through, before throttling might kick in (see Throttling Resource Manager requests) for details. When a Terraform template might have to deploy a huge landscape, these limits might kick in. The ARM API communicates the current state back to the caller using x-ms-ratelimit-remaining-* response headers.
  • ARM/Bicep only require upload of the template (and associated parameters) for the deployment, which is a single API call to the ARM API. All the complex orchestration, talking to the different resource providers, happens inside the cloud.
  • Terraform decomposes the template client-side into a dependency graph, and is treating the cloud as 'dumb'. Therefore, Terraform submits a large number of API calls to the cloud, both resource creation requests, as well as continuous queries to track execution status of previously submitted requests.

Source of truth

  • Declarative systems like ARM and Terraform must avoid disruption and unnecessary work. Resources that already exist should not be 'touched' or re-created, if their current state complies to the template's desired state. For example, a storage account containing data should not be deleted and re-created upon subsequent re-execution of a template deployment; users wouldn't like such a clean slate. The deployments should be 'idempotent', i.e., re-deploying the same template over and over should not modify the landscape, is all resources are already as desired.
  • ARM effectively inspects the existing resources in Azure, to compare the system's current state with the template's desired state. For example, when ARM processes a template that demands a certain storage account, if that storage account exists, nothing happens. The storage account resource in the template is a 'no-op'. As a result, one could create some resources manually (via the portal), and a subsequent script deployment that also describes these resources would just skim over them.
  • Terraform uses an own (local) state store, to keep track of system state. When a TF template creates a resource, its existence is remembered in the state store. If a subsequent execution of a modified template still indicated that resource, TF just expects the resource to be still there (because the state files say so). In cases where other processes or users do operations in the environment, such as deleting resources, Terraform by default does not inspect the live system. For example, if Terraform created a storage account, a user then (accidentally) deletes that account, the re-execution of the template would not re-create the storage account. The terraform refresh command would be used to inform Terraform that maybe reality and state diverged, and the state files should be updated.

Interaction with Azure AD

  • For the sake of IaC tools, Azure and Azure Active Directory (AAD) are completely separate beasts. For example, as an Office 365 customer, you might have an Azure Active Directory tenant, but you might not have an Azure subscription. The opposite isn't possible: each Azure subscription must be hooked up to one (and exactly one) Azure AD tenant.
  • ARM templates target Azure, not Azure AD. An ARM template lists ARM resources, such as virtual machines, storage accounts, or cognitive services. Azure AD resources, or Microsoft Graph resources, such as security groups or service principals, cannot natively (or at least easily) be created using an ARM template.
    • One could create a deploymentScripts resource, which runs a PowerShell or bash script, which performs the necessary API calls against Azure AD or Microsoft Graph API. However, that requires quite a bit of authoring effort and becomes verbose. In addition, given that a deploymentScript runs in the cloud without a local user, one has to carefully consider which credential the deploymentScript should use to call into AAD or Graph.
    • An exception to this limitation are managed identities, such as user-assigned managed identities and system-assigned managed identities. Under the hood, a managed identity is represented by an Azure AD service principal. But this service principal is created through an ARM resource provider, and can be managed using ARM templates. For managed identities, it is not necessary to talk to AAD or Graph directly.
  • Terraform has a dedicated azure-ad provider, that supports management of a rich set of AAD objects, such as applications, roles, groups, invitations, service principals, users and other resources. So creating complex AAD and Graph objects is definitely much easier on the Terraform side.

Multi-cloud and non-ARM resource types

  • ARM is specifically focused on Azure. Via ARM, you can interact with resources in Microsoft's public cloud, in Azure China, or in Azure Stack deployments.
  • Terraform supports a broad set (2000+) of providers for various environments and services, such as Azure, Amazon Web Services, Google Cloud, Kubernetes, Helm, Consul, Azure Active Directory, auth0, DNSimple, and many more. If you have a strong requirement to deploy resource across multiple environments, in an orchestrated fashion, from a single template, then Terraform is the way to go. For example, creating an Azure landscape, setting up Akamai CDN, and configuring DNS records in DNSimple, would be such a cross-provider deployment.

Syntax and know-how re-use after learning the domain-specific language

  • ARM JSON
    • JSON is a verbose way to express infrastructure. Often, the author copy-and-pastes content from other templates. Getting the JSON nesting level right, can be a time-consuming and error-prone exercise.
    • In the resource graph, resources often have dependencies on other resources. The resource descriptions in the JSON document require dependsOn annotations, so that the ARM provider creates everything in the correct order.
    • ARM JSON is the underlying resource representation of resources in the ARM model. Whether one fetches a resource from the REST API, or explores the system via the portal's resource explorer, JSON is the lingua franca on ARM.
  • ARM Bicep
    • Bicep simplifies the template authoring experience significantly.
    • Bicep is a custom language (which looks very similar to "HashiCorp Configuration Language (HCL)", so if you have experience writing Terraform templates, Bicep should feel natural.
  • Terraform
    • Terraform uses "HashiCorp Configuration Language (HCL)" for all providers, so devops engineers and template authors only have to learn a single syntax and use it across all cloud environments and service providers. So instead of learning ARM/Bicep on Azure, and CloudFormation for AWS, a single syntax and templating philosophy can be used across all environments.
    • However, it is important to understand that the actual template fragments cannot be used as is: An S3 bucket in AWS is different from a storage account in Azure. An EC2 instance in AWS has different properties than an Azure virtual machine. A Terraform template written for AWS cannot be re-used for a deployment in Azure.

Azure Marketplace requires ARM (maybe generated using Bicep)

  • When publishing an "Azure Managed Application" into Azure Marketplace, the ISV uploads a ZIP file containing all required artifacts into the Azure Marketplace Partner Center.
  • ARM: This ZIP file MUST contain a ARM JSON template.
  • Bicep: You might create that JSON from a set of Bicep templates.
  • Terraform: You cannot put a Terraform template in here, as Azure Marketplace expects to deploy an ARM JSON file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment