This is a POC for immutability in a legal hold situation. The assumption is that an Azure Storage Account will be used as a target for images, documents etc. that need to be provably unchanged for a required legal period.
Immutability policies can be scoped to a blob version or to a container. How an object behaves under an immutability policy depends on the scope of the policy.
- Time-based retention policy scope
- Legal hold scope
You can configure both at the same time. Time-based can be extended up to five times.
Legal hold allows new blob uploads (not updates / overwrites) whilst time-based stops everything bar read.
As we need legal hold then the storage account must be either
- General-purpose v2
- Premium block blob
Hierarchial namespace is supported. but only with container-level scope
To configure an immutability policy that is scoped to a blob version, you must enable support for version-level immutability on either the storage account or a container.
Once version-level immutability is enabled you can configure a default policy at the account or container level, but only for time-based immutability. Legal hold must be applied on individual blobs.
The most flexible looks like it is version-level immutability, that works on a specific version. However policies can be enabled at the container level so that new blobs (or blob versions) will automatically be placed on legal hold.
On this page we'll create two storage accounts:
- container time-based immutability lock
- version level legal hold
And see what that means for REST API actions, reporting, etc.
-
Create a directory
mkdir ~/immutable
-
Move to it
cd ~/immutable
-
Input variables
Customise if required.
resource_group="immutable" location="uksouth"
-
Set local defaults
rgId="/subscriptions/$(az account show --query id -otsv)/resourceGroups/$resource_group" hash=$(md5sum <<< $rgId | cut -c1-12) az config set --local \ defaults.group=$resource_group \ defaults.location=$location \ storage.legalhold=legalhold$hash \ storage.timebased=timebased$hash
The last command creates a
.azure/config
file in the current working directory.Note that storage.account and storage.auth_mode etc. are valid defaults, whereas storage.legalhold and storage.timebased are not. However the file will store those values and we can recall them easily. (Storage accounts need to be globally unique as they form part of the public endpoint's FQDN.)
-
Suppress warnings (optional)
This guide uses some newer features so there are warnings. If desired, these can be suppressed, but not in the local config.
az config set core.only_show_errors=true
-
Set auth mode
You can also set defaults for the auth-mode, so that it purely uses RBAC role assignments. This is a good recommendation to avoid any issues with leaked access keys.
az config set storage.auth_mode=true
-
Resource group and defaults
az group create --name $(az config get defaults.group --local --query value -otsv)
-
Base storage account
az storage account create \ --name $(az config get storage.timebased --local --query value -otsv) \ --allow-blob-public-access=false \ --allow-cross-tenant-replication=true \ --allow-shared-key-access=false \ --https-only=true \ --kind=StorageV2 \ --min-tls-version=TLS1_2 \ --public-network-access=Enabled \ --default-action=Allow \ --sku=Standard_LRS
Ensure TLS 1.2 and https, plus RBAC permissions only. Versioning is not supported on Premium_LRS.
-
Configure network rules
az storage account network-rule add \ --account-name $(az config get storage.timebased --local --query value -otsv) \ --action=Allow \ --ip-address=$(curl -sSL https://myexternalip.com/raw)
-
Tighten access
az storage account update \ --name $(az config get storage.timebased --local --query value -otsv) \ --public-network-access=Enabled \ --default-action=Deny \ --bypass AzureServices
⚠️ This could be tightened up further. -
Configure data protection
az storage account blob-service-properties update \ --account-name $(az config get storage.timebased --local --query value -otsv) \ --enable-versioning true
az storage account blob-service-properties update \ --account-name $(az config get storage.timebased --local --query value -otsv) \ --enable-delete-retention true \ --delete-retention-days 7 # 1-365
az storage account blob-service-properties update \ --account-name $(az config get storage.timebased --local --query value -otsv) \ --enable-container-delete-retention true \ --container-delete-retention-days 7 # 1-365
az storage account blob-service-properties update \ --account-name $(az config get storage.timebased --local --query value -otsv) \ --enable-change-feed true \ --change-feed-days 7 # 1-146000
This is auto-enabled when you add a backup policy.
For convenience. Repeat of the above commands in a single code block.
storage_account=$(az config get storage.timebased --local --query value -otsv)
az storage account create \
--name $storage_account \
--allow-blob-public-access=false \
--allow-cross-tenant-replication=true \
--allow-shared-key-access=false \
--https-only=true \
--kind=StorageV2 \
--min-tls-version=TLS1_2 \
--public-network-access=Enabled \
--default-action=Allow \
--sku=Standard_LRS
az storage account network-rule add \
--account-name $storage_account \
--action=Allow \
--ip-address=$(curl -sSL https://myexternalip.com/raw)
az storage account update \
--name $storage_account \
--public-network-access=Enabled \
--default-action=Deny \
--bypass AzureServices
az storage account blob-service-properties update \
--account-name $storage_account \
--enable-versioning true \
--enable-delete-retention false \
--enable-container-delete-retention false \
--enable-change-feed true \
--change-feed-days 7 \
--enable-restore-policy false
unset storage_account
Note that
--enable-delete-retention
(point in time restore) cannot be used with time based immutability.
Defined at the storage account level with --enable-alw
. Note the --immutability-period-in-days
which sets the default immutability period for uploaded blobs. You can set these at the container level instead, but I'm assuming that these storage accounts are being created specifically for legal hold use.
Blob uploads can specify a different retention period. It is also possible to add a legal hold onto specific blob versions. I assume that the default account level time period will be used to establish base immutability across all uploaded blobs, and then the legal hold will be added to items of interest. Any uploads of a blob with the same name will create a new version.
Also note the immutability-state
. This is set to unlocked to allow the default to be modified. Set to locked to harden. Can only extend at that point, up to five times.
storage_account=$(az config get storage.legalhold --local --query value -otsv)
az storage account create \
--name $storage_account \
--allow-blob-public-access=false \
--allow-cross-tenant-replication=true \
--allow-shared-key-access=false \
--https-only=true \
--kind=StorageV2 \
--min-tls-version=TLS1_2 \
--public-network-access=Enabled \
--default-action=Allow \
--sku=Standard_LRS \
--enable-alw \
--immutability-period-in-days 2 \
--immutability-state unlocked \
--allow-protected-append-writes true
az storage account network-rule add \
--account-name $storage_account \
--action=Allow \
--ip-address=$(curl -sSL https://myexternalip.com/raw)
az storage account update \
--name $storage_account \
--public-network-access=Enabled \
--default-action=Deny \
--bypass AzureServices
az storage account blob-service-properties update \
--account-name $storage_account \
--enable-versioning true \
--enable-delete-retention false \
--enable-container-delete-retention false \
--enable-change-feed true \
--change-feed-days 7 \
--enable-restore-policy false
Note that point in time restores cannot be configured at the same time as Azure Blobs backup. We'll use that service as the backups can also be immutable.
Note that Azure Blob backup uses Azure Backup Vault rather than Azure Recovery Vault. Azure Backup Vault covers some newer backup scenarios such as Azure PostgreSQL and Azure Blob.
Using az backup vault create
creates the older Recovery Services vaults. Use az dataprotection backup-vault create
instead.
-
Install the dataprotection extension
az extension add --name dataprotection
-
Create an Azure Backup Vault
az dataprotection backup-vault create \ --vault-name "immutable" \ --storage-setting "[{type:'LocallyRedundant',datastore-type:'VaultStore'}]" \ --azure-monitor-alerts-for-job-failures="Enabled" \ --immutability-state="Unlocked" \ --soft-delete-state="On" \ --retention-duration-in-days=14 \ --type="SystemAssigned"
The immutability ensures that recovery points cannot be deleted from the Backup Vault before their expiry date. Valid values are Disabled, Unlocked and Locked.
⚠️ Setting to Locked is irreversible so consider the length of recovery points. -
Grant permissions on the Backup vault on the storage account
Grab the backup vault's managed identity object id.
managed_identity=$(az dataprotection backup-vault show --vault-name "immutable" --query identity.principalId -otsv)
Grab the time based storage account's resource id.
storage_account=$(az config get storage.timebased --local --query value -otsv) storage_account_id=$(az storage account show --name $storage_account --query id -otsv)
Add the Storage Account Backup Contributor RBAC role assignment.
az role assignment create \ --role "Storage Account Backup Contributor" \ --assignee $managed_identity \ --scope $storage_account_id
-
Repeat for the legal hold storage account
managed_identity=$(az dataprotection backup-vault show --vault-name "immutable" --query identity.principalId -otsv) storage_account_id=$(az storage account show --name $(az config get storage.legalhold --local --query value -otsv) --query id -otsv) az role assignment create --role "Storage Account Backup Contributor" --assignee $managed_identity --scope $storage_account_id
-
Create a backup policy
Create a backup policy.json file from the template for Azure Blob.
az dataprotection backup-policy get-default-policy-template --datasource-type AzureBlob > backup_policy.json
Customise name and retention etc. if you wish. Example:
{ "datasourceTypes": [ "Microsoft.Storage/storageAccounts/blobServices" ], "name": "MyDefaultPolicy", "objectType": "BackupPolicy", "policyRules": [ { "isDefault": true, "lifecycles": [ { "deleteAfter": { "duration": "P14D", "objectType": "AbsoluteDeleteOption" }, "sourceDataStore": { "dataStoreType": "OperationalStore", "objectType": "DataStoreInfoBase" } } ], "name": "Default", "objectType": "AzureRetentionRule" } ] }
Create the backup policy.
az dataprotection backup-policy create --backup-policy-name DefaultPolicy \ --policy backup_policy.json --vault-name immutable
⚠️ Don't use--backup-policy-name Default
. (Ordefault
.) This triggers an odd error. -
Configure backup for the time based storage account
Grab the backup policy ID.
backup_policy_id=$(az dataprotection backup-policy show --backup-policy-name DefaultPolicy \ --vault-name immutable --query id -otsv)
Grab the timebased storage account's resource id.
storage_account=$(az config get storage.timebased --local --query value -otsv) storage_account_id=$(az storage account show --name $storage_account --query id -otsv)
Create the backup_instance.json.
az dataprotection backup-instance initialize --datasource-type AzureBlob \ --policy-id $backup_policy_id --datasource-id $storage_account_id | tee timebased_backup_instance.json
Create the backup instance.
az dataprotection backup-instance create --vault-name immutable --backup-instance timebased_backup_instance.json
-
Repeat for the legal hold storage account
backup_policy_id=$(az dataprotection backup-policy show --backup-policy-name DefaultPolicy --vault-name immutable --query id -otsv) storage_account_id=$(az storage account show --name $(az config get storage.legalhold --local --query value -otsv) --query id -otsv) az dataprotection backup-instance initialize --datasource-type AzureBlob --policy-id $backup_policy_id --datasource-id $storage_account_id > legalhold_backup_instance.json az dataprotection backup-instance create --vault-name immutable --backup-instance legalhold_backup_instance.json
Uses a JSON string rather than a templated file.
https://learn.microsoft.com/azure/storage/blobs/blob-inventory
Enable blob inventory. Supports CSV and Apache Parquet format, which is more efficient for ingestion into Azure Databricks. The daily or weekly inventory is accompanied by an Event Grid trigger.
Use rules and filters at the blob level as Content-MD5 is not supported at the container level.
You can create multiple containers.
It is also possible to use hierarchical namespace, but this then adds some limitations.
You can set time based at the storage account level. Here we will do it at the container level.
Retention period between 1 and 146000 days. Note that you can permit additional writes or appends to block blobs with --allow-protected-append-writes
or --allow-protected-append-writes-all
.
-
Get the storage account name
The storage CLI commands do not seem to fully respect local config settings or environment variables.
storage_account=$(az config get --local storage.timebased --query value -otsv)
-
Create a container
az storage container create --name "time-based" \ --account-name $storage_account --auth-mode login --public-access off
-
Add a time based retention policy
az storage container immutability-policy create \ --account-name $storage_account --container-name "time-based" \ --period 1
Once locked you can only extend up to five times.
-
Example modification (optional)
You can test and modify until locked.
Get the ETag.
etag=$(az storage container immutability-policy show \ --account-name $storage_account --container-name time-based \ --query etag -otsv)
Then modify.
az storage container immutability-policy extend \ --account-name $storage_account --container-name time-based \ --period 2 --if-match $etag
⚠️ Retest tomorrow. Getting "operation not allowed on immutability policy with current state" error, yet showing unlocked. Same day? -
Lock the time based policy
Once locked you can only extend up to five times.
Get the ETag.
etag=$(az storage container immutability-policy show \ --account-name $storage_account --container-name time-based \ --query etag -otsv)
Then lock.
az storage container immutability-policy lock \ --account-name $storage_account --container-name time-based \ --if-match $etag
-
Get the storage account name
storage_account=$(az config get --local storage.legalhold --query value -otsv)
-
Create a container
Note that this uses the container-rm subcommand.
az storage container-rm create \ --name legal-hold \ --storage-account $storage_account \ --public-access off --enable-vlw
-
Check (optional)
az storage container-rm show \ --storage-account $storage_account \ --name legal-hold \ --query '[immutableStorageWithVersioning.enabled]' \ --output tsv
-
What about one with a different default period?
az storage container immutability-policy create \
--account-name <storage-account> \
--container-name <container> \
--period <retention-interval-in-days> \
--allow-protected-append-writes true
-
Add a legal hold to the container- DON'T DO THIS!!!
Legal hold can be applied at a container level, or on individual blob versions. As this is a target for storing legal hold info then suggest container level.
az storage container legal-hold set \ --account-name $storage_account \ --container-name legal-hold \ --tags tag1 tag2 \ --allow-protected-append-writes-all true
Note that the tags are required.
Working with blobs needs Storage Blob Data Owner, Storage Blob Data Contributor, or Storage Blob Data Reader. (Or a custom role.)
-
Get the storage account name
storage_account=$(az config get --local storage.account --query value -otsv)
-
Add yourself as a Blob Contributor.
az role assignment create \ --role "Storage Blob Data Contributor" \ --scope $(az storage account show --name $storage_account --query id -otsv) \ --assignee $(az ad signed-in-user show --query id -otsv)
-
Calculate the md5sum
-
Single file example
az storage blob upload \ --file "./Partner Admin Link - Partner Ready FAQ_April28_2022.docx" \ --account-name $storage_account --container-name time-based --auth-mode login
-
Batch file example
az storage blob upload-batch \ --source ./my_folder --pattern *.txt \ --destination legal-hold --destination-path my_folder \ --account-name $storage_account --auth-mode login
All *.vhd files go to page, otherwise block. (Can control with
--type
.) Can also choose--if-(un)modified-since
with a UTC datetime. (E.g. YYYY-MM-DDThh:mmZ, ordate -d "7 days ago" '+%Y-%m-%dT%H:%MZ'
)
All files get uploaded with an automatic md5sum.
⚠️ Note that this needs checking for larger files.
-
List blobs with checksum
az storage blob list \ --container-name legal-hold --account-name $storage_account \ --query "[].{name:name, md5:properties.contentSettings.contentMd5}" \ --auth-mode login
Note that the contentMd5 value is a base64 encoded representation of the binary MD5 hash value.
-
Example download
az storage blob download --name "my_folder/my_blob_file.ext" \ --container-name legal-hold --account-name $storage_account \ --file "my_local_file.ext" --auth-mode login
-
Bash checksum example
contentMd5=$(md5sum --binary "my_local_file.ext" | awk '{print $1}' | xxd -p -r | base64)
-
PowerShell checksum example
$FilePath = ".\my_local_file.ext" $rawMD5 = (Get-FileHash -Path $FilePath -Algorithm MD5).Hash $hashBytes = [system.convert]::FromHexString($rawMD5) $contentMd5 = [system.convert]::ToBase64String($hashBytes)
-
Filter container based on md5sum base64 value and count array length
az storage blob list --container-name legal-hold --account-name $storage_account \ --query "[?properties.contentSettings.contentMd5 == '$contentMd5'] | length(@)" \ --auth-mode login
Should return one if checksum matches, zero if not. If more than one then multiple matches.
https://learn.microsoft.com/azure/storage/blobs/immutable-legal-hold-overview#audit-logging
Each container with a legal hold in effect provides a policy audit log. The log contains the user ID, command type, time stamps, and legal hold tags. The audit log is retained for the lifetime of the policy, in accordance with the SEC 17a-4(f) regulatory guidelines.
The Azure Activity log provides a more comprehensive log of all management service activities. Azure resource logs retain information about data operations. It's the user's responsibility to store those logs persistently, as might be required for regulatory or other purposes.
TODO: Add this to a specific container or storage account.
az storage container legal-hold clear \
--tags tag1 tag2 \
--container-name <container> \
--account-name <storage-account> \
--resource-group <resource-group> \
--auth-mode login
storageAccount="<storage-account>"
containerName="<container-name>"
az storage blob list \
--container-name $containerName \
--prefix "ab" \
--query "[[].name, [].versionId]" \
--account-name $storageAccount \
--include v \
--auth-mode login \
--output tsv
- https://learn.microsoft.com/azure/storage/blobs/immutable-storage-overview
- https://learn.microsoft.com/azure/storage/blobs/immutable-legal-hold-overview
- https://learn.microsoft.com/azure/storage/blobs/immutable-policy-configure-version-scope
- https://learn.microsoft.com/azure/storage/blobs/immutable-policy-configure-container-scope
- https://learn.microsoft.com/azure/storage/blobs/versioning-overview
- https://learn.microsoft.com/azure/storage/blobs/versioning-enable
- https://learn.microsoft.com/azure/storage/blobs/blob-inventory
- https://learn.microsoft.com/azure/storage/blobs/blob-inventory-how-to
- https://learn.microsoft.com/azure/storage/blobs/object-replication-configure
- https://learn.microsoft.com/azure/storage/blobs/storage-feature-support-in-storage-accounts
- https://learn.microsoft.com/azure/storage/blobs/soft-delete-container-enable
- https://learn.microsoft.com/azure/storage/blobs/soft-delete-blob-enable
- https://learn.microsoft.com/azure/backup/blob-backup-overview
- https://learn.microsoft.com/azure/backup/blob-backup-configure-manage
- https://learn.microsoft.com/azure/backup/backup-blobs-storage-account-cli
- https://learn.microsoft.com/azure/backup/backup-azure-immutable-vault-concept (Preview)
- https://learn.microsoft.com/azure/backup/backup-azure-immutable-vault-how-to-manage (Preview)
- https://learn.microsoft.com/cli/azure/azure-cli-configuration
- https://learn.microsoft.com/en-us/azure/storage/blobs/immutable-policy-configure-container-scope#configure-or-clear-a-legal-hold
- https://learn.microsoft.com/en-us/rest/api/storageservices/Put-Block ( https://technet2.github.io/Wiki/blogs/windowsazurestorage/windows-azure-blob-md5-overview.html)
-
Get the storage account name
storage_account=$(az config get --local storage.legalhold --query value -otsv)
-
Create a container
az storage container create --name legal-hold \ --account-name $storage_account --auth-mode login --public-access off
-
Add a legal hold to the containe- DON'T DO THIS!!!
Legal hold can be applied at a container level, or on individual blob versions. As this is a target for storing legal hold info then suggest container level.
az storage container legal-hold set \ --account-name $storage_account \ --container-name legal-hold \ --tags tag1 tag2 \ --allow-protected-append-writes-all true
Note that the tags are required.