Slach · March 19, 2026 04:50
diff --git a/gistfile0.txt b/gistfile0.txt
 # ClickHouse: `DROP TABLE SYNC` and Async Object Storage Deletion (v26.2+)

 Applies to: `s3`, `azure_blob_storage`, `hdfs`, `local_blob_storage` disk types.

 ## What changed in ~v26.2

 A `BlobKillerThread` background thread was introduced. It is now responsible for
 **all actual blob deletion** from object storage. The metadata files are updated
 synchronously, but the blobs themselves are always removed asynchronously.

 ## `DROP TABLE SYNC` does NOT mean synchronous S3/Azure deletion

 The deletion flow is always:

 1. `DROP TABLE [SYNC]` → metadata marked as deleted (synchronous)
 2. Blobs added to an **in-memory** removal queue
 3. `BlobKillerThread` drains the queue in the background (default interval: 1s)

 The `SYNC` keyword only affects **when** the metadata operation happens:
 - On `Atomic` DB: without `SYNC`, removal is delayed; with `SYNC`, the query waits for the background task to finish.
 - On `Ordinary` DB: the `sync` parameter is **silently ignored** (`bool /*sync*/` in `DatabaseOnDisk::dropTable`). Behavior is the same with or without `SYNC`.

 Neither engine makes blob deletion synchronous.

 ## `SETTINGS` on `DROP TABLE` — not supported

 The parser only accepts fixed keywords (`IF EXISTS`, `ON CLUSTER`, `SYNC`, `PERMANENTLY`).
 There is no way to override async deletion behavior via SQL.

 ## How to wait for actual blob deletion

 ```sql
 DROP TABLE my_table SYNC;
 SYSTEM WAIT BLOBS CLEANUP;           -- or: SYSTEM WAIT BLOBS CLEANUP 'disk_name'
 ```

 ## Behavior on server restart

 The removal queue is **in-memory only** — it is not persisted to disk or ZooKeeper.

 | Scenario | Result |
 |---|---|
 | Graceful stop (`SIGTERM`) | `BlobKillerThread::shutdown()` drains the full queue before exit — blobs are deleted |
 | Crash / `kill -9` / OOM | Queue is lost — **orphan blobs remain in the bucket forever** |

 There is no built-in mechanism to detect or clean up orphaned blobs after a crash.
 Use S3/Azure Lifecycle Policies or external tooling to periodically reconcile.

 ## Same behavior for S3 and Azure Blob Storage

 All object storage disk types share the same stack:
 `DiskObjectStorage` + `BlobKillerThread` + `MetadataStorageFromDisk`.
 The behavior described above is identical for `s3` and `azure_blob_storage`.

 ## `BlobKillerThread` tuning (per-disk config)

 | Parameter | Default | Description |
 |---|---|---|
 | `interval_sec` | `1` | Wake-up interval |
 | `metadata_request_size` | `1000` | Blobs fetched per iteration |
 | `threads_count` | `16` | Parallel deletion threads |
	# ClickHouse: `DROP TABLE SYNC` and Async Object Storage Deletion (v26.2+)

	Applies to: `s3`, `azure_blob_storage`, `hdfs`, `local_blob_storage` disk types.

	## What changed in ~v26.2

	A `BlobKillerThread` background thread was introduced. It is now responsible for
	all actual blob deletion from object storage. The metadata files are updated
	synchronously, but the blobs themselves are always removed asynchronously.

	## `DROP TABLE SYNC` does NOT mean synchronous S3/Azure deletion

	The deletion flow is always:

	1. `DROP TABLE [SYNC]` → metadata marked as deleted (synchronous)
	2. Blobs added to an in-memory removal queue
	3. `BlobKillerThread` drains the queue in the background (default interval: 1s)

	The `SYNC` keyword only affects when the metadata operation happens:
	- On `Atomic` DB: without `SYNC`, removal is delayed; with `SYNC`, the query waits for the background task to finish.
	- On `Ordinary` DB: the `sync` parameter is silently ignored (`bool /sync/` in `DatabaseOnDisk::dropTable`). Behavior is the same with or without `SYNC`.

	Neither engine makes blob deletion synchronous.

	## `SETTINGS` on `DROP TABLE` — not supported

	The parser only accepts fixed keywords (`IF EXISTS`, `ON CLUSTER`, `SYNC`, `PERMANENTLY`).
	There is no way to override async deletion behavior via SQL.

	## How to wait for actual blob deletion

	```sql
	DROP TABLE my_table SYNC;
	SYSTEM WAIT BLOBS CLEANUP; -- or: SYSTEM WAIT BLOBS CLEANUP 'disk_name'
	```

	## Behavior on server restart

	The removal queue is in-memory only — it is not persisted to disk or ZooKeeper.

	\| Scenario \| Result \|
	\|---\|---\|
	\| Graceful stop (`SIGTERM`) \| `BlobKillerThread::shutdown()` drains the full queue before exit — blobs are deleted \|
	\| Crash / `kill -9` / OOM \| Queue is lost — orphan blobs remain in the bucket forever \|

	There is no built-in mechanism to detect or clean up orphaned blobs after a crash.
	Use S3/Azure Lifecycle Policies or external tooling to periodically reconcile.

	## Same behavior for S3 and Azure Blob Storage

	All object storage disk types share the same stack:
	`DiskObjectStorage` + `BlobKillerThread` + `MetadataStorageFromDisk`.
	The behavior described above is identical for `s3` and `azure_blob_storage`.

	## `BlobKillerThread` tuning (per-disk config)

	\| Parameter \| Default \| Description \|
	\|---\|---\|---\|
	\| `interval_sec` \| `1` \| Wake-up interval \|
	\| `metadata_request_size` \| `1000` \| Blobs fetched per iteration \|
	\| `threads_count` \| `16` \| Parallel deletion threads \|
No results found