You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SLTS operations do not immediately delete the chunks that are not needed.
Instead Chunks to be deleted are are
First marked for deletion in metadata and that change is committed as part of SLTS operations (Eg during truncate, concat or write etc)
Name of the chunk is put into GC queue.
Each container has dedicated background thread that polls this GC queue and deletes the chunks and the associated metadata
The actual deletion task happens on storage thread.
Garbage Collection Background Thread
In memory GC queue.
Each container has dedecated a background collector instance
Each Garbage collector instance has a delay queue instance that holds list of names of garbage chunks.
When a chunk becomes eligible it is deleted and it's metadata removed from metadata store.
Each instance of garbage collector also has a overflow list that is periodically flused to persistent GC queue.
Persisted GC queue
The persisted GC queue is formed by linked list of metadata records (just like normal segments)
By serializing List from overflow buffer to a new chunk on LTS.
Note that metadata about the persistent GC queue chunks are stored in table segment just like metadata for normal chunks
When in-memory queue is empty, the garbage collector populates the in-memory queue from such Persisted GC queue.
As the persistent GC queue is processed, the already processed chunks (containing GC queue data) are now eligible for delete and are added at the tail of the GC queue just like any normal chunk.
Throttling
Garbage collector uses no more than fixed percentage of storage threads at any time.
When the size of in-memory queue reaches max size.
No more items are added to the in memory queue.
New items are added to a oveflow buffer which is then periodically drained into a persisted chunk.
Garbage Discovery Background Thread
This is a background thread that periodically (once a day/ or once few hrs) scans entire storage metadata table segment by enumerating all entries and discovers all chunks that are marked for deletion, but still haven't.
This thread also scans and enqeues system journal chunks
Garbage Admin Tool
This is an admin tool that uses ChunkStorage::listChunks API to scan all chunks on LTS and discover any orphan chunks that are not in metadata.
The orphan chunks thus found are added to Persisted GC chunks.
Deletion of System Journal Chunks
Once the SLTS instance boots up successfully then all the system journal chunks created by previous epochs are added to the Persisted GC list.
When new snapshots are created chunks containing older snapshots and truncated chunks from journal are added to the Persisted GC list.
Garbage Collection Config Values
/**
* Minimum delay in seconds between when garbage chunks are marked for deletion and actually deleted.
*/
@Getter
final private Duration garbageCollectionDelay;
/**
* Number of chunks deleted concurrently.
* This number should be small enough so that it does interfere foreground requests.
*/
@Getter
final private int garbageCollectionMaxConcurrency;
/**
* Max size of garbage collection queue.
*/
@Getter
final private int garbageCollectionMaxQueueSize;
/**
* Duration for which garbage collector sleeps if there is no work.
*/
@Getter
final private Duration garbageCollectionSleep;
/**
* Max number of attempts per chunk for garbage collection.
*/
@Getter
final private int garbageCollectionMaxAttempts;