Currently, if the data in pulsar was offloaded to the second storage layer, data can still exists in bookkeeper for a period of time, but the client will directly read data from the second layer.
This may lead to several problems:
- Read from second layer have different performance characteristics, which may lead wrong estimate from users if they didn't know which layer they are reading.
- The second layer may be managed by another team rather than Pulsar management team(for example, a independent HDFS management team), they may have independent quota or authority policy to users.
- The second layer storage can be infinite in theory, if user set cursor to an error time in accident, it will cause a lot of resource waste.
So it's better to make data source configurable if data exists in both layer.
Maybe the below options are enough:
BOOKKEEPER_ONLY
BOOKKEEPER_FIRST
OFFLOADED_ONLY
OFFLOADED_FIRST
Now which layer was broker read from is decide by org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#getLedgerHandle(long ledgerId) which only have one parameter ledgerId
, and will choose the offloaded ledger handle as soon as the ledger was offloaded. If the choosed handle fails all the getLedgerHandle
fails.
The tiered read priority should be set by namespace or topic, the method in command line tool should be looks like
pulsar-admin namespaces --set-tiered-read-priority tenant/namespace priority-policie
pulsar-admin topics --set-tiered-read-priority tenant/namespace/topic priority-policie
If not configured, OFFLOADED_FIRST
should be used by default, which will result to the same behavior with current version.
Then the corresponding ManagedLedger
should be aware what priority option client is using, and the signature the getLedgerHandle
method should be change to
CompletableFuture<ReadHandle> getLedgerHandle(
long ledgerId, TieredReadPriority priority) {
For BOOKKEEPER_ONLY
and OFFLOADED_ONLY
, the ManagedLedger
will use the corresponding ReadHandle
directly. For BOOKKEEPER_FIRST
and OFFLOADED_FIRST
, ManagedLedger
will fall back to the secondary storage, no matter the ledger in the first layer storage does not exist, or there is something wrong in network or disk or authorization with first layer storage.
tieredReadPriority
to the config filebroker.conf
orstandalone.conf
for broker level config.tieredReadPriority
to theLedgerInfo
and the signature of the methodgetLedgerHandle
could stay the current situation.