Currently, if the data in pulsar was offloaded to the second storage layer, data can still exists in bookkeeper for a period of time, but the client will directly read data from the second layer.
This may lead to several problems:
- Read from second layer have different performance characteristics, which may lead wrong estimate from users if they didn't know which layer they are reading.
- The second layer may be managed by another team rather than Pulsar management team(for example, a independent HDFS management team), they may have independent quota or authority policy to users.
- The second layer storage can be infinite in theory, if user set cursor to an error time in accident, it will cause a lot of resource waste.