les-flow-control.md

Abstract flow control model

Now LES protocol uses token bucket for throttling as the flow control mechanism. Basically the mechanism is:

Each les client will be assigned a capacity which represents the allowance of les client to use les server resources
- Free client will be assigned a minimal capacity
- Paid client can be assigned a higher capacity
Each request will be metered, the cost mostly depends on the serving time
Token bucket is counted by unit, not request count, e.g. 100,000 units
Token bucket is recharged by unit, not request count, e.g. 1 unitper nanosecond

For each request, corresponding units will be reduced which is consistent with serving time, e.g. 100,000 units for 0.1ms request serving.

In fact, this is not very accurate. In addition to the request serving time, the impact of network bandwidth will be considered in LES.
Token bucket is recharged with speed capacity, so higher capacity assigned, the client can send more or heavier requests
Mirror token bucket

Not only server will maintain token buckets for all connected clients, client itself will also maintain a mirror token bucket for the connection.

In the handshake, server will exchange some flow control parameters with les client which is listed below:
- Maximal request cost table: MRC
- Minimal recharge speed: MRR
- Token bucket burst
With these information, clients can use these locally maintained token buckets for request distribution. Therefore, in theory, all requests will be distributed to different connected servers. Each request is processed with the highest priority and the best effect on latency.

Notably, these parameters are the promise from server to client. It means based on the MRC and MRR, client can always send requests no matter server is overload or not.

Additional tricks
- Extra bonus for token bucket recharge
  
  If the server is idle(no heavy requst load, no block processing), we can use a higher recharge speed(In other words, we give the client a higher capacity)
Drawbacks

However this model will have these drawbacks:
- Now the request cost metering is bond with serving time, however we can abstract it out with different metering policies:
  - Metering by serving time
  - Metering by request count
- It's very hard to make request cost table correct which is exchanged between server and client
  - Different machine will have totally different average request serving time
  - Low machine
    - Fast machine
- Different status of Geth node will have totally different request serving time
  - Geth node will high pressure(Eth node is syncing from us, Les clients are sending requests to us)
  - Geth node is idle
So in order to match the promise maximum cost table which server sends to client, we need an additional adjustment factor Global Factor to adjust the real cost via the proportion between Real Serving Time and Estimated Serving Time
It's unnecessary to throttle request in unit level

We can see the benchmarking result, basically the average serving time of different request types is 0.45ms to 1ms, there is no big difference.

ONE EXCEPTION is txpool related requests because txpool has lots of lock resource competition
```
// average request cost estimates based on serving time
	reqAvgTimeCost = protocol.RequestCostTable{
		protocol.GetBlockHeadersMsg:     {150000, 30000},
		protocol.GetBlockBodiesMsg:      {0, 700000},
		protocol.GetReceiptsMsg:         {0, 1000000},
		protocol.GetCodeMsg:             {0, 450000},
		protocol.GetProofsV2Msg:         {0, 600000},
		protocol.GetHelperTrieProofsMsg: {0, 1000000},
		protocol.SendTxV2Msg:            {0, 450000},
		protocol.GetTxStatusMsg:         {0, 250000},
	}
```
- It's incorrect to derive the initial BufferLimit and MinRecharge based on the initial status of server. Everything will change fast.
If the request processing is slower than the promise to LES client?
- The real cost will be adjusted to estmation level which is close to the estimated value
- Capacity management module will undertake the task for Overload control
If the request processing is faster than the promise to LES client?
- The real cost will be adjusted to estimation level which is close to the estimated value
Proposals

Here are a few proposals for flow control model:
- In current flow control implementation: les/flowcontrol, some capacity management code is mixed
  
  like raise total capcity or reduce total capacity whenever client freezing happened. We need to move these logics into capacity management module.
- Abstract request metering policies, support request count metering
  - I'm not sure which one is the best policy, but I can ensure the raw count metering will simplify thing a lot
    - We can get rid of Global factor for adjustment
    - We can get rid of request serving time estimation which is may not accurate or outdated
    - We can calculate minimal capacity much much easier
  - I guess it won't be so bad if we use request count policy
    - The average serving time for different kinds request is similar(at least the same magnitude)
    - Heavier request can be assigned a higher weight:
      - GetBlockBodiesMsg, GetReceiptsMsg, GetBlockHeadersMsg
        
        These requests are trivial for serving, the weight can be assigned as 1.
      - GetCodeMsg, GetProofsV2Msg, GetHelperTrieProofsMsg
        
        These requests require state access which is heavier than chain access, the weight can be assigned as 2
      - SendTxV2Msg,GetTxStatusMsg
        
        These reqeusts require txpool access which may extermely slow if txpool is busy. Perhaps we need to improve txpool instead of assigning a weird high weight here?
  - The philosophy of simplification
    
    The original implementation is complicated but full-scale, we has a term unit and meter the request cost at unit level, so that we can consider serving cost, bandwidth cost, connection time in accurate level. But in the same time it introduces lots of complixity.
    
    The simplified version uses simple weight strategy, it's general may not accurate but simple. But the fact here is sometimes even we use unit metering, the status of system itself fluctuate heavily. So it might just have same effective with this simple approach.
    
    For other consideration, like bandwidth limitaion, we can use a separate mirror token bucket, each request will be metered based on the incoming request size of outgoing response size.
  - It's easier for business logic
    
    With the help of incentive module, now server can start to sell its resource to clients. However now server needs to sell some token which is consistent with resource unit. It's too hard for server operator to understand the whole thing, it's not easy for estimation and business logic setting.
    
    However request count metering is quite straightforward. Server operator can set the rule like:
    - 100 request/per second: 1 cent for 1 hour connection

rjl493456442/les-flow-control.md

Select an option

No results found

Select an option

No results found

Abstract flow control model

Additional tricks

Drawbacks

Proposals