Custom Character Replication

In this video: https://youtu.be/15TemJ1Qi9M, the person replaces roblox's built in character replication which reduces the p2p latency heavily due to roblox's character replication system having a massive interpolation buffer for mobile kiddies who have a total internet speed of 1mbps and a massive router latency so that they can play roblox. This is however really bad for FPS or fast paced shooters which causes things like getting shot through a wall and other latency things.

In this gist, I will be going through how to code a replication system that bypasses interpolation lag, implements a dynamic interpolation buffer, supports snapshots, latency compensation, optimizes for bandwidth, and follows best practices.

Interpolation Buffer

An interpolation buffer is a time period for holding onto snapshots. When using unreliables and sending cframes rapidly, there is bound to be packet loss, or delayed packets. Interpolation delays gives you the time needed to reorder packets, receive packets and deal with packet loss. Roblox has an interpolation delay of 200ms, but this is a heavy overkill when you are playing in low latency without packet loss. I will be showing how to create your own "dynamic" interpolation buffer.

To implement a dynamic interpolation buffer, you would need to "predict" when a packet gets dropped or received in an incorrect order. This may sound complicated at first but it is actually very simple due to the fact that you can easily have an "estimated arrival" prediction. Estimated arrivas should be based on 2 things:

replication frequency
latency

Your replication frequency is the general tick rate you want to send your CFrames. I recommend having a frequency of 20-40hz for the best results.

Your latency should be calculated as the recent average of the time differences (offsets) between when packets are sent and when they arrive. This gives you an expectation of how long packets usually take to reach you.

Using the average latency, you can detect if a packet has likely been dropped (if it doesn't arrive within the expected time window). Suppose a packet does eventually arrive, but in the next replication frame, along with the packet that was originally scheduled for that frame. In that case, you can assume that the late packet is the one you dropped. You can confirm this by checking if the gap between the delayed packet and the expected one is smaller than your average latency.

Snapshots

Snapshots sound simple to implement, because one may think it is just keeping up a table and indexing the time. It's actually a bit more complicated than that - but still very straightforward once understood. You want to be able to get the "interpolated snapshot" at a current renderTime rather than just the snapshot at that current time. The reason for this is pretty straightforward. Picture this, let's say you are sending CFrames over the wire at 30hz, which is roughly 33.3ms for each position update. If you have someone at 17ms, you will have both their snapshot at 0ms and 33ms earlier than that. If you just snap to either of those, motion will appear jittery or disconnected. Thats why you want to interpolate between those two snapshots to estimate what the player’s state was at 17ms. There are 2 snapshot interpolation methods that people typically implement.

There is:

linear interpolation
hermite interpolation

In linear interpolation, you simply interpolate between snapshots as you move forward in time through your array. This is the most common, and recommended interpolation method for your snapshots.

Hermite interpolation is the interpolation method that Roblox uses. It is a way to approximate functions, and uses calculus. Hermite interpolation constructs a polynomial that matches function values and their derivatives at a set of points. It results in much smoother curves, especially for easing in/out of movement. This has added complexity and isn't necessary for our usecase.

Additional Info: https://gafferongames.com/post/snapshot_interpolation/

Client Tick

A common thing people do when implementing custom replication is to just send CFrames to the server and have the server generate a timestamp to be forwarded to the client. This is bad because it does not take into account the latency of the sender, only the receipient. This may seem like a minor issue but will cause visible desync in faster paced movement. Picture this, client A sends a CFrame at local time t = 100ms, and client B sends theirs at t = 90ms. Due to latency, both packets arrive at the server around t = 120ms. The server, unaware of the actual send times assigns both packets the same server timestamp and forwards them to other clients. This generalized time will cause inaccuracies in interpolation.

Simply forwarding timestamps from the client to the server will solve this.

Ex.

bad: client fires remote -> server generates new timestamp -> our client uses that timestamp and calculates latency from there
good: client fires remote with timestamp attached -> server uses that as the timestamp -> fires client then client gets latency so it accounts for both latencies from sender and receipient

Serialization

To optimize for bandwidth, it is highly recommended to use buffers to serialize packets. Thankfully, there are networking libraries such as Bytenet and Blink to do SerDes easily.

CFrames consist of both positional and rotational components.

For position, use float32 — while float16 is smaller, it is faced with the problem of precision limits. Many open-source replication systems use float16 for position, but those are typically for showcase.

For rotation, it's best to use quaternions and serialize them to float16. It is very simple to convert from CFrames to Quaternions (using AxisAngles), and even easier to convert back to CFrames using the quaternion overload. Alternatively, you can just send Y-axis rotation as it is precise enough for most usecases in battleground games.

Timestamps should be sent from client-to-server and server-to-all-clients (explained in client tick). I recommend using GetServerTimeNow() for timestamps, since it provides a synchronized clock between client and server, avoiding the need to compensate for clock drift due to latency variations.

When rendering timestamps, you can compute remote-latency easily by comparing GetServerTimeNow() on the server and client, and subtracting your interpolation buffer to get the correct render time. However, keep in mind:

GetServerTimeNow() returns a float64, which is more costly than os.clock() (which is float32).
To reduce bandwidth, you can encode timestamps. For example, I use GetServerTimeNow() % 255 to compress the timestamp to a float16 range.
Nezuo also has a u8 tick trick using os.clock() — you can find more on that in the Roblox OSS Community. As far as I know, kauis replication as implements this.

If you're implementing time wrapping like this, make sure to account for circularity when comparing timestamps in your snapshot logic — otherwise, you’ll get incorrect ordering.

Disabling Default Replication

There are 2 common methods for disabling default roblox character replication, each with their own complexities.

Anchoring on server and unanchoring on client. This is the most straightforward and simplest way of disabling character replication. Thankfully, animations replicate fine using this method and you already have character appearances replicated. To do collision detection in this method, each player should be represented as a dummy parented to the camera (will not replicate), and we simply bulkmoveto them to the latest character cframe.
Parenting to camera: Instances parented to the camera do not replicate. This also means you have to recreate the character appearances of everyone on the server on the client. it is best to use this technique when you wish to code your own animation system & custom character controller with collisions. This is probably the best example using custom characters.