Kindlewild - Evolution of Server-Client Communication

Evolution of Server-Client Communication

By Robin Dowling · 6 months ago

Building an efficient server-client communication system for my evolution simulation has been a fascinating journey. Each iteration brought new challenges and insights as I balanced performance, bandwidth usage, and user experience. Here's a look at my progression:

1. Always Full State

My first implementation was straightforward: send the entire world state to every client on each update. I created a data structure that contained all entity information, metadata about the world state, and statistics. This provided a reliable baseline but quickly revealed scalability issues as my simulation grew.

This approach worked well for testing but wasn't sustainable with hundreds of organisms and multiple clients.

2. Custom Serialization

To reduce payload size, I implemented custom serialization. Instead of sending verbose JSON with property names, I used a simple array where keys and values are packed together, removing the need for many control chars.

This significantly reduced my payload size, with some entities shrinking from ~200 bytes to just ~50 bytes.

3. Minimized IDs

Entity IDs were initially small 4 char base62 IDs. I implemented a more compact ID system using incremental base62 IDs for session-scoped references, reducing size even more.

4. Change Deltas

Instead of sending complete entities on each update, I began tracking client state and sending only what changed. I implemented a function that compares two states of an entity and returns only the properties that differ between them. This delta approach meant that if only an entity's position changed, I would send just the position rather than the entire entity definition.

This reduced typical payloads by 70-90% in stable world states, as most entities only change position or a few properties between frames.

5. Notify Server of Missing States

Delta-based approaches created a new problem: what if clients miss an update? I implemented a feedback mechanism allowing clients to notify the server when they're missing entity data. When a client sends a feedback message indicating it has missing entities, the server clears its assumptions about what the client knows and sends a full update on the next iteration.

6. Skip Out-of-Order State Iterations

Network delays sometimes resulted in iterations arriving out of sequence. I added a simple mechanism to track the last applied iteration and ignore older ones. In the client code, I check if the received state's iteration number is less than the already applied iteration, and if so, discard it.

This prevented visual glitches where the world would temporarily "jump backward."

7. Client-Side State Buffering

To further address the occasional out-of-order delivery, I implemented a buffering system in the client. The system stores incoming states in a map keyed by iteration number. When processing states, it first looks for the exact next iteration needed. If that iteration is available, it's applied and removed from the buffer. If too many states accumulate in the buffer, it falls back to using the oldest one to prevent memory issues.

This buffer allowed the client to store states that arrived early and apply them in the correct sequence, creating smoother animations.

8. Aggregate Similar Organisms

As my simulations grew more complex, areas with high organism density created bandwidth spikes. A solution was to aggregate similar organisms in congested positions. When detecting multiple organisms of the same species in the same position, I add a special "aggregate" metadata property to the first organism's state, indicating how many similar organisms are at that location.

The client then expands these aggregates into individual entities. It checks for the aggregate property, and if present, creates that many individual entities with slightly modified IDs to distinguish them.

Results

The iterative improvements paid off:

Average payload size: reduced from ~1.5MB to ~2KB per update
Peak bandwidth usage: reduced from ~10MB/s to ~20KB/s

Each optimization built upon the previous one, creating a data transfer system that's efficient, resilient, and scalable.