Skip to content

Instantly share code, notes, and snippets.

@JonCole
Last active February 22, 2024 09:49
Show Gist options
  • Save JonCole/4a249477142be839b904f7426ccccf82 to your computer and use it in GitHub Desktop.
Save JonCole/4a249477142be839b904f7426ccccf82 to your computer and use it in GitHub Desktop.
Redis Debugging Tips

Debugging Redis Keyspace Misses

Simply put, a keyspace "MISS" means some piece of data you tried to retrieve from Redis was not there. This usually means that one of the following things happened:

  1. The key expired
  2. They key was renamed
  3. The key was deleted
  4. The key was evicted due to memory pressure
  5. The entire database was flushed
  6. The key was never actually inserted
  7. The data was lost due to a crash, failover, etc.
  8. The data is in a different DB than the one you currently selected

Please read "What happened to my data" for related information.

For cases 1, 2, 3 and 4 (key expired, renamed, deleted or evicted)

The first thing you should do is look at diagnostic metrics for your cache to see if there is a correlation between when the key went missing and a spike in expired or evicted keys. I have seen many cases where there is a large spike in expired or evicted keys at the point in time when keys seem to have gone missing.

See the Appendix for information on using Keyspace Notifications or MONITOR to debug these types of issues.

For case 5 (DB was flushed)

See here for details of how to see if this happened.

For case 6 (key was never inserted)

In this case, the key was never actually inserted into Redis even though you thought it was. For example, this can happen due to a bug in the client application or due to a network connection getting dropped while the application is trying to write to Redis.

In order to debug this, you will need to debug the client application to figure out which keys are resulting in a MISS and then try to figure out which part of the application is issuing these operations. You will likely need to add additional logging to your application.

For case 7 (data loss)

This doesn't happen very often, so check all the other possible causes first. When we are talking about data loss, we are talking about most or all of your keys disappearing, not a few keys disappeared. As mentioned previously, please read "What happened to my data" for related information.

Using a Premium tier cache instance with persistence is a way to protect yourself against this. You can reach out to customer support if you have additional questions.

For case 8 (incorrect DB selected)

If you have DB 0 selected when you set the key/value into Redis, but have a different DB (like db 1, 2, etc.), then the system won't find the key you are looking for because each DB is its own logical data set. Use the Select Command to switch to other DBs and then look for the key in each DB.

Appendix

Keyspace Notifications

If you need to debug on an individual key basis, then the best option is to enable Keyspace Notifications and watch for events to fire for the key(s) in question. You can find sample code here.
Note that there is a performance impact for turning on notifications.

MONITOR

Another option for figuring out exactly what is happening on the server side is to use the MONITOR Command, which allows a client watch the stream of commands that are being executed. However, you should be aware that this can have a heavy performance impact because Redis will now send all operations back to the client that issued the MONITOR command.

If you are trying to debug a miss with MONITOR logs, then you will have to capture the logs for sufficient time to know what was done to the key in question over time. For instance, you will need to be able to see when the associated SET operation was done on the key, then look for changes (like DELETE, RENAME, EXPIRE, etc). If you don't find any entries in the MONITOR output, then this means that it was never set in the first place or it means that you started MONITOR after the key was already inserted.

As an example, a customer asked why the MISS count went up by just creating a StackExchange.Redis ConnectionMultiplexer and calling GetDatabase() on it. Using MONITOR on my cache, it was pretty easy to see that StackExchange.Redis issues GET requests for a key "__Booksleeve_TieBreak", which I knew didn't exist in my cache. Apparently StackExchange.Redis uses this key for some type of state management.

Debugging Latency Issues

If you are seeing latency issues, it is almost always caused by one of the following things:

Server-side causes

Server CPU/Load - If you have hit the compute capacity of your Redis instance, it is going to take longer for it to respond to your requests. Upgrade your server to fix this.

Server Bandwidth - If you have hit the bandwidth capacity of your Redis instance, it is going to take longer data sent by Redis to reach your client application. Upgrade your server to fix this.

Server Memory pressure - Memory pressure on the server causes page faulting, which slows down the entire system and bad things happen. Upgrade Redis or reduce memory usage to fix it.

Client-side causes

Client CPU/Load - High client CPU means that requests take longer to send and responses take longer to process because the CPU is busy.

Client ThreadPool Configuration - If the thread pool is not configured correct, it will not grow fast enough for bursts of work, thus causing the data received by the server to sit unprocessed in the client socket buffer.

Client Bandwidth - Once you have exceeded the bandwidth for the client, it takes longer to receive data from the server even if the server processed the request very quickly...

Client Memory pressure - Memory pressure on the client is similar to pressure on the server - it causes page faulting, which slows down the entire system and bad things happen. Upgrade your client or reduce memory usage to fix it.

Running expensive commands - Expensive commands cause not only that request to slow down, but other requests thus affecting that request and any other pending request in the queue.

Larger Request or Response Size - Large requests or responses take longer for the server to process, thus affecting that request and any other pending request in the queue.

Client Per-connection Throughput Limits - While most apps do not need to create multiple connections, there are some high throughput scenarios where you can exhaust the throughput capabilities of a single connection. This will cause latency to go up. Creating more connections can help. Just beware not to create too many connections - each connection has overhead associated with it, so creating too many connections can actually make things worse. Start low and increase the number of connections slowly until you stop getting improved performance.

Note: If you are using StackExchange.Redis as your client, it has a profiling system that may be helpful when investigating performance issues.

@Leonardo-Ferreira
Copy link

Leonardo-Ferreira commented Aug 30, 2017

Is there a way to get a report that has 2 columns: "KeySearched" and "Found"? redis is reporting a avg of 35% cache misses (sometimes i get to 50% miss)... but looking at my own logs i just can't find out where are those misses comping from...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment