Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save christianselig/6c71608cf617d2f881cd2849325494c1 to your computer and use it in GitHub Desktop.
Save christianselig/6c71608cf617d2f881cd2849325494c1 to your computer and use it in GitHub Desktop.

Reddit Dec 5, 2022 10:46 AM:

Hi Christian,

I hope all is well!

I’m reaching out because we noticed a spike in Apollo's request rate on December 2nd. It was from approximately 14:17 to 14:23 UTC to /messages/inbox that went up by around 35% before returning to baseline. We are hoping you could help us understand what might have happened.

The source IPs were in AWS us-west-2: redacted, redacted, and redacted and had the Apollo UA server:apollo-backend:v1.0 (by /u/iamthatis) contact [email protected]

Thank you!

Christian Dec 5, 2022 10:48 AM:

On it, will have an answer for you shortly.

Christian Dec 5, 2022 10:51 AM:

My partner on the server side of things is at his day job at the moment and will check it out when he’s home. I’ll CC him in.

Reddit Dec 5, 2022 10:53 AM:

Thanks again!

André Dec 5, 2022 12:47 PM:

Hi all,

I'm looking at the graphs and I am noticing a spike in error responses from Reddit on the 2nd that coincides with the UTC date time.

Were these Reddit trying to rate limit us? Or was there an outage?

Reddit Dec 5, 2022 12:50 PM:

Hi André,

This is what I received before: During that time, they would have received back a bunch of 5xx’s because the spike in traffic was more than we could handle that fast.

Does that help?

André Dec 5, 2022 12:57 PM:

Probably not the answer you want to hear but I'm... not sure?

Looking at the graphs here. It all seems "normal" (other than the spike in error responses.)

What I can tell you is that we have mechanisms in place in order to avoid hammering Reddit if we can avoid to. We have locks to ensure we don't deploy a thundering herd if response times ever grow on your end, and we only fire off jobs in a pre-determined period of time too.

This one is a real head scratcher, but I'd love to get to the bottom of it. Would it help if I hopped on a call with one of your SREs to dive a bit deeper? I'd like to figure out what I did wrong so I can rectify it.

André Dec 5, 2022 1:21 PM:

Actually I see it now.

Let me do some digging on my end to figure out what happened. This is bizarre. Do you have a timeframe you need the definitive answer for this by?

André Dec 5, 2022 1:28 PM:

Never mind, I got it now.

From what I gather, we had a huge influx of new accounts (via API calls) starting at that time. Effectively, every time an account gets UPSERT'ed, we check /message/inbox to grab the last message ID the user has in order to pre-populate that on our end.

Typically this isn't a huge issue, because we don't see spikes like these, but I'm going to dig more into why this happened, and most importantly, ways to prevent it from happening again.

Super sorry about this. I owe you (and your engineers) a beer.

Reddit Dec 5, 2022 2:59 PM:

Hi André,

I appreciate you investigating it so quickly. I've passed along the information, and I'll let you know if our team has any questions or suggestions.

Thanks!

end of email chain

@iamvinny
Copy link

sopa de macaco

@snowmannishboy
Copy link

someone tell spez, this sure as hell isn't what Aaron wanted for reddit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment