Skip to content

Instantly share code, notes, and snippets.

@carlhannes
Last active August 17, 2024 10:39
Show Gist options
  • Save carlhannes/610d7db3cd46dbf28cc8f667b94b2e7c to your computer and use it in GitHub Desktop.
Save carlhannes/610d7db3cd46dbf28cc8f667b94b2e7c to your computer and use it in GitHub Desktop.
sj-booking-fix.js
// Paste the code below into your webbrowser console and press "enter"
// To open the console you can press "F12" or "Ctrl + Shift + J" for most browsers.
// Read more here: https://appuals.com/open-browser-console/
// Instructions video on my twitter: https://twitter.com/_carlhannes/status/1590441813445599232
// The code re-tries fetching data if it gets status 429, which is the error that the SJ page has
// It does this together with an exponential back-off delay which is common to use with microservices of this type
// Because of these re-tries and the delay, the overall load of the website and the servers will be lower,
// since it does not need to re-fetch requests that actually succeed. Read more on my twitter if you're interested:
// https://twitter.com/_carlhannes/status/1590605735314206721
// I do not have a soundcloud but I heard the artist Yoyo Xno on Spotify is really good:
// https://open.spotify.com/artist/0AZYrtDySQn1YqHbNzYWib?si=1335f829215942b2
window._fetch = window.fetch;
window.fetch = (...args) => {
return new Promise((resolve, reject) => {
const maxRetries = 10
let retries = 0
const retry = () => {
window._fetch(...args)
.then((data) => {
if (data.status === 429 && retries < maxRetries) {
retries++
setTimeout(retry, 500 * Math.pow(2, retries))
} else {
resolve(data);
}
}).catch(reject)
}
retry()
})
};
// Add this code as the "URL" of a bookmark in your browser to quickly add the booking fix to your browser.
// Good if you want to help parents, grandparents etc, just tell them to click the bookmark before booking.
// Kudos to @_genesis_ai_ https://twitter.com/_genesis_ai_/status/1590778385760157696
javascript:q=window.fetch;window.fetch=(...a)=>new Promise((r,_,R=0,T=U=>q(...a).then(d=>d.status==429?setTimeout(T,500*(2**++R)):r(d)))=%3ET());
@carlhannes
Copy link
Author

stackars servrar :)

Detta lättar faktiskt trycket på servrarna. Koden är skriven med exponentiell back-off som är en vanlig metod att använda när man jobbar med den här typen av tjänster. Finns mer info i min twitter-tråd:
https://twitter.com/_carlhannes/status/1590605735314206721?s=20&t=XXMAzVorQ7bp-2YCKP1UFw

@nickeforsberg
Copy link

Wihooo!!

@Hakeemsm20
Copy link

Mycket smart:)

@recumbentbirder
Copy link

Correct me if I'm wrong, but I think your script still has a problem. It worked now, because not so many use it, and most users get 'taken out' through SJ's faulty way. But once 'your' users are because of their numbers able to provoke a 429 -- because too many of them tried a request at the same time -- then they will retry again, ... at the same time, and so on.
Ethernet prevents such problems in its collision detection and prevention by adding some randomness to the delay value. Would that be an idea? Or just not necessary for some reason that I don't see just now?

Oh, and please feel free to answer in Swedish if you like -- for me it just goes faster still to quickly write in English ;-)

@carlhannes
Copy link
Author

carlhannes commented Dec 13, 2022

@recumbentbirder Sorry I've missed this comment, but yeah there's loads of parameters to account for here and I'm not surprised that SJ missed this possible "easy fix", I'm going to record a video with how I analysed the problem and what my conclusions were, as you do not only have to take into account how the servers and clients behave technically but also the user behaviour, etc.

The fix above is essentially just one step better then doing nothing at all (like they were), simply because the 429 too many requests was "throttled" not on IP, session or anything else, but just "globally" on all traffic. My theory is that they just put in a proxy between their servers and the price calculation engine to "keep it alive" and when they saw that the servers didn't die repeatedly they just went home.

In practice their "fix" that they themselves said blocked about 20% of the traffic to the servers resulted in the client not behaving as expected and users refreshed the browsers repeatedly until they could book, causing the price calculation service to be called for departures that were already calculated, again. In practice, as I said, they did not throttle on anything that I could reproduce, ie session id, ip or anything similar, and they did not return any header like X-Retry-After or likewise that I could abide by.

Hence I made it re-try 429 responses, but to prevent overloading the servers I added an exponential back-off delay to prevent exactly the thing that you were talking about, ie. a "mass-DDOS" from different clients. But in practice, since there was throttling between the servers and the actual price calculation engine, that wouldn't be a problem even with a regular back-off delay as users would just get longer load times. The exponential back-off also starts at 1s which means it's very easy on the servers, normally for these types of services you'd probably start at something like 250ms.

A rough comparison between my fix and a user refreshing the browser, presuming 20% loss of traffic, for every 50 requests there would be 52 request with my fix and 60-70 by "manually" refreshing the browser. So the overall load on the servers would be lower either way.

But this was just a 13-minute-or-so fix and it's "good enough" for what it was, but if you wanted to put this into production you would probably want to iterate on this fix even further, especially to prevent the possible danger you talked about in your comment. An idea would be to add a jitter effect on the delays or likewise to prevent what you said. But there's also plenty of other things you could on the back-end side, that would probably be a better solution, especially if there's a "hard limit" on how many requests the price calculation service could handle, like limiting by IP, session or likewise and adding a Retry-After header (and update the client to listen to it).

@recumbentbirder
Copy link

Hehe, yes. -- or consider something really revolutionary outside all IT tech: prevent the rush on the servers by getting the train system in order and provide more trains :-D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment