At Logistics Algorithms we use a lot of machine learning models to compute/predict time estimates for parts of the order journey, either before, while or after orders are placed. Everything we rollout is previously experimented via a traditional but slightly elaborated AB/testing (a blog post to some other time).
Sometimes we want to test a new model for a particular geographical region, other times we want a way to toggle between models while time goes by (ab testing), or we simply we want to quickly disable everywhere a particular model. While we use and advocate the use of global feature flags, we also need these more refined capabilities.
You can accomplish this pattern by using the datastore of your choice in which you store a setting attribute (for example in our case we might put this as a "Zone" attribute so we can test zones separately) that dictates which model should be used. The default setting should always be the "NoOp" scenario that just returns a "no-op" value.