You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A primer/refresher on the category theory concepts that most commonly crop up in conversations about Scala or FP. (Because it's embarassing when I forget this stuff!)
I'll be assuming Scalaz imports in code samples, and some of the code may be pseudo-Scala.
PoC for approximating the median of a Stream via stochastic averaging in Redis with Lua
Approximating the median of a Stream via stochastic averaging
Often it is useful to have access to the median value for fields of a data stream since they are more robust with respect to outliers.
The median is defined as the value of a dataset such that, when sorted, 50% of the data is smaller than the value and 50% of the data is larger then the value. Ordinarily this is difficult to calculate on a stream because it requires the collection and sorting of all data.
The median of a data stream can be approximated with a technique called stochastic averaging. To approximate the median value of a data stream one could use the following approach:
Given the current estimate of the median M. If the next observed value in the stream is larger than M, increase the current estimate by r (= the learning rate). If it is smaller, decrease the estimate by r. When M is close to the median, it increases as often as it decreases, and therefore it stabilizes.
This approach was taken from the book "Real-time Analytics -
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters