Skip to content

Instantly share code, notes, and snippets.

@jooyunghan
Last active February 19, 2016 07:06
Show Gist options
  • Save jooyunghan/a7652e16cd34eeb5e784 to your computer and use it in GitHub Desktop.
Save jooyunghan/a7652e16cd34eeb5e784 to your computer and use it in GitHub Desktop.
K-mean for [(d,d)] - slightly improved for readability
{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
module Lib
(
kMeans
) where
import GHC.Exts (groupWith)
import Data.List (transpose, sort)
import Data.List.Split (chunksOf)
import Debug.Trace (trace)
type Pt = (Double, Double)
dist (a, b) (c, d) = sqrt $ (a - c) ^ 2 + (b - d) ^ 2
(.:) = (.) (.) (.)
dbg x = trace (show x) x
kMeans xs k n = within same $ take n $ iterate (dbg . centroids . flip cpartition xs) $ dbg $ centroids $ kpartition k xs
where
same = all (<0.1) .: zipWith dist
centroids = sort . fmap center
nearest cs v = snd $ minimum [(dist c v, c) | c<-cs]
cpartition cs xs = groupWith (nearest cs) xs
kpartition n xs = transpose $ chunksOf n xs
center points = (xsum / l, ysum / l)
where xsum = sum $ fmap fst points
ysum = sum $ fmap snd points
l = fromIntegral $ length points
within _ [x] = x
within same (x:y:xs) | same x y = x
within same (x:xs) = within same xs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment