Write a k-medians clustering algorithm #142

Boarders · 2019-06-12T16:50:32Z

Currently our clustering pass uses the algorithms in AI.Clustering.Hierarchical. As it stands there is no easy way to make use of the k-means algorithm within this library.

We should write a k-medians algorithm which is linear time and has good convergence properties (it should also expose a good amount of parallelism). This will entail writing a k-medians function for each of our different character types i.e. functions that looks something like:

kMedian :: (Foldable t, Traversable t) => t CharacterType -> CharacterType

In the case of a discrete character this will function will look like an inclusion-exclusion formula and we should be able to write similar things for other character types.

This would then be added as a Clustering option (currently asking for median clustering gives an error explaining that this is not yet implemented).

The text was updated successfully, but these errors were encountered:

Boarders · 2019-07-10T19:49:56Z

This commit gives a version of K-medians using essentially Lloyds's algorithm for refinement and RGC (centroids of random subsamples) for initialisation. It is abstract enough that it would work as both an implementation of k-medians or k-means. This also exposes some parallelism in how assignment is done for each cluster, but finding the correct way to chunk the input will require experimentation.

Still left on this issue is to define a k-medians function on CharacterSequences.

Boarders added enhancement low priority labels Jun 12, 2019

Boarders added this to the Version 1.0.0 milestone Jun 12, 2019

Boarders assigned recursion-ninja and Boarders Jun 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write a k-medians clustering algorithm #142

Write a k-medians clustering algorithm #142

Boarders commented Jun 12, 2019 •

edited by recursion-ninja

Loading

Boarders commented Jul 10, 2019 •

edited

Loading

Write a k-medians clustering algorithm #142

Write a k-medians clustering algorithm #142

Comments

Boarders commented Jun 12, 2019 • edited by recursion-ninja Loading

Boarders commented Jul 10, 2019 • edited Loading

Boarders commented Jun 12, 2019 •

edited by recursion-ninja

Loading

Boarders commented Jul 10, 2019 •

edited

Loading