Having some background in computational geometry, I'm especially fond of geometric algorithms. The closest pair of points problem simply states that given a set of n points, we need to find the closest pair of points.
Let's consider the planar case of two-dimensional points. The naive solution would be to find the closest pair among all the possible pairs, which amounts to n^2 pairs. A clever observation reveals that the problem can be solved in O(n log n) using a divide and conquer approach. We recursively split the set to a left and right sets along the vertical median x point, and find the closest pair on each side, up until the trivial case of n <= 3. The trick is in the merge stage, in which we need to find if there is a closer pair where each point is from a different set (the left and the right sets). Instead of checking all possible pairs (which again would amount to n^2), it appears that each point from one sides, needs to be evaluated against up to six candidates point from the other side.
A complete explanation of the algorithm is beyond the scope of this post, as I would like to focus the discussion on Haskell. Since this is a divide-and conquer algorithm, a functional language like Haskell allows to implement it in a rather strait forward way. Well, at least part of it ;). I used this link as a pseudo code reference for the algorithm. So, I begin with a small convenient function, which returns the median point of a list points, according to the x coordinate:
Next, comes the main function, which takes a list of points and creates two sorted lists, one based on the x coordinate and one on the y coordinate and pass those lists to the main recursive function (with some initial very large minimum distance and a fictional pair)
The main recursive function, is an almost one to one mapping of the pseudo code
Lastly, I defined a function for the merge stage, which checks for pairs in the current delta (the minimum distance found thus far)
In order that I would be able to verify the correctness of the fast algorithm, I implemented also the naive version, which compares all possible pairs
Finally, a small main function which creates a random set of 200 points, and run both algorithms
Here is an example run of the program
Monday, November 29, 2010
Subscribe to:
Post Comments (Atom)
Data.List provides (!!) which returns an element of a list at a given point, so you could replace:
ReplyDeletemedian pts = last $ take (((length pts)+1) `div` 2) pts
with:
median pts = pts !! (((length pts)+1) `div` 2)
@Jonno - thanks for expanding my Haskell vocabulary.
ReplyDelete