Posts
15
31
0
January 2010 Entries
A faster method for solving the closest-pair problem

As is often the case, we’ll speed up the solver by using the divide-and-conquer method.  We’ll split the graph down the middle, solve the problem for the left side, solve the problem for the right side, and we’ll be done!

How do we solve the problem for the left side?  Easy!  We’ll split the graph down the middle, solve the problem for the left side, solve the problem for the right side, and we’ll be done!

This technique is called recursion.  The key thing to understand about recursion in general is that there has to be a stopping point.  For example, a function to calculate a factorial might look something like this:

int Factorial(int i)

{

return Factorial(i - 1) * i;

}

But this function won’t work, because it will continually call itself over and over until the machine can’t handle it any more.  So, we’ll just add a stopping point to say that the factorial value of 1 is, yes, 1.

int Factorial(int i)

{

if (i == 1) return 1;

return Factorial(i - 1) * i;

}

When will we stop our algorithm? It's a pretty fair bet that if there are three points or less left to be analyzed, the brute-force method will work just as well as any other, so we'll use that as a limit. So, here's what we have so far:

Segment Closest_Recursive(List<PointF> points)

{

if (points.Count() < 4) return Closest_BruteForce(points);

int split = points.Count() / 2;

var ordered = points.OrderBy(point => point.X);     // sort the points by their x value

var pointsOnLeft = ordered.Take(split).ToList();    // split into two groups, those to the

var pointsOnRight = ordered.Skip(split).ToList();   // left and those to the right

var leftMin = Closest_Recursive(pointsOnLeft);

var rightMin = Closest_Recursive(pointsOnRight);

if (leftMin.Length() < rightMin.Length())

return leftMin;

else

return rightMin;

}

But there’s a snag.  Do you see it?

Suppose as we recurse along we end up drawing one of our lines like this:

Now our closest pair is neither on the right or left.  It crosses between the two.  So it seems like we still have to examine each point to see if it’s closer to a point on the other side than it is to any points on its side.

But, we have some additional information to work with now.  We know the distance between the two closest points that don’t cross the line.  That means that if there are any crossing pairs that are closer, they will have to be at least that close to the line.  Let’s find all the points on the right-hand side that are that close:

float minDist = Math.Min(leftMin.Length(), rightMin.Length());

var xDivider = pointsOnLeft.Last().X;

var closeY = pointsOnRight.TakeWhile(point => point.X - xDivider < minDist).OrderBy(point => point.Y);

It’s handy the points are sorted by X value – it allows us to stop our search once we get too far away from the line.  And, note that closeY is sorted by the Y values now.  This allows us to analyze only a small number of points on the right for each point on left.  To finish off, we’ll take all the points on the left that are close to the line, and for each one, only examine the points on the right that are closer than minDist in the Y direction.

var crossingPairs = pointsOnLeft.SkipWhile(point => xDivider - point.X > minDist)

.SelectMany(p1 => closeY.SkipWhile(i => i.Y < p1.Y - minDist)

.TakeWhile(i => i.Y < p1.Y + minDist)

.Select(p2 => new Segment(p1, p2)));

And take the shortest distance out of everything.

return crossingPairs.Union(new[] { leftMin, rightMin })

.OrderBy(segment => segment.Length()).First();

Is it faster?  Well, I posted some speed comparisons on RosettaCode.  You can see there that for 10,000 points on the graph, this method ran in slightly more than a second.  The brute-force method?  Nearly two and a half minutes.  Choosing the right algorithm can make all the difference in the world :)

Source code available here: http://github.com/benfulton/Algorithmic-Alley

Author: Ben Fulton
posted @ Sunday, January 31, 2010 4:31 PM | Feedback (0)
Solving the closest-pair problem

The first problem we’ll analyze comes from computational geometry, which is a very formal way of saying that we need to do some analysis on a graph.  So, given a graph of points, like this:

We’re going to analyze the list of points to decide which two are closest together.  Eyeballing it, we can see that the closest two are the pair in red.

So how do we find the closest pair algorithmically?  Easy –we’ll simply find every pair of points, sort them in order of distance, and take the mininum pair.

Oddly, the .Net framework doesn’t have any sort of Segment class, so first we’ll define our own:

public class Segment

{

public Segment(PointF p1, PointF p2)

{

P1 = p1;

P2 = p2;

}

public float Length()

{

return (float)Math.Sqrt(LengthSquared());

}

public float LengthSquared()

{

return (P1.X - P2.X) * (P1.X - P2.X)

+ (P1.Y - P2.Y) * (P1.Y - P2.Y);

}

public override bool Equals(object obj)

{

var other = obj as Segment;

if (other == null) return false;

return P1 == other.P1 && P2 == other.P2;

}

public override string ToString()

{

return String.Format("{0} - {1}", P1, P2);

}

}

And solve the problem.

public class ClosestPointSolver

{

public Segment Closest_BruteForce(List<PointF> points)

{

int n = points.Count;

var allPairs = Enumerable.Range(0, n - 1)

.SelectMany(i => Enumerable.Range(i + 1, n - (i + 1))

.Select(j => new Segment(points[i], points[j])));

return allPairs.OrderBy(seg => seg.LengthSquared())

.First();

}

}

Is this a good solution?  Absolutely not.  Looking at every single pair of points means that number of comparisons we have to do doubles every time we add another point.  Going from 20 to 21 points still isn’t a big deal, but if you go from 100,000 to 100,001 points you’ll notice a significant difference.  (It’s O(n2) in standard parlance, which I bet you know if you’re reading this, although I’ll probably cover it later in any case.)

So can we improve it?  You bet.  Next time.

(Image thanks to Wikimedia Commons http://commons.wikimedia.org/wiki/File:ClosestPair.png )

Author: Ben Fulton
posted @ Tuesday, January 26, 2010 9:15 PM | Feedback (3)
Algorithmic Alley

Welcome to Algorithmic Alley!  I intend for this site to be a crossover site that fuses the intellectuals that are out there creating new algorithms, and the engineers that barely have time to learn anything about algorithms at all because they're too busy creating all the cool content we see on the web.

To that end, I'll be posting working demonstrations of algorithms that I hope will be more interesting or unusual than what you see every day, I'll post code snippets along with explanations on the site, and a working application in github that actually runs the algorithms.

To my mind, the language that finds the best balance between functional and procedural coding styles is C# with LINQ, so I'll be writing most of my code samples in that language.  But it's not impossible that Ruby or Groovy may rear their heads to be noticed, and certainly if you have a sample or algorithm you'd like to demonstrate in any language at all, drop me a line!  The only requirement is that it has to have code.  This site isn't going to be about theory.

Enjoy!

Author: Ben Fulton
posted @ Wednesday, January 20, 2010 10:48 PM | Feedback (1)