Closest-pair problem: Difference between revisions

Content added Content deleted

Inline

Revision as of 15:34, 11 May 2009

This page uses content from Wikipedia. The original article was at Closest-pair problem. The list of authors can be seen in the page history. As with Rosetta Code, the text of Wikipedia is available under the GNU FDL. (See links for details on variance)

The aim of this task is to provide a function to find the closest two points among a set of given points in two dimensions, i.e. to solve the Closest pair of points problem in the planar case.

The straightforward solution is a O(n²) algorithm (which we can call brute-force algorithm); the pseudocode (using indexes) could be simply:

bruteForceClosestPair of P(1), P(2), ... P(N)
if N < 2 then
  return ∞
else
  minDistance ← |P(1) - P(2)|
  foreach i ∈ [1, N-1]
    foreach j ∈ [i+1, N]
      if |P(i) - P(j)| < minDistance then
        minDistance ← |P(i) - P(j)|
      endif
  return minDistance
 endif'

A better algorithm is based on the recursive divide&conquer approach, as explained also at Wikipedia, which is O(n log n); a pseudocode could be:

closestPair of P(1), P(2), ... P(N)
if N ≤ 3 then
  return closest points of P using brute-force algorithm
else
  xP ← P ordered by the x coordinate, in ascending order
  P_L ← points of xP from 1 to ⌈N/2⌉
  P_R ← points of xP from ⌈N/2⌉+1 to N
  d_L ← closestPair of P_L
  d_R ← closestPair of P_R
  d_min ← min{d_L, d_R}
  x_M ← xP(⌈N/2⌉)
  S ← { p ∈ xP : |x_M - p_x| < d_min }
  yP ← S ordered by the y coordinate, in ascending order
  closest ← ∞
  nP ← number of points in yP
  closest ← d_min
  for i from 1 to nP - 1
    k ← i + 1
    while k ≤ nP and yP(k)_y - yP(k)_y < d_min
      closest ← min{|yP(k) - yP(i)|, closest}
      k ← k + 1
    endwhile
  return closest
endif

References

Smalltalk

Works with: GNU Smalltalk

These class methods return a three elements array, where the first two items are the points, while the third is the distance between them (which having the two points, can be computed; but it was easier to keep it already computed in the third position of the array).

<lang smalltalk>Object subclass: ClosestPair [

 ClosestPair class >> raiseInvalid: arg [
     SystemExceptions.InvalidArgument
       signalOn: arg
       reason: 'specify at least two points'
 ]

 ClosestPair class >> bruteForce: pointsList [ |dist from to points|
   (pointsList size < 2) ifTrue: [ self raiseInvalid: pointsList ].
   points := pointsList asOrderedCollection.
   from := points at: 1. to := points at: 2.
   dist := from dist: to.
   [ points isEmpty ]
   whileFalse: [ |p0|
     p0 := points removeFirst.
     points do: [ :p |
       ((p0 dist: p) <= dist)
       ifTrue: [ from := p0. to := p. dist := p0 dist: p. ]
     ]
   ].
   ^ { from. to. from dist: to }
 ]

 ClosestPair class >> orderByX: points [
   ^ points asSortedCollection: [:a :b| (a x) < (b x) ]
 ]
 ClosestPair class >> orderByY: points [
   ^ points asSortedCollection: [:a :b| (a y) < (b y) ]
 ]

 ClosestPair class >> splitLeft: pointsList [ |s|
   s := pointsList size // 2.
   s odd ifTrue: [ s := s + 1 ].
   ^ pointsList copyFrom: 1 to: s
 ]
 ClosestPair class >> splitRight: pointsList [ |s|
   s := (pointsList size) - (pointsList size // 2).
   s odd ifTrue: [ s := s - 1 ].
   ^ pointsList copyFrom: s to: (pointsList size).
 ]

 ClosestPair class >> minBetween: a and: b [
   (a at: 3) < (b at: 3)
     ifTrue: [ ^a ]
     ifFalse: [ ^b ]
 ]

 ClosestPair class >> recursiveDAndC: pointsList [
   |orderedByX pR pL minL minR minDist middleVLine joiningStrip tDist|
   (pointsList size < 2) ifTrue: [ self raiseInvalid: pointsList ].
   (pointsList size <= 3)
     ifTrue: [ ^ self bruteForce: pointsList ].
   orderedByX := self orderByX: pointsList.
   pR := self splitLeft: orderedByX.
   pL := self splitRight: orderedByX.
   minR := self recursiveDAndC: pR.
   minL := self recursiveDAndC: pL.
   minDist := self minBetween: minR and: minL.

   middleVLine := (((orderedByX at: (orderedByX size)//2+1) x) +
                  ((orderedByX at: (orderedByX size)//2) x)) / 2.
   joiningStrip := self 
                     orderByY: (pointsList 
                                  select: [ :p |
                                     ((middleVLine - (p x)) abs) < (minDist at: 3) 
                                  ]
                                ).
   tDist := { nil. nil.
              FloatD infinity
            }.
   1 to: (joiningStrip size) do: [ :i |
     ((i-3) max: 1)
       to: ((i+3) min: (joiningStrip size))
       do: [ :j |
         j ~= i ifTrue: [ |t|
           t := (joiningStrip at: i) dist: (joiningStrip at: j).
           t < (tDist at:3 ) 
             ifTrue: [ 
               tDist := { joiningStrip at: i.
                          joiningStrip at: j.
                          t }. 
             ]
         ]
       ]
   ].
   ^ self minBetween: minDist and: tDist
 ]

].</lang>

Testing

<lang smalltalk>|coll cp ran| "Let's use the same seed to be sure of the results..." ran := Random seed: 100.

coll := (1 to: 10000 collect: [ :a |

         Point x: (ran next)*10 y: (ran next)*10 ]).

cp := ClosestPair bruteForce: coll. ((cp at: 3) asScaledDecimal: 7) displayNl.

"or"

cp := ClosestPair recursiveDAndC: coll. ((cp at: 3) asScaledDecimal: 7) displayNl.</lang>

The brute-force approach with 10000 points, run with the time tool, gave

224.21user 1.31system 3:46.84elapsed 99%CPU

while the recursive divide&conquer algorithm gave

4.65user 0.00system 0:05.07elapsed 91%CPU

(Of course these results must be considered relative and taken cum grano salis; time counts also the time taken to initialize the Smalltalk environment, and I've taken no special measures to avoid the system load falsifying the results)

Tcl

Each point is represented as a list of two floating-point numbers, the first being the x coordinate, and the second being the y. <lang Tcl># A trivial helper... proc distance {p1 p2} {

   lassign $p1 x1 y1
   lassign $p2 x2 y2
   expr {
       hypot($x1-$x2, $y1-$y2)
   }

}

proc bruteForceClosestPair {points} {

   if {[package provide Tcl] >= "8.5"} {
       set min Inf
   } else {
       set min [distance [lindex $points 0] [lindex $points 1]]
   }

   for {set i 0} {$i < [llength $points]-1} {incr i} {
       for {set j $i} {[incr j] < [llength $points]} {} {
           set dist [distance [lindex $points $i] [lindex $points $j]]
           if {$dist < $min} {
               set min $dist
           }
       }
   }
   return $min

}

Some testing code:

set points {} for {set n 0} {$n < 10000} {incr n} {

   lappend points [list [expr {10*rand()}] [expr {10*rand()}]]

} set t [time {

   set d [bruteForceClosestPair $points]

}] puts "closest was $d, found in [lindex $t 0] microseconds"</lang>Example output

closest was 0.0006785546466922088, found in 94942709 microseconds

Note that the lindex and llength commands are both O(1).

@@ Line 4: / Line 4: @@
 The straightforward solution is a O(n<sup>2</sup>) algorithm (which we can call ''brute-force algorithm''); the pseudocode (using indexes) could be simply:
- '''bruteForceClosestPair''' of P(1), P(2), ... P(N), N &gt; 1
+ '''bruteForceClosestPair''' of P(1), P(2), ... P(N)
+ '''if''' N &lt; 2 '''then'''
- minDistance ← |P(1) - P(2)|
- '''foreach''' i ∈ [1, N-1]
+   '''return''' ∞
+ '''else'''
-   '''foreach''' j ∈ [i+1, N]
-     '''if''' |P(i) - P(j)| < minDistance '''then'''
+   minDistance ← |P(1) - P(2)|
+   '''foreach''' i ∈ [1, N-1]
-       minDistance ← |P(i) - P(j)|
-     '''endif'''
+     '''foreach''' j ∈ [i+1, N]
- '''return''' minDistance
+       '''if''' |P(i) - P(j)| < minDistance '''then'''
+         minDistance ← |P(i) - P(j)|
+       '''endif'''
+   '''return''' minDistance
+  ''endif'''
 A better algorithm is based on the recursive divide&amp;conquer approach, as explained also at [[wp:Closest pair of points problem#Planar_case|Wikipedia]], which is O(''n'' log ''n''); a pseudocode could be:
- '''closestPair''' of P(1), P(2), ... P(N), N &gt; 1
+ '''closestPair''' of P(1), P(2), ... P(N)
  '''if''' N ≤ 3 '''then'''
    '''return''' closest points of P using brute-force algorithm
  '''else'''
    xP ← P ordered by the x coordinate, in ascending order
-   P<sub>L</sub> ← points of xP from 1 to ⌊N/2⌋
+   P<sub>L</sub> ← points of xP from 1 to ⌈N/2⌉
-   P<sub>R</sub> ← points of xP from ⌊N/2⌋+1 to N
+   P<sub>R</sub> ← points of xP from ⌈N/2⌉+1 to N
-   '''if''' number of points in P<sub>L</sub> '''is''' odd '''then'''
+   d<sub>L</sub> ← ''closestPair'' of P<sub>L</sub>
-     '''append''' xP(⌊N/2⌋+1) '''to''' P<sub>L</sub>
+   d<sub>R</sub> ← ''closestPair'' of P<sub>R</sub>
-   '''endif'''
-   '''if''' number of points in P<sub>R</sub> '''is''' odd '''then'''
-     '''prepend''' xP(⌊N/2⌋) '''to''' P<sub>R</sub>
-   '''endif'''
-   d<sub>L</sub> ← ClosestPair of P<sub>L</sub>
-   d<sub>R</sub> ← ClosestPair of P<sub>R</sub>
    d<sub>min</sub> ← ''min''{d<sub>L</sub>, d<sub>R</sub>}
-   x<sub>M</sub> ← (xP(⌊N/2⌋+1)<sub>x</sub> - xP(⌊N/2⌋)<sub>x</sub>) / 2
+   x<sub>M</sub> ← xP(⌈N/2⌉)
-   S ← { p ∈ P : |x<sub>M</sub> - p<sub>x</sub>| &lt; d<sub>min</sub> }
+   S ← { p ∈ xP : |x<sub>M</sub> - p<sub>x</sub>| &lt; d<sub>min</sub> }
    yP ← S ordered by the y coordinate, in ascending order
    closest ← ∞
    nP ← number of points in yP
+   closest ← d<sub>min</sub>
-   '''for''' i '''from''' 1 '''to''' nP
-     '''for''' j '''from''' ''max''{i-3, 1} '''to''' ''min''{nP, i+3}
+   '''for''' i '''from''' 1 '''to''' nP - 1
-       '''if''' i ≠ j '''then'''
+     k ← i + 1
-         '''if''' |yP(i) - yP(j)| &lt; closest '''then'''
+     '''while''' k ≤ nP '''and''' yP(k)<sub>y</sub> - yP(k)<sub>y</sub> &lt; d<sub>min</sub>
-           closest ← |yP(i) - yP(j)|
+       closest ← ''min''{|yP(k) - yP(i)|, closest}
-         '''endif'''
+       k ← k + 1
-       '''endif'''
+     '''endwhile'''
-     '''endfor'''
+   '''return''' closest
-   '''endfor'''
-   '''return''' ''min''{d<sub>min</sub>, closest}
  '''endif'''
-'''Notes'''
-* The pseudocode works with at least two points. It could be rewritten so that a single point is treated like if we want to know the distance from it and a point at infinity (i.e. the distance is infinity)
 '''References'''
@@ Line 55: / Line 48: @@
 * [http://www.cs.mcgill.ca/~cs251/ClosestPair/ClosestPairDQ.html Closest Pair (McGill)]
 * [http://www.cs.ucsb.edu/~suri/cs235/ClosestPair.pdf Closest Pair (UCBS)]
-* [http://www.cs.wustl.edu/~rlg1/cse502/handouts/fastclosest.pdf Closest Pair (WUStL)]
+* [http://classes.cec.wustl.edu/~cse241/handouts/closestpair.pdf Closest pair (WUStL)]
 =={{header|Smalltalk}}==