Priority queue

From Rosetta Code
Revision as of 23:20, 9 August 2011 by Thundergnat (talk | contribs) (→‎{{header|Perl 6}}: Add Perl 6 example)
Priority queue is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

A priority queue is somewhat similar to a queue, with an important distinction: each item is added to a priority queue with a priority level, and will be later removed from the queue with the highest priority element first. That is, the items are (conceptually) stored in the queue in priority order instead of in insertion order.

Task: Create a priority queue. The queue must support at least two operations:

  1. Insertion. An element is added to the queue with a priority (a numeric value).
  2. Top item removal. Deletes the element or one of the elements with the current top priority and return it.

Optionally, other operations may be defined, such as peeking (find what current top priority/top element is), merging (combining two priority queues into one), etc.

To test your implementation, insert a number of elements into the queue, each with some random priority. Then dequeue them sequentially; now the elements should be sorted by priority. You can use the following task/priority items as input data:

Priority    Task
  3        Clear drains
  4        Feed cat
  5        Make tea
  1        Solve RC tasks
  2        Tax return

The implementation should try to be efficient. A typical implementation has O(log n) insertion and extraction time, where n is the number of items in the queue. You may choose to impose certain limits such as small range of allowed priority levels, limited capacity, etc. If so, discuss the reasons behind it.

C

Using a dynamic array as a binary heap. Stores integer priority and a void data pointer. There's no limit on heap size besides integer overflow, although a very large heap will cause a lot of page faults. Supports insert, extraction, peeking at top element, merging and clearing. <lang c>#include <stdio.h>

  1. include <stdlib.h>

typedef struct { void * data; int pri; } q_elem_t; typedef struct { q_elem_t *buf; int n, alloc; } pri_queue_t, *pri_queue;

  1. define priq_purge(q) (q)->n = 1
  2. define priq_size(q) ((q)->n - 1) /* first element in array not used to simplify indices */

pri_queue priq_new(int size) { if (size < 4) size = 4;

pri_queue q = malloc(sizeof(pri_queue_t)); q->buf = malloc(sizeof(q_elem_t) * size); q->alloc = size; q->n = 1;

return q; }

void priq_push(pri_queue q, void *data, int pri) { q_elem_t tmp, *b; int n, m;

if (q->n >= q->alloc) { q->alloc *= 2; b = q->buf = realloc(q->buf, sizeof(q_elem_t) * q->alloc); } else b = q->buf;

n = q->n++;

/* append at end, then up heap */ b[n].data = data; b[n].pri = pri;

while ((m = n / 2)) { if (b[n].pri >= b[m].pri) return; tmp = b[n]; b[n] = b[m]; b[m] = tmp; n = m; } }

/* remove top item. returns 0 if empty. *pri can be null. */ void * priq_pop(pri_queue q, int *pri) { void *out; if (q->n == 1) return 0;

q_elem_t tmp, *b = q->buf;

out = b[1].data; if (pri) *pri = b[1].pri;

/* pull last item to top, then down heap. also could reclaim memory here if used/allocated ratio is low, just realloc() */ b[1] = b[--(q->n)];

int n = 1, m; while ((m = n * 2) < q->n) { if (b[n].pri <= b[m].pri && b[n].pri <= b[m + 1].pri) break;

if (b[m].pri > b[m+1].pri) m++; tmp = b[m]; b[m] = b[n]; b[n] = tmp; n = m; } return out; }

/* get the top element without removing it from queue */ void* priq_top(pri_queue q, int *pri) { if (q->n == 1) return 0; if (pri) *pri = q->buf[1].pri; return q->buf[1].data; }

/* this is O(n log n), but probably not the best */ void priq_combine(pri_queue q, pri_queue q2) { int i; q_elem_t *e = q2->buf + 1;

for (i = q2->n - 1; i >= 1; i--, e++) priq_push(q, e->data, e->pri); priq_purge(q2); }

int main() { int i, p; char *c, *tasks[] ={ "Clear drains", "Feed cat", "Make tea", "Solve RC tasks", "Tax return" }; int pri[] = { 3, 4, 5, 1, 2 };

/* make two queues */ pri_queue q = priq_new(0), q2 = priq_new(0);

/* push all 5 tasks into q */ for (i = 0; i < 5; i++) priq_push(q, tasks[i], pri[i]);

/* pop them and print one by one */ while ((c = priq_pop(q, &p))) printf("%d: %s\n", p, c);


/* put a million random tasks in each queue */ for (i = 0; i < 1 << 20; i++) { p = rand() / ( RAND_MAX / 5 ); priq_push(q, tasks[p], pri[p]);

p = rand() / ( RAND_MAX / 5 ); priq_push(q2, tasks[p], pri[p]); }

printf("\nq has %d items, q2 has %d items\n", priq_size(q), priq_size(q2));

/* merge q2 into q; q2 is empty */ priq_combine(q, q2); printf("After merge, q has %d items, q2 has %d items\n", priq_size(q), priq_size(q2));

/* pop q until it's empty */ i = 0; while ((c = priq_pop(q, 0))) i++; printf("Popped %d items out of q\n", i);

return 0; }</lang>output<lang>1: Solve RC tasks 2: Tax return 3: Clear drains 4: Feed cat 5: Make tea

q has 1048576 items, q2 has 1048576 items After merge, q has 2097152 items, q2 has 0 items Popped 2097152 items out of q</lang>

J

Implementation:

<lang j>coclass 'priorityQueue'

PRI=: QUE=:

insert=:4 :0

 p=. PRI,x
 q=. QUE,y
 assert. p -:&$ q
 assert. 1 = #$q
 ord=: \: p
 QUE=: ord { q
 PRI=: ord { p
 i.0 0

)

topN=:3 :0

 assert y<:#PRI
 r=. y{.QUE
 PRI=: y}.PRI
 QUE=: y}.QUE
 r

)</lang>

Efficiency is obtained by batching requests. Size of batch for insert is determined by size of arguments. Size of batch for topN is its right argument.

Example:

<lang j> Q=: conew'priorityQueue'

  3 4 5 1 2 insert__Q 'clear drains';'feed cat';'make tea';'solve rc task';'tax return'
  >topN__Q 1

make tea

  >topN__Q 4

feed cat clear drains tax return solve rc task</lang>

Java

Java has a PriorityQueue class. It requires either the elements implement Comparable, or you give it a custom Comparator to compare the elements.

<lang java>import java.util.PriorityQueue;

class Task implements Comparable<Task> {

   final int priority;
   final String name;
   public Task(int p, String n) {
       priority = p;
       name = n;
   }
   public String toString() {
       return priority + ", " + name;
   }
   public int compareTo(Task other) {
       return priority < other.priority ? -1 : priority > other.priority ? 1 : 0;
   }
   public static final void main(String[] args) {
       PriorityQueue<Task> pq = new PriorityQueue<Task>();
       pq.add(new Task(3, "Clear drains"));
       pq.add(new Task(4, "Feed cat"));
       pq.add(new Task(5, "Make tea"));
       pq.add(new Task(1, "Solve RC tasks"));
       pq.add(new Task(2, "Tax return"));
       while (!pq.isEmpty())
           System.out.println(pq.remove());
   }

}</lang>

output:

1, Solve RC tasks
2, Tax return
3, Clear drains
4, Feed cat
5, Make tea

Perl

There are a few implementations on CPAN. Following uses Heap::Priority[1] <lang perl>use 5.10.0; use strict; use Heap::Priority;

my $h = new Heap::Priority;

$h->highest_first(); # higher or lower number is more important $h->add(@$_) for ["Clear drains", 3], ["Feed cat", 4], ["Make tea", 5], ["Solve RC tasks", 1], ["Tax return", 2];

say while ($_ = $h->pop);</lang>output<lang>Make tea Feed cat Clear drains Tax return Solve RC tasks</lang>

Perl 6

This is a rather simple implementation. It requires the priority to be a positive integer value, with lower values being higher priority. There isn't a hard limit on how many priority levels you can have, though more that a few dozen is probably not practical.

The tasks are stored internally as an array of FIFO buffers, so multiple tasks of the same priority level will be returned in the order they were stored.

<lang perl6>class PriorityQueue {

   has @!tasks is rw;
   method insert ( Int $priority where { $priority >= 0 }, $task ) {
       @!tasks[$priority] //= [];
       @!tasks[$priority].push: $task; 
   }
   method get { @!tasks.first({$^_}).shift }
   method is_empty { !?@!tasks.first({$^_}) }

}

my $pq = PriorityQueue.new;

for (

   3, 'Clear drains',
   4, 'Feed cat',
   5, 'Make tea',
   9, 'Sleep',
   3, 'Check email',
   1, 'Solve RC tasks',
   9, 'Exercise',
   2, 'Do taxes'

) -> $priority, $task {

   $pq.insert( $priority, $task );

}

say $pq.get until $pq.is_empty;</lang>

Output:

Solve RC tasks
Do taxes
Clear drains
Check email
Feed cat
Make tea
Sleep
Exercise

Python

Using PriorityQueue

Python has the class queue.PriorityQueue in its standard library.

The data structures in the "queue" module are synchronized multi-producer, multi-consumer queues for multi-threaded use. They can however handle this task: <lang python>>>> import queue >>> pq = queue.PriorityQueue() >>> for item in ((3, "Clear drains"), (4, "Feed cat"), (5, "Make tea"), (1, "Solve RC tasks"), (2, "Tax return")): pq.put(item)


>>> while not pq.empty(): print(pq.get_nowait())


(1, 'Solve RC tasks') (2, 'Tax return') (3, 'Clear drains') (4, 'Feed cat') (5, 'Make tea') >>> </lang>

Help text for queue.PriorityQueue

<lang python>>>> import queue >>> help(queue.PriorityQueue) Help on class PriorityQueue in module queue:

class PriorityQueue(Queue)

|  Variant of Queue that retrieves open entries in priority order (lowest first).
|  
|  Entries are typically tuples of the form:  (priority number, data).
|  
|  Method resolution order:
|      PriorityQueue
|      Queue
|      builtins.object
|  
|  Methods inherited from Queue:
|  
|  __init__(self, maxsize=0)
|  
|  empty(self)
|      Return True if the queue is empty, False otherwise (not reliable!).
|      
|      This method is likely to be removed at some point.  Use qsize() == 0
|      as a direct substitute, but be aware that either approach risks a race
|      condition where a queue can grow before the result of empty() or
|      qsize() can be used.
|      
|      To create code that needs to wait for all queued tasks to be
|      completed, the preferred technique is to use the join() method.
|  
|  full(self)
|      Return True if the queue is full, False otherwise (not reliable!).
|      
|      This method is likely to be removed at some point.  Use qsize() >= n
|      as a direct substitute, but be aware that either approach risks a race
|      condition where a queue can shrink before the result of full() or
|      qsize() can be used.
|  
|  get(self, block=True, timeout=None)
|      Remove and return an item from the queue.
|      
|      If optional args 'block' is true and 'timeout' is None (the default),
|      block if necessary until an item is available. If 'timeout' is
|      a positive number, it blocks at most 'timeout' seconds and raises
|      the Empty exception if no item was available within that time.
|      Otherwise ('block' is false), return an item if one is immediately
|      available, else raise the Empty exception ('timeout' is ignored
|      in that case).
|  
|  get_nowait(self)
|      Remove and return an item from the queue without blocking.
|      
|      Only get an item if one is immediately available. Otherwise
|      raise the Empty exception.
|  
|  join(self)
|      Blocks until all items in the Queue have been gotten and processed.
|      
|      The count of unfinished tasks goes up whenever an item is added to the
|      queue. The count goes down whenever a consumer thread calls task_done()
|      to indicate the item was retrieved and all work on it is complete.
|      
|      When the count of unfinished tasks drops to zero, join() unblocks.
|  
|  put(self, item, block=True, timeout=None)
|      Put an item into the queue.
|      
|      If optional args 'block' is true and 'timeout' is None (the default),
|      block if necessary until a free slot is available. If 'timeout' is
|      a positive number, it blocks at most 'timeout' seconds and raises
|      the Full exception if no free slot was available within that time.
|      Otherwise ('block' is false), put an item on the queue if a free slot
|      is immediately available, else raise the Full exception ('timeout'
|      is ignored in that case).
|  
|  put_nowait(self, item)
|      Put an item into the queue without blocking.
|      
|      Only enqueue the item if a free slot is immediately available.
|      Otherwise raise the Full exception.
|  
|  qsize(self)
|      Return the approximate size of the queue (not reliable!).
|  
|  task_done(self)
|      Indicate that a formerly enqueued task is complete.
|      
|      Used by Queue consumer threads.  For each get() used to fetch a task,
|      a subsequent call to task_done() tells the queue that the processing
|      on the task is complete.
|      
|      If a join() is currently blocking, it will resume when all items
|      have been processed (meaning that a task_done() call was received
|      for every item that had been put() into the queue).
|      
|      Raises a ValueError if called more times than there were items
|      placed in the queue.
|  
|  ----------------------------------------------------------------------
|  Data descriptors inherited from Queue:
|  
|  __dict__
|      dictionary for instance variables (if defined)
|  
|  __weakref__
|      list of weak references to the object (if defined)

>>> </lang>

Using heapq

Python has the heapq module in its standard library.

Although one can use the heappush method to add items individually to a heap similar to the method used in the PriorityQueue example above, we will instead transform the list of items into a heap in one go then pop them off one at a time as before. <lang python>>>> from heapq import heappush, heappop, heapify >>> items = [(3, "Clear drains"), (4, "Feed cat"), (5, "Make tea"), (1, "Solve RC tasks"), (2, "Tax return")] >>> heapify(items) >>> while items: print(heappop(items))


(1, 'Solve RC tasks') (2, 'Tax return') (3, 'Clear drains') (4, 'Feed cat') (5, 'Make tea') >>> </lang>

Help text for module heapq

<lang python>>>> help('heapq') Help on module heapq:

NAME

   heapq - Heap queue algorithm (a.k.a. priority queue).

DESCRIPTION

   Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for
   all k, counting elements from 0.  For the sake of comparison,
   non-existing elements are considered to be infinite.  The interesting
   property of a heap is that a[0] is always its smallest element.
   
   Usage:
   
   heap = []            # creates an empty heap
   heappush(heap, item) # pushes a new item on the heap
   item = heappop(heap) # pops the smallest item from the heap
   item = heap[0]       # smallest item on the heap without popping it
   heapify(x)           # transforms list into a heap, in-place, in linear time
   item = heapreplace(heap, item) # pops and returns smallest item, and adds
                                  # new item; the heap size is unchanged
   
   Our API differs from textbook heap algorithms as follows:
   
   - We use 0-based indexing.  This makes the relationship between the
     index for a node and the indexes for its children slightly less
     obvious, but is more suitable since Python uses 0-based indexing.
   
   - Our heappop() method returns the smallest item, not the largest.
   
   These two make it possible to view the heap as a regular Python list
   without surprises: heap[0] is the smallest item, and heap.sort()
   maintains the heap invariant!

FUNCTIONS

   heapify(...)
       Transform list into a heap, in-place, in O(len(heap)) time.
   
   heappop(...)
       Pop the smallest item off the heap, maintaining the heap invariant.
   
   heappush(...)
       Push item onto heap, maintaining the heap invariant.
   
   heappushpop(...)
       Push item on the heap, then pop and return the smallest item
       from the heap. The combined action runs more efficiently than
       heappush() followed by a separate call to heappop().
   
   heapreplace(...)
       Pop and return the current smallest value, and add the new item.
       
       This is more efficient than heappop() followed by heappush(), and can be
       more appropriate when using a fixed-size heap.  Note that the value
       returned may be larger than item!  That constrains reasonable uses of
       this routine unless written as part of a conditional replacement:
       
           if item > heap[0]:
               item = heapreplace(heap, item)
   
   merge(*iterables)
       Merge multiple sorted inputs into a single sorted output.
       
       Similar to sorted(itertools.chain(*iterables)) but returns a generator,
       does not pull the data into memory all at once, and assumes that each of
       the input streams is already sorted (smallest to largest).
       
       >>> list(merge([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
       [0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25]
   
   nlargest(n, iterable, key=None)
       Find the n largest elements in a dataset.
       
       Equivalent to:  sorted(iterable, key=key, reverse=True)[:n]
   
   nsmallest(n, iterable, key=None)
       Find the n smallest elements in a dataset.
       
       Equivalent to:  sorted(iterable, key=key)[:n]

DATA

   __about__ = 'Heap queues\n\n[explanation by François Pinard]\n\nH... t...
   __all__ = ['heappush', 'heappop', 'heapify', 'heapreplace', 'merge', '...

FILE

   c:\python32\lib\heapq.py


>>> </lang>

Tcl

Library: Tcllib (Package: struct::prioqueue)

<lang tcl>package require struct::prioqueue

set pq [struct::prioqueue] foreach {priority task} {

   3 "Clear drains"
   4 "Feed cat"
   5 "Make tea"
   1 "Solve RC tasks"
   2 "Tax return"

} {

   # Insert into the priority queue
   $pq put $task $priority

}

  1. Drain the queue, in priority-sorted order

while {[$pq size]} {

   # Remove the front-most item from the priority queue
   puts [$pq get]

}</lang> Which produces this output:

Make tea
Feed cat
Clear drains
Tax return
Solve RC tasks