Averages/Mode: Difference between revisions

Content added Content deleted

Inline

Revision as of 19:11, 2 July 2009

Write a program to find the mode value of a collection. The case where the collection is empty may be ignored. Care must be taken to handle the case where the mode is non-unique.

If it is not appropriate or possible to support a general collection, use a vector (array), if possible. If it is not appropriate or possible to support an unspecified value type, use integers.

All the codes above are utilities to keep this main function simple. First, we build a table of value-frequency, for each different value, using the utility function collect_from_array. Then we sort the value-frequency pair according to the frequency (descending, cfr. f_cmp), and push on a "stack" all the values with the biggest same frequency.

<lang c>stack_t *median(double *v, size_t n) {

 size_t i, f;
 stack_t *r;

 fav_t *t1 = alloc_table();
 collect_from_array(t1, v, n);
 // now let's sort according to freqs
 qsort(t1->vf, t1->n, sizeof(fav_el_t), CMP_HOOK(f_cmp));
 r = init_stack();
 f = t1->vf[0].f;
 push_value(r, t1->vf[0].v);
 for(i=1; (i < t1->n) && (t1->vf[i].f == f); i++)
   push_value(r, t1->vf[i].v);
 free_table(t1);
 return r;

}</lang>

<lang c>double v1[] = { 1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17 }; double v2[] = { 1, 1, 2, 4, 4 };

define PRINT_MEDIAN(V) do { size_t i; \

   stack_t *r = median(V, sizeof(V)/sizeof(double));	\
   for(i=0; i < r->n; i++) printf("%lf ", r->data[i]); \
   free_stack(r); printf("\n");			\
 } while(0);

int main() {

 PRINT_MEDIAN(v1)
 PRINT_MEDIAN(v2);

 return EXIT_SUCCESS;

}</lang>

C++

Works with: g++ version 4.3.2

include <iterator>
include <utility>
include <algorithm>
include <list>
include <iostream>

// helper struct template<typename T> struct referring {

 referring(T const& t): value(t) {}
 template<typename Iter>
  bool operator()(std::pair<Iter, int> const& p)
 {
   return *p.first == value;
 }
 T const& value;

};

// requires: // FwdIterator is a ForwardIterator // The value_type of FwdIterator is EqualityComparable // OutIterator is an output iterator // the value_type of FwdIterator is convertible to the value_type of OutIterator // [first, last) is a valid range // provides: // the mode is written to result template<typename FwdIterator, typename OutIterator>

void mode(FwdIterator first, FwdIterator last, OutIterator result)

{

 typedef typename std::iterator_traits<FwdIterator>::value_type value_type;
 typedef std::list<std::pair<FwdIterator, int> > count_type;
 typedef typename count_type::iterator count_iterator;

 // count elements
 count_type counts;

 while (first != last)
 {
   count_iterator element = std::find_if(counts.begin(), counts.end(),
                                         referring<value_type>(*first));
   if (element == counts.end())
     counts.push_back(std::make_pair(first, 1));
   else
     ++element->second;
   ++first;
 }

 // find maximum
 int max = 0;
 for (count_iterator i = counts.begin(); i != counts.end(); ++i)
   if (i->second > max)
     max = i->second;

 // copy corresponding elements to output sequence
 for (count_iterator i = counts.begin(); i != counts.end(); ++i)
   if (i->second == max)
     *result++ = *i->first;

}

// example usage int main() {

 int values[] = { 1, 2, 3, 1, 2, 4, 2, 5, 2, 3, 3, 1, 3, 6 };
 median(values, values + sizeof(values)/sizeof(int),
        std::ostream_iterator<int>(std::cout, " "));
 std::cout << std::endl;

} </lang> Output:

2 3

E

<lang e>pragma.enable("accumulator") def mode(values) {

   def counts := [].asMap().diverge()
   var maxCount := 0
   for v in values {
       maxCount max= (counts[v] := counts.fetch(v, fn{0}) + 1)
   }
   return accum [].asSet() for v => ==maxCount in counts { _.with(v) }

}</lang>

<lang e>? mode([1,1,2,2,3,3,4,4,4,5,5,6,6,7,8,8,9,9,0,0,0])

value: [4, 0].asSet()</lang>

In the line "maxCount max= (counts[v] := counts.fetch(v, fn{0}) + 1)", max= is an update-assignment operation like +=. (The parentheses are unnecessary.) A more verbose version would be:

<lang e> def newCount := counts.fetch(v, fn { 0 }) + 1

 counts[v] := newCount
 maxCount := maxCount.max(newCount)</lang>

In for loops, each key and value from the collection are pattern matched against the specified key pattern => value pattern. In "for v => ==maxCount in counts", the == is a pattern-match operator which fails unless the value examined is equal to the specified value; so this selects only the input values (keys in counts) whose counts are equal to the maximum count.

Java

<lang java>import java.util.*;

public class Mode {

   public static <T> List<T> mode(List<? extends T> coll) {
       Map<T, Integer> seen = new HashMap<T, Integer>();
       int max = 0;
       List<T> maxElems = new ArrayList<T>();
       for (T value : coll) {
           if (seen.containsKey(value))
               seen.put(value, seen.get(value) + 1);
           else
               seen.put(value, 1);
           if (seen.get(value) > max) {
               max = seen.get(value);
               maxElems.clear();
               maxElems.add(value);
           } else if (seen.get(value) == max) {
               maxElems.add(value);
           }
       }
       return maxElems;
   }

   public static void main(String[] args) {
       System.out.println(mode(Arrays.asList(1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17))); // prints [6]
       System.out.println(mode(Arrays.asList(1, 1, 2, 4, 4))); // prints [1, 4]
   }

}</lang>

Mathematica

Built-in function commonest returns a list of the most common element(s), even is there is only one 'commonest' number. Example for multiple 'commonest' numbers and a single 'commonest' number: <lang Mathematica>

Commonest[{b, a, c, 2, a, b, 1, 2, 3}]
Commonest[{1, 3, 2, 3}]

</lang> gives back: <lang Mathematica>

{b,a,2}
{3}

</lang>

Objective-C

<lang objc>#import <Foundation/Foundation.h>

@interface NSArray (Mode) - (NSArray *)mode; @end

@implementation NSArray (Mode) - (NSArray *)mode {

   NSCountedSet *seen = [NSCountedSet setWithArray:self];
   int max = 0;
   NSMutableArray *maxElems = [NSMutableArray array];
   NSEnumerator *enm = [seen objectEnumerator];
   id obj;
   while( (obj = [enm nextObject]) ) {
       int count = [seen countForObject:obj];
       if (count > max) {
           max = count;
           [maxElems removeAllObjects];
           [maxElems addObject:obj];
       } else if (count == max) {
           [maxElems addObject:obj];
       }
   }
   return maxElems;

} @end</lang>

OCaml

<lang ocaml>let mode lst =

 let seen = Hashtbl.create 42 in
   List.iter (fun x ->
                let old = if Hashtbl.mem seen x then
                  Hashtbl.find seen x
                else 0 in
                  Hashtbl.replace seen x (old + 1))
     lst;
   let best = Hashtbl.fold (fun _ -> max) seen 0 in
     Hashtbl.fold (fun k v acc ->
                     if v = best then k :: acc
                     else acc)
       seen []</lang>

# mode [1;3;6;6;6;6;7;7;12;12;17];;
- : int list = [6]
# mode [1;1;2;4;4];;
- : int list = [4; 1]

Octave

Of course Octave has the mode function; but it returns only the "lowest" mode if multiple modes are available.

<lang octave>function m = mode2(v)

 sv = sort(v);
 % build two vectors, vals and c, so that
 % c(i) holds how many times vals(i) appears
 i = 1; c = []; vals = [];
 while (i <= numel(v) )
   tc = sum(sv==sv(i)); % it would be faster to count
                        % them "by hand", since sv is sorted...
   c = [c, tc];
   vals = [vals, sv(i)];
   i += tc;
 endwhile
 % stack vals and c building a 2-rows matrix x
 x = cat(1,vals,c);
 % sort the second row (frequencies) into t (most frequent
 % first) and take the "original indices" i ... 
 [t, i] = sort(x(2,:), "descend");
 % ... so that we can use them to sort columns according
 % to frequencies
 nv = x(1,i);
 % at last, collect into m (the result) all the values
 % having the same bigger frequency
 r = t(1); i = 1;
 m = [];
 while ( t(i) == r )
   m = [m, nv(i)];
   i++;
 endwhile

endfunction</lang>

<lang octave>a = [1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17]; mode2(a) mode(a)

a = [1, 1, 2, 4, 4]; mode2(a) % returns 1 and 4 mode(a) % returns 1 only</lang>

Perl

<lang perl>use strict; use List::Util qw(max);

sub mode {

   my %c;
   foreach my $e ( @_ ) {

$c{$e}++;

   }
   my $best = max(values %c);
   return grep { $c{$_} == $best } keys %c;

}</lang>

<lang perl>print "$_ " foreach mode(1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17); print "\n"; print "$_ " foreach mode(1, 1, 2, 4, 4); print "\n";</lang>

PHP

Note: this function only works with strings and integers, as those are the only things that can be used as keys of an (associative) array in PHP. <lang php><?php function mode($arr) {

   $count = array();
   foreach ( $arr as $e )

$count[$e]++;

   $best = max($count);
   return array_keys($count, $best);

}

print_r(mode(array(1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17))); print_r(mode(array(1, 1, 2, 4, 4))); ?></lang>

Python

<lang python>>>> from collections import defaultdict >>> def modes(values): count = defaultdict(int) for v in values: count[v] +=1 best = max(count.itervalues()) return [k for k,v in count.iteritems() if v == best]

>>> modes([1,3,6,6,6,6,7,7,12,12,17]) [6] >>> modes((1,1,2,4,4)) [1, 4]</lang>

Ruby

Here's two methods, the first more Ruby-ish, the second perhaps a bit more efficient. <lang ruby>def mode(ary)

 seen = Hash.new(0)
 ary.each {|value| seen[value] += 1}
 max = seen.values.max
 seen.find_all {|key,value| value == max}.map {|key,value| key}

end

def mode_one_pass(ary)

 seen = Hash.new(0)
 max = 0
 max_elems = []
 ary.each do |value|
   seen[value] += 1
   if seen[value] > max
     max = seen[value]
     max_elems = [value]
   elsif seen[value] == max
     max_elems << value
   end
 end
 max_elems

end

p mode([1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17]) # => [6] p mode([1, 1, 2, 4, 4]) # => [1, 4] p mode_one_pass([1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17]) # => [6] p mode_one_pass([1, 1, 2, 4, 4]) # => [1, 4]</lang>

Smalltalk

Works with: GNU Smalltalk

This code is able to find the mode of any collection of any kind of object. <lang smalltalk>OrderedCollection extend [

 mode [ |s|
    s := self asBag sortedByCount.
    ^ (s select: [ :k | ((s at: 1) key) = (k key) ]) collect: [:k| k value]
 ]

].

( 1 3 6 6 6 6 7 7 12 12 17 ) asOrderedCollection

   mode displayNl.

( 1 1 2 4 4) asOrderedCollection

   mode displayNl.</lang>

Tcl

Works with: Tcl version 8.6

<lang tcl># Can find the modal value of any vector of values proc mode {n args} {

   foreach n [list $n {*}$args] {
       dict incr counter $n
   }
   set counts [lsort -stride 2 -index 1 -decreasing $counter]
   set best {}
   foreach {n count} $counts {
       if {[lindex $counts 1] == $count} {
           lappend best $n
       } else break
   }
   return $best

}

Testing

puts [mode 1 3 6 6 6 6 7 7 12 12 17]; # --> 6 puts [mode 1 1 2 4 4]; # --> 1 4</lang> Note that this works for any kind of value.

Ursala

The mode function defined below works on lists of any type and returns a list of the modes. There is no concept of a general collection in Ursala. The algorithm is to partition the list by equality, then partition the classes by their lengths, and then select a representative from each member of the set of classes with the maximum length.

<lang Ursala>#import std

mode = ~&hS+ leql$^&h+ eql|=@K2

cast %nLW

examples = mode~~ (<1,3,6,6,6,7,7,12,12,17>,<1,1,2,4,4>)</lang> The function is tested on a pair of lists, one with a unique mode and one with multiple modes. Here is the output.

(<6>,<4,1>)