Set: Difference between revisions

From Rosetta Code
Content added Content deleted
(→‎{{header|C}}: bits for small integers)
m ({{omit from|GUISS}})
Line 288: Line 288:
# Getting the cardinality:
# Getting the cardinality:
puts "cardinality: [struct::set size $s3]</lang>
puts "cardinality: [struct::set size $s3]</lang>

{{omit from|GUISS}}

Revision as of 14:50, 17 September 2011

Set is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Data Structure
This illustrates a data structure, a means of storing data within a program.

You may see other such structures in the Data Structures category.

A set is a collection of elements, without duplicates and without order.

Show how to create a set, and how to test m ∈ S, true if m is an element in set S. Also show each of these set operations:

  • A ∪ B -- union; a set of all elements either in set A or in set B.
  • A ∩ B -- intersection; a set of all elements in both set A and set B.
  • A − B -- difference; a set of all elements in set A, except those in set B.
  • A ⊂ B -- subset; true if every element in set A is also in set B.
  • A = B -- equality; true if every element of set A is in set B and vice-versa.

As an option, show some other set operations. (If A ⊂ B, but A ≠ B, then A is called a true or proper subset of B, written A ⊊ B.) As another option, show how to modify a mutable set.

One might implement a set using an associative array (with set elements as array keys), or using a list, a binary search tree or a hash table. The basic test, m ∈ S, is O(n) with a list, O(log n) with a balanced binary search tree, or near O(1) with a hash table.

C

Building a set highly depends on the datatype and use case. For example, a set of string could be implemented by hash table, sort tree or trie, but if all sets are known to have very few number of elements, it might be best to use just flat arrays. There isn't, and shouldn't be, an all-purpose set type for C.

A frequent use of set is that of small, non-negative integers, implemented as a bit field as shown below. <lang c>#include <stdio.h>

typedef unsigned int set_t; /* probably 32 bits; change according to need */

void show_set(set_t x, char *name) { int i; printf("%s is:", name); for (i = 0; (1U << i) <= x; i++) if (x & (1U << i)) printf(" %d", i); putchar('\n'); }

int main(void) { int i; set_t a, b, c;

a = 0; /* empty set */ for (i = 0; i < 10; i += 3) /* add 0 3 6 9 to set a */ a |= (1U << i); show_set(a, "a");

for (i = 0; i < 5; i++) printf("\t%d%s in set a\n", i, (a & (1U << i)) ? "":" not");

b = a; b |= (1U << 5); b |= (1U << 10); /* b is a plus 5, 10 */ b &= ~(1U << 0); /* sans 0 */ show_set(b, "b");

show_set(a | b, "union(a, b)"); show_set(c = a & b, "c = common(a, b)"); show_set(a & ~b, "a - b"); /* diff, not arithmetic minus */ show_set(b & ~a, "b - a"); printf("b is%s a subset of a\n", !(b & ~a) ? "" : " not"); printf("c is%s a subset of a\n", !(c & ~a) ? "" : " not");

printf("union(a, b) - common(a, b) %s union(a - b, b - a)\n", ((a | b) & ~(a & b)) == ((a & ~b) | (b & ~a)) ? "equals" : "does not equal");

return 0; }</lang>

J

In J, we use a sequence to represent a set. This actually winds up being a family of set implementations. In this example, we chose to ignore order and specify that duplicate elements are not allowed.

Here are definitions for the required operations:

<lang j>union=: ~.@, intersection=: [ -. -. difference=: -. subset=: *./@e. equality=: -:&(/:~)</lang>

Examples:

<lang j> 2 4 6 8 ~.@, 2 3 5 7 2 4 6 8 3 5 7

  2 4 6 8 ([ -. -.) 2 3 5 7

2

  2 4 6 8 -. 2 3 5 7

4 6 8

  2 4 6 8 *./@e. 2 3 5 7

0

   *./@e. 2 3 5 7

1

  2 4 6 8 3 5 7 -:&(/:~) 8 7 6 5 4 3 2

1</lang>

Note that these operations can be combined in sentences with other operations. For example we could define

<lang j>properSubset=: subset *. 1 - equality</lang>

Perl

Primitive set implementation using hashes. <lang perl>use strict;

package Set; # likely will conflict with stuff on CPAN use overload '""' => \&str, 'bool' => \&count, '+=' => \&add, '-=' => \&del, '-' => \&diff, '==' => \&eq, '&' => \&intersection, '|' => \&union, '^' => \&xdiff;

sub str { my $set = shift; # This has drawbacks: stringification is used as set key # if the set is added to another set as an element, which # may cause inconsistencies if the element set is modified # later. In general, a hash key loses its object identity # anyway, so it's not unique to us. "Set{ ". join(", " => sort map("$_", values %$set)) . " }" }

sub new { my $pkg = shift; my $h = bless {}; $h->add($_) for @_; $h }

sub add { my ($set, $elem) = @_; $set->{$elem} = $elem; $set }

sub del { my ($set, $elem) = @_; delete $set->{$elem}; $set }

sub has { # set has element my ($set, $elem) = @_; exists $set->{$elem} }

sub union { my ($this, $that) = @_; bless { %$this, %$that } }

sub intersection { my ($this, $that) = @_; my $s = new Set; for (keys %$this) { $s->{$_} = $this->{$_} if exists $that->{$_} } $s }

sub diff { my ($this, $that) = @_; my $s = Set->new; for (keys %$this) { $s += $this->{$_} unless exists $that->{$_} } $s }

sub xdiff { # xor, symmetric diff my ($this, $that) = @_; my $s = new Set; bless { %{ ($this - $that) | ($that - $this) } } }

sub count { scalar(keys %{+shift}) }

sub eq { my ($this, $that) = @_; !($this - $that) && !($that - $this); }

sub contains { # this is a superset of that my ($this, $that) = @_; for (keys %$that) { return 0 unless $this->has($_) } return 1 }

package main; my ($x, $y, $z, $w);

$x = Set->new(1, 2, 3); $x += $_ for (5 .. 7); $y = Set->new(1, 2, 4, $x); # not the brightest idea

print "set x is: $x\nset y is: $y\n"; for (1 .. 4, $x) { print "$_ is", $y->has($_) ? "" : " not", " in y\n"; }

print "union: ", $x | $y, "\n"; print "intersect: ", $x & $y, "\n"; print "z = x - y = ", $z = $x - $y, "\n"; print "y is ", $x->contains($y) ? "" : "not ", "a subset of x\n"; print "z is ", $x->contains($z) ? "" : "not ", "a subset of x\n"; print "z = (x | y) - (x & y) = ", $z = ($x | $y) - ($x & $y), "\n"; print "w = x ^ y = ", $w = ($x ^ $y), "\n"; print "w is ", ($w == $z) ? "" : "not ", "equal to z\n"; print "w is ", ($w == $x) ? "" : "not ", "equal to x\n";</lang>

Python

In Python, the set is a standard type with support in the languages syntax. <lang python>>>> s1, s2 = {1, 2, 3, 4}, {3, 4, 5, 6} >>> s1 | s2; # Union {1, 2, 3, 4, 5, 6} >>> s1 & s2; # Intersection {3, 4} >>> s1 - s2; # Difference {1, 2} >>> s1 < s1; # True subset False >>> {3, 1} < s1; # True subset True >>> s1 <= s1; # Subset True >>> {3, 1} <= s1; # Subset True >>> {3, 2, 4, 1} == s1; # Equality True >>> s1 == s2; # Equality False >>> 2 in s1; # Membership True >>> 10 not in s1; # Non-membership True >>> {1, 2, 3, 4, 5} > s1; # True superset True >>> {1, 2, 3, 4} > s1; # True superset False >>> {1, 2, 3, 4} >= s1; # Superset True >>> s1 ^ s2; # Symmetric difference {1, 2, 5, 6} >>> len(s1); # Cardinality 4 >>> s1.add(99); # Mutability >>> s1 {99, 1, 2, 3, 4} >>> s1.discard(99); # Mutability >>> s1 {1, 2, 3, 4} >>> s1 |= s2; # Mutability >>> s1 {1, 2, 3, 4, 5, 6} >>> s1 -= s2; # Mutability >>> s1 {1, 2} >>> s1 ^= s2; # Mutability >>> s1 {1, 2, 3, 4, 5, 6} >>> </lang>

Tcl

Sets in Tcl are modeled as lists of items, with operations that preserve uniqueness of membership.

Library: Tcllib (Package: struct::set)

<lang tcl>package require struct::set

  1. Many ways to build sets

set s1 [list 1 2 3 4] set s2 {3 4 5 6} struct::set add s3 {2 3 4 3 2}; # $s3 will be proper set... set item 5

puts "union: [struct::set union $s1 $s2]" puts "intersection: [struct::set intersect $s1 $s2]" puts "difference: [struct::set difference $s1 $s2]" puts "membership predicate: [struct::set contains $s1 $item]" puts "subset predicate: [struct::set subsetof $s1 $s2]"; # NB: not strict subset test! puts "equality predicate: [struct::set equal $s1 $s2]"

  1. Adding an element to a set (note that we pass in the name of the variable holding the set):

struct::set include s3 $item

  1. Removing an element from a set:

struct::set exclude s3 $item

  1. Getting the cardinality:

puts "cardinality: [struct::set size $s3]</lang>