I'm working on modernizing Rosetta Code's infrastructure. Starting with communications. Please accept this time-limited open invite to RC's Slack.. --Michael Mol (talk) 20:59, 30 May 2020 (UTC)

# Jaccard index

Jaccard index is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.
 This page uses content from Wikipedia. The original article was at Jaccard index. The list of authors can be seen in the page history. As with Rosetta Code, the text of Wikipedia is available under the GNU FDL. (See links for details on variance)

The Jaccard index, also known as the Jaccard similarity coefficient, is a statistic used for gauging the similarity and diversity of sample sets. It was developed by Paul Jaccard, originally giving the French name coefficient de communauté, and independently formulated again by T. Tanimoto. Thus, the Tanimoto index or Tanimoto coefficient are also used in some fields. However, they are identical in generally taking the ratio of Intersection over Union. The Jaccard coefficient measures similarity between finite sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets:

J(A, B) = |A ∩ B|/|A ∪ B|

Define sets as follows, using any linear data structure:

```A = {}
B = {1, 2, 3, 4, 5}
C = {1, 3, 5, 7, 9}
D = {2, 4, 6, 8, 10}
E = {2, 3, 5, 7}
F = {8}
```

Write a program that computes the Jaccard index for every ordered pairing (to show that J(A, B) and J(B, A) are the same) of these sets, including self-pairings.

## APL

`task←{    jaccard ← (≢∩)÷(≢∪)     A ← ⍬    B ← 1 2 3 4 5    C ← 1 3 5 7 9    D ← 2 4 6 8 10    E ← 2 3 5 7    F ← ,8     '.ABCDEF' ⍪ 'ABCDEF' , ∘.jaccard⍨ A B C D E F}`
Output:
```. A            B            C     D     E   F
A 1 0            0            0     0     0
B 0 1            0.4285714286 0.25  0.5   0
C 0 0.4285714286 1            0     0.5   0
D 0 0.25         0            1     0.125 0.2
E 0 0.5          0.5          0.125 1     0
F 0 0            0            0.2   0     1  ```

## BQN

`Jaccard ← ≡◶⟨∊ ÷○(+´) ∊∘∾, 1⟩ a ← ⟨⟩b ← ⟨1,2,3,4,5⟩c ← ⟨1,3,5,7,9⟩d ← ⟨2,4,6,8,10⟩e ← ⟨2,3,5,7⟩f ← ⟨8⟩ Jaccard⌜˜ ⟨a,b,c,d,e,f⟩`
Output:
```┌─
╵ 1                   0                   0     0     0   0
0                   1 0.42857142857142855  0.25   0.5   0
0 0.42857142857142855                   1     0   0.5   0
0                0.25                   0     1 0.125 0.2
0                 0.5                 0.5 0.125     1   0
0                   0                   0   0.2     0   1
┘```

## Factor

Works with: Factor version 0.99 2021-06-02
`USING: assocs formatting grouping kernel math math.combinatoricsprettyprint sequences sequences.repeating sets ; : jaccard ( seq1 seq2 -- x )    2dup [ empty? ] both? [ 2drop 1 ]    [ [ intersect ] [ union ] 2bi [ length ] [email protected] / ] if ; { { } { 1 2 3 4 5 } { 1 3 5 7 9 } { 2 4 6 8 10 } { 2 3 5 7 } { 8 } }[ 2 <combinations> ] [ 2 repeat 2 group append ] bi[ 2dup jaccard "%u %u -> %u\n" printf ] assoc-each`
Output:
```{ } { 1 2 3 4 5 } -> 0
{ } { 1 3 5 7 9 } -> 0
{ } { 2 4 6 8 10 } -> 0
{ } { 2 3 5 7 } -> 0
{ } { 8 } -> 0
{ 1 2 3 4 5 } { 1 3 5 7 9 } -> 3/7
{ 1 2 3 4 5 } { 2 4 6 8 10 } -> 1/4
{ 1 2 3 4 5 } { 2 3 5 7 } -> 1/2
{ 1 2 3 4 5 } { 8 } -> 0
{ 1 3 5 7 9 } { 2 4 6 8 10 } -> 0
{ 1 3 5 7 9 } { 2 3 5 7 } -> 1/2
{ 1 3 5 7 9 } { 8 } -> 0
{ 2 4 6 8 10 } { 2 3 5 7 } -> 1/8
{ 2 4 6 8 10 } { 8 } -> 1/5
{ 2 3 5 7 } { 8 } -> 0
{ } { } -> 1
{ 1 2 3 4 5 } { 1 2 3 4 5 } -> 1
{ 1 3 5 7 9 } { 1 3 5 7 9 } -> 1
{ 2 4 6 8 10 } { 2 4 6 8 10 } -> 1
{ 2 3 5 7 } { 2 3 5 7 } -> 1
{ 8 } { 8 } -> 1
```

## jq

Works with: jq

Works with gojq, the Go implementation of jq In the following:

• the Jaccard index is presented as a string representing a reduced fraction, e.g. "0" or "1/7".
• sets are represented by sorted arrays with distinct elements.

Preliminaries

`def lpad(\$len): tostring | (\$len - length) as \$l | (" " * \$l)[:\$l] + .; def gcd(a; b):  # subfunction expects [a,b] as input  # i.e. a ~ .[0] and b ~ .[1]  def rgcd: if .[1] == 0 then .[0]         else [.[1], .[0] % .[1]] | rgcd         end;  [a,b] | rgcd;`

`def rjaccardIndex(x; y):  def i(a;b): a - (a-b);  def u(a;b): a + (b - i(a;b)) | unique;   def idivide(\$i; \$j):    if \$i == 0 then "0"    else gcd(\$i;\$j) as \$d    | if \$j == \$d then "\(\$i/\$d)"      else "\(\$i/\$d)/\(\$j/\$d)"      end    end;   if (x|length) == 0 and (y|length) == "0" then "1"  else idivide( i(x;y)|length; u(x;y)|length )  end; def a : [];def b : [1, 2, 3, 4, 5];def c : [1, 3, 5, 7, 9];def d : [2, 4, 6, 8, 10];def e : [2, 3, 5, 7];def f : [8]; def task:  def tidy: map(lpad(4))|join(" ");   [a,b,c,d,e,f] as \$sets  | [range(0;\$sets|length) | [. + 97] | implode] as \$names  | ([""] + \$names | tidy),    (range(0; \$sets|length) as \$i     | ([\$i + 97] | implode) as \$name     | \$sets[\$i] as \$x     | \$sets | map(rjaccardIndex(\$x; .)) | tidy     | "  \(\$name): \(.)" ) ; task`
Output:
```        a    b    c    d    e    f
a:    0    0    0    0    0    0
b:    0    1  3/7  1/4  1/2    0
c:    0  3/7    1    0  1/2    0
d:    0  1/4    0    1  1/8  1/5
e:    0  1/2  1/2  1/8    1    0
f:    0    0    0  1/5    0    1
```

## Julia

`J(A, B) = begin i, u = length(A ∩ B), length(A ∪ B); u == 0 ? 1//1 : i // u end A = Int[]B = [1, 2, 3, 4, 5]C = [1, 3, 5, 7, 9]D = [2, 4, 6, 8, 10]E = [2, 3, 5, 7]F = [8]testsets = [A, B, C, D, E, F] println("Set A             Set B             J(A, B)\n", "-"^44)for a in testsets, b in testsets    println(rpad(isempty(a) ? "[]" : a, 18), rpad(isempty(b) ? "[]" : b, 18),        replace(string(J(a, b)), "//" => "/"))end `
Output:
```Set A             Set B             J(A, B)
--------------------------------------------
[]                []                1/1
[]                [1, 2, 3, 4, 5]   0/1
[]                [1, 3, 5, 7, 9]   0/1
[]                [2, 4, 6, 8, 10]  0/1
[]                [2, 3, 5, 7]      0/1
[]                [8]               0/1
[1, 2, 3, 4, 5]   []                0/1
[1, 2, 3, 4, 5]   [1, 2, 3, 4, 5]   1/1
[1, 2, 3, 4, 5]   [1, 3, 5, 7, 9]   3/7
[1, 2, 3, 4, 5]   [2, 4, 6, 8, 10]  1/4
[1, 2, 3, 4, 5]   [2, 3, 5, 7]      1/2
[1, 2, 3, 4, 5]   [8]               0/1
[1, 3, 5, 7, 9]   []                0/1
[1, 3, 5, 7, 9]   [1, 2, 3, 4, 5]   3/7
[1, 3, 5, 7, 9]   [1, 3, 5, 7, 9]   1/1
[1, 3, 5, 7, 9]   [2, 4, 6, 8, 10]  0/1
[1, 3, 5, 7, 9]   [2, 3, 5, 7]      1/2
[1, 3, 5, 7, 9]   [8]               0/1
[2, 4, 6, 8, 10]  []                0/1
[2, 4, 6, 8, 10]  [1, 2, 3, 4, 5]   1/4
[2, 4, 6, 8, 10]  [1, 3, 5, 7, 9]   0/1
[2, 4, 6, 8, 10]  [2, 4, 6, 8, 10]  1/1
[2, 4, 6, 8, 10]  [2, 3, 5, 7]      1/8
[2, 4, 6, 8, 10]  [8]               1/5
[2, 3, 5, 7]      []                0/1
[2, 3, 5, 7]      [1, 2, 3, 4, 5]   1/2
[2, 3, 5, 7]      [1, 3, 5, 7, 9]   1/2
[2, 3, 5, 7]      [2, 4, 6, 8, 10]  1/8
[2, 3, 5, 7]      [2, 3, 5, 7]      1/1
[2, 3, 5, 7]      [8]               0/1
[8]               []                0/1
[8]               [1, 2, 3, 4, 5]   0/1
[8]               [1, 3, 5, 7, 9]   0/1
[8]               [2, 4, 6, 8, 10]  1/5
[8]               [2, 3, 5, 7]      0/1
[8]               [8]               1/1
```

## Phix

```with javascript_semantics
include sets.e

function jaccard(sequence a, b)
integer i = length(intersection(a,b)),
u = length(union(a,b))
return iff(u=0?1:i/u)
end function

constant tests = {{},               -- A
{1, 2, 3, 4, 5},  -- B
{1, 3, 5, 7, 9},  -- C
{2, 4, 6, 8, 10}, -- D
{2, 3, 5, 7},     -- E
{8}}              -- F

for i=1 to length(tests) do
for j=i to length(tests) do
string s = sprintf("J(%c,%c)",{'A'+i-1,'A'+j-1})
atom jij = jacard(tests[i],tests[j])
if i!=j then
atom jji = jacard(tests[j],tests[i])
assert(jji==jij)
s &= sprintf(" = J(%c,%c)",{'A'+j-1,'A'+i-1})
end if
printf(1,"%s = %g\n",{s,jij})
end for
end for
```
Output:
```J(A,A) = 1
J(A,B) = J(B,A) = 0
J(A,C) = J(C,A) = 0
J(A,D) = J(D,A) = 0
J(A,E) = J(E,A) = 0
J(A,F) = J(F,A) = 0
J(B,B) = 1
J(B,C) = J(C,B) = 0.428571
J(B,D) = J(D,B) = 0.25
J(B,E) = J(E,B) = 0.5
J(B,F) = J(F,B) = 0
J(C,C) = 1
J(C,D) = J(D,C) = 0
J(C,E) = J(E,C) = 0.5
J(C,F) = J(F,C) = 0
J(D,D) = 1
J(D,E) = J(E,D) = 0.125
J(D,F) = J(F,D) = 0.2
J(E,E) = 1
J(E,F) = J(F,E) = 0
J(F,F) = 1
```

## Perl

`#!/usr/bin/perl use strict;use warnings; my %sets = (  A => [],  B => [1, 2, 3, 4, 5],  C => [1, 3, 5, 7, 9],  D => [2, 4, 6, 8, 10],  E => [2, 3, 5, 7],  F => [8],  );use Data::Dump 'dd'; dd \%sets; for my \$left (sort keys %sets )  {  for my \$right (sort keys %sets )    {    my %union;    \$union{ \$_ }++ for @{ \$sets{\$left} }, @{ \$sets{\$right} };    print "J(\$left,\$right) = ",      %union ? (grep \$_ == 2, values %union) / (keys %union) : 1, "\n";    }  }`
Output:
```{
A => [],
B => [1 .. 5],
C => [1, 3, 5, 7, 9],
D => [2, 4, 6, 8, 10],
E => [2, 3, 5, 7],
F => [8],
}
J(A,A) = 1
J(A,B) = 0
J(A,C) = 0
J(A,D) = 0
J(A,E) = 0
J(A,F) = 0
J(B,A) = 0
J(B,B) = 1
J(B,C) = 0.428571428571429
J(B,D) = 0.25
J(B,E) = 0.5
J(B,F) = 0
J(C,A) = 0
J(C,B) = 0.428571428571429
J(C,C) = 1
J(C,D) = 0
J(C,E) = 0.5
J(C,F) = 0
J(D,A) = 0
J(D,B) = 0.25
J(D,C) = 0
J(D,D) = 1
J(D,E) = 0.125
J(D,F) = 0.2
J(E,A) = 0
J(E,B) = 0.5
J(E,C) = 0.5
J(E,D) = 0.125
J(E,E) = 1
J(E,F) = 0
J(F,A) = 0
J(F,B) = 0
J(F,C) = 0
J(F,D) = 0.2
J(F,E) = 0
J(F,F) = 1
```

## Prolog

` show([]).show([X|Xs]):- write(X), show(Xs). j(N,M,X):- M > 0 -> X is N/M; X is 1. task:- L = [[], [1,2,3,4,5], [1,3,5,7,9], [2,4,6,8,10], [2,3,5,7], [8]],    forall((member(A,L), member(B,L)), (        findall(X, (member(X,A), member(X,B)), I), length(I,N),        findall(X, (member(X,B), not(member(X,A))), T), append(A,T,U), length(U,M),        j(N,M,J), show(["A = ",A,", B = ",B,", J = ",J]), nl)). `
Output:
```?- task.
A = [], B = [], J = 1
A = [], B = [1,2,3,4,5], J = 0
A = [], B = [1,3,5,7,9], J = 0
A = [], B = [2,4,6,8,10], J = 0
A = [], B = [2,3,5,7], J = 0
A = [], B = [8], J = 0
A = [1,2,3,4,5], B = [], J = 0
A = [1,2,3,4,5], B = [1,2,3,4,5], J = 1
A = [1,2,3,4,5], B = [1,3,5,7,9], J = 0.42857142857142855
A = [1,2,3,4,5], B = [2,4,6,8,10], J = 0.25
A = [1,2,3,4,5], B = [2,3,5,7], J = 0.5
A = [1,2,3,4,5], B = [8], J = 0
A = [1,3,5,7,9], B = [], J = 0
A = [1,3,5,7,9], B = [1,2,3,4,5], J = 0.42857142857142855
A = [1,3,5,7,9], B = [1,3,5,7,9], J = 1
A = [1,3,5,7,9], B = [2,4,6,8,10], J = 0
A = [1,3,5,7,9], B = [2,3,5,7], J = 0.5
A = [1,3,5,7,9], B = [8], J = 0
A = [2,4,6,8,10], B = [], J = 0
A = [2,4,6,8,10], B = [1,2,3,4,5], J = 0.25
A = [2,4,6,8,10], B = [1,3,5,7,9], J = 0
A = [2,4,6,8,10], B = [2,4,6,8,10], J = 1
A = [2,4,6,8,10], B = [2,3,5,7], J = 0.125
A = [2,4,6,8,10], B = [8], J = 0.2
A = [2,3,5,7], B = [], J = 0
A = [2,3,5,7], B = [1,2,3,4,5], J = 0.5
A = [2,3,5,7], B = [1,3,5,7,9], J = 0.5
A = [2,3,5,7], B = [2,4,6,8,10], J = 0.125
A = [2,3,5,7], B = [2,3,5,7], J = 1
A = [2,3,5,7], B = [8], J = 0
A = [8], B = [], J = 0
A = [8], B = [1,2,3,4,5], J = 0
A = [8], B = [1,3,5,7,9], J = 0
A = [8], B = [2,4,6,8,10], J = 0.2
A = [8], B = [2,3,5,7], J = 0
A = [8], B = [8], J = 1
true.
```

## Raku

`sub J(\A, \B) { A ∪ B ?? (A ∩ B) / (A ∪ B) !! A ∪ B == A ∩ B ?? 1 !! 0 } my %p =  A => < >,  B => <1 2 3 4 5>,  C => <1 3 5 7 9>,  D => <2 4 6 8 10>,  E => <2 3 5 7>,  F => <8>,; .say for %p.sort;say '';say "J({.join: ','}) = ", J |%p{\$_} for [X] <A B C D E F> xx 2;`
Output:
```A => ()
B => (1 2 3 4 5)
C => (1 3 5 7 9)
D => (2 4 6 8 10)
E => (2 3 5 7)
F => 8

J(A,A) = 1
J(A,B) = 0
J(A,C) = 0
J(A,D) = 0
J(A,E) = 0
J(A,F) = 0
J(B,A) = 0
J(B,B) = 1
J(B,C) = 0.428571
J(B,D) = 0.25
J(B,E) = 0.5
J(B,F) = 0
J(C,A) = 0
J(C,B) = 0.428571
J(C,C) = 1
J(C,D) = 0
J(C,E) = 0.5
J(C,F) = 0
J(D,A) = 0
J(D,B) = 0.25
J(D,C) = 0
J(D,D) = 1
J(D,E) = 0.125
J(D,F) = 0.2
J(E,A) = 0
J(E,B) = 0.5
J(E,C) = 0.5
J(E,D) = 0.125
J(E,E) = 1
J(E,F) = 0
J(F,A) = 0
J(F,B) = 0
J(F,C) = 0
J(F,D) = 0.2
J(F,E) = 0
J(F,F) = 1```

## Wren

Library: Wren-set
Library: Wren-trait
Library: Wren-fmt

Note that the Set object in the above module is implemented as a Map and consequently the iteration order (and the order in which elements are printed) is undefined.

`import "./set" for Setimport "./trait" for Indexedimport "./fmt" for Fmt var jaccardIndex = Fn.new { |a, b|    if (a.count == 0 && b.count == 0) return 1    return a.intersect(b).count / a.union(b).count} var a = Set.new([])var b = Set.new([1, 2, 3, 4, 5])var c = Set.new([1, 3, 5, 7, 9])var d = Set.new([2, 4, 6, 8, 10])var e = Set.new([2, 3, 5, 7])var f = Set.new([8])var isets = Indexed.new([a, b, c, d, e, f])for (se in isets) {    var i = String.fromByte(se.index + 65)    var v = se.value    v = v.toList.sort() // force original sorted order    Fmt.print("\$s = \$n", i, v)}System.print()for (se1 in isets) {    var i1 = String.fromByte(se1.index + 65)    var v1 = se1.value    for (se2 in isets) {        var i2 = String.fromByte(se2.index + 65)        var v2 = se2.value        Fmt.print("J(\$s, \$s) = \$h", i1, i2, jaccardIndex.call(v1, v2))    }}`
Output:
```A = []
B = [1, 2, 3, 4, 5]
C = [1, 3, 5, 7, 9]
D = [2, 4, 6, 8, 10]
E = [2, 3, 5, 7]
F = [8]

J(A, A) = 1
J(A, B) = 0
J(A, C) = 0
J(A, D) = 0
J(A, E) = 0
J(A, F) = 0
J(B, A) = 0
J(B, B) = 1
J(B, C) = 0.428571
J(B, D) = 0.25
J(B, E) = 0.5
J(B, F) = 0
J(C, A) = 0
J(C, B) = 0.428571
J(C, C) = 1
J(C, D) = 0
J(C, E) = 0.5
J(C, F) = 0
J(D, A) = 0
J(D, B) = 0.25
J(D, C) = 0
J(D, D) = 1
J(D, E) = 0.125
J(D, F) = 0.2
J(E, A) = 0
J(E, B) = 0.5
J(E, C) = 0.5
J(E, D) = 0.125
J(E, E) = 1
J(E, F) = 0
J(F, A) = 0
J(F, B) = 0
J(F, C) = 0
J(F, D) = 0.2
J(F, E) = 0
J(F, F) = 1
```