Hash join: Difference between revisions

Content added Content deleted

Inline

Revision as of 17:00, 27 November 2013

The classic hash join algorithm for an inner join of two relations has the following steps:

Hash phase : Creating a hash table for one of the two relations by applying a hash function to the join attribute of each row. Ideally we should create a hash table for the smaller relation. Thus, optimizing for creation time and memory size of the hash table.
Join phase : Scanning the larger relation and finding the relevant rows by looking in the hash table created before.

The algorithm is as follows:

for each tuple s in S do
{ hash on join attributes s(b)
  place tuples in hash table based on hash values};
for each tuple r do
{ hash on join attributes r(a)
  if r hashes in a nonempty bucket of hash table for S
  then
    {if r hash key matches any s in bucket concatenate r and s
    place relation in Q}};

Implement the Hash Join algorithm in your programming language (optionally providing a test case as well).

@@ Line 1: / Line 1: @@
-{{task}}[[wp:Hash Join|Hash Join]]
+{{draft task}}[[wp:Hash Join|Hash Join]]
 The classic hash join algorithm for an inner join of two relations has the following steps:
 <ul>