Hash join: Difference between revisions

From Rosetta Code
Content added Content deleted
mNo edit summary
(To draft task status)
Line 1: Line 1:
{{task}}[[wp:Hash Join|Hash Join]]
{{draft task}}[[wp:Hash Join|Hash Join]]
The classic hash join algorithm for an inner join of two relations has the following steps:
The classic hash join algorithm for an inner join of two relations has the following steps:
<ul>
<ul>

Revision as of 17:00, 27 November 2013

Hash join is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Hash Join

The classic hash join algorithm for an inner join of two relations has the following steps:

  • Hash phase : Creating a hash table for one of the two relations by applying a hash function to the join attribute of each row. Ideally we should create a hash table for the smaller relation. Thus, optimizing for creation time and memory size of the hash table.
  • Join phase : Scanning the larger relation and finding the relevant rows by looking in the hash table created before.

The algorithm is as follows:

for each tuple s in S do
{ hash on join attributes s(b)
  place tuples in hash table based on hash values};
for each tuple r do
{ hash on join attributes r(a)
  if r hashes in a nonempty bucket of hash table for S
  then
    {if r hash key matches any s in bucket concatenate r and s
    place relation in Q}};

Implement the Hash Join algorithm in your programming language (optionally providing a test case as well).