Hash join: Difference between revisions
Content added Content deleted
mNo edit summary |
(To draft task status) |
||
Line 1: | Line 1: | ||
{{task}}[[wp:Hash Join|Hash Join]] |
{{draft task}}[[wp:Hash Join|Hash Join]] |
||
The classic hash join algorithm for an inner join of two relations has the following steps: |
The classic hash join algorithm for an inner join of two relations has the following steps: |
||
<ul> |
<ul> |
Revision as of 17:00, 27 November 2013
Hash join is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.
The classic hash join algorithm for an inner join of two relations has the following steps:
- Hash phase : Creating a hash table for one of the two relations by applying a hash function to the join attribute of each row. Ideally we should create a hash table for the smaller relation. Thus, optimizing for creation time and memory size of the hash table.
- Join phase : Scanning the larger relation and finding the relevant rows by looking in the hash table created before.
The algorithm is as follows:
for each tuple s in S do { hash on join attributes s(b) place tuples in hash table based on hash values}; for each tuple r do { hash on join attributes r(a) if r hashes in a nonempty bucket of hash table for S then {if r hash key matches any s in bucket concatenate r and s place relation in Q}};
Implement the Hash Join algorithm in your programming language (optionally providing a test case as well).