30 Jun 2010

Types of Oracle joins

Just as a brush up of Oracle join understanding.

Suppose we are joining two tables in Oracle.

SELECT T1.X, T2.Y
FROM T1 JOIN T2 ON T1.X = T2.X

Assume, T1 is driving table (i.e. more rows than T2)

Nested Loop join
------------------

for each record in T1
find matching record in T2 where T1.X = T2.X
fetch that record into result set

Now if there is no index on column X in T2, Oracle will perform full table scan for each record in T1.

Hash join
---------

Oracle will load create an in-memory index (knowns as hash table) for T2 (i.e. smaller table)

It will still perform similar operation

for each record in T1
find matching record in T2 where T1.X = T2.X
fetch that record into result set

But even if there is no index on T2.X, Oracle will use the hash table which works similar to index!

Clearly this is suitable only when T2 is small otherwise [1] hash table may not fit in memory [2] creation of hash table may involve a long time (then we could have created the index on T2 in first place!)

Sorted Merge join
-------------------

If both tables are very big, hash join is not suitable due to memory/time requirement and nested loop will be slow or will result in full table scan.

The alternative way to speed up performance is to use sorted merge join.

In this case, both tables are sorted by Oracle (using temporary tablespace) by joining key (X in this case).

The advantage is, when joining two tables, Oracle just needs to scan only a small part of the table (e.g. as they are sorted, to find a record in T2 when T1.X = 15 will require Oracle search just from X=10 to X=20 in T2 as it cannot be beyond this range and so on).

The trade off is, it will take some resource (space/time/memory) to sort the tables.

Usually Oracle chooses best method of join based on statistics avaialble. However, user may force using a different type of join using hints like use_hash(T1,T2) etc.

No comments:

Post a Comment