Sunday 26 June 2011

The right, left outer join!

Faced an interesting scenario today while working on the correctness of a query. What I didn't know was that in an outer join, the SQL statement 

col(+) = val 

is not quite the same as

col = val
or col is NULL.

In the first case, the RDBMS will pick up all values of that column whether they are NULL or NOT NULL, and display it as NULL if the column value does not match val. The number of rows remains unaffected.

In the second case, however, the RDBMS will pick up only those rows which has the column value = val as well as those rows which are NULL (right table NULL rows which don't match due to the left outer join, OR which are just NULL);
Illustration below:

SQL> create table emp
  2  ( emp_id number,
  3  emp_name varchar2(30),
  4  dept_no number);

Table created.


SQL> create table dept
  2  (dept_id number,
  3  dept_name varchar2(30)
  4  );

Table created.



SQL> insert into emp values(100, 'Adam', 10);
1 row created.

SQL> insert into emp values(200, 'Bud', 20);
1 row created.

SQL> insert into emp values(300, 'Carling', 10);
1 row created.

SQL> insert into emp values(400,'Don', NULL);
1 row created.

SQL> insert into emp values(500, 'Emma',40);
1 row created.

SQL> insert into dept values(10, 'Accounts');
1 row created.

SQL> insert into dept values(20, 'Sales');
1 row created.


SQL> insert into dept values(30, 'IT');
1 row created.

SQL> commit;
Commit complete.


SQL> column emp_name format a8;


SQL> select *
  2  from emp,
  3  dept
  4  where emp.dept_no = dept.dept_no;
where emp.dept_no = dept.dept_id;


    EMP_ID EMP_NAME    DEPT_NO    DEPT_ID DEPT_NAME
---------- -------- ---------- ---------- ------------------------------
       100 Adam             10         10 Accounts
       200 Bud              20         20 Sales
       300 Carling          10         10 Accounts


SQL> select *
  2  from emp,
  3  dept
  4  where emp.dept_no = dept.dept_id(+)
  5  order by emp_id;

    EMP_ID EMP_NAME    DEPT_NO    DEPT_ID DEPT_NAME
---------- -------- ---------- ---------- ------------------------------
       100 Adam             10         10 Accounts
       200 Bud              20         20 Sales
       300 Carling          10         10 Accounts
       400 Don
       500 Emma             40


SQL> select *
  2  from emp,
  3  dept
  4  where emp.dept_no = dept.dept_id(+)
  5  and dept.dept_name (+) = 'Sales'
  6  order by 1;

    EMP_ID EMP_NAME    DEPT_NO    DEPT_ID DEPT_NAME
---------- -------- ---------- ---------- ------------------------------
       100 Adam             10
       200 Bud              20         20 Sales
       300 Carling          10
       400 Don
       500 Emma             40


Note that rows which have dept_name = 'Accounts' are also picked up, but the column value is displayed as NULL since it does not match 'Sales'.
SQL> select *
  2  from emp,
  3  dept
  4  where emp.dept_no = dept.dept_id(+)
  5  and (dept.dept_name = 'Sales'
  6       or
  7       dept.dept_name is null)
  8  order by emp_id;

    EMP_ID EMP_NAME    DEPT_NO    DEPT_ID DEPT_NAME
---------- -------- ---------- ---------- ------------------------------
       200 Bud              20         20 Sales
       400 Don
       500 Emma             40


This is what I was expecting - to filter out non-Sales rows if present.

The same example with ANSI join syntax:


SQL> select *
  2  from emp
  3  left outer join dept
  4  on emp.dept_no = dept.dept_id;

    EMP_ID EMP_NAME    DEPT_NO    DEPT_ID DEPT_NAME
---------- -------- ---------- ---------- ------------------------------
       300 Carling          10         10 Accounts
       100 Adam             10         10 Accounts
       200 Bud              20         20 Sales
       400 Don
       500 Emma             40

SQL> select *
  2  from emp
  3  left outer join dept
  4  on emp.dept_no = dept.dept_id
  5  where dept.dept_name = 'Sales'
  6  order by emp_id;

    EMP_ID EMP_NAME    DEPT_NO    DEPT_ID DEPT_NAME
---------- -------- ---------- ---------- ------------------------------
       200 Bud              20         20 Sales

SQL> select *
  2  from emp
  3  left outer join dept
  4  on emp.dept_no = dept.dept_id
  5  and dept.dept_name = 'Sales'
  6  order by emp_id;

    EMP_ID EMP_NAME    DEPT_NO    DEPT_ID DEPT_NAME
---------- -------- ---------- ---------- ------------------------------
       100 Adam             10
       200 Bud              20         20 Sales
       300 Carling          10
       400 Don
       500 Emma             40

Note that (+) behaviour in the ANSI join when the filter condition is placed in the join condition rather than in the where clause.


SQL> select *
  2  from emp
  3  left outer join dept
  4  on emp.dept_no = dept.dept_id
  5  where (dept_name = 'Sales'
  6        or
  7        dept_name is NULL)
  8  order by emp_id;

    EMP_ID EMP_NAME    DEPT_NO    DEPT_ID DEPT_NAME
---------- -------- ---------- ---------- ------------------------------
       200 Bud              20         20 Sales
       400 Don
       500 Emma             40

No comments: