The Outer Limits of SQL JOINs
06/27/2001First off this week, thanks for your patience during the recent breaks in this column -- I'm now the proud father of new baby (our first), and as you may imagine, I needed a little time to get into the swing of balancing new responsibilities with the rest of my life. The good news is that this is just the first of a large stack of new aboutSQL columns which should keep us on track for the next several months.
To refresh everyone's memory, we've started talking about SQL JOIN statements over the past few columns -- one of the most fundamentally important areas of SQL development. In the last column, we talked about the various types of joins, and I promised to cover "outer joins" this time around to complement the discussions of inner and cross joins from the past two columns. So with no further ado....
The SQL outer join
Every type of SQL join effectively multiplies the row in one table by the rows in the other table that is participating in the join (and which can be extended to include as many tables are participating in the JOIN operation).
When we discussed the Cartesian product (the "cross join"), we saw the raw results of a JOIN operation. The INNER JOIN statement provided a useful filter to that raw result by picking out only rows where the key fields have the same value. The result of the INNER JOIN operation is one row for each key value that exists in both joined tables. But what about situations where there are rows in one table that do not have a corresponding match in the other table? An inner join ignores these rows.
An outer join is used to include the rows that are "missing" in an inner join. Using our ongoing CD database as an example may make things clearer. So far, we've created joins that pull our artist data and CD data together to create a catalog of the CD collection. But let's say I've heard a new artist on the radio and immediately decide to put them in the database -- David Gray, for example.
|
Artists
|
|
| ArtistID | ArtistName |
| 1 | Peter Gabriel |
| 2 | Bruce Hornsby |
| 3 | Lyle Lovett |
| 4 | Beach Boys |
| 5 | David Gray |
If I create an inner join between the Artists and CD tables, I get the following result.
SELECT * FROM Artists, CDs WHERE Artists.ArtistID=CDs.ArtistID
| CDID | CDs.ArtistID | Artists.ArtistID | ArtistName | Title | Year |
| 1 | 1 | 1 | Peter Gabriel | So | 1984 |
| 2 | 1 | 1 | Peter Gabriel | Us | 1992 |
| 3 | 2 | 2 | Bruce Hornsby | The Way It Is | 1986 |
| 4 | 2 | 2 | Bruce Hornsby | Scenes from the Southside | 1990 |
| 5 | 1 | 1 | Peter Gabriel | Security | 1990 |
| 6 | 3 | 3 | Lyle Lovett | Joshua Judges Ruth | 1992 |
| 7 | 4 | 4 | Beach Boys | Pet Sounds | 1966 |
These results accurately describe my catalog. But what about David Gray and all the others I'm planning on adding to the catalog? They completely disappear from the result because there is no matching row where CDs.ArtistID=5. Big deal you may say, but what if the two tables were Customers and Orders instead? Or maybe Orders and Items?
|
Also in aboutSQL: |
In the first case, only customers with orders would show up in the report; in the second case, only items that had actually been ordered would show up. This sort of "missing" information could completely ruin some types of calculations! An "outer join" operation will fix this problem.
Using the outer join
Outer joins come in three distinct flavors:
LEFT OUTER JOIN (*=)RIGHT OUTER JOIN (=*)FULL OUTER JOIN
Knowing that the purpose of an outer join is to include the "missing" or unmatched rows, you can probably figure out what each of these flavors means. The LEFT, RIGHT, and FULL syntax all describe which of the table's unmatched columns to include. If we wanted to fix the CD collection example, we could change the inner join to the following outer join:
SELECT * FROM Artists, CDs WHERE Artists.ArtistID *= CDs.ArtistID
The LEFT OUTER JOIN operator ensures that all rows on the "left" side of the join, in this case the Artists table, will be included. It is important to note that "left" and "right" are completely dependent on the order of the tables in the SQL statement. The following SQL statements are all identical:
SELECT * FROM Artists, CDs WHERE Artists.ArtistID *= CDs.ArtistID
SELECT * FROM CDs, Artists WHERE CDs.ArtistID =* Artists.ArtistID
SELECT * FROM Artists LEFT OUTER JOIN CDs ON Artists.ArtistID = CDs.ArtistID
SELECT * FROM CDs RIGHT OUTER JOIN Artists ON CDs.ArtistID = Artists.ArtistID
And as you can probably gather, a "full outer join" includes all unmatched rows from both tables in the result.
The outer join is particularly useful for creating aggregate data reports. You could count the orders placed for each item in an inventory table using a SQL statement something like the following:
SELECT Inventory.InventoryID, COUNT(Orders.OrderID)
FROM Inventory LEFT OUTER JOIN Orders
ON Inventory.InventoryID = Orders.InventoryID
GROUP BY Inventory.InventoryID
This would count all of the orders for each inventory row, even those with no matching orders (this example assumes that the records in the Orders table contains orders for single inventory items). You could also use this technique to count the orders per customer, inventory per vendor, or any other type of relationship where it is useful to know that some rows in one table have no corresponding values.
Next steps
This week we took a look at the OUTER JOIN operator and some real-world examples. Next week, we jump into final basic type of join -- the self join. Until then, feel free to contact me with your comments and questions. Believe it or not, I read all of the email you send me and answer the vast majority of it! Some of the questions coming in will make it into future columns after we've covered the basics of SQL.
John Paul Ashenfelter is president and CTO of TransitionPoint.com, a technology development and analysis group focusing on web database applications.
Return to ONLamp.com.
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 17 of 17.
-
Great Tutorial - Thanks!
2007-02-26 23:10:50 RiteshFromIndia [Reply | View]
This is much much better than listening my boring lectures in the class .
Thanks Again!
-
Great Tutorial - Thanks!
2007-02-26 23:10:36 RiteshFromIndia [Reply | View]
This is much much better than listening my boring lectures in the class .
Thanks Again!
-
Very good tutorial
2006-11-30 06:04:32 sync_or_swim [Reply | View]
Thank you very much, this article did exactly what it says on the tin!!!! Very easy to understand!
-
Left Join
2006-02-22 07:43:25 RathnaAsh [Reply | View]
I'm using left join to display results when there is no value in the right table. But I'm not getting any records. I'm running the following query in SQL Server:
SELECT
a.prono, a.stateabrev, a.agname, isnull(count(distinct p.policy_number),0)uploadpolicies
FROM agency_info a LEFT OUTER JOIN policy p
ON a.prono = p.agency_code
Where a.prono = '4444032'
and isnull(p.status, '') = 'C' and isnull(p.state, '') like 'VT'
and isnull(p.POS_Status, '') <> ''
and ltrim(rtrim(len(isnull(p.policy_number, ''))))=7
and isnull(p.received_at_aig, '9999-12-31') between '2006-01-16' and '2006-02-10'
group by a.prono, a.stateabrev, a.agname
-
Query
2006-02-16 12:33:57 ShaktiArora [Reply | View]
Hi,i have a problem in outer join statements.
i am trying to use OR clause with outer join and not getting a required result.what would be the correct answer for the following question
My question is:
Display the last name, job id, department id, and department name for all employees who do not work in department 110. The result will include any employee whose department id does not exist in the department table.
please reply me as soon as you can because i have my exam tomorrow.
thankyou.
Regards,
Shakti Arora
-
thank u very much
2006-01-22 19:09:32 DB2KT [Reply | View]
This article helped me to know the basics of Outerjoins.
Once again thank u very much and we are expecting many more such an articles from u.
Thank u
Ram
-
Thanks!
2004-01-20 15:06:16 anonymous2 [Reply | View]
One of the few articles google returned about joins which actually helped me understand them.
-
Thanks!
2003-12-04 08:25:47 anonymous2 [Reply | View]
This article was very helpful to me. I hadn't done outer joins in a couple years but your article refreshed my memory.
thanks again,
Andrew
-
very nice articles....
2003-08-08 07:52:16 anonymous2 [Reply | View]
HI
i'm very new to these topics.. ur articles made
me confident. can u send these articles for me....
thanks allot....
mahesh@lcsi.net
-
Cooool Explanation
2003-05-23 08:55:53 anonymous2 [Reply | View]
I really like the way you have explained the concept of outer join
-
feedback
2002-11-20 21:32:38 anonymous2 [Reply | View]
this was a very good article. joins couldn't be any simpler from now on.
-
Missed out.
2002-11-10 09:05:12 anonymous2 [Reply | View]
The article was really covering the various possibilities of the outer join except one.
There can be a requirement when the records that are in excess in one of the tables alone should be listed. Is there any method to get the records that are present in Table 1 but not in Table 2.
Note:- This does not pertain to a single column, it should be applicable over a set of columns.
Eg:- Table 1 has cols 1 and 2. Table 2 has cols 1 and 2. Requirement is to list out the records in Table 1 which does not have a matching record for cols 1 and 2 (both taken together together) in Table 2. Thanks..
-
Joins
2002-09-30 11:20:13 anonymous2 [Reply | View]
The article on joins was good. I have a question regarding the performance .how joins affects the performance of application. Can I get some guidelines on this.Thank you.
My mail Id is:
ss_rekha@hotmail.com
Thank you.
-
Outer join with >1 condition
2002-06-25 09:33:11 jonathan.spencer@citex.com [Reply | View]
I need to do an outer join between two tables where the join uses two columns in each table - eg a.company=b.company AND a.project=b.project (I know, bad design - happily not by me ;-). I can't get this to work - I've tried various combinations of syntax - and I can't find any references to this type of join anywhere, all the examples use just one condition. Is that because it isn not possible, or just because it is unusual?
Any help much appreciated!
-
Left Outer Join
2001-12-24 07:16:59 shevyb [Reply | View]
I am having a problem with a sql statement which i am trying to run. I am doing a left outer join and then i am also doing 2 other inner joins in the same select statment. when i do all this together i get the correct amount of rows however the information from the table which i am outer joining to always comes up as null. I found that when i eliminate the other 2 inner joins then i get the information.
I would appreciate if you can help me with this.
thanks.






mysql> select * from girls full outer join boys where girls.city = boys.city;
where :
drop table girls;
drop table boys;
create table girls (name varchar(12), city varchar(12));
create table boys (name varchar(12), city varchar(12));
insert into girls values('Mary', 'Boston');
insert into girls values('Nancy', null);
insert into girls values('Susan', 'Chicago');
insert into girls values('Betty', 'Chicago');
insert into girls values('Anne', 'Denver');
insert into boys values('John', 'Boston');
insert into boys values('Henry', 'Boston');
insert into boys values('George', null);
insert into boys values('Sam', 'Chicago');
insert into boys values('James', 'Dallas');
If needed, I can go into insert and create for you!
BTW left and right work OK!
BTW what do P(T1,T2), P1(T1,T2) and R( mean in the following
SELECT * T1 LEFT JOIN T2 ON P1(T1,T2)
WHERE P(T1,T2) AND R(T2)
my academic theory book doesn't cover this!