MySQL and Full Outer Join
In SQL, a full outer join is a join that returns all records from both tables, wheter there’s a match or not:
unfortunately MySQL doesn’t support this kind of join, and it has to be emulated somehow. But how?
In SQL it happens often that the same result can be achieved in different ways, and which way is the most appropriate is just a matter of taste (or of perormances).
But this time the question is a little controversial, even on StackOverflow not everyone agrees and the solution marked as correct isn’t actually the correct one.
Suppose we have the following tables:
Customers:
company_id | name |
---|---|
1 | Abc Company |
2 | Noise Inc. |
3 | Mr. Smith |
Partners:
company_id | name |
---|---|
2 | Noise Inc. |
4 | The Pages |
a full outer join would be written as:
and the expected result is:
name | name |
---|---|
Abc Company | |
Noise Inc. | Noise Inc. |
Mr. Smith | |
The Pages |
to get the same result we have to combine a left outer join query:
with a right outer join query:
(the right join is quite uncommon because it’s more difficult to read, and is equivalent to a left join with the order of the tables switched).
We could combine both queries with an UNION ALL
clause, but this would
return some duplicates (all rows where the join succeeds will be returned
twice).
We could then use an UNION
clause which will remove duplicates, but it
will fail if one of the table has no primary key or unique constraints, or
if the selected columns are not unique.
We could also use a different approach:
which is often a good solution, but would fail if we allow the company_id to be NULL in one or both tables (a full outer join will return those rows, while the previous one won’t).
The most general solution is this:
duplicates, if already present on the source tables, won’t be removed. And the anti-join pattern on the second query assures that we are not introducing new duplicated rows.