MySQL and Full Outer Join
In SQL, a full outer join is a join that returns all records from both tables, wheter there’s a match or not:
unfortunately MySQL doesn’t support this kind of join, and it has to be emulated somehow. But how?
In SQL it happens often that the same result can be achieved in different ways, and which way is the most appropriate is just a matter of taste (or of perormances).
But this time the question is a little controversial, even on StackOverflow not everyone agrees and the solution marked as correct isn’t actually the correct one.
Suppose we have the following tables:
a full outer join would be written as:
select c.name, p.name from customers c full outer join partners p on c.company_id = p.company_id
and the expected result is:
|Noise Inc.||Noise Inc.|
to get the same result we have to combine a left outer join query:
select c.name, p.name from customers c left join partners p on c.company_id = p.company_id
with a right outer join query:
select c.name, p.name from customers c right join partners p on c.company_id = p.company_id
(the right join is quite uncommon because it’s more difficult to read, and is equivalent to a left join with the order of the tables switched).
We could combine both queries with an
UNION ALL clause, but this would
return some duplicates (all rows where the join succeeds will be returned
We could then use an
UNION clause which will remove duplicates, but it
will fail if one of the table has no primary key or unique constraints, or
if the selected columns are not unique.
We could also use a different approach:
select c.name, t.name from (select company_id from customers UNION select company_id from partners) n left join customers c on n.company_id = c.company_id left join partners p on n.company_id = p.company_id
which is often a good solution, but would fail if we allow the company_id to be NULL in one or both tables (a full outer join will return those rows, while the previous one won’t).
The most general solution is this:
select c.name, p.name from customers c left join partners p on c.company_id = p.company_id union all -- don't remove duplicates select c.name, p.name from customers c right join partners p on c.company_id = p.company_id where c.company_id is null
duplicates, if already present on the source tables, won’t be removed. And the anti-join pattern on the second query assures that we are not introducing new duplicated rows.