SQL Server Physical Joins

SQL Server Physical Join (Nested Loops joins, Merge joins, Hash joins)


Advanced Query Tuning Concepts
https://technet.microsoft.com/en-us/library/ms191426(v=sql.105).aspx

SQL Server employs three types of join operations:

  • Nested loops joins
  • Merge joins
  • Hash joins

 

Nest Loops Joins

If one join input is small (fewer than 10 rows) and the other join input is fairly large and indexed on its join columns, an index nested loops join is the fastest join operation because they require the least I/O and the fewest comparisons. For more information about nested loops, see Understanding Nested Loops Joins.

Nested Loops Join
https://blogs.msdn.microsoft.com/craigfr/2006/07/26/nested-loops-join/
OPTIMIZED Nested Loops Joins
https://blogs.msdn.microsoft.com/craigfr/2009/03/18/optimized-nested-loops-joins/

Merge Joins

If the two join inputs are not small but are sorted on their join column (for example, if they were obtained by scanning sorted indexes), a merge join is the fastest join operation. If both join inputs are large and the two inputs are of similar sizes, a merge join with prior sorting and a hash join offer similar performance. However, hash join operations are often much faster if the two input sizes differ significantly from each other. For more information, see Understanding Merge Joins.

Merge Join
https://blogs.msdn.microsoft.com/craigfr/2006/08/03/merge-join/

Hash Joins

Hash joins can efficiently process large, unsorted, nonindexed inputs. They are useful for intermediate results in complex queries because:

  • Intermediate results are not indexed (unless explicitly saved to disk and then indexed) and often are not suitably sorted for the next operation in the query plan.
  • Query optimizers estimate only intermediate result sizes. Because estimates can be very inaccurate for complex queries, algorithms to process intermediate results not only must be efficient, but also must degrade gracefully if an intermediate result turns out to be much larger than anticipated.

The hash join allows reductions in the use of denormalization. Denormalization is typically used to achieve better performance by reducing join operations, in spite of the dangers of redundancy, such as inconsistent updates. Hash joins reduce the need to denormalize. Hash joins allow vertical partitioning (representing groups of columns from a single table in separate files or indexes) to become a viable option for physical database design. For more information, see Understanding Hash Joins.

Hash Join
https://blogs.msdn.microsoft.com/craigfr/2006/08/10/hash-join/