One question left unanswered was the computational complexity
of
the
`apply_permutation`

function.

Here it is again:

template<typename T> void apply_permutation( std::vector<T>& v, std::vector<int>& indices) { using std::swap; // to permit Koenig lookup for (size_t i = 0; i < indices.size(); i++) { auto current = i; while (i != indices[current]) { auto next = indices[current]; swap(v[current], v[next]); indices[current] = current; current = next; } indices[current] = current; } }

The outer `for`

loop runs `N` times;
that's easy to see.
The maximum number of iterations of the
inner `while`

loop is a bit less obvious,
but if you understood the discussion,
you'll see that it runs at most `N` times
because that's the maximum length of a cycle in the
permutation.
(Of course, this analysis requires that the `indices`

be a permutation of 0 … `N` − 1.)

Therefore,
a naïve analysis would conclude that this
has worst-case running time of `O`(`N`²)
because the outer `for`

loop runs `N` times,
and the inner `while`

loop also runs `N` times
in the worst case.

But it's not actually that bad.
The complexity is only `O`(`N`),
because not all of the worst-case scenarios can exist
simultaneously.

One way to notice this is to observe that each item
moves only once, namely to its final position.
Once an item is in the correct position, it never
moves again.
In terms of indices, observe that each swap corresponds
to an assignment `indices[current] = current`

.
Therefore, each entry in the index array gets set to its
final value.
And the `while`

loop doesn't iterate at all
if `indices[current] == current`

,
so when we revisit an element that has already moved
to its final location, we do nothing.

Since each item moves at most only once,
and there are `N` items, then the total number
of iterations of the inner `while`

loop
is at most `N`.

Another way of looking at this is that the
`while`

loop walks through every cycle.
But mathematics tell us that permutations decompose
into disjoint cycles,
so
the number of elements involved in the cycles cannot
exceed the total number of elements.

Either way, the conclusion is that there are most
`N` iterations of the inner `while`

loop
in total.
Combine this with the fixed overhead of `N` iterations
of the `for`

loop,
and you see that the total running time complexity is
`O`(`N`).

(We can determine via inspection that the algorithm consumes constant additional space.)

Friday was "part 2" and this is "part 4." Was "part 3" lost or is this just an off-by-one error?

Anyway, I've really enjoyed this recent series. Thanks!

They're actually numbered by powers of 2. Next installment would be "part 8".

Look a couple entries previous. This entry was titled "Applying a permutation to a vector". The Thursday and Friday articles were about a specific application of the general algorithm.

Raymond is interleaving the "Permutations of a Vector" series and the "Sorting by Indices" series. You have to read the whole title, not just the suffix.

Ah, thanks. I didn't realize they were being interleaved.

Another way to look at this is to observe that the permutation of a vector is actually a matrix-vector multiplication operation, with the matrix P being an NxN binary matrix such that P[i][indices[i]] = 1 (and zeros everywhere else). Then, the naive cost of the permutation is the cost of matrix-vector multiple, which is N^2. However, since this is a binary matrix, we don't need to multiply entries by 0, only those with a nonzero value. There is exactly one nonzero entry per row, hence the cost is only N. Vector permutations come up all the time in Linear Algebra operations such as Gaussian elimination with pivoting.

But it takes

O(N) operations to find that nonzero value in each row, so if you are using matrix multiplication, the cost isO(N²).Raymond, you don't need to search for the nonzero entry because we already have its location: The nonzero entry in row i indices[i]. (although if we carry the matrix analogy through , we'll need an additional O(N) storage to store the permuted vector.)

If you use a sparse matrix, you only have to operate on the n nonzero values, and it goes back to being O(n)

Noting that if you are allowed

O(N) additional storage, then the algorithm is trivial, as demonstrated in part 1.The while loop was a big hint that something was going on.