schwartzSort

Alternative sorting method that should be used when comparing keys involves an expensive computation.

Instead of using less(a, b) for comparing elements, schwartzSort uses less(transform(a), transform(b)). The values of the transform function are precomputed in a temporary array, thus saving on repeatedly computing it. Conversely, if the cost of transform is small compared to the cost of allocating and filling the precomputed array, sort may be faster and therefore preferable.

This approach to sorting is akin to the Schwartzian transform, also known as the decorate-sort-undecorate pattern in Python and Lisp. The complexity is the same as that of the corresponding sort, but schwartzSort evaluates transform only r.length times (less than half when compared to regular sorting). The usage can be best illustrated with an example.

  1. SortedRange!(R, ((a, b) => binaryFun!less(unaryFun!transform(a), unaryFun!transform(b)))) schwartzSort(R r)
    SortedRange!(R, (
    (
    a
    ,
    b
    )
    => binaryFun!less(unaryFun!transform(a), unaryFun!transform(b))
    )
    )
    schwartzSort
    (
    alias transform
    alias less = "a < b"
    R
    )
    (
    R r
    )
  2. auto schwartzSort(R r)

Parameters

transform

The transformation to apply. Either a unary function (unaryFun!transform(element)), or a binary function (binaryFun!transform(element, index)).

less

The predicate to sort the transformed elements by.

ss

The swapping strategy to use.

r R

The range to sort.

Return Value

Type: SortedRange!(R, (
(
a
,
b
)
=> binaryFun!less(unaryFun!transform(a), unaryFun!transform(b))
)
)

The initial range wrapped as a SortedRange with the predicate (a, b) => binaryFun!less(transform(a), transform(b)).

Examples

uint hashFun(string) { ... expensive computation ... }
string[] array = ...;
// Sort strings by hash, slow
sort!((a, b) => hashFun(a) < hashFun(b))(array);
// Sort strings by hash, fast (only computes arr.length hashes):
schwartzSort!(hashFun, "a < b")(array);

The schwartzSort function might require less temporary data and be faster than the Perl idiom or the decorate-sort-undecorate idiom present in Python and Lisp. This is because sorting is done in-place and only minimal extra data (one array of transformed elements) is created.

To check whether an array was sorted and benefit of the speedup of Schwartz sorting, a function schwartzIsSorted is not provided because the effect can be achieved by calling isSorted!less(map!transform(r)).

import std.algorithm.iteration : map;
import std.numeric : entropy;

auto lowEnt = [ 1.0, 0, 0 ],
     midEnt = [ 0.1, 0.1, 0.8 ],
    highEnt = [ 0.31, 0.29, 0.4 ];
auto arr = new double[][3];
arr[0] = midEnt;
arr[1] = lowEnt;
arr[2] = highEnt;

schwartzSort!(entropy, "a > b")(arr);

assert(arr[0] == highEnt);
assert(arr[1] == midEnt);
assert(arr[2] == lowEnt);
assert(isSorted!("a > b")(map!(entropy)(arr)));

Meta