GapWeightedSimilarityIncremental

Similar to gapWeightedSimilarity, just works in an incremental manner by first revealing the matches of length 1, then gapped matches of length 2, and so on. The memory requirement is O(s.length * t.length). The time complexity is O(s.length * t.length) time for computing each step. Continuing on the previous example:

The implementation is based on the pseudocode in Fig. 4 of the paper "Efficient Computation of Gapped Substring Kernels on Large Alphabets" by Rousu et al., with additional algorithmic and systems-level optimizations.

struct GapWeightedSimilarityIncremental (
Range
F = double
) if (
isRandomAccessRange!(Range) &&
hasLength!(Range)
) {}

Constructors

this
this(Range s, Range t, F lambda)

Constructs an object given two ranges s and t and a penalty lambda. Constructor completes in O(s.length * t.length) time and computes all matches of length 1.

Members

Functions

opSlice
GapWeightedSimilarityIncremental opSlice()
popFront
void popFront()

Computes the match of the popFront length. Completes in O(s.length * t.length) time.

Properties

empty
bool empty [@property getter]
front
F front [@property getter]

Examples

string[] s = ["Hello", "brave", "new", "world"];
string[] t = ["Hello", "new", "world"];
auto simIter = gapWeightedSimilarityIncremental(s, t, 1.0);
assert(simIter.front == 3); // three 1-length matches
simIter.popFront();
assert(simIter.front == 3); // three 2-length matches
simIter.popFront();
assert(simIter.front == 1); // one 3-length match
simIter.popFront();
assert(simIter.empty);     // no more match

Meta