string[] s = ["Hello", "brave", "new", "world"]; string[] t = ["Hello", "new", "world"]; auto simIter = gapWeightedSimilarityIncremental(s, t, 1.0); assert(simIter.front == 3); // three 1-length matches simIter.popFront(); assert(simIter.front == 3); // three 2-length matches simIter.popFront(); assert(simIter.front == 1); // one 3-length match simIter.popFront(); assert(simIter.empty); // no more match
Similar to gapWeightedSimilarity, just works in an incremental manner by first revealing the matches of length 1, then gapped matches of length 2, and so on. The memory requirement is O(s.length * t.length). The time complexity is O(s.length * t.length) time for computing each step. Continuing on the previous example:
The implementation is based on the pseudocode in Fig. 4 of the paper "Efficient Computation of Gapped Substring Kernels on Large Alphabets" by Rousu et al., with additional algorithmic and systems-level optimizations.