String convolution between vectors of integers representing a pattern and a text is a widely used computational primitive in string processing. In this paper, we investigate the use of an algorithmic framework which exploits sequence repetitions (identified according to the Lempel-Ziv parsing technique, i.e., the LZ78 algorithm) to speed up conventional algorithms (based on Fast Fourier Transform) for the computation of convolution between a pattern and a text, when the text is long enough and the pattern is sufficiently small. In particular, we present a deterministic algorithm which, given a text T of length n (drawn from a constant size alphabet |SigmaT|) and a pattern P of length m (drawn from a constant size alphabet |SigmaP|), computes the convolution between P and T with time and space complexity O(n + (nm/logn)h), where h is the entropy of text T .
A faster algorithm for the computation of string convolutions using LZ78 parsing
FRESCHI, VALERIO;BOGLIOLO, ALESSANDRO
2010
Abstract
String convolution between vectors of integers representing a pattern and a text is a widely used computational primitive in string processing. In this paper, we investigate the use of an algorithmic framework which exploits sequence repetitions (identified according to the Lempel-Ziv parsing technique, i.e., the LZ78 algorithm) to speed up conventional algorithms (based on Fast Fourier Transform) for the computation of convolution between a pattern and a text, when the text is long enough and the pattern is sufficiently small. In particular, we present a deterministic algorithm which, given a text T of length n (drawn from a constant size alphabet |SigmaT|) and a pattern P of length m (drawn from a constant size alphabet |SigmaP|), computes the convolution between P and T with time and space complexity O(n + (nm/logn)h), where h is the entropy of text T .I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.