Substring index

In computer science, a substring index is a data structure which gives substring search in a text or text collection in sublinear time. If you have a document S of length n, or a set of documents D=\{S^1,S^2, \dots, S^d\} of total length n, you can locate all occurrences of a pattern P in o(n) time. (See Big O notation.)

The phrase full-text index is also often used for an index of all substrings of a text. But is ambiguous, as it is also used for regular word indexes such as inverted files and document retrieval. See full text search.

Substring indexes include:

References

  1. R. Grossi and J. S. Vitter, Compressed Suffix Arrays and Suffix Trees, with Applications to Text Indexing and String Matching, SIAM Journal on Computing, 35(2), 2005, 378-407.
This article is issued from Wikipedia - version of the 1/5/2014. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.