NIST

string matching

(classic problem)

Definition: The problem of finding occurrence(s) of a pattern string within another string or body of text. There are many different algorithms for efficient searching.

Also known as exact string matching, string searching, text searching.

Specialization (... is a kind of me.)
brute force string search, Knuth-Morris-Pratt algorithm, Boyer-Moore, Zhu-Takaoka, quick search, deterministic finite automata string search, Karp-Rabin, Shift-Or, Aho-Corasick, Smith algorithm.

See also string matching with errors, optimal mismatch, phonetic coding, string matching on ordered alphabets, suffix tree, inverted index.

Note: For large collections that are searched often, it may be far faster, though more complicated, to start with an inverted index. The name "exact string matching" is in contrast to string matching with errors.

Author: PEB

Implementation

Christian Charras' and Thierry Lecroq's Exact String Matching Algorithms (C), (C++ and Pascal), Strmat (C) - a collection of string matching and pattern discovery programs, (Fortran), (Fortran).
Go to the Dictionary of Algorithms and Data Structures home page.

If you have suggestions, corrections, or comments, please get in touch with Paul Black.

Entry modified 2 September 2014.
HTML page formatted Mon Feb 2 13:10:40 2015.

Cite this as:
Paul E. Black, "string matching", in Dictionary of Algorithms and Data Structures [online], Vreda Pieterse and Paul E. Black, eds. 2 September 2014. (accessed TODAY) Available from: http://www.nist.gov/dads/HTML/stringMatching.html