Definition: (1) The smallest number of insertions, deletions, and substitutions required to change one string or tree into another. (2) A Θ(m × n) algorithm to compute the distance between strings, where m and n are the lengths of the strings.
Also known as edit distance.
Generalization (I am a kind of ...)
string matching with errors.
Aggregate child (... is a part of or used in me.)
See also double metaphone, soundex, Jaro-Winkler, Hamming distance.
Note: (1) From Algorithms and Theory of Computation Handbook, page 14-35, Copyright © 1999 by CRC Press LLC. Appearing in the Dictionary of Computer Science, Engineering and Technology, Copyright © 2000 CRC Press LLC.
Wikipedia entry which has links to implementations. March 2003 pictures of Levenshtein at a reception.
Vladimir I. Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, Doklady Akademii Nauk SSSR, 163(4):845-848, 1965 (Russian). English translation in Soviet Physics Doklady, 10(8):707-710, 1966.
(Doklady is Russian for "Report". Sometimes transliterated in English as Doclady or Dokladi.)
If you have suggestions, corrections, or comments, please get in touch with Paul Black.
Entry modified 22 June 2015.
HTML page formatted Mon Jul 20 14:31:11 2015.
Cite this as:
Algorithms and Theory of Computation Handbook, CRC Press LLC, 1999, "Levenshtein distance", in Dictionary of Algorithms and Data Structures [online], Vreda Pieterse and Paul E. Black, eds. 22 June 2015. (accessed TODAY) Available from: http://www.nist.gov/dads/HTML/Levenshtein.html