Definition: An algorithm to code surnames phonetically by reducing them to the first letter and up to three digits, where each digit is one of six consonant sounds. This reduces matching problems from different spellings.
Generalization (I am a kind of ...)
phonetic coding algorithm.
See also double metaphone, Jaro-Winkler, Caverphone, NYSIIS, Levenshtein distance.
Note: The algorithm was devised to code names recorded in US census records. The standard algorithm works best on European names. Variants have been devised for names from other cultures.
Overview of Soundex.
If you have suggestions, corrections, or comments, please get in touch with Paul E. Black.
Entry modified 13 December 2010.
HTML page formatted Tue Dec 6 16:16:32 2011.
Cite this as:
Paul E. Black, "soundex", in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed., U.S. National Institute of Standards and Technology. 13 December 2010. (accessed TODAY) Available from: http://www.nist.gov/dads/HTML/soundex.html