Soundex Matching
From Erlang Community
Problem
You want to generate Soundex hashes of surnames, for doing "sounds-like" indexing databases, or retrieving information from the US Census records and similar pre-existing databases.
Solution
Note: This library does not exist yet. Scheme data shown for the time being:
Use the soundex library:
> (soundex "Smith") "S530" > (soundex "Smyth") "S530" |
Both current NARA Soundex and "old" Soundex are supported (soundex is an alias for soundex-nara):
> (soundex-nara "Ashcraft") "A261" > (soundex-old "Ashcraft") "A226" |
Multiple Soundex keys based on prefix-skipping can be generated with the soundex-nara/prefixing, soundex-old/prefixing, and soundex/p procedures:
> (soundex/p "vanderlinden")
("V536" "D645" "L535")
|
Soundex is a string hash historically used by the US Census for indexing surnames by a function of what they "sound" like, rather than their precise spelling. Further general information on Soundex is available at http://www.archives.gov/research_room/genealogy/census/soundex.html.
Soundex keys are represented as four-character strings, therefore the equal? procedure can be used to compare them:
> (equal? (soundex "Johnson") (soundex "Jackson")) #f > (equal? (soundex "Johnson") (soundex "JANZEN")) #t |
This doesn't apply to Erlang, and is only here as a placeholder until the library is implemented. Coming to a Jungerl near you...

Digg It
Del.icio.us
Reddit
Facebook
Stumble Upon
Technorati

