By using our site you agree to our use of cookies to deliver a better site experience. Find out more here.

Phonics in JSS

Monday October 12, 2020

• phonics • articles • R • XSEDE • data science • demography • computational linguistics •

I am very happy to say my new article, “Phonetic Spelling Algorithm Implementations for R,” was published in the Journal of Statistical Software, this morning. This article is the culmination of a few years of work, off-and-on, working with phonetic spellign algorithms for record linkage. The package supports a wide variety of historical and current phonetic spelling algorithms:

Caverphone
- Original Caverphone
- Caverphone 2
Cologne (Kölner)
Lein
Match Rating Approach
- Encoder
- Comparison
Metaphone
New York State Identification and Intelligence System
- NYSIIS
- Modified NYSIIS
Oxford Name Compression Algorithm
Phonex
Roger Root
Soundex
- Original Soundex
- Apache Refined Soundex
Statistics Canada
- Census Modified

It’s been pleasing to see an occasional user send me an email because they been putting it to use. It all started because I needed a high-speed soundex and I just let it grow from there. This yak may not be shaved yet. Maybe a version in Rust, next…

This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562. In particular, it used the Comet system at the San Diego Supercomputing Center (SDSC) through allocations TG-DBS170012 and TG-ASC150024.

Freedom for Ukraine! Stop Russian aggression!