The function metaphone phonentically encodes the given string using the metaphone algorithm.

metaphone(word, maxCodeLen = 10L, clean = TRUE)

Arguments

word

string or vector of strings to encode

maxCodeLen

maximum length of the resulting encodings, in characters

clean

if TRUE, return NA for unknown alphabetical characters

Value

a character vector containing the metaphones of word, or an NA if the word value is NA

Details

There is some discrepency with respect to how the metaphone algorithm actually works. For instance, there is a version in the Java Apache Commons library. There is a version provided within PHP. These do not provide the same results. On the questionable theory that the implementation in PHP is probably more well known, this code should match it in output.

This implementation is based on a Javascript implementation which is itself based on the PHP internal implementation.

The variable maxCodeLen is the limit on how long the returned metaphone should be.

The metaphone algorithm is only defined for inputs over the standard English alphabet, i.e., "A-Z.". Non-alphabetical characters are removed from the string in a locale-dependent fashion. This strips spaces, hyphens, and numbers. Other letters, such as "Ü," may be permissible in the current locale but are unknown to metaphone. For inputs outside of its known range, the output is undefined and NA is returned and a warning this thrown. If clean is FALSE, metaphone attempts to process the strings. The default is TRUE.

References

James P. Howard, II, "Phonetic Spelling Algorithm Implementations for R," Journal of Statistical Software, vol. 25, no. 8, (2020), p. 1--21, <10.18637/jss.v095.i08>.

See also

Other phonics: caverphone(), cologne(), lein(), mra_encode(), nysiis(), onca(), phonex(), phonics(), rogerroot(), soundex(), statcan()

Examples

metaphone("wheel")
#> [1] "WL"
metaphone(c("school", "benji"))
#> [1] "SXL" "BNJ"