The Lein name coding procedure.
lein(word, maxCodeLen = 4, clean = TRUE)
word | string or vector of strings to encode |
---|---|
maxCodeLen | maximum length of the resulting encodings, in characters |
clean | if |
the Lein encoded character vector
The variable word
is the name to be encoded. The variable
maxCodeLen
is the limit on how long the returned name code
should be. The default is 4.
The lein
algorithm is only defined for inputs over the
standard English alphabet, i.e., "A-Z.". Non-alphabetical
characters are removed from the string in a locale-dependent fashion.
This strips spaces, hyphens, and numbers. Other letters, such as
"Ü," may be permissible in the current locale but are unknown to
lein
. For inputs outside of its known range, the output is
undefined and NA
is returned and a warning
this thrown.
If clean
is FALSE
, lein
attempts to process the
strings. The default is TRUE
.
James P. Howard, II, "Phonetic Spelling Algorithm Implementations for R," Journal of Statistical Software, vol. 25, no. 8, (2020), p. 1--21, <10.18637/jss.v095.i08>.
Billy T. Lynch and William L. Arends. "Selection of surname coding procedure for the SRS record linkage system." United States Department of Agriculture, Sample Survey Research Branch, Research Division, Washington, 1977.
Other phonics:
caverphone()
,
cologne()
,
metaphone()
,
mra_encode()
,
nysiis()
,
onca()
,
phonex()
,
phonics()
,
rogerroot()
,
soundex()
,
statcan()
lein("William")
#> [1] "W320"
lein(c("Peter", "Peady"))
#> [1] "P130" "P100"
lein("Stevenson", maxCodeLen = 8)
#> [1] "S1425200"