In the previous post we used regular expressions to rewrite word affixes in Grascii notation. Our solution mostly works, but there are a couple further considerations.
One case we didn't explore is when a word has both a disjoined affix and a joined affix on the same end. For example, the word relationship has the disjoined suffix -ship as well as the joined suffix -tion. Our regexes only match at the beginning and end of words, so relationship becomes relation^SH
instead of relaSH^SH
. We can account for disjoined Grascii affixes when we generate regexes:
(defn gregg [s]
(-> (name s)
(str/replace #"\*[aeiouy]|[aeiouy]\*" "[aeiouy]+")
(str/replace #"^(.+)-$" "^([A-Z()|',._~-]+\\\\^)?$1")
(str/replace #"^-(.+)$" "$1(\\\\^[A-Z()|',._~-]+)?\\$")
re-pattern))
This is another reason to denote Grascii forms with upper-case, distinct from the original word's lower-case characters.
With this change we can match prefixes that are preceded by Grascii and suffixes that are followed by Grascii:
(gregg '-tion)
#"tion(\^[A-Z()|',._~-]+)?$"
(re-find (gregg '-tion) "relation^SH")
["tion^SH" "^SH"]
We want to include any captured affix and include it in the replacement, which we can do with $1
in the replacement string:
(str/replace "relation^SH" (gregg '-tion) "SH$1")
"relaSH^SH"
Here's a revised grascii
function:
(defn grascii [word]
(reduce (fn [word [affix replacement]]
(str/replace word (gregg affix)
(str replacement "$1")))
(str/lower-case word)
(partition 2 replacements)))
(grascii "relationship")
"relaSH^SH"
While this works for suffixes, prefixes end up in the wrong order:
(grascii "counterdecree")
"DK^cree"
We also have to check for whether an affix is a prefix or a suffix and put the captured group on the correct side in the replacement string:
(defn grascii [word]
(reduce (fn [word [affix replacement]]
(str/replace word (gregg affix)
(if (str/ends-with? (name affix) "-")
(str "$1" replacement)
(str replacement "$1"))))
(str/lower-case word)
(partition 2 replacements)))
(grascii "counterdecree")
"K^Dcree"