In the previous post we handled stacked affixes in Grascii notation. Another case to handle is the affixes that can use any of multiple forms, namely the prefix sub- and the suffix -self, which can both be represented by either S)
or S(
, as well as the suffix -selves, which can be represented by either S(S)
or S)S(
.
To handle multiple possibilities, we can use regex matches to split the string, then insert a collection where the match originally occurred. For that we can use the re-map
function I shared in this post:
(require '[clojure.string :as str])
(defn re-map [re f s]
(remove #{"" ::padding}
(interleave
(str/split s re)
(concat (map f (re-seq re s)) [::padding]))))
(re-map #"^sub" (constantly #{"S)" "S("}) "subject")
(#{"S)" "S("} "ject")
This complicates the grascii
function somewhat, but not unbearably. Instead of reducing a string, we'll reduce a sequence of either strings or sets, starting with a one-value sequence of the original word. Each replacement will map over the sequence and, if it finds a set, return it, or if it finds a string re-map
over it to replace matches with a set, then splice the resulting sequence into the sequence we're reducing:
(def replacements
'[,,,
sub- #{"S(" "S)"}
,,,
-selves #{"S(S)" "S)S("}
,,,])
(defn grascii [word]
(reduce (fn [segments [affix replacement]]
(mapcat (fn [segment]
(if (string? segment)
(re-map (gregg affix) (constantly replacement) segment)
[segment]))
segments))
[(str/lower-case word)]
(partition 2 replacements)))
(grascii "subject")
(#{"S)" "S("} "ject")
(grascii "themselves")
("them" #{"S)S(" "S(S)"})
That complexity allows us to handle multiple matches:
(grascii "subject themselves")
(#{"S)" "S("} "ject them" #{"S)S(" "S(S)"})
To get back a list of all possible combinations, we can use a Cartesian product. Here's a function taken from this StackOverflow answer:
(defn cartesian-product [colls]
(if (empty? colls)
[[]]
(for [more (cartesian-product (rest colls))
x (first colls)]
(cons x more))))
The final grascii
function wraps each string in the reduced sequence in a vector, creates a Cartesian product of all the collections in the sequence, and maps those products to strings:
(defn grascii [word]
(->> replacements
(partition 2)
(reduce (fn [segments [affix replacement]]
(mapcat (fn [segment]
(if (string? segment)
(re-map (gregg affix) (constantly replacement) segment)
[segment]))
segments))
[(str/lower-case word)])
(map (fn [segment]
(if (string? segment)
[segment]
segment)))
cartesian-product
(map (partial apply str))))
(grascii "subject themselves")
("S)ject themS)S(" "S(ject themS)S(" "S)ject themS(S)" "S(ject themS(S)")