Handling multiple Grascii options
In the previous post we handled stacked affixes in Grascii notation. Another case to handle is the affixes that can use any of multiple forms, namely the prefix sub- and the suffix -self, which can both be represented by either S)
or S(
, as well as the suffix -selves, which can be represented by either S(S)
or S)S(
.
To handle multiple possibilities, we can use regex matches to split the string, then insert a collection where the match originally occurred. For that we can use the re-map
function I shared in this post:
(require '[clojure.string :as str])
(defn re-map [re f s]
(remove #{"" ::padding}
(interleave
(str/split s re)
(concat (map f (re-seq re s)) [::padding]))))
(re-map #"^sub" (constantly #{"S)" "S("}) "subject")
(#{"S)" "S("} "ject")
This complicates the grascii
function somewhat, but not unbearably. Instead of reducing a string, we'll reduce a sequence of either strings or sets, starting with a one-value sequence of the original word. Each replacement will map over the sequence and, if it finds a set, return it, or if it finds a string re-map
over it to replace matches with a set, then splice the resulting sequence into the sequence we're reducing:
(def replacements
'[,,,
sub- #{"S(" "S)"}
,,,
-selves #{"S(S)" "S)S("}
,,,])
(defn grascii [word]
(reduce (fn [segments [affix replacement]]
(mapcat (fn [segment]
(if (string? segment)
(re-map (gregg affix) (constantly replacement) segment)
[segment]))
segments))
[(str/lower-case word)]
(partition 2 replacements)))
(grascii "subject")
(#{"S)" "S("} "ject")
(grascii "themselves")
("them" #{"S)S(" "S(S)"})
That complexity allows us to handle multiple matches:
(grascii "subject themselves")
(#{"S)" "S("} "ject them" #{"S)S(" "S(S)"})
To get back a list of all possible combinations, we can use a Cartesian product. Here's a function taken from this StackOverflow answer:
(defn cartesian-product [colls]
(if (empty? colls)
[[]]
(for [more (cartesian-product (rest colls))
x (first colls)]
(cons x more))))
The final grascii
function wraps each string in the reduced sequence in a vector, creates a Cartesian product of all the collections in the sequence, and maps those products to strings:
(defn grascii [word]
(->> replacements
(partition 2)
(reduce (fn [segments [affix replacement]]
(mapcat (fn [segment]
(if (string? segment)
(re-map (gregg affix) (constantly replacement) segment)
[segment]))
segments))
[(str/lower-case word)])
(map (fn [segment]
(if (string? segment)
[segment]
segment)))
cartesian-product
(map (partial apply str))))
(grascii "subject themselves")
("S)ject themS)S("
"S(ject themS)S("
"S)ject themS(S)"
"S(ject themS(S)")