exupero's blog
RSSApps

Handling multiple Grascii options

In the previous post we handled stacked affixes in Grascii notation. Another case to handle is the affixes that can use any of multiple forms, namely the prefix sub- and the suffix -self, which can both be represented by either S) or S(, as well as the suffix -selves, which can be represented by either S(S) or S)S(.

To handle multiple possibilities, we can use regex matches to split the string, then insert a collection where the match originally occurred. For that we can use the re-map function I shared in this post:

(require '[clojure.string :as str])
(defn re-map [re f s]
  (remove #{"" ::padding}
    (interleave
      (str/split s re)
      (concat (map f (re-seq re s)) [::padding]))))
(re-map #"^sub" (constantly #{"S)" "S("}) "subject")
(#{"S)" "S("} "ject")

This complicates the grascii function somewhat, but not unbearably. Instead of reducing a string, we'll reduce a sequence of either strings or sets, starting with a one-value sequence of the original word. Each replacement will map over the sequence and, if it finds a set, return it, or if it finds a string re-map over it to replace matches with a set, then splice the resulting sequence into the sequence we're reducing:

(def replacements
  '[,,,
    sub- #{"S(" "S)"}
    ,,,
    -selves #{"S(S)" "S)S("}
    ,,,])
(defn grascii [word]
  (reduce (fn [segments [affix replacement]]
            (mapcat (fn [segment]
                      (if (string? segment)
                        (re-map (gregg affix) (constantly replacement) segment)
                        [segment]))
                 segments))
          [(str/lower-case word)]
          (partition 2 replacements)))
(grascii "subject")
(#{"S)" "S("} "ject")
(grascii "themselves")
("them" #{"S)S(" "S(S)"})

That complexity allows us to handle multiple matches:

(grascii "subject themselves")
(#{"S)" "S("} "ject them" #{"S)S(" "S(S)"})

To get back a list of all possible combinations, we can use a Cartesian product. Here's a function taken from this StackOverflow answer:

(defn cartesian-product [colls]
  (if (empty? colls)
    [[]]
    (for [more (cartesian-product (rest colls))
          x (first colls)]
      (cons x more))))

The final grascii function wraps each string in the reduced sequence in a vector, creates a Cartesian product of all the collections in the sequence, and maps those products to strings:

(defn grascii [word]
  (->> replacements
       (partition 2)
       (reduce (fn [segments [affix replacement]]
                 (mapcat (fn [segment]
                           (if (string? segment)
                             (re-map (gregg affix) (constantly replacement) segment)
                             [segment]))
                      segments))
               [(str/lower-case word)])
       (map (fn [segment]
              (if (string? segment)
                [segment]
                segment)))
       cartesian-product
       (map (partial apply str))))
(grascii "subject themselves")
("S)ject themS)S("
 "S(ject themS)S("
 "S)ject themS(S)"
 "S(ject themS(S)")