exupero's blog
RSSApps

Improved re-map

While writing the previous post I discovered an edge case not handled by the re-map function (originally introduced in this post). If the entire string matches the regex, we get an empty sequence:

(defn re-map [re f s]
  (remove #{"" ::padding}
    (interleave
      (str/split s re)
      (concat (map f (re-seq re s)) [::padding]))))
(re-map #"hello" (constantly :match) "hello")
()

This happens because clojure.string/split produces an empty sequence:

(str/split "hello" #"hello")
[]

It doesn't matter that re-seq finds matches:

(re-seq #"hello" "hello")
("hello")

because interleave quits when it reaches the end of the shortest sequence.

To fix this, we can handle that edge-case explicitly:

(defn re-map [re f s]
  (if (re-matches re s)
    [(f s)]
    (remove #{"" ::padding}
      (interleave
        (str/split s re)
        (concat (map f (re-seq re s)) [::padding])))))
(re-map #"hello" (constantly :match) "hello")
[:match]

re-matches only returns non-nil when the entire string matches the regex; the ^ and $ regex operators aren't necessary.