exupero's blog
RSSApps

Tips for long-running nREPL jobs

I've occasionally worked with moderate to large data-processing jobs in an nREPL, jobs that take long enough that the basic read-eval-print loop starts be cumbersome. Here are a few tricks I've learned to improve the experience.

Depending on the nREPL, a long-running evaluation may block evaluation of other forms and prevent doing other work until the evaluation finishes. To avoid that, wrap the long-running form in future and assign the result to a var. Instead of:

(something-that-runs-for-a-while)

use

(def result (future (something-that-runs-for-a-while)))

The nREPL will return immediately and the function will run in the background. The downside is that the nREPL will no longer print the result when execution finishes, thus notifying you that it's done, and you'll have to periodically run future-done? and get the result manually by dereferencing the var. On the other hand, you can also use future-cancel to terminate processing, which isn't always supported in nREPL clients.

The main problem using a future is that futures tend to be opaque: you can't see how far along your job is until it finishes. Depending on the type of job, a long-running future may have hung, and it would be difficult to know. To see if a job is actually making progress, you can use a simple counter:

(def counter (atom 0))
(def result
  (future
    (for [item items]
      (do
        (swap! counter inc)
        (process item)))))

Then, while processing runs in another thread, you can occasionally eval @counter to check if it's increasing or if processing has somehow gotten stuck. If processing does get stuck, you may be able to use the count of processed items to find the one that's a problem, but often processing logic is more complicated and it's helpful to have the most recently handled item itself:

(def most-recent-item (atom 0))
(def result
  (future
    (for [item items]
      (do
        (reset! most-recent-item item)
        (process item)))))

Or, instead of one item, you can collect intermediate results:

(def processed (atom []))
(def result
  (future
    (doseq [item items]
      (swap! processed conj (process item)))))

Collecting intermediate results allows you you to access processed items before the future completes, which is especially helpful if the future hits an error and terminates prematurely. Once you've handled the error, you can restart the job but skip the items you've already processed.

To me, affordances like this are a big argument in favor of REPL-based exploration. You can approximate these capabilities without a REPL by manually forking processes and flushing intermediate data to files that can be read from other threads, but the friction of serializing, deserializing, managing concurrent access, and resuming a script at the appropriate point is usually enough to reserve such workflows for only the most thorny situations. In Clojure, however, these tricks become almost too trivial to mention.