I keep some loosely structured habit and travel information in a set of EDN files. For a while I used an actual database, but I quickly discovered that I spent a lot of time tinkering with the UI for inputting data. I also tried spreadsheets, but the data is sparse and was awkward in a tabular format. Also, spreadsheets aren't very easy to query. The scheme I've settled on is to have a folder of EDN files that contain Clojure code, which when evaluated produce a list of maps.
To parse an individual file, I wrap its contents in square brackets and read it as a vector of EDN forms:
(defn parse-file [s]
(clojure.edn/read-string {} (format "[%s]" s)))
Before evaluating a file, I bind the date in the filename to a dynamic var, then I evaluate the EDN forms as if they were Clojure code:
(def ^:dynamic *date*)
(defn eval-file [nmsp file]
(binding [*date* (re-find #"\d{4}-\d{2}-\d{2}" (.getName file))]
(->> (slurp file)
parse-file
(mapv (fn [form]
(binding [*ns* nmsp]
(eval form))))
(remove var?)
(mapv #(cond-> %
(not (:date %)) (assoc :date *date*))))))
In reality, EDN is only a subset of Clojure syntax. I can't use Clojure's reader macros, such as @
for dereferencing atoms, nor can I create anonymous functions with #(...)
. But non-reader macros such as def
and defn
do work, so after evaluating forms I discard vars. Finally, for any maps without a :date
field, I add the date given by the filename.
A lot of power hides in that eval
. Though I'm generating a list of maps, I typically don't write out the maps explicitly; instead, having Clojure available, I define functions that return maps, then call those functions. For example, instead of writing
{:lat 43 :lon -85}
I'll write
(defn location [lat lon]
{:lat lat :lon lon})
(location 43 -85)
Defining and invoking functions like this keeps the maps for specific pieces of information structured consistently, and allows me to easily change details of a map, such as changing the names of keys (e.g., changing :lat
to :latitude
). It also avoids typos in the names of keys, which eval
ing doesn't detect, but eval
will throw an error if I mistype the name of a function.
Using this style, I've built up a DSL of common entries. A typical file looks something like this:
(location 43 -85)
(outdoors 1030 1130)
(-> (walk)
(route "park.geojson")
(note "warm day, unusually busy"))
(no-tv)
Functions like route
and note
are modifier functions that add optional data to maps, and having most of the power of Clojure makes it easy to chain modifiers with ->
.
Another subtlety in eval-file
is that forms can refer to *date*
. I don't reference it directly, but I do have tag-reader macros called #yesterday
and #tomorrow
that read *date*
and rebind it to the date before or after, in case I want to refer to events that extend beyond the date named by the file. To use them, I supply :readers
in parse-string
:
(defn parse-string [s]
(clojure.edn/read-string
{:readers {'yesterday yesterday
'tomorrow tomorrow}}
(format "[%s]" s)))
Here's the main entry point, which reads the files in a directory:
(defn read-data [directory]
(let [nmsp *ns*]
(->> (file-seq (clojure.java.io/file directory))
(filter #(re-find #"\d{4}-\d{2}-\d{2}" (.getName %)))
(sort-by #(.getName %))
(mapcat #(eval-file nmsp %)))))
Note that the files are evaluated in the current namespace, which means a function defined within one file is available in all subsequent files. Alternatively, you can group function definitions in a library file and evaluate it before processing the data files.
Querying this DB is as easy as getting a list of maps from read-data
and filtering them with Clojure's built-in collection functions. The above code can all be run in Babashka, so I have a suite of bb
tasks that print reports on various aspects of the data (such as how much time I spend outdoors). Those plain-text reports can then be piped to commands that generate data visualizations.
Overall, this setup allows me to have a low-maintenance log that produces well-structured, queryable data, yet is also sufficiently human-readable.