This one is small, and the project around it is not mine. Holding Time is a public-art project on grief, breath, and tenderness, led by artists I’m not, taken to Bradford for the Ablaze in Bradford exhibition. My role was consultation and a single small tool, nothing more. The page is here because the tool was a useful early experiment in what would later become a much bigger pattern in my work.
What the tool was
A Google Apps Script extension wired into the response sheet the project was using to collect visitor reflections. The extension exposed a custom sheet function and a small menu: select a range of responses, pick a prompt (summarise themes, extract verbatim quotes worth keeping, cluster by emotional tone, flag responses worth a longer read), and the script would batch the calls to the OpenAI API and write the result into the destination cells.
That’s it. No bespoke interface, no dashboard, no AI evaluation framework. A few hundred lines of Apps Script, an API key in the script properties, a one-page cheat sheet for the team on how to use the menu, and a deliberate refusal to over-build.
Why this was a thing in 2024
In mid-2024, ChatGPT-via-API was new enough that using it to read survey data felt slightly magical to anyone who hadn’t seen it done before. The interesting thing wasn’t the model; it was the change in what was easy. Survey processing used to mean either (a) read every response yourself, or (b) hand them to a researcher with the time to code them. The thin layer between Google Sheets and the API turned it into a third option: read across the responses at speed, then go back and read the individual ones the model flagged as interesting.
The tool wasn’t trying to replace the team’s reading. It was trying to make the first pass fast enough that the team had time and energy left for the second pass. That distinction matters. Most of the worst applied-AI work I’ve seen tries to skip the second pass.
What it surfaced
The team used the extension across the Bradford run. It was good at the things you’d expect a 2024 LLM to be good at (clustering, pulling verbatim quotes, identifying recurring nouns and emotional registers) and bad at the things you’d expect it to be bad at (judging which responses were genuinely the most interesting vs which simply used the most distinctive language). The team kept doing the second pass themselves. That was the design.
What I took from it
The pattern (thin custom UI over an LLM, doing one job for a real team, deliberately small) is one I’ve reused several times since. The honest version of the page is this: I built a small tool for someone else’s project, and I learned that the LLM-over-spreadsheet pattern is more useful than it looks once you accept that the model is doing the boring 80 per cent of the read and the humans are doing the interesting 20.
For more on the project itself, see Holding Time’s own write-up of the Bradford exhibition. For more on the pattern, see Everything I vibe-coded this year.