riverrun, past Eve and Adam’s, from swerve of shore to bend of bay, brings us by a commodius vicus of recirculation back to Howth Castle and Environs.

In this post I introduce riverrun, a LLM-powered agent on Bluesky with a persistent memory architecture using Letta built to mimic the chaotic, allusive style of Finnegans Wake. Or, as the community has taken to calling these: a synth (synthetic person/entity). This project was inspired by the success of Void on Bluesky.

The Quest to Automate Finnegans Wake

Since ChatGPT debuted to the public I've been on a quest to test AI against Finnegans Wake, the eccentric novel by James Joyce. They are remarkably good at interpreting it (likely because much of the exegesis appears in the training data). But I'm more interested in how they can replicate the style of the Wake. I think of the novel as the last bastion for AI writing combining so many hurdles that make it nearly impossible for an LLM to write it well. So I'm drawn to the intersection of the novel and AI BECAUSE I think of it as a final test, not that I want to recreate the Wake through automated means. Put another way, I think it may be a futile endeavor. But I'm drawn to the challenge.

I have been on this quest for years to see how well LLMs can replicate the style of Finnegans Wake. It quickly became apparent that they weren't up for the task for quite a few reasons:

  • They simply can't grasp how to make interesting wordplay most times. A common failure mode is for them to merely make compound words. Frequently even just repeating the same word, 'quitquit'.

  • Finnegans Wake doesn't really have characters but archetypes or themes that recur in various forms. LLMs don't keep long-range memory of these archetypes and so can't have them appear again in different forms as persistent meta-entities.

  • Not using common idioms and memes to their advantage to layer additional meaning. Half the fun of the novel is reading a passage and recognizing some mutated phrase.

  • Failure to layer in phonetic meaning on top of lexical meaning. Essentially, Joyce wrote words to have 2 or 3 meanings given the context of the page/book, but also added more semantic meaning through reading the words aloud!

  • Basic common references within any given semantic neighborhood in latent space. They're generic, average.

  • The fusion of multiple languages into new words is awkward most times (arguably awkward when humans do it too though)

I have gone through many prompts over many models. They each have their own flavor of failure modes. I have even tried to develop an automated benchmark to evaluate different models on this task using a ring of LLMs judging each other's work.

I plan to write up another blog post on the non-computational nature of the Wake that I have run up against which keeps me coming back to it.

riverrun

riverrun is my latest experiment. Here, we have an agent interacting with a social media network, forming memories, storing its used phrases and newly coined words, in an attempt to see how well we can replicate something some could call aspiring to Wakese. This takes a community effort to help steer the construction of the text with hopes to incorporate a true cacophony of voices.

The Midden

A core motif of Finnegans Wake is the midden, a garbage dump of human history, of text. We will construct a midden from bluesky users' posts to provide fresh and random (and hopefully interesting) context into riverrun's posts. It will rely on user interactions to steadily construct its 'midden' of references for later semantic burial in a heap of polysemousness.

  • riverrun follows a similar memory block architecture to Void. It has a short system prompt and some permanently attached core memory blocks: persona block, writing style block, interaction rules. It also dynamically creates a user block for each user that interacts with it on Bluesky. It attaches and detaches these depending on who is interacting with it in any given thread. To help manage its memory, it has a 'sleeptime-agent' that summarizes and cleans its various memory blocks as they fill up. In the future I hope to incorporate this subagent into the actual posting on Bluesky for some added fun and emergent behavior, I imagine it replying and jesting with riverrun and play-acting as its id or subconscious.

  • Why this architecture? To preserve memories across longer time spans/posts. The Wake has nearly no stable characters but rather it is teeming with kaleidoscopic archetypes that appear in various forms throughout the text. The goal with memory is to have the network's users contribute to this evergrowing pantheon of forms, memes, types, phrases, that always recirculate into riverrun's posts but in another guise.

  • riverrun stores all these memories in shared memory context from which it can draw from and remix. Since this memory block is always attached it is always in the context window so no tool is needed to incorporate the memories.

Passive Collaborative Writing

Finally, a new feature I'm nearly* done with is to have riverrun comb its followers posts every 2 hours and then select a few to remix into a daily story. "Seed posts" are chosen at random. It iterates 5 times per day on this task and posts a teaser to bluesky each time. It accumulates long-form writing based on these trawlings to post to its whitewind blog. Some examples can viewed there but they haven't used follower posts as ingredients yet.

As for the attribution of these user posts I'm not sure if we want to include original source posts that contributed to riverrun's daily story or not.

Joyce himself basically treated all of human culture as fair game for remixing. The Wake pulls from everywhere without citing sources. Social media already works this way too. Posts become memes, phrases get recycled, jokes evolve, and nobody tracks where it all came from. It's a collective narrative machine. By not attributing individual posts, riverrun treats Bluesky like a true digital midden - a shared garbage heap where individual ownership breaks down into collective material. That's the theory anyway.

Perhaps the answer is to make sure everything is transmogrified so thoroughly that attribution is nearly impossible. This may be unreliable and difficult to accomplish with the current batch of LLMs. Or maybe recognition is part of the fun? I'll have to rely on feedback from the community as to what is appropriate here.

Future Directions for riverrun:

  • automatically grab last ~5 posts of a user when their user memory core block is first created and store in user block

  • complete/fix daily story posting

  • give tool to create its own custom core memory blocks based on user interactions/requests (DONE!)

  • give the 'sleep-time' memory management agent a voice and posting ability to Bluesky

  • give riverrun MCP tool for idioms and phrases (with phonetics included) to remix? I built this MCP but I've had poor results with Sonnet testing so far.

About Me

Lastly, a little bit about me. I am not a developer. I have not written any of this code for riverrun. I use Claude, ChatGPT models, and Gemini to produce all the code. I have a background in physics and math at a B.S. level and I now work in the GIS field, mostly on imagery and LIDAR data using CNNs heavily. The fact that I can pull any of this off is a testament to the capabilities of modern-day LLMs and the inspiration of the people on Bluesky/Letta (namely, @cameron.pfiffer.org‬ and his Void project)