I just completed some serious upgrades to a long running project I have had for notebooks. Its a coding assistant that reads all the cells in the notebook to build a LLM context, and answers questions by inserting cells. This has a few nice features: you can correct its code and get immediate feedback if it works or not, your conversation always stays in sync with the codebase, and you can just paste cells from other pre-trained notebooks to add in-context learned skills.
The latest upgrade was adding a function called highlight() that programatically adds it’s arguments into the LLM context, so now Roboco-op can use both source code cells AND runtime values information to reply.
Values into LLM context is a game changer. By feeding a test-suite report into its context it can do incremental test-driven development. It produces tested code!
Roboco-op improves within a session, once it has enough context to learn your programming style and Observable idioms. There is an annoying cold start problem though, on a fresh notebook when it doesn’t have enough context to program well.
So now when you ask for a cell, it always has 4 extra examples cells to guide it a little. Hopefully, they are relevant examples, but even if not, they still help just by being concrete Observable programming examples.
Building the RAG index optimized for clientside was interesting. My first attempt weighed 14MB, but with PCA I golfed it down to 3.5MB, which is not bad for a 3000 cell knowlegebase.