July 25, 20258 min read
Mixture-of-Recursions Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
Explaining the Mixture-of-Recursions model, starting from kv-cache basics.
AI ResearchAdaptive Computation
Read full postWriting
A running log of essays on AI engineering, open-source work, and whatever else I think is interesting at the time. Some posts live directly on this site, and others still link out to the canonical version in case you prefer reading in your app of choice.
Explaining the Mixture-of-Recursions model, starting from kv-cache basics.