Skip to content
Aditya Karnam
Building the infrastructure layer for world-model-driven AI.
Current Systems

Research artifacts, not portfolio cards.

These systems map to the infrastructure layer behind world-model-driven AI: runtime control, memory, retrieval, interface design, and practical evaluation. Each one is grounded in work already published in this codebase.

Observe → Model → Simulate → Act → Evaluate → Update
runtime control through subagent-fleet
retrieval and memory abstractions through embenx
prompt shaping and review loops through AI Toolkit
ecosystem mapping through awesome-agentic-memory
Stack Fit

The systems ladder

The projects do different jobs, but they fit together as one thesis. Local runtime control, durable retrieval, operator-facing interfaces, and memory research are separate layers of the same emerging stack.

Runtime + routing

subagent-fleet is the clearest runtime artifact: role-aware routing, warmup, health checks, and a visible control plane for local coding agents.

Memory + retrieval

embenx and awesome-agentic-memory cover the retrieval and memory layer from two sides: implementation and landscape mapping.

Interfaces + eval hints

AI Toolkit keeps human operators in the loop with explicit structure, scoring heuristics, and reproducible prompt surfaces.

Artifact Index

Current systems

Each page below is framed as a system with a question, a build artifact, and a claim about what the next AI infrastructure layer needs.

subagent-fleet

local inferencemodel routingcoding agentsollamalitellm
Research questionCan local machines become a coordinated compute fleet for coding agents instead of a pile of disconnected Ollama endpoints?
System builtAn open-source control plane that generates LiteLLM routing config, Claude Code-style agent definitions, environment files, model warmup flows, and a live SSE dashboard from one fleet topology.
Why it mattersIt turns agent role design into infrastructure. Planner, implementer, reviewer, and summarizer workloads can route to different models and machines with visible runtime behavior.
StatusActive experiment · open source
Proof points
One declarative fleet.yaml drives routes, agent files, and warmup flows.
Live dashboard exposes node health, routing, trace stream, and warm model state.
Published eval compares the local fleet against Sonnet 5 and GPT-4o-mini.

embenx

retrievalmemory layerhybrid searchtemporal memorymcp
Research questionCan retrieval infrastructure become backend-agnostic without giving up the features agents need for durable memory?
System builtA Python retrieval library with a unified Collection API across 15+ vector backends, plus hybrid search, metadata filtering, reranking, temporal memory, self-healing retrieval, and a built-in MCP server.
Why it mattersWorld-model systems need a memory layer that survives backend changes and exposes retrieval behavior explicitly instead of burying it in one-off adapters.
StatusShipping library · active development
Proof points
One API spans FAISS, pgvector, LanceDB, Milvus, Qdrant, and more.
TemporalCollection adds recency-aware retrieval for session and episodic memory.
The roadmap already points toward state hydration and trajectory retrieval.

AI Toolkit

workflow toolingprompt evaluationprompt rewritinginterfaces
Research questionWhat lightweight tooling helps make LLM workflows more inspectable before they become larger agent systems?
System builtA practical tool suite with an Intelligent Prompt Composer, a Prompt Grader & Rewriter, and a Tweet Thread Generator for shaping prompts, checking quality dimensions, and creating repeatable outputs.
Why it mattersEven small tools reinforce the same thesis: useful agent systems need explicit structure, evaluation hints, guardrails, and visible operator control.
StatusLive product page
Proof points
Prompt Grader scores prompts across goals, constraints, output format, evaluation hints, and guardrails.
Composer exposes role, tone, output format, verbosity, and thinking mode as controllable inputs.
The toolkit acts as a low-friction proving ground for runtime ergonomics.

awesome-agentic-memory

memory researchmcpknowledge graphsfield mapping
Research questionWhat does the current memory landscape look like when you compare agent frameworks, MCP servers, vector databases, and temporal-memory approaches side by side?
System builtA curated, continuously updated landscape page covering agent memory frameworks, MCP memory servers, vector backends, and research directions relevant to long-horizon agents.
Why it mattersOwning a category requires mapping the space, not just shipping one project. This page makes the memory layer legible and helps place new work in context.
StatusPublic resource · active curation
Proof points
Includes a framework comparison across memory types, backends, and MCP support.
Highlights embenx inside a wider ecosystem instead of treating it in isolation.
Connects production tooling to research-oriented memory patterns.
Next Reads

Continue through the lab

The systems page is the artifact index. The other two pages in this slice explain the architecture and what is currently being pushed forward.

Read the stack

The stack page turns the thesis into a concrete systems map: runtime, memory, retrieval, simulation, tools, routing, and evaluation.

Read the now board

The now page captures the current active fronts: local compute orchestration, memory infrastructure, and practical builder tooling.

© 2026 Aditya Karnam. World Model Infrastructure Lab.
Field Notes · Current Systems · Status · Ask My Work