Current Systems

Research artifacts, not portfolio cards.

These systems map to the infrastructure layer behind world-model-driven AI: runtime control, memory, retrieval, interface design, and practical evaluation. Each one is grounded in work already published in this codebase.

Observe → Model → Simulate → Act → Evaluate → Update

•runtime control through subagent-fleet

•retrieval and memory abstractions through embenx

•prompt shaping and review loops through AI Toolkit

•ecosystem mapping through awesome-agentic-memory

Stack Fit

The systems ladder

The projects do different jobs, but they fit together as one thesis. Local runtime control, durable retrieval, operator-facing interfaces, and memory research are separate layers of the same emerging stack.

Runtime + routing

subagent-fleet is the clearest runtime artifact: role-aware routing, warmup, health checks, and a visible control plane for local coding agents.

Memory + retrieval

embenx and awesome-agentic-memory cover the retrieval and memory layer from two sides: implementation and landscape mapping.

Interfaces + eval hints

AI Toolkit keeps human operators in the loop with explicit structure, scoring heuristics, and reproducible prompt surfaces.

Artifact Index

Current systems

Each page below is framed as a system with a question, a build artifact, and a claim about what the next AI infrastructure layer needs.

subagent-fleet

local inferencemodel routingcoding agentsollamalitellm

Research questionCan local machines become a coordinated compute fleet for coding agents instead of a pile of disconnected Ollama endpoints?

System builtAn open-source control plane that generates LiteLLM routing config, Claude Code-style agent definitions, environment files, model warmup flows, and a live SSE dashboard from one fleet topology.

Why it mattersIt turns agent role design into infrastructure. Planner, implementer, reviewer, and summarizer workloads can route to different models and machines with visible runtime behavior.

StatusActive experiment · open source

Proof points

•One declarative fleet.yaml drives routes, agent files, and warmup flows.

•Live dashboard exposes node health, routing, trace stream, and warm model state.

•Published eval compares the local fleet against Sonnet 5 and GPT-4o-mini.

System write-up Docs site GitHub

embenx

retrievalmemory layerhybrid searchtemporal memorymcp

Research questionCan retrieval infrastructure become backend-agnostic without giving up the features agents need for durable memory?

System builtA Python retrieval library with a unified Collection API across 15+ vector backends, plus hybrid search, metadata filtering, reranking, temporal memory, self-healing retrieval, and a built-in MCP server.

Why it mattersWorld-model systems need a memory layer that survives backend changes and exposes retrieval behavior explicitly instead of burying it in one-off adapters.

StatusShipping library · active development

Proof points

•One API spans FAISS, pgvector, LanceDB, Milvus, Qdrant, and more.

•TemporalCollection adds recency-aware retrieval for session and episodic memory.

•The roadmap already points toward state hydration and trajectory retrieval.

Guide Docs site GitHub

AI Toolkit

workflow toolingprompt evaluationprompt rewritinginterfaces

Research questionWhat lightweight tooling helps make LLM workflows more inspectable before they become larger agent systems?

System builtA practical tool suite with an Intelligent Prompt Composer, a Prompt Grader & Rewriter, and a Tweet Thread Generator for shaping prompts, checking quality dimensions, and creating repeatable outputs.

Why it mattersEven small tools reinforce the same thesis: useful agent systems need explicit structure, evaluation hints, guardrails, and visible operator control.

StatusLive product page

Proof points

•Prompt Grader scores prompts across goals, constraints, output format, evaluation hints, and guardrails.

•Composer exposes role, tone, output format, verbosity, and thinking mode as controllable inputs.

•The toolkit acts as a low-friction proving ground for runtime ergonomics.

Toolkit page Prompt Composer Prompt Grader

awesome-agentic-memory

memory researchmcpknowledge graphsfield mapping

Research questionWhat does the current memory landscape look like when you compare agent frameworks, MCP servers, vector databases, and temporal-memory approaches side by side?

System builtA curated, continuously updated landscape page covering agent memory frameworks, MCP memory servers, vector backends, and research directions relevant to long-horizon agents.

Why it mattersOwning a category requires mapping the space, not just shipping one project. This page makes the memory layer legible and helps place new work in context.

StatusPublic resource · active curation

Proof points

•Includes a framework comparison across memory types, backends, and MCP support.

•Highlights embenx inside a wider ecosystem instead of treating it in isolation.

•Connects production tooling to research-oriented memory patterns.

Resource page GitHub

Next Reads

Continue through the lab

The systems page is the artifact index. The other two pages in this slice explain the architecture and what is currently being pushed forward.

Read the stack

The stack page turns the thesis into a concrete systems map: runtime, memory, retrieval, simulation, tools, routing, and evaluation.

Read the now board

The now page captures the current active fronts: local compute orchestration, memory infrastructure, and practical builder tooling.