A runtime for AI systems Open source, Apache 2.0 v0.1 — early

Most AI applications aren't
under-engineered.
They're under-governed.

Catalyst is a runtime and contract layer for retrieval, agents, and tool-using workflows. It assumes you'll write your own logic — and gives you tracing, evaluation, versioning, and policy around it, regardless of which framework sits underneath.

#start Read the thesis #example / buildwithcatalyst.dev
The shape of a Catalyst application.
YOUR CODE Knowledge agents · Planner agents · Tool agents · RAG flows CATALYST RUNTIME Contracts · Context · Policy · Telemetry · Evaluation · Versioning ADAPTERS LangChain · LangGraph · OpenAI · Anthropic · Vector stores · Tools
01 — Thesis

Composition isn't the problem.
Operation is.

LangChain helps you wire components together. LangGraph helps you orchestrate them. Neither was designed to answer the question every enterprise eventually asks: how do we run this safely, observe it honestly, and change it without breaking everything? — The Catalyst position
Context

Every team building AI applications eventually invents the same scaffolding: prompt versioning, trace IDs, evaluation harnesses, PII redaction, cost accounting, a way to know which version of which prompt produced which answer for which user.

It gets built badly, three times per company, scattered across notebooks and side projects, glued onto whichever framework was fashionable that quarter.

Catalyst is the answer to that — built once, properly, as a runtime contract rather than a wrapper.

02 — Architecture

A small, opinionated core.
Everything else is an adapter.

Catalyst is built on hexagonal architecture. Your business logic stays in plain Python. Frameworks live at the edges. The runtime sits in the middle and enforces consistency.

i.

Your core logic is yours

Custom chunkers, retrieval strategies, planner logic, tool routing, decision rules — none of this should depend on a framework's base classes. Catalyst gives you stable contracts (Retriever, QueryFlow, AgentFlow) so your code outlives any specific dependency.

ii.

The runtime owns governance

Execution context, policy enforcement, telemetry emission, prompt resolution, evaluation hooks — these are runtime concerns, not framework concerns. Centralizing them means one place to fix bugs, one schema to query, one model your security team can review.

iii.

Frameworks plug in through adapters

LangChain middleware, LangGraph runtime context, raw provider SDKs — all reachable through thin translators that satisfy Catalyst's contracts. Swap a framework without rewriting your application. Run two frameworks side-by-side without duplicating governance.

03 — Diagram

What it looks like, drawn honestly.

EXTERNAL · frameworks, models, stores, tools LangChain LangGraph OpenAI Anthropic Chroma Pinecone Internal APIs ADAPTERS · translate to Catalyst contracts LangChainRetrieverAdapter · LangGraphAgentAdapter · OpenAIModelAdapter · … CATALYST RUNTIME Contracts · ExecutionContext · Policy Telemetry · Evaluation · ArtifactRegistry YOUR CODE · domain logic, framework-free BetterChunker HybridRetriever KnowledgeAgent PlannerFlow ClaimsRAGFlow
Fig. 2 — Dependency flows inward. The runtime depends on nothing. Adapters translate externals into runtime contracts. Your code implements the contracts.
04 — Example

One invocation,
every concern accounted for.

A knowledge agent answering a policy question. The LangChain chain inside is unchanged. Catalyst wraps the invocation with everything you'd otherwise build by hand.

app/main.py python
from catalyst import Runtime, ExecutionContext, PromptRef
from catalyst.policy import EnterprisePolicy
from app.agents import knowledge_agent   # plain LangChain inside

runtime = Runtime(
    telemetry=otel_sink,
    policy=EnterprisePolicy(pii=True, injection=True),
    evaluator=production_evals,
)

ctx = ExecutionContext(
    app_name="support-assistant",
    tenant_id="acme",
    user_id="u-1029",
)

result = runtime.run(
    flow=knowledge_agent,
    input={"question": "What is the carry-forward policy?"},
    prompt=PromptRef("hr_rag", version="2.3.1"),
    ctx=ctx,
)

result.output      # the grounded answer
result.evals       # {groundedness: 0.94, pii_leakage: 0.0, ...}
result.trace.id    # auditable, queryable, attributable

Listing 1 — A single runtime.run() handles prompt resolution, policy checks, span emission, and post-hoc evaluation.

05 — Contrast

What changes, in practice.

Before

Prompts in code, version controlled by git blame.

Logging bolted on per-app, every team writes its own format.

PII redaction added after the first incident.

Evaluation is a Jupyter notebook someone ran last quarter.

Swapping LangChain for raw SDKs means rewriting the app.

After

Prompts are registered artifacts. Versions, owners, diffs.

One telemetry schema. Query it once, answer everywhere.

Policy is declared at the runtime; enforced on every call.

Evaluation runs on every invocation. Trends are visible.

Frameworks become adapters. Swap them; your app doesn't notice.

06 — Scope

What people build with it.

Knowledge agents
Grounded answers over internal documentation, with citations, audit trails, and groundedness scoring on every response.
RAG · Retrieval · Eval
Planner agents
Multi-step decomposition with explicit tool routing, retry policy, and full execution traces from goal to outcome.
LangGraph · State
Tool-using agents
Agents that interact with internal APIs, databases, and services — under policy, with per-tool authorization and audit.
Tools · Policy · Audit
Custom retrieval pipelines
Bring your own chunking, hybrid search, reranking. Catalyst doesn't care how you retrieve — only that the runtime can see it.
Ingestion · Custom
Multi-framework fleets
Unify governance across teams using LangChain, LangGraph, raw SDKs, and custom code. One runtime, many implementations.
Platform · Governance
07 — Begin

Built for the part of AI engineering
nobody markets.

Catalyst is early. The contracts are stable. The runtime works. The adapters cover LangChain and LangGraph today; provider SDKs and vector stores are next. If you're tired of rebuilding the same scaffolding on top of every new framework, this is for you.

#docs Read the docs #github #example