Concepts

Knowledge boundary. Top: Claude's training data — generic patterns, public docs, common code — does not contain your tenant model, your policies, or your failure modes. Three cards below show examples: tenant invariants, RLS policies, org-specific failure modes. Femto encodes these as Knowledge Components. — Femto fills the gap between Claude's training data and your domain

Knowledge Components

A Knowledge Component (KC) is the unit of domain knowledge femto reasons about. The structure is deliberately narrow so contributors write KCs in the same shape and the probe / grader can consume them uniformly.

Every KC is a markdown file with six required sections:

CONCEPT

What the thing is, in domain terms. No code yet.

INVARIANTS

What must be true for the concept to hold. The load-bearing properties the engineer has to preserve.

MECHANISMS

How specific technologies implement it. PostgreSQL RLS, Supabase Auth, middleware patterns, etc.

COMMON FAILURES

How it breaks in the wild. Named failure-mode patterns, not just “be careful.”

DIAGNOSTIC HEURISTICS

What an expert looks for when troubleshooting. The hardest field to write and the one that carries most of the probe’s signal.

CITATIONS

An authoritative source per claim — RFC, vendor doc, OWASP cheat sheet, peer-reviewed reference. Every bullet carries its own link; unsourced assertions fail schema validation.

Packs group related KCs (e.g., the multi-tenancy pack ships row-level-security and tenant-scoping) and declare which KCs gate code emission via required_for_gating.

The Socratic probe

The probe is pre-emission: it happens before the agent can edit or write. The engineer — not the agent — specifies the fix at domain level. The probe keeps asking even on correct answers until every required KC has hit its per-KC turn minimum.

The terminator is a checklist, not an LLM verdict. “The grader is satisfied” does not end the probe on its own; “every required KC has been touched enough times” does. That keeps the probe from drifting into single-mechanism gaming via framing attacks on the grader.

Probe, grader, and hook are separated

Femto architecture. Claude Code (outer dashed box) hosts the femto harness. Inside femto: three stacked layers — Content (skill, KCs, packs), Probe (MCP server, six tools, Socratic loop), Enforcement (PreToolUse hook, per-KC threshold check). A sidebar shows session state on disk under .femto/session-id/ — state.json, probe-log.md, reads, grader.md, events.jsonl. Files are the contract between layers. — Three layers, one file-based session state, no designed-in bypass

Three roles, three separate contexts — bridged only by files on disk:

The probe (MCP server) drives the dialogue, tracks coverage, and serializes turns to probe-log.md. It does not decide mastery.
The grader (separate subagent, different model) reads the serialized log plus the reads and writes grader.md. It never sees live dialogue state; it cannot be talked into a higher score mid-turn.
The hook (PreToolUse) reads grader.md from disk and checks per-KC mastery against the pack threshold. It does not call an LLM. The emit-time decision is a file-system read, not a model prediction.

Files are the contract between layers. If any single layer is compromised or lies, the artifacts on disk still tell the truth — and every session ships events.jsonl as an append-only audit log.

Three content tiers

Femto’s content lives in three tiers with distinct authorship and trust models. The middle tier is what you see on the Packs page; the outer tiers matter for real adoption.

LIBRARY-DOCS TIER

Library-shipped docs (llms.txt, bundled markdown) — Next.js, Auth0, Clerk, Supabase, Stripe, Better Auth. Femto orchestrates these on demand; it does not bundle or rehost them.

CONCEPT-KC TIER (shipped)

Packs shipped with femto under packs/*/. Operator-curated, community-PR-extended, harness-calibrated, reliability published on Metrics. Gate 1 strong-with-transparency in full.

ORG-SPECIFIC TIER (your repo)

Packs authored in the adopter’s own repository at .femto/packs/*/. Same schema, same structural enforcement — but the adopter owns the content, calibration, and trust boundary. No public reliability claim from femto for these packs.

On session start the MCP server walks shipped packs first and org-local packs second. On id collision, shipped wins — an org-local pack cannot shadow a shipped one by reusing its id. Events log the source of every loaded pack so the trust boundary is visible in events.jsonl.

For compliance-grade work (HIPAA, PCI, SOC2, NIST SP800 controls, internal engineering policies), this is the path: write the KCs against the pack schema, commit them under your repo’s .femto/packs/, and femto’s probe → grader → hook loop runs over your content exactly as it does over shipped packs. See Contributing for the authoring walkthrough.

Gate 1 — strong with transparency

Femto does not claim guaranteed understanding. Three readings of that commitment were on the table:

Strict as guarantee. Provably zero bypass, zero false passes. Unbuildable at current state-of-art — best published LLM-based mastery detection sits around 76% AUC. Delivering a guarantee requires either a trick definition of “understanding” or research beyond what shipped tooling can honestly promise.
Strict as design discipline. No bypass, stacked mechanisms, residual unmeasured. Ships as product but the claim is unfalsifiable from outside — adopters can’t audit it.
Strong with transparency. No designed-in bypass, stacked mechanisms, residual rate measured and published per domain. Ships as product with real artifacts — harness, test cases, numbers — that adopters and regulators can inspect and replicate. This is what femto commits to.