Building
Last updated April 2026
llm-readiness-platform
CompleteEvaluation framework for LLM-powered features: structured testing, output scoring, and drift detection. Eval philosophy, evaluation principles, and metrics taxonomy (7 classes) finalized. Working eval harness at v0.2 with ExactMatch, Refusal, and Calibration evaluator classes.
constraint-architecture
CompleteReference framework for embedding human judgment, gates, and enforcement layers into AI systems. Covers gate taxonomy, guardrail vs. constraint distinction, human judgment layer design, trust boundaries, conflict resolution, audit surface, and multi-agent constraint systems.
ai-failure-taxonomy
CompleteStructured taxonomy of AI system failure modes with detection patterns and mitigation playbooks. All 10 failure classes complete with passive, active, and red team detection patterns. Mitigation playbooks cover all priority classes at capability-level and incident-level.
adversarial-eval-tool
CompleteOpen-source tool for probing LLM behavioral boundaries and surfacing failure modes. v0.1 ships a 28-probe library across four adversarial subtypes, a dual-metric scoring system (refusal rate + exploitation rate), and JSON and markdown report generation. Designed to find what standard benchmarks miss.