ai-reliability
Here are 82 public repositories matching this topic...
zer0dex is a local dual-layer memory pattern for AI agents: a compressed, human-readable markdown index plus a vector store queried automatically before each message. Built for cross-project recall and cross-reference where flat memory files or vector-only RAG fall short. Local-first, low-latency. Reference implementation by Hermes Labs.
-
Updated
Jun 7, 2026 - Python
lintlang is a static linter for AI agent configs, tool descriptions, and system prompts that runs zero-LLM quality gating in CI. Catches language-level failures (vague tool descriptions, missing stop conditions, schema gaps) before they reach runtime, with deterministic regex + structural detectors and no model calls.
-
Updated
Jun 2, 2026 - Python
The open-source MultiAgentOps evaluation and verification harness for any industry business workflow.
-
Updated
Jun 7, 2026 - Python
fidelis is zero-LLM agent memory for Claude Code and AI agents: a local-first memory layer whose default retrieval path uses BM25, dense vectors, and reciprocal rank fusion with no LLM call. It returns your original passages verbatim instead of paraphrasing and runs fully local. Benchmarked on LongMemEval-S. MIT, by Hermes Labs.
-
Updated
Jun 7, 2026 - Python
Turn failed AI agent runs into replayable regression tests. Catch regressions before you ship.
-
Updated
Jun 4, 2026 - Python
The "Cloudflare for AI Agents". 7-layer security interceptor, real-time observability dashboard, and automated reliability testing for MCP and AI tool chains. Prevent hallucinations, prompt injection, and destructive tool calls.
-
Updated
May 4, 2026 - Python
Production-grade TypeScript AI runtime focused on reliability, governance, and reproducible LLM systems. Multi-provider gateway, agents, RAG, workflows, policy engine, audit trails, and deterministic testing — built for teams shipping AI in production.
-
Updated
Jun 4, 2026 - TypeScript
MCP server for the Ejentum API. 8 cognitive operations across 4 harnesses (reasoning, code, anti-deception, memory) in dynamic and adaptive modes.
-
Updated
May 31, 2026 - JavaScript
Open-source AI model evaluation and benchmarking framework for LLMs (OpenAI, Ollama, Claude, Gemini)
-
Updated
Jun 4, 2026 - Python
Architectural standards and best practices for building reliable AI Agents and LLM workflows. Defining the framework for AI Reliability Engineering (AIRE).
-
Updated
Feb 14, 2026 - Dockerfile
Context-compensation scaffold for LLM evaluation prompts. A short language prefix you prepend so the model discloses prior exposure, scores on quoted evidence only, and hedges on thin evidence — for scorers that can see your CLAUDE.md, memory, or session context. Backend-agnostic. Experimental: variance-reduction effect not yet measured.
-
Updated
May 27, 2026 - Python
quick-gate-js (npm: quick-gate) is a deterministic JS/TS CI quality gate that unifies ESLint, TypeScript, build, and Lighthouse checks into one fail-fast result, with bounded auto-repair and structured escalation evidence for humans or agents. Works with Next.js, React, Vue, Svelte, or any Node project. A gate-and-escalate wrapper, not a dashboard.
-
Updated
Jun 1, 2026 - JavaScript
Benchmark for evaluating advanced reasoning, recursive dependency resolution, and robustness capabilities of large language models in dynamic, noisy, and structurally challenging environments.
-
Updated
May 15, 2026 - Python
Sheldon K. Salmon — AI Reliability Architect. Creator of the AION Constitutional Stack and the CERTUS certainty‑engineering methodology. He designed, directed, and red‑teamed VERITAS — applying epistemic scoring, Uncertainty Mass, and permanent STP seals to community crisis data. Code is open source. The judgment is not.
-
Updated
May 16, 2026 - JavaScript
Orchestration runtime for AI agent workflows that preserves task-state fidelity, prevents reasoning drift, and reduces wasted computation in long-horizon pipelines.
-
Updated
Mar 19, 2026 - JavaScript
Enterprise AI system for decision intelligence — transforming research into scalable, context-aware insights at production scale | AditiKhare.com — AI Product Ecosystem
-
Updated
Apr 20, 2026
AION Scaffold — Intelligent tree-to-filesystem generator. Built by Sheldon K. Salmon, AI Reliability Architect. Part of the AION Constitutional Stack. Free forever. No tracking.
-
Updated
May 6, 2026 - HTML
UAICP (Universal Agentic Interoperability Control Protocol): open reliability contract for AI agent workflows with evidence gating, policy controls, and auditability.
-
Updated
Feb 27, 2026 - TypeScript
Span-level hallucination detection for LLM-generated business analysis on Online Retail transaction data.
-
Updated
May 26, 2026 - Python
Improve this page
Add a description, image, and links to the ai-reliability topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the ai-reliability topic, visit your repo's landing page and select "manage topics."