Flamehaven projects
Core tracks, system domains, and the technical thesis behind the work.
Projects is the domain map for Flamehaven AI governance systems, reasoning verification engines, scientific BioAI infrastructure, and engineering foundations. Selected Work is the proof layer beneath it. This page defines the territory first, then points to the artifacts that make it concrete.
Control, auditability, and safe boundaries
AI Governance Systems
This track focuses on the layers that make AI behavior inspectable before it reaches production: policy boundaries, fail-closed gates, and governance logic that can survive legal, operational, or safety review.
The goal is not to add superficial compliance language after a model is already wired into your workflow. The goal is to define where the system may act, when it must stop, and what evidence exists for those decisions.
Flamehaven uses governance as a systems problem: constraints, audit trails, review surfaces, and runtime behavior should align. If they do not, the architecture is still fragile even if the demo looks polished.
Related Selected Work
CCGE: Fail-Closed Governance Engine
Fail-closed governance engine for healthcare AI systems, ensuring deterministic boundaries around probabilistic models.
AI-SLOP-Detector
A long-running code review and anti-slop inspection system designed to surface low-integrity patterns before they harden into production debt.
Flamehaven-Tensor-Canon
Universal Data Governance Engine ∴ Enforcing structural covenants and detecting drift (MMD) for PyTorch & NumPy pipelines.
Related Writing
Short writing list focused on governance, safety, and architectural control.
How Auditing 10 Bio-AI Repositories Shaped STEM-AI
After auditing 10 open-source Bio-AI repositories, we found blind spots in STEM-AI and expanded it from text-only review to code-aware trust evaluation.
After Auditing 10 Bio-AI Repositories, I Think We're Scaling the Wrong Layer
After auditing 10 open-source Bio-AI repositories, one pattern stood out: the field is scaling packaging faster than verification. Here is what that gap actually costs.
Everyone Was Talking About Context Engineering. Nobody Had Solved Governance.
Everyone Was Talking About Context Engineering. Nobody Had Solved Governance.
Inference quality, validation, and proof surfaces
Reasoning / Verification Engines
This track covers systems that inspect claims, reasoning steps, and structural integrity. The emphasis is not “can the model answer” but “can the system justify, verify, and reject weak output.”
Reasoning infrastructure matters when downstream decisions are expensive, regulated, or irreversible. In those environments, plausible output without verification is just delayed failure.
Flamehaven treats verification as part of the product architecture itself: not a QA afterthought, but a required layer that shapes which outputs are allowed to survive.
Related Selected Work
AI-SLOP-Detector
A long-running code review and anti-slop inspection system designed to surface low-integrity patterns before they harden into production debt.
ProofCore-AI-Benchmark
ProofCore is a browser-native, 100% offline-first, hybrid mathematical proof verification engine. It combines rigorous symbolic math with semantic understanding to reliably verify mathematical proofs, offering zero ex...
HRPO-X
Hybrid Reasoning Policy Optimization (HRPO): a research prototype for hybrid latent reasoning with RL.
Related Writing
Posts linked to reasoning quality, verification, proof, and evaluation.
I Built an Ecosystem of 46 AI-Assisted Repos. Then I Realized It Might Be Eating Itself.
An ecosystem of 46 AI-assisted repos can become a closed loop. This article explores structural blind spots, self-validating toolchains, and the need for external validators to create intentional friction.
Why Reasoning Models Die in Production (and the Test Harness I Ship Now)
Project note from the Flamehaven writing archive.
Implementing "Refusal-First" RAG: Why We Architected Our AI to Say 'I Don't Know'
Implementing refusal-first RAG means teaching AI to say “I don’t know.” This article explains evidence atomization, Slop Gates, and grounding checks that favor verifiable answers over plausible hallucinations.
Evidence-aware scientific systems
Scientific & BioAI Infrastructure
This track is for scientific and BioAI environments where reproducibility, validation boundaries, and explicit methodological structure matter more than generic model enthusiasm.
Scientific systems need more than automation. They need traceable assumptions, screened hypotheses, and outputs that can be inspected by technical stakeholders without hand-waving.
Flamehaven approaches BioAI and scientific infrastructure as high-stakes engineering: evidence pathways, reviewable artifacts, and architectures that stay useful when the domain becomes more demanding.
Related Selected Work
RExSyn-Nexus
A governance-aware orchestration framework for AI systems that need structured reasoning, explicit controls, and traceable decision paths.
Flamehaven-TOE
A research-side validation engine for structured hypothesis extraction, experimental framing, and multi-step reasoning review.
ARR-medic-cyp3a4
Research-side CYP3A4 interaction prediction system for pharmacology education, exploratory screening, and BioAI workflow design.
Related Writing
Posts connected to scientific workflows, BioAI, and evidence-bound research systems.
I Audited 10 Open-Source Bio-AI Repos. Most Could Produce Outputs. Few Could Establish Trust.
I audited 10 visible repositories. Most could produce outputs. Very few could establish what those outputs meant.
Bio-AI Repository Audit 2026: A Technical Report on 10 Open-Source Systems
We audited 10 prominent open-source Bio-AI repositories using code inspection and STEM-AI trust scoring. 8 of 10 scored T0: trust not established. Here is what the code actually shows.
Medical AI Repositories Need More Than Benchmarks. We Built STEM-AI to Audit Trust
STEM-AI is a governance audit framework for public medical AI repositories. It scores README integrity, cross-platform consistency, and code infrastructure — because benchmarks alone don't tell you if a bio-AI tool is safe to trust.
Operational surfaces that survive real deployment
Cloud & Engineering Foundations
This track covers the engineering foundations that hold everything else up: deployment surfaces, delivery tooling, developer infrastructure, and the production scaffolding that turns concept work into systems teams can operate.
A strong idea still fails if the surrounding engineering is weak. Infrastructure, automation, and delivery logic determine whether the system can be sustained after the initial build.
Flamehaven treats operational foundations as part of the same thesis: architecture should be governable, observable, and practical to evolve under real production pressure.
Related Selected Work
Flamehaven-Filesearch
Open-source semantic document search (RAG) engine with FastAPI and instant self-hosted deployment
copilot-guardian
Autonomous CI/CD recovery tool powered by GitHub Copilot CLI. Analyzes failures with multi-hypothesis reasoning, generates risk-stratified patches (Conservative/Balanced/Aggressive), and auto-applies fixes with full t...
Dir2md
CLI pipeline that converts codebases into structured markdown context for AI-assisted engineering, review, and documentation workflows.
FlashRecord
The fastest Python-first CLI screen recorder ∴ Instant screenshots (@sc) and lightweight GIF recording (@sv) for developer automation. No GUI, just speed.
Related Writing
Posts tied to engineering practice, deployment, and production infrastructure.
The Stake Was Governance Outside the Schema. MICA v0.1.5 Pulled It In
v0.1.0 through v0.1.4 made the schema more implementable. v0.1.5 was the first version to ask a different question — what if governance itself belongs inside the schema? Here is what that looked like, and what it still could not do.
The Schema Existed. The Model Had No Way to Know.
v0.0.1 proved that context could be structured. It did not prove that the structure could govern what shaped the session. Three failures — and why only one made the others meaningless.
95% of AI Businesses Will Die. Here’s How to Not Be One of Them.
What the data, a founder’s confession, and 70 years of tech history tell us about who actually survives.
Trend shifts, market movement, and strategic signals
AI Signals & Market Shifts
This track covers meaningful AI market movement, platform shifts, product signals, and operational changes that matter to teams building under real constraints.
The goal is not to repost headlines. The goal is to surface changes that affect architecture, risk posture, product timing, and strategic decision-making.
Flamehaven treats AI signals as decision inputs: market structure, platform behavior, and ecosystem drift all matter when systems need to hold up beyond the current cycle.
Related Selected Work
Related Writing
Posts connected to AI trend shifts, platform movement, and market-relevant signals.
The Centaur’s Equation: Why the Stubborn Expert Wins in the Era of Infinite AI
Why Evaluation Ownership is the Ultimate Defensive Asset in the AGI Economy
What Anthropic’s 81k Survey Reveals About What the AI Market Still Gets Wrong
Users Don’t Want Faster AI — They Want AI That Helps Them Live Better Without Losing Their Humanity.
The Repo Is Right There. Why Are You Checking Their CV?
In 2026, AI researchers and engineers use the same words to mean opposite things. This is not a communication problem. It is an incentive problem with a vocabulary leak and it's where most AI projects actually fail.