Singapore Just Described the Future of Healthcare AI Safety. We Already Built It.

On 10 March 2026, Singapore's Ministry of Health and Health Sciences Authority published AIHGle 2.0 — the most comprehensive national guideline yet for artificial intelligence in healthcare. Buried in Chapter 8.2, under a section on Generative AI, is a sentence that should stop every healthcare AI developer in their tracks.

"Perhaps the most promising recent technique is to compute 'certainty' of the outputs of LLMs. This works by examining the falloff or probabilities in the 'logits' in the output layer of the LLM."

— AIHGle 2.0, Chapter 8.2, Published 10 March 2026

The guideline goes further. It distinguishes between aleatoric uncertainty — noise inherent in data collection — and epistemic uncertainty, which it describes as shortcomings of the model itself, "often due to improper or incomplete summarisation of the knowledge which it is trying to represent."

In other words: Singapore's regulators are now explicitly saying that measuring whether an AI knows what it doesn't know is the most promising frontier in healthcare AI safety.

We agree. Because we've been building exactly this — and have the data to prove it works.

What AIHGle 2.0 Gets Right

Let's be clear: this is an exceptional document. It moves beyond the usual "be responsible with AI" language and gets specific about the real dangers.

The guideline identifies four amplified risks of Generative AI in healthcare: hallucination, where models present fictitious content as fact; undesirable content that may compromise clinical outcomes; data disclosure through inadvertent leaking of sensitive information; and vulnerability to adversarial prompts that manipulate model behaviour.

It also establishes a three-tier human oversight framework — human-in-the-loop (active control), human-over-the-loop (supervisory monitoring), and human-out-of-the-loop (autonomous) — and makes a critical declaration: autonomous AI should not make clinical decisions without human oversight.

Most importantly, the guideline explicitly calls for developers to implement epistemic uncertainty measurement as a core safety mechanism, and recommends techniques like Retrieval-Augmented Generation (RAG), red teaming, and source-citing architectures.

This is significant. It means the regulatory direction is no longer about "don't use AI in healthcare." It's about "measure what the AI doesn't know, and build systems that catch failure before it reaches the patient."

Where AIHGle 2.0 Stops — and Where EBP Begins

The guideline describes what needs to happen. It does not prescribe how. This is appropriate for a governance document — regulators should set the target, not mandate the engineering.

But someone has to build the engineering. And the gap between "compute certainty" and a working system that actually does it in clinical contexts is enormous.

The Epistemic Bridge Protocol (EBP) was designed specifically to close this gap.

The Core Insight

AIHGle 2.0 recommends examining logit probabilities from a single LLM. EBP goes further: instead of trusting one model's self-reported confidence, it measures consensus across multiple independent models and uses the agreement signal (σ) to classify epistemic states — producing a verifiable, reproducible certainty score without requiring access to any model's internals.

This matters because logit-based confidence and actual factual reliability are only loosely correlated. A model can be confidently wrong. EBP's multi-model consensus avoids this failure mode entirely: if four independently-trained models agree on a clinical assessment, the probability that all four share the same hallucination is vanishingly small.

Side-by-Side Comparison

Dimension	AIHGle 2.0 Recommends	EBP / eptim.health Delivers
Uncertainty measurement	"Compute certainty via logit probabilities"	Multi-model consensus signal (σ) — model-agnostic, no logit access required
Epistemic vs. aleatoric	Distinguishes the two conceptually	Formally modelled: P(hallucination) = (1−σ) × η, validated across 13,728 responses
Hallucination mitigation	RAG, red teaming, source citation	All of the above + multi-model verification achieving 0% emergency under-triage
Human oversight model	Three tiers: in-the-loop, over-the-loop, out-of-the-loop	Three epistemic states: EXPLORE → PROVISIONAL → COMMIT with mandatory doctor validation at COMMIT
Clinical incompleteness	Not addressed as distinct from hallucination	4-way outcome taxonomy separating hallucination from clinically insufficient responses
Guardrails	"Rule-based constraints to filter inappropriate input and output"	8 Clinical Consistency Rules (CR-1 → CR-8) catching physiologically impossible outputs
DTC safety	"Should not generate outputs requiring clinical expertise to interpret"	Epistemic state labelling ensures users always see confidence level; escalation to doctor built in
Continuous learning risk	Monitor for model drift post-deployment	Drift detection via consensus divergence — patented (USPTO provisional)
Validation scale	No benchmark specified	13,728 responses, 4 models, 3 domains; 1,248 clinical vignettes replicating GPT-Health-Eval

The Numbers That Matter

When we stress-tested EBP against 1,248 clinical vignettes from the GPT-Health-Eval benchmark — the same scenarios used to evaluate ChatGPT's clinical safety — the results were unambiguous.

Emergency
under-triage

17.2%

ChatGPT Health
under-triage rate

24×

Reduction in
unsafe outputs

In the broader EFT validation study across 13,728 AI responses, hallucination rates dropped from approximately 51.9% at low model consensus to 5.9% at perfect consensus — a clean monotonic relationship that held across medical, legal, and technical domains.

Critically, we also identified a class of failure that AIHGle 2.0 does not yet distinguish: clinical incompleteness. An AI that responds "administer IV fluids" for sepsis is not hallucinating — but without specifying the 30 mL/kg bolus within a 3-hour window, the response is clinically dangerous. In our dataset, incomplete responses were 7× more common than hallucinations. No existing safety framework catches them. EBP does.

The Timeline Tells the Story

The convergence between regulatory direction and our technical work isn't coincidental. It reflects a shared recognition that the fundamental problem of healthcare AI safety isn't about blocking bad outputs — it's about measuring epistemic reliability.

2024

EBP conceptualised — three-state epistemic commitment model (EXPLORE → PROVISIONAL → COMMIT) designed from a decade of observing students interact with AI in robotics education contexts.

Early 2025

EFT v3.2 study completed — 13,728 responses scored across 4 models, 3 domains. Core formula P(hallucination) = (1−σ)×η validated. Semantic scorer with 800+ synonym mappings built to replace initial keyword matching.

Mid 2025

eptim.health reaches functional state — 55 Supabase Edge Functions, multi-model EBP verification, 11 validated mental health screening instruments, doctor validation tier deployed.

Late 2025

Clinical stress test completed — 1,248 vignettes, 0% emergency under-triage. Clinical Consistency Rules (CR-1 → CR-8) implemented. USPTO provisional patents filed.

10 March 2026

Singapore publishes AIHGle 2.0 — explicitly calls for epistemic uncertainty measurement in healthcare GenAI, distinguishes epistemic from aleatoric uncertainty, recommends computing LLM output certainty as "the most promising technique."

We didn't build EBP in response to AIHGle 2.0. We built it because the problem was obvious to anyone deploying AI in clinical contexts. The fact that Singapore's regulators have now arrived at the same conclusion independently is validation — not of our specific solution, but of the problem space itself.

What This Means for ASEAN Healthcare AI

Singapore's regulatory framework carries outsized influence in the region. When MOH and HSA set direction, Malaysia's NAIO, Thailand's MOPH, and Indonesia's regulatory bodies take notice. AIHGle 2.0 is explicitly aligned with ASEAN's Guide on AI Governance and Ethics, and its emphasis on epistemic uncertainty measurement signals where regional regulation is heading.

For healthcare AI developers across ASEAN, the message is clear: uncertainty quantification is no longer optional. It is becoming a regulatory expectation.

Three implications for the sector

1. "We use RAG" is no longer sufficient. AIHGle 2.0 treats RAG as one technique among many. The guideline's emphasis on computing certainty of outputs suggests regulators expect something more fundamental — an architectural commitment to measuring what the AI doesn't know.

2. Clinical incompleteness needs its own category. The guideline's distinction between hallucination and epistemic uncertainty is a start, but the 7× prevalence gap we found between incomplete and hallucinated responses suggests regulators will eventually need to address this explicitly.

3. Multi-model consensus is the natural architecture. If the goal is model-agnostic certainty measurement that doesn't depend on any single vendor's logit access, cross-model verification is the scalable solution. EBP demonstrates this is technically feasible and clinically effective today.

The Regulatory Sandbox Opportunity

AIHGle 2.0 introduces regulatory sandboxes for healthcare AI — a mechanism for testing AI solutions in real-world clinical settings with reduced licensing requirements. HSA launched the AI-MD Exemption Sandbox in February 2026, specifically for low-to-moderately low-risk AI medical devices.

This is precisely the kind of environment where EBP's capabilities can be demonstrated at scale. A system that can verifiably measure its own uncertainty, that escalates appropriately, and that achieves 0% emergency under-triage in benchmark testing is exactly what a regulatory sandbox should be evaluating.

The infrastructure is ready. The evidence base exists. The regulatory direction is aligned.

Our position

We welcome AIHGle 2.0 as a validation of the problem space that EBP was designed to address. Singapore has set the standard for what healthcare AI governance should demand. Now the question is: who will build the systems that meet it?

We believe we already have.

· · ·

The Epistemic Bridge Protocol (EBP) and Epistemic Field Theory (EFT) are developed by Eptim.ai Sdn. Bhd. (Malaysia). The EBP paper is under review at Discover Artificial Intelligence (Springer Nature). The EFT manuscript is in preparation for Nature Communications. eptim.health is currently in beta.

For enquiries on partnership, regulatory consultation, or the EBP architecture, contact us at eptim.ai.

The future of healthcare AI is verifiable trust

If you're building, deploying, or regulating AI in healthcare — let's talk about how epistemic verification changes the equation.

Learn more about EBP