Skip to main content

2 posts tagged with "Research"

Academic research and empirical studies

View All Tags

We Simulated 5,400 AI Agent Scenarios. Here Is What Broke.

· 8 min read
Frank Luong
Founder & CEO, FAOSX | CIO 100 Asia 2025 | AI & Digital Transformation Leader

Your AI agent passes every unit test. It handles demo scenarios perfectly. Then it goes to production in a regulated industry and recommends a solvency ratio that violates Vietnamese insurance law, or bypasses an approval chain that exists for compliance reasons, or confidently cites a regulation that was repealed two years ago.

The problem isn't that your agent is broken. The problem is that nobody tested the scenarios that matter — the regulatory edge cases, the cross-role handoffs, the adversarial inputs that exploit domain-specific knowledge gaps.

We built a system that generates these scenarios automatically from enterprise ontologies, then ran 5,400 simulations across 3 LLM models and 5 regulated industries. Here's what we learned about verifying AI agents before they touch production.

We Ran 1,800 Enterprise AI Experiments. Ontology Beat RAG Every Time.

· 6 min read
Frank Luong
Founder & CEO, FAOSX | CIO 100 Asia 2025 | AI & Digital Transformation Leader

Most enterprise AI teams are building RAG pipelines — retrieval-augmented generation that fetches relevant documents and stuffs them into prompts. It works. But when we tested it against structured ontological grounding across 1,800 controlled experiments, 3 LLM models, and 5 regulated industries, ontology-grounded agents consistently outperformed RAG-augmented ones on the metrics that matter most in enterprise: metric accuracy, regulatory compliance, and role consistency.

Here's what the data shows — and why it matters for anyone building AI agents for regulated industries.