Chuyển tới nội dung chính

Lessons from the Trenches: What We'd Do Differently

· 12 phút để đọc
Frank Luong
Founder & CEO, FAOSX | CIO 100 Asia 2025 | AI & Digital Transformation Leader

We've made every mistake in the book—and a few that aren't in any book yet.

Some of these mistakes cost us months. Others cost us team members. A few almost cost us the company.

I'm sharing them because the AI agent space is so new that there's no playbook. Every team is learning by doing. If our scars can save you some pain, they're worth exposing.

This isn't a success story disguised as humility. We're still making mistakes. We're still learning. But we're further along than we were, and the lessons we've accumulated might be useful to someone building something similar.

Here's the honest version of what we got wrong, what we over-built, what the community taught us, and what we'd tell our past selves.


Early Mistakes: What We Got Wrong from the Start

The Foundation Errors

Mistake 1: Building for Power Users First

We built FAOSX for people like us—engineers who love configuration, who read documentation for fun, who want maximum flexibility. We created an infinitely configurable system with endless options.

The result: new users bounced before seeing any value. The learning curve was too steep. The time-to-first-success was too long. People who would have loved the product never got far enough to discover that.

What we learned: Simplicity first, power later. The user who needs 47 configuration options is willing to learn where to find them. The user who needs to see value in 5 minutes isn't willing to wade through complexity to get there.

What we'd do differently: Progressive disclosure from day one. The simple path is obvious and fast. Advanced options exist but don't clutter the beginner experience.

Mistake 2: Underestimating the Importance of Defaults

Related to the above, but distinct: we made everything configurable without providing good defaults. Every setting required a decision. Users faced choice paralysis before they could do anything.

"What model should I use?" "What token limit?" "What temperature?" "What output format?" Users who just wanted to try the thing were forced to make decisions they weren't equipped to make.

What we learned: Great defaults are a product decision, not abdication of responsibility. Saying "it depends on your use case" is true but unhelpful. Pick the right answer for 80% of users and make it the default.

What we'd do differently: Opinionated defaults with escape hatches. Make decisions for users, but let them override when they know better.

Mistake 3: Treating Documentation as an Afterthought

We planned to write documentation "when things stabilize." Things never stabilized. Documentation was always outdated, incomplete, or missing entirely.

The impact was worse than we expected. Features we built went unused because people didn't know they existed. Support requests flooded in for things that were documented (poorly) somewhere. Engineers spent time answering questions that docs should have answered.

What we learned: Undocumented features don't exist. If users can't find it, learn it, and use it, you haven't shipped it—you've just written code.

What we'd do differently: Documentation as part of the feature definition. A feature isn't complete until it's documented. Documentation is a release requirement, not a nice-to-have.


Features We Over-Engineered

When Less Would Have Been More

The Configuration System

What we built: An infinitely nested, inheritance-based configuration system. Configurations could inherit from other configurations, which could inherit from others. Override rules were complex. Merge behavior was configurable.

What we needed: Simple, flat configuration files that anyone could understand.

Time wasted: Three months building, two months debugging edge cases, one month simplifying.

The fix: We ripped it out and replaced it with a three-layer max system: core defaults, module settings, project customization. Each layer overrides the previous. Simple to understand, simple to debug.

The Plugin Architecture

What we built: Full plugin isolation with sandboxing, version management, dependency resolution, and hot-reloading. Plugins could have plugins. The architecture could theoretically support an ecosystem of thousands of plugins.

What we needed: Simple extension points where developers could add functionality without modifying core code.

Time wasted: Two months building infrastructure that no one used.

The fix: We created the skills system instead—simple, single-purpose extensions with minimal overhead. No sandboxing (skills run in the same context). No complex versioning (just use git). The 95% case is covered. The 5% who need more can contribute to core.

The Workflow Engine v1

What we built: A full Turing-complete workflow language. Loops, conditionals, parallel execution, exception handling, compensating transactions. You could express any computation as a workflow.

What we needed: Linear steps with occasional branches.

Time wasted: Four months building a programming language when we should have been building product.

The fix: YAML-based simple workflows. Steps execute in order. Conditions branch. Loops repeat. That's it. The rare case that needs more complexity can use code, not configuration.

The pattern we noticed:

We consistently over-estimated complexity needs. We imagined sophisticated users with sophisticated requirements. In reality, 90% of use cases are simple. We built for the 10% and made the 90% harder.

We built for hypothetical future requirements. "What if someone needs X?" became justification for building X before anyone actually needed it. Most of those hypotheticals never materialized.

We ignored the cost of complexity on ourselves. Complex systems are hard to maintain, hard to debug, hard to explain. Every hour spent on over-engineering was an hour stolen from features users actually wanted.


Things We Underestimated

Harder Than We Thought

Context Management Complexity

What we expected: Just use bigger context windows. Problem solved.

Reality: Context is a strategic resource requiring careful management. Context windows are limited. Token costs add up. Not all context is equally valuable. Summarization loses information. Retrieval adds latency.

What we learned: Context management is a core competency, not a solved problem. The teams that do it well have significant advantages over those who don't.

The Debugging Challenge

What we expected: Standard debugging techniques would work. Set breakpoints, step through, find bugs.

Reality: Non-deterministic systems need new approaches. The same input produces different output. Bugs are probabilistic, not deterministic. You can't reproduce issues reliably.

What we learned: Invest in observability from day one. Structured logging, distributed tracing, state snapshots. When you can't step through, you need to reconstruct.

User Trust Requirements

What we expected: Demos would convince people. Show them what the system can do, and they'll adopt.

Reality: Enterprise trust requires track records. Seeing a demo is not the same as trusting a system with your data and reputation. Organizations need to see consistent, reliable behavior over time before they'll depend on it.

What we learned: Trust-building is a multi-year journey. There are no shortcuts. Demonstrated reliability over time is the only thing that works.

The Pace of AI Model Improvement

What we expected: Build for current model capabilities. If models improve, that's a bonus.

Reality: Models improved faster than our architecture could evolve. Capabilities we designed around became obsolete. Limitations we worked around disappeared. Our "clever" solutions sometimes became liabilities.

What we learned: Build for model capabilities 2 generations ahead. Abstract model interactions so you can swap. Don't bake current limitations into your architecture.


What Community Taught Us

Feedback That Changed Our Direction

"Your quickstart takes too long"

Original quickstart: 15 minutes, multiple steps, several decisions required.

Community feedback: People were dropping off at step 3. By step 5, we'd lost most of them.

What we changed: Redesigned for 5-minute success. Fewer steps. Fewer decisions. More defaults. The goal: get value before attention wanes.

Impact: 3x increase in quickstart completion rate. More importantly, those completers were more likely to continue using the product.

"We don't want to write code"

Original assumption: Developers want to code. Give them APIs and SDKs and they'll build.

Community feedback: Many developers preferred configuration over code. They wanted to describe what they wanted, not implement it themselves.

What we changed: Configuration-first design. Most agent behavior is defined in YAML and Markdown, not code. Code is available when needed, but not required for common cases.

Impact: Broader audience adoption. Product managers, technical writers, and domain experts could contribute to agent definitions without engineering help.

"Show us the failures"

Original assumption: Success stories sell. Polish everything. Hide the rough edges.

Community feedback: Polished success stories felt inauthentic. People wanted to know what didn't work, not just what did.

What we changed: Open discussion of challenges. Blog posts about problems (like this one). Honest assessment of limitations in documentation.

Impact: Deeper community engagement. Counter-intuitively, being honest about weaknesses built more trust than hiding them.

"Make it work with what we have"

Original assumption: Users would deploy FAOSX in greenfield environments. Clean slate, no legacy.

Community feedback: Everyone has existing systems. Existing data. Existing processes. A new tool needs to integrate, not replace.

What we changed: Integration-first architecture. We prioritized connectors, import/export, and compatibility with existing tools over proprietary features.

Impact: Enterprise adoption path. Organizations could adopt FAOSX incrementally, alongside existing systems, rather than ripping and replacing.


The Dogfooding Revelation

Using FAOSX to Build FAOSX

About six months in, we made a decision that changed everything: we started using FAOSX to build FAOSX.

Our agents helped write documentation. Our workflows managed our release process. Our C-Suite agents participated in planning discussions.

The immediate discovery: pain points we'd been ignoring were impossible to ignore when we experienced them ourselves.

What we found by dogfooding:

Pain points we'd minimized: "Yeah, that workflow is a bit clunky, but it works." When we had to use it daily, "a bit clunky" became unacceptable. We fixed things we'd been tolerating.

Features we thought we needed but didn't: We'd built capabilities based on what we imagined users wanted. Using the product ourselves, we discovered we didn't use many of them. We deprecated or simplified.

Features we desperately needed but hadn't built: Using the product revealed gaps we hadn't anticipated. The workflow editor we built because we needed it ourselves. The debugging tools we added after losing days to issues. The documentation we wrote because we couldn't remember our own system.

The feedback loop:

Daily use created an immediate feedback loop. Build something → use it → discover problems → fix them → use the improved version. Iteration speed increased dramatically.

We now mandate dogfooding for all features. If the team won't use it, users won't either. If the team can't figure it out, users definitely won't.


Advice for Teams Starting Their Journey

What We'd Tell Our Past Selves

1. Start with one agent doing one thing well

Don't build a platform. Build a solution. One agent, one problem, one user. Validate that you can deliver real value before building flexibility.

The platform emerges from solving real problems. If you build the platform first, you're guessing at what problems matter. If you solve problems first, the patterns become obvious.

2. Invest in observability before you need it

You can't debug what you can't see. Structured logging from line one. Correlation IDs for every request. State snapshots at checkpoints.

The time you spend setting up observability will be repaid 10x when something goes wrong. And something will go wrong.

3. Make your first user someone who will complain

Friendly users hide problems. They work around issues. They give polite feedback that doesn't tell you what's actually broken.

Find someone who will tell you when things are bad. Pain is information. The earlier you get painful feedback, the less expensive it is to fix.

4. Plan for model changes

Today's best model is tomorrow's baseline. In six months, capabilities you're working around will be solved. Limitations you're optimizing for will be irrelevant.

Abstract your model interactions. Make it easy to swap models. Don't bake current model behavior into your architecture. The model landscape changes fast.

5. Trust takes longer than technology

You can build features in weeks. Trust takes months or years. Start building trust before you need it. Be transparent about limitations. Respond quickly to problems. Demonstrate reliability over time.

Enterprise adoption is relationship-dependent. Technology is necessary but not sufficient. Trust is what converts pilots to production.

6. Simple > Flexible > Powerful

This is the order of priorities. Not Powerful > Flexible > Simple.

Simple means users can understand and use it. Flexible means users can adapt it. Powerful means users can do sophisticated things.

If you build powerful but not simple, only experts will use it. If you build flexible but not simple, users will be overwhelmed by options. Start simple. Add flexibility for real needs. Pursue power only when simpler approaches genuinely fail.


The Learning Never Stops

We're still making mistakes. We'll make new ones we haven't imagined yet.

The difference between now and a year ago isn't that we've stopped making mistakes. It's that we've built systems to learn from them faster. Observability shows us what's happening. User feedback tells us what matters. Dogfooding keeps us honest.

The key is learning fast and sharing openly. The AI agent space is young. Every team is pioneering. When we share what works and what doesn't, everyone benefits. The field advances faster than any individual team could push it.

To our community:

Thank you for the feedback that shaped this product. Thank you for the bug reports, the feature requests, the honest criticism. You've made FAOSX better than we could have made it alone.

To teams building in this space:

Share your lessons too. Write the blog posts. Answer the forum questions. Contribute the documentation. The more we share, the faster we all learn.

In our final post, we'll look forward—where we think agentic AI is heading and what we're building toward.


Join the conversation: What lessons have you learned? Share in our GitHub Discussions or Discord community.

Subscribe: Get the series finale: Post 10 — The Future of Agentic Enterprise.

Next in the series: Post 10: The Future — Where Agentic Enterprise Is Heading


This is Post 9 of 10 in the series "Building the Agentic Enterprise: The FAOSX Journey."


Ready to see agentic AI in action? Request a Workshop and let's build the future together.