My Road to i80: Building AI Agents That Know Their Domain

i80 metaphor for AI
The Spark

I love to cook. Not by the book, but with intuition, memory, and whatever ingredients are at hand. Over the years, I've built a rich personal archive: dishes invented on the fly, surprising flavor combinations, and moments of improvisation captured in photos, notes, and conversations.

Then, in January 2023, I encountered ChatGPT. I was blown away by its ability to answer questions and generate ideas. I'd ask about cuisines, techniques, or flavor pairings, and the responses were impressive. But when I asked, "What did I cook that night with salmon and miso?", the reply came through as: "I don't have access to your personal cooking history." For all their brilliance, these large language models (LLMs) didn't know my stories. They couldn't access the unique experiences, preferences, and memories that make me who I am.

The Turning Point

That's when it hit me: What if I could build an AI that feels like an LLM but knows what I know? Not a generic model trained on the internet, but a personal system capturing my cooking stories, my thought processes, my unique way of seeing the world. I imagined an AI that could answer, "What would I have cooked with those ingredients?" or even "How would I have approached that problem?" - not based on generic data, but on my own lived experience.

This led me to the idea of a domain-aware AI Agent - not just a chatbot, but a system that can understand a question, search trusted knowledge, decide how to respond, and stay grounded in the right context. Underneath, it may use retrieval, embeddings, orchestration, and LLMs. But the goal is easier to understand: build an AI Agent that knows a specific world deeply. I started with a personal knowledge base - not of recipes, but of my cooking stories, complete with ingredients, intentions, and moments of inspiration. This agent wouldn't just recite facts; it would reflect my style, my creativity, and my voice.

That's also where the name "i80" comes from - inspired by the interstate highway built for clarity, structure, and speed. Just like that road, this system is meant to navigate complexity with confidence. No guesswork. No detours. Just grounded, context-aware answers that get you where you need to go.

My vision goes beyond cooking. It's about creating an AI that acts like an extension of myself - one that preserves my memories and thought patterns for future reflection, or for others to access my perspective. It's a step toward digital continuity, a way to keep my perspective alive. And this approach isn't just personal. The same technology can empower organizations - like hotel guest services, HR help desks, new employee onboarding assistants and many others - where public LLMs fall short due to a lack of domain-specific, trustworthy knowledge.

The Challenge Ahead

Building an AI Agent that knows a person, a business, or a domain is no small task. It requires blending the expressive power of LLMs with curated knowledge, reliable retrieval, careful orchestration, and clear boundaries. But that's the road I'm on: a path to an AI that doesn't just answer, but remembers, reasons, acts, reflects, and resonates.

I don't know how far I'll go, or exactly what the final destination might look like. But that's part of the journey. What started as a spark - an idea to build an AI that truly understands me - has become a deeply personal exploration of memory, language, and technology.

A Living Journal
This website will document my road to i80. Through this space, I'll share what I'm learning along the way - the breakthroughs and the dead ends, the tools and techniques, the insights that emerge when intuition meets iteration. Whether it leads to a fully functioning personal AI Agent, a new kind of storytelling engine, trusted domain agents for businesses, or something I haven't imagined yet, I'm here to explore it - and you're welcome to follow along.

- Alex P. Wang
October 28, 2024

How I Got Started

How I Got Started I jumped in after the idea struck me - despite not knowing much about large language models. Back in graduate school, I studied expert systems, AI, and neural networks, but LLMs were a whole new world. I knew I'd need Python, but at that point, I hadn't written a single line of it.

Thankfully, there's an abundance of resources online - and even better, we now have LLMs to help along the way. It didn't take long for me to set up my initial Python environment using VS Code. And just like that, I was on my way. (Set Up Your Python App Environment)

At first, I tried downloading open-source LLMs and fine-tuning them with my own knowledge base. That quickly turned into a dead end. Fine-tuning is resource-intensive, hard to iterate, and - most importantly - poorly suited for domain knowledge that changes frequently or needs precise control. It simply wasn't the right tool for a task where accuracy, flexibility, and explainability matter. Eventually, I came across retrieval, embeddings, and knowledge-grounded generation. That opened the door to something broader: building AI Agents that can use trusted knowledge, make decisions, and respond with context.

How an AI Agent Works

Not all AI agents are the same. Some agents execute tasks like booking meetings or sending emails. Some automate workflows across systems. Some operate autonomously with minimal human input.

The agents I focus on are domain-specific AI Agents — agents powered by trusted private knowledge. These agents are designed to answer accurately, reason within a specific domain, and stay grounded in real business knowledge rather than generic internet knowledge.

Examples include hotel guest assistants, onboarding assistants, internal support agents, and personal knowledge agents built from cooking stories, memories, and lived experience.

I think the best way to explain how this AI Agent works is with an analogy - how a human answers a question and decides what to do next.

Imagine you're answering a question from your friend. First, you think back through your memory to find anything relevant - that's retrieval. Then, based on what you remember, your brain put the answer together in your own words - that's generation.

That is essentially what a trusted domain-specific AI Agent does. It has a curated knowledge base (its memory). The retriever pulls in the most relevant information. The orchestrator - like your brain - decides how to respond: answer directly, ask for clarification, call on an LLM, or eventually trigger an action through an API or workflow.

  • Curate a knowledge base (memory) - Use your private, domain-specific content - facts, stories, documentation - to build a structured foundation.
  • Retrieve relevant content (retrieval) - When a question is asked, the retriever searches the knowledge base for snippets that are semantically related and have high similarity scores.
  • Orchestrate a response or action (brain):
    • If the similarity score is high and the answer is straightforward, it returns the response directly.
    • If the similarity is lower or the question is more nuanced, it sends the retrieved content to the LLM - along with clear instructions to stay grounded in the facts and avoid hallucination.
    • If the task requires action, the agent can route the request to a defined workflow, API, or human handoff.

The result is an AI Agent that behaves more like a well-informed assistant: thoughtful, relevant, context-aware, and capable of moving from answers toward action.

Key Building Blocks of an AI Agent

It didn't take me long to piece together the basic building blocks and get things up and running. At the heart of what I built is a modular agent system - one that reflects how we, as humans, remember, reason, and respond. I set up a curated knowledge base to serve as memory. A retriever fetches the most relevant information based on your question. An orchestrator decides whether to respond directly, synthesize information from multiple sources, call on an LLM, or route the request toward an action.

Once I had these core parts working together, things really started to click. The setup gave me a solid foundation to start experimenting - exactly the kind of system I had envisioned from the very beginning.

Knowledge Base

A structured memory system built from your content. Text is transformed into semantic vectors and stored in a vector database for fast, meaningful retrieval.

Retriever

Finds the most relevant entries from the knowledge base by comparing the question to stored vectors using similarity scores.

Orchestrator

Decides how the agent should respond - return trusted content directly, ask for clarification, invoke the LLM, or route the request to a defined action.

LLM

Produces natural-language answers using retrieved context, instructions, and boundaries so the agent can respond clearly and accurately.

Key Challenges in Building Trusted AI Agents

The concept sounded simple - until I actually started building and testing my first domain-aware agent. One of the first things I had to understand was how the system makes sense of conversational language. It does this through a technique called embeddings. In simple terms, embeddings convert text into numbers (technically, vectors) - representations that capture meaning, not just literal words. This allows the system to compare concepts and retrieve content that's semantically relevant to the question being asked, even when the wording or language differs.

With that foundation in place, a new challenge emerged: what content should I embed? At first, I followed the conventional method of breaking documents into evenly sized chunks. But in practice, this approach often missed the essence of what users were really asking. What turned out to be far more effective - especially in a focused domain - was creating query-focused embeddings that align more closely with the actual questions people tend to ask. That shift significantly improved the system's performance and reliability. It laid the foundation for my solution. Of course, embeddings are just one part of the puzzle - many other challenges still remain.

Embedding Quality

Poor or inconsistent embedding quality due to vague text, evolving language, or model drift. Embedding mismatches lead to retrieval errors - especially over time

Query Understanding

Interpreting vague or conversational questions - especially when context is implied or missing. Handling multi-turn conversations or follow-ups that reference previous questions or answers

Retrieval Accuracy

Ensuring the right pieces of information are found, ranked by relevance, and passed correctly to the language model.

Knowledge Base Coverage

Capturing enough structured, relevant knowledge to confidently answer the real-world questions users actually ask. Keeping the knowledge base up to date, pruning outdated info.

Multilingual Support

Accurately retrieving and generating across multiple languages. Embeddings and language models trained in one language often degrade in performance with others.

Hallucination Control

Making sure the model generates responses based only on trusted knowledge, avoiding plausible-sounding fabrications.

Research & Reflections

Based on the challenges I've encountered while building private-domain AI Agents - from query understanding and embedding design to orchestration, confidence routing, and agent actions - I've identified several areas that deserve deeper exploration. This section is where I document that journey. It is a place for papers, practical articles, technical notes, product thinking, and reflections on current AI progress. Some pieces will be formal and evidence-driven, exploring core architectural questions such as retrieval strategies, embedding methods, confidence routing, and how private knowledge systems perform across different domains. Others will be shorter observations from building real systems, responding to new AI developments, or thinking through how agents may change software, knowledge work, and personal memory. All of this is a work in progress - continuously updated as I discover new ideas, run new experiments, and refine the path forward.

Articles & Reflections

The Last Line of Code

April 3, 2026

After two years of working with AI tools and large language models, one thing has become increasingly clear: software engineering is entering its next major transition. This article explores the shift from writing code to designing intent—where engineers move beyond implementation and become AI Product Architects, defining systems through structured natural language and precise specifications.

Selecting the Best Embedding Method for Limited Domain Knowledge

May 21, 2025

This article compares chunk-based and query-focused embedding strategies for private-domain AI Agents in limited-domain settings. It outlines 15 chunking methods and highlights why query-based embedding offers greater precision and efficiency for structured knowledge bases.

Papers

Enhancing Query Retrieval Precision Through Optimized Embedding Text Selection

Nov 16, 2024

This paper analyzes how embedding construction affects semantic search precision in private-domain applications. It evaluates normalization, synonyms, typo handling, and conversational phrasing using text-embedding-3-large and ChromaDB, providing evidence-backed recommendations to improve retrieval accuracy.

Optimizing Retrieval in Private Knowledge Agents: A Comparison of Query-Focused and Chunk-Based Embedding Strategies

This paper compares query-focused embedding and chunk-based retrieval in private knowledge agent systems, using persoanl cooking stories. It evaluates precision, latency, and efficiency, showing how query-focused methods can improve accuracy and reduce overhead in structured, domain-specific AI Agent applications.

Want to Collaborate?

This is an ongoing personal research and building journey - not a finished product, but a path of discovery. There are many challenges ahead: from capturing intent to handling ambiguity, from structuring knowledge to routing actions and scaling across subjects within a domain. I'm approaching it step by step, using a divide-and-conquer mindset to explore, test, and refine. Is the end goal too ambitious? Perhaps. But I'm confident that I'll make progress, uncover interesting insights, and grow along the way.

If you're working on similar problems or have ideas to share, I welcome collaboration and conversation. Let's learn from each other.

Want to collaborate or learn more?
Email me at alex@i80.com