Before You Start: The AI Concepts That Actually Matter for Builders

Most introductions to AI for developers are written from the outside. Here’s what transformers are. Here’s how attention works. Here’s a taxonomy of model families.

This series is written from the inside. Every piece is grounded in a real project — a mixin-based business framework called DarJS, built over several months of AI-assisted development. The patterns here weren’t derived from theory. They were discovered by hitting the problems that theory doesn’t warn you about.

Before the series begins, it helps to know which AI concepts you’ll actually use as a builder. Not the full landscape — the specific handful that recur in every serious project.

The Concepts (In Order of When You’ll Need Them)

1. Context Window Management

The AI doesn’t remember. Every conversation starts cold. The context window is everything the AI knows right now — your code, your instructions, your conversation history.

This shapes every decision about how you structure a codebase for AI work. Where you put documentation matters because the AI reads it every session. How much you load matters because every token costs. What you index matters because the AI can’t grep a file it hasn’t been given.

The good news: context window management is mostly a file organization problem. Solve it once, benefit every session.

Covered in: 013 — Your Directory Layout Is Now a Routing Table

2. Retrieval-Augmented Generation (RAG)

The pattern: before asking the AI to generate something, retrieve the relevant existing code, documentation, or context and include it in the prompt.

The basic version is copy-paste. The intermediate version is structured annotations on every public function (@reuse-when, @does, @tags) that a search tool can query. The advanced version is vector embeddings — semantic search over your entire codebase.

The important insight: most of what developers ask an AI to generate already exists somewhere in the codebase. RAG turns “generate this from scratch” into “find this and adapt it.” The retrieval hit rate for well-annotated codebases is high enough to measurably reduce the number of times you need to generate at all.

Covered in: 011 — The NLP-First Codebase, 012 — Writing Code for Machines

3. Tool Use / Function Calling

The AI can call functions. You define the functions; the AI decides when and how to call them.

This changes what’s possible. An AI without tools can only produce text — code, explanations, suggestions. An AI with tools can inspect your app’s live structure, write files, run tests, and verify the result. The difference between advising and acting.

The two-part structure: read tools give the AI eyes (introspect the app, search contracts, check health). Write tools give it hands (create models, scaffold apps, fix locale keys). Read tools alone make the AI a better advisor. Read and write tools together make it a working collaborator.

The key design problem is stability: the tools wrap a codebase that changes. Solving that problem is most of what articles 017-018 are about.

Covered in: 014 — Your Framework Needs a dar inspect, 017 — The Stable Adapter Layer, 018 — From Oracle to Builder

4. Structured Outputs

Models produce text. Text is ambiguous. Structured outputs constrain the model to emit JSON, or a specific schema, or a tool call — something a program can parse without guessing.

Every place you accept free text from an AI is a point of failure in an automated workflow. Every place you enforce a schema is a point you don’t have to write error handling for.

In practice this means: tool return values should be typed JSON. AI-generated config should conform to a spec. The DSL between AI and your app should have defined valid values, not arbitrary strings.

Covered in: 010 — The DSL Layer Between AI and Your App

5. Agentic Loops

A loop: the AI calls a tool, observes the result, decides what to call next, repeats until done.

The scaffold workflow in this series is a minimal example: suggest mixins → scaffold app (dry run) → user confirms → scaffold (write) → generate PageDef → health check → fix locale. Six tool calls. A runnable app. No human intervention between steps.

The safety primitive for loops is a confirmation gate — a way to pause the loop and show the human what’s about to happen before it happens. Without it, an agentic loop is a thing that does whatever it decides without oversight. With it, the loop handles the mechanical work and the human handles the judgment calls.

Covered in: 018 — From Oracle to Builder

6. Evals

The thing most developers skip until something breaks badly.

An eval is a deterministic check over AI-generated output. Not “does this look right?” but “does this pass these specific assertions?” The test suite for AI output, not just for your code.

In a DarJS app, the scenario runner is the beginning of an eval framework: scaffold a model, run the scenario, assert the expected records exist. The scenario doesn’t know or care that the model was AI-generated. It just checks the behavior.

As write-capable AI tools mature, evals become load-bearing. The only way to trust an AI that writes files is to have something that verifies what it wrote.

Covered in: 009 — How to Make Your App AI-Testable, 010 — The DSL Layer

7. Prompt Caching

A practical detail that compounds into significant savings.

Long system prompts — your CLAUDE.md, your KNOWLEDGE.md, your contract index — can be cached by the model provider. The first call pays full price; subsequent calls within the cache window pay a fraction. For a well-structured project where the same context is loaded every session, caching is the difference between expensive and affordable at scale.

The implication: putting more into a well-structured system prompt is often cheaper than loading it piecemeal, because it’s cached once and reused many times. The architecture of your project instructions is also a cost architecture.

The Through-Line

Read the list again. Context management is a file organization problem. RAG is a search problem. Tool use is an API design problem. Structured outputs are a schema problem. Agentic loops are a workflow design problem. Evals are a testing problem.

Every AI concept, when it touches real code, becomes a software engineering problem. The developers who will get the most out of AI-assisted development are not the ones who understand transformers best. They’re the ones who can recognize which software problem each AI concept reduces to, and then solve that problem well.

That’s what this series is about.

How to Read It

The articles are mostly self-contained. You can read them in order or jump to the topic you’re facing.

A few landmarks:

If you’re starting a new AI-assisted project: 001 (instructions), 003 (framework prompt), 013 (directory layout)

If you want less LLM dependency: 011 (NLP contracts), 012 (code for machines), 014 (dar inspect)

If you’re building AI tooling over a framework: 017 (adapter pattern), 018 (write tools), 014 (inspect layer)

If you want AI-verifiable apps: 009 (AI-testable design), 010 (DSL layer), 015 (PageDef autofill)

All code examples are from DarJS. The patterns transfer to any framework that takes itself seriously enough to be worth wrapping.