05 | Giving AI a Bird's-eye View

When working with AI coding assistants on a large codebase, there's a fundamental issue: the AI can only realistically read a handful of files, but your system might span hundreds or thousands of files across multiple services. The AI sees a slice. Which is most of the time fine, but sometimes you need it to understand the whole system, especially during planning. If you want the AI to help you think through a new feature, but it doesn't know what APIs already exist, what tables are in your database, or how your services communicate. It's like asking someone to design a room addition when they've never seen the floor plan.

I've been using AGENTS.md files and DDRs to give AI context about patterns and decisions. But there's a gap: these documents describe how to work within the codebase, not what the current codebase contains. They describe the constraints for adding new functionality but they don't know what else is in the system.

What I really needed was a way to give AI a bird's-eye view of the entire system.

The Problem: AI Sees Trees, Not Forest

When I ask an AI to help plan a feature I often need to give it lots of extra context about what we've already got, getting it up to speed with where we are.

The things I found myself explaining over and over were what APIs we have, what database tables we have, and how the communication between services works. I was patching over this by adding this stuff to the AGENTS.md files but this information is only sometimes relevant so you're wasting context.

I used to ask it to read the important controller files individually before being able to plan with it, but that gets quite impractical for anything other than small targeted changes. On any large project it's easy to have 100s of APIs and loads of database tables, not to mention integration events. No AI is going to read all of that into its context. The most eager AI I've dealt with would easily give up after about 10 files, confidently telling me that it's read all of the files.

The Solution: Code-Generated Reference Documentation

What I needed was an information dense reference document that captures the metadata of the system without requiring the AI to read every source file.

Since I work in C# for the backend, I built a tool using Roslyn (Microsoft's .NET compiler platform) to statically analyse my codebase and generate three key reference files:

API Routes (api-routes.md) - Every endpoint with method, route, controller, request/response types, and descriptions
Database Schema (database-schema.md) - Every table with columns, types, indexes, foreign keys, relationships and table descriptions
Event Flow (event-flow.md) - Every integration event with what method publishes the event, what method consumes the event, and what services those are both in.

It does take a while to run so currently I have a command line I run to re-create them. A project which has been on the to-do list for a while is to move this to a GitHub action and create a commit back onto main if there were changes, which would run on every commit to main.

What the Generated Docs Look Like

FYI, all of the tables below have been helpfully populated with dummy data by my friend Claude.

API Routes Reference

markdown

| Method | Route                         | Controller                  | Request Body         | Return Type      | Auth           | Description                |
| ------ | ----------------------------- | --------------------------- | -------------------- | ---------------- | -------------- | -------------------------- |
| POST   | `products`                    | CreateProductController     | CreateProductRequest | ProductResponse  | Authorized     | Creates a new product      |
| GET    | `products/{id}`               | GetProductController        | -                    | ProductResponse  | Authorized     | Retrieves product details  |
| DELETE | `products/{id}`               | DeleteProductController     | -                    | Ok               | IsProductOwner | Deletes a product          |
| GET    | `products/{id}/reviews`       | GetProductReviewsController | -                    | ReviewResponse[] | -              | Retrieves product reviews  |
| PATCH  | `products/{id}/reviews/{rid}` | UpdateReviewController      | UpdateReviewRequest  | ReviewResponse   | CanEditReview  | Updates an existing review |

The routes documentation is a single markdown table with every endpoint

Database Schema Reference

markdown

### `products`

> Central product catalog entity. Maintains product information

#### Columns

| Column      | CLR Type       | DB Type                  | Nullable | Max Length | Default | Description                                 | Notes |
| ----------- | -------------- | ------------------------ | -------- | ---------- | ------- | ------------------------------------------- | ----- |
| Id          | long           | bigint                   | No       | -          | -       | Gets the unique identifier for this entity. | PK    |
| Name        | string         | TEXT                     | Yes      | -          | -       | The product display name.                   | -     |
| Description | string         | TEXT                     | Yes      | -          | -       | Full product description.                   | -     |
| Price       | decimal        | numeric(18,2)            | No       | -          | -       | Current product price.                      | -     |
| CategoryId  | long           | bigint                   | No       | -          | -       | Foreign key to product category.            | FK    |
| CreatedAt   | DateTimeOffset | timestamp with time zone | No       | -          | -       | When the product was created.               | -     |

#### Indexes

| Columns      | Unique | Purpose                                                               |
| ------------ | ------ | --------------------------------------------------------------------- |
| `CategoryId` | No     | ProductRepository.GetByCategoryAsync - filtering products by category |
| `CreatedAt`  | No     | ProductRepository.GetRecentProducts - ordering by creation date       |

#### Foreign Keys

| Column(s)    | References | On Delete | Relationship |
| ------------ | ---------- | --------- | ------------ |
| `CategoryId` | Categories | Restrict  | Many-to-One  |

The database documentation is a single markdown table with every database table

Event Flow Reference

markdown

| Event                    | Queue                      | Publisher(s)     | Publisher Source File(s)                 | Consumer(s)                           | Consumer Class                                              |
| ------------------------ | -------------------------- | ---------------- | ---------------------------------------- | ------------------------------------- | ----------------------------------------------------------- |
| `OrderPlacedEvent`       | `order-placed-event`       | OrderService     | OrderAggregateAddedDomainEventHandler.cs | InventoryService, NotificationService | InventoryReservationHandler, OrderNotificationHandler       |
| `PaymentProcessedEvent`  | `payment-processed-event`  | PaymentService   | ProcessPaymentCommandHandler.cs          | OrderService                          | OrderConfirmationHandler                                    |
| `InventoryReservedEvent` | `inventory-reserved-event` | InventoryService | ReserveInventoryCommandHandler.cs        | ShippingService, AnalyticsService     | ShippingNotificationHandler, InventoryAnalyticsHandler      |
| `OrderShippedEvent`      | `order-shipped-event`      | ShippingService  | ShipOrderCommandHandler.cs               | NotificationService, CustomerService  | ShippingNotificationHandler, CustomerOrderUpdateHandler     |
| `PaymentFailedEvent`     | `payment-failed-event`     | PaymentService   | ProcessPaymentCommandHandler.cs          | NotificationService, OrderService     | PaymentFailureNotificationHandler, OrderCancellationHandler |

The event flow documentation is a single markdown table with every event in the system

When the AI understands this event flow, it can reason about async processes: "When an order is placed, the inventory is reserved, then shipping is notified." That kind of system-level understanding is impractical without seeing the full event map.

Benefits for AI-Assisted Development

When I want to plan a new feature, I tell the AI to read the three reference files first. The AI can make informed suggestions instead of generic recommendations.

When I ask the AI to help with an unfamiliar part of the codebase, the schema documentation provides instant context. "Oh, Product has an IsDiscontinued field and DiscontinuedReason - I can see the lifecycle model without reading the entity class."

The event flow documentation is invaluable for debugging and extension. The AI can trace: "When an image uploads, what happens?" It sees the full chain without reading handler implementations.

Starting Your Own Documentation Generator

I'm not able to share or release my generator as it's not a generic thing - it's tied directly into how my project works. I think any generator like this would be tightly coupled to the codebase, or at least ones built on Roslyn would be.

If you're working with a large codebase, consider building something similar. The core approach:

Identify what structure matters and isn't easily documented in AGENTS.md - APIs? Database? Message queues?
Use your language's analysis tools - Roslyn for C#, ASTs for TypeScript, reflection for Python
Generate structured, scannable output - Tables work great for AI consumption

The goal isn't comprehensive documentation for humans - it's compressed reference material that lets AI understand your system's shape without reading every file.

Bringing it all together

I've added references to the generated docs in my root AGENTS.md:

markdown

## Generated References

- **[API Routes](docs/generated/api-routes.md)** - All API endpoints with methods, routes, controllers, and request/response types
- **[Database Schema](docs/generated/database-schema.md)** - All tables with columns, types, indexes, foreign keys, and EF configurations
- **[Event Flow](docs/generated/event-flow.md)** - All RabbitMQ integration events with publishers, consumers, and service flows

This is another piece of the puzzle I've been building: making AI coding assistants genuinely useful on large codebases.

AGENTS.md files explain patterns, conventions, and "how to work here"
DDRs explain decisions, rationale, and constraints
Generated references explain "what exists" at a structural level

Together, they give the AI both the wisdom (patterns/decisions) and the knowledge (current state) needed to work effectively.

Combined with automatic context loading and change detection, I'm getting closer to AI that truly understands the project it's working on.

What approaches have you tried for giving AI visibility into large systems?