Tool Design Principles

Our MCP tools are designed for LLM consumption—safe, deterministic, and composable.

Core Principles

Single Responsibility

Each tool does one thing well:

Tool	Responsibility
`read_dag`	Read raw DAG source code
`lookup_concept`	Query Airflow→Prefect translation knowledge
`search_prefect_docs`	Search live Prefect documentation
`validate`	Syntax check + return both sources for comparison
`scaffold`	Create Prefect project directory structure

Structured Outputs

All tools return JSON for predictable parsing:

{
  "source": "...",
  "file_path": "/absolute/path/to/dag.py",
  "file_size_bytes": 1234,
  "line_count": 45
}

This allows LLMs to reliably extract information and chain tool calls.

Explicit Error Handling

Errors include context and suggestions:

{
  "error": "Cannot connect to Prefect MCP at https://docs.prefect.io/mcp. Run 'colin run' for cached context."
}

Composability

Tools can be chained in a predictable workflow:

Tool Chain

workflow

read_dag → lookup_concept → search_prefect_docs → LLM generates → validate

Each tool's output feeds into the LLM's understanding for the next step.

Design Rationale

Why Raw Code Instead of AST Analysis?

We intentionally do not parse DAGs into AST structures. LLMs are better at reading code directly than processing structured AST representations.

AST-based approach (deprecated):

Lossy — drops comments, formatting, context
Fragile — breaks on dynamic DAGs, custom operators
Incomplete — can't represent all Python patterns

Raw code approach (current):

Lossless — LLM sees exactly what the developer wrote
Robust — handles any valid Python
Contextual — comments and structure preserved

Why Separate Lookup and Search?

lookup_concept and search_prefect_docs serve different purposes:

lookup_concept — Pre-compiled, instant, works offline. 78 entries covering common operators, patterns, and connections. Compiled from live Airflow source and Prefect docs via Colin.
search_prefect_docs — Live, comprehensive, requires network. Queries the real Prefect documentation for anything not pre-compiled.

What Is Colin?

Colin is the build-time tool that compiles translation knowledge from live sources. It reads Airflow operator source code and Prefect documentation, summarizes the mappings with an LLM, and outputs JSON files that lookup_concept serves at runtime.

The compiled knowledge ships with the PyPI package — you don't need to run Colin yourself. If you're developing locally and want to recompile after upstream changes, run cd colin && colin run.

When you see "source": "colin" in a lookup_concept response, it means the answer came from pre-compiled knowledge. "source": "fallback" means Colin output wasn't available and the server used its minimal built-in mappings.

Why Not Generate Code?

The tools provide knowledge, not code. The LLM does the generation:

Source code and translation rules in, working Prefect code out
Idiomatic output that matches how a Prefect developer would write it
Improve results by improving knowledge, not maintaining templates

Tool Categories

Input Tools

Read source code without side effects:

read_dag — Read raw DAG source

Knowledge Tools

Query translation knowledge:

lookup_concept — Pre-compiled Airflow→Prefect mappings
search_prefect_docs — Live Prefect documentation

Output Tools

Verify and scaffold:

validate — Syntax check + comparison guidance
scaffold — Project structure generation

Best Practices for Tool Usage

Start with read_dag — Let the LLM see the raw code first
Look up concepts as needed — Query lookup_concept for each Airflow pattern found
Search docs for gaps — Use search_prefect_docs when lookup doesn't cover something
Let the LLM generate — Don't constrain its output
Always validate — Catch syntax errors before deployment