Tool Design Principles

Our MCP tools are designed for LLM consumption—safe, deterministic, and composable.

Core Principles

Single Responsibility

Each tool does one thing well:

ToolResponsibility
read_dagRead raw DAG source code
lookup_conceptQuery Airflow→Prefect translation knowledge
search_prefect_docsSearch live Prefect documentation
validateSyntax check + return both sources for comparison
scaffoldCreate Prefect project directory structure

Structured Outputs

All tools return JSON for predictable parsing:

{
  "source": "...",
  "file_path": "/absolute/path/to/dag.py",
  "file_size_bytes": 1234,
  "line_count": 45
}

This allows LLMs to reliably extract information and chain tool calls.

Explicit Error Handling

Errors include context and suggestions:

{
  "error": "Cannot connect to Prefect MCP at https://docs.prefect.io/mcp. Run 'colin run' for cached context."
}

Composability

Tools can be chained in a predictable workflow:

Tool Chain
workflow
read_dag → lookup_concept → search_prefect_docs → LLM generates → validate

Each tool's output feeds into the LLM's understanding for the next step.

Design Rationale

Why Raw Code Instead of AST Analysis?

We intentionally do not parse DAGs into AST structures. LLMs are better at reading code directly than processing structured AST representations.

AST-based approach (deprecated):

  • Lossy — drops comments, formatting, context
  • Fragile — breaks on dynamic DAGs, custom operators
  • Incomplete — can't represent all Python patterns

Raw code approach (current):

  • Lossless — LLM sees exactly what the developer wrote
  • Robust — handles any valid Python
  • Contextual — comments and structure preserved

Why Separate Lookup and Search?

lookup_concept and search_prefect_docs serve different purposes:

  1. lookup_concept — Pre-compiled, instant, works offline. 78 entries covering common operators, patterns, and connections. Compiled from live Airflow source and Prefect docs via Colin.
  2. search_prefect_docs — Live, comprehensive, requires network. Queries the real Prefect documentation for anything not pre-compiled.

What Is Colin?

Colin is the build-time tool that compiles translation knowledge from live sources. It reads Airflow operator source code and Prefect documentation, summarizes the mappings with an LLM, and outputs JSON files that lookup_concept serves at runtime.

The compiled knowledge ships with the PyPI package — you don't need to run Colin yourself. If you're developing locally and want to recompile after upstream changes, run cd colin && colin run.

When you see "source": "colin" in a lookup_concept response, it means the answer came from pre-compiled knowledge. "source": "fallback" means Colin output wasn't available and the server used its minimal built-in mappings.

Why Not Generate Code?

The tools provide knowledge, not code. The LLM does the generation:

  • Source code and translation rules in, working Prefect code out
  • Idiomatic output that matches how a Prefect developer would write it
  • Improve results by improving knowledge, not maintaining templates

Tool Categories

Input Tools

Read source code without side effects:

  • read_dag — Read raw DAG source

Knowledge Tools

Query translation knowledge:

  • lookup_concept — Pre-compiled Airflow→Prefect mappings
  • search_prefect_docs — Live Prefect documentation

Output Tools

Verify and scaffold:

  • validate — Syntax check + comparison guidance
  • scaffold — Project structure generation

Best Practices for Tool Usage

  1. Start with read_dag — Let the LLM see the raw code first
  2. Look up concepts as needed — Query lookup_concept for each Airflow pattern found
  3. Search docs for gaps — Use search_prefect_docs when lookup doesn't cover something
  4. Let the LLM generate — Don't constrain its output
  5. Always validate — Catch syntax errors before deployment