When AI agents fail mid-execution, they often lose their entire context and any work completed up to that point. API rate limits, network timeouts, and infrastructure failures can turn a sophisticated multi-step agent into an expensive waste of tokens. What if your agent could survive these failures and resume exactly where it left off?

This post walks through how I built a multi-model AI intelligence system using Temporal’s AI SDK integration for TypeScript. We’ll examine the architecture, explore how tools become Temporal Activities, and show how the scatter/gather pattern enables parallel queries across Claude’s model family.

What is NetWatch?

NetWatch is a cyberpunk-themed intelligence analysis system set in Night City, 2077. It demonstrates several enterprise integration patterns running on Temporal:

  • Multi-Model Scatter/Gather: Query Haiku 4.5, Sonnet 4.5, and Opus 4.5 in parallel, then aggregate results
  • Tool-Equipped AI Agents: Agents with access to corporate databases, runner profiles, and threat analysis tools
  • Durable Execution: Every LLM call and tool invocation is automatically persisted and retryable

[SCREENSHOT: NetWatch frontend terminal interface showing the cyberpunk-themed UI with query input and model selection]

Temporal’s AI SDK Integration

Temporal’s integration with the Vercel AI SDK lets you write AI agent code that looks almost identical to standard AI SDK usage, but with one critical difference: every LLM call becomes durable.

LLM API calls are fundamentally non-deterministic. In a Temporal Workflow, non-deterministic operations must run as Activities. The AI SDK plugin handles this automatically. When you call generateText(), the plugin wraps those calls in Activities behind the scenes.

This means your agent survives:

  • Infrastructure failures (process crashes, container restarts)
  • API rate limits (automatic retries with backoff)
  • Long-running operations (agents can run for hours or days)
  • Network timeouts (graceful retry handling)

The workflow code maintains the familiar Vercel AI SDK developer experience:

import { generateText, tool } from 'ai';
import { temporalProvider } from '@temporalio/ai-sdk';

export async function netwatchIntelAgent(request: IntelRequest): Promise<IntelResponse> {
  const result = await generateText({
    model: temporalProvider.languageModel('claude-sonnet-4-5-20250929'),
    prompt: request.query,
    system: NETWATCH_SYSTEM_PROMPT,
    tools: createTools(),
    stopWhen: stepCountIs(10),
  });
  return result.text;
}

The only change from non-Temporal code is using temporalProvider.languageModel() instead of importing the model directly. This single change gives you durable execution, automatic retries, timeouts, and full observability.

Architecture Overview

The NetWatch system consists of four main components that communicate through Temporal:

flowchart TB
    subgraph Client["Client Layer"]
        FE[Frontend UI]
        API[Express API Server]
    end

    subgraph Temporal["Temporal Server"]
        TS[Temporal Service]
        TQ[Task Queue: netwatch-intel]
        WS[Workflow State Store]
    end

    subgraph Worker["Worker Layer"]
        W[NetWatch Worker]
        WF[Workflows]
        ACT[Activities]
        AI[AI SDK Plugin]
    end

    subgraph External["External Services"]
        ANTH[Anthropic API]
        DB[(Intel Databases)]
    end

    FE -->|HTTP POST /api/intel| API
    API -->|gRPC: Start Workflow| TS
    TS -->|Queue Tasks| TQ
    TQ -->|Poll for Work| W
    W --> WF
    WF --> ACT
    WF --> AI
    AI -->|API Calls| ANTH
    ACT -->|Query Data| DB
    W -->|Report Completion| TS
    TS -->|Return Result| API
    API -->|JSON Response| FE

The Communication Flow

  1. Client to Temporal Server: The Express API server connects to Temporal via gRPC (default port 7233). When a request arrives, it starts a workflow execution with client.workflow.start().

  2. Temporal Server to Worker: Temporal doesn’t push work to workers. Instead, workers poll the Task Queue for work. This pull-based model means workers can scale independently and Temporal handles load distribution.

  3. Worker Execution: When the worker picks up a task, it executes the workflow code. Any LLM calls through temporalProvider.languageModel() are automatically wrapped as Activities.

  4. Result Propagation: The workflow result flows back through Temporal to the waiting client via handle.result().

How the Server Communicates with Temporal

The NetWatch server demonstrates a clean separation between HTTP handling and workflow orchestration:

import { Client, Connection } from '@temporalio/client';
import { netwatchIntelAgent } from './workflows/netwatch-agent';

async function main() {
  // Establish gRPC connection to Temporal Server
  const connection = await Connection.connect({
    address: process.env.TEMPORAL_ADDRESS || 'localhost:7233',
  });
  const client = new Client({ connection });

  // Express route handler
  app.post('/api/intel', async (req, res) => {
    const { query, priority, requester } = req.body;
    const requestId = `REQ-${Date.now()}`;

    // Start workflow execution
    const handle = await client.workflow.start(netwatchIntelAgent, {
      taskQueue: 'netwatch-intel',
      workflowId: `netwatch-${requestId}`,
      args: [{ requestId, query, requester, priority }],
    });

    // Wait for workflow completion
    const result = await handle.result();
    res.json(result);
  });
}

The server never handles API keys or makes LLM calls directly. It simply tells Temporal “run this workflow with these arguments” and waits for the result. This separation means:

  • API credentials only exist on worker nodes
  • The server scales independently from AI processing
  • Multiple servers can start workflows that any worker can process

The Worker: Where AI Execution Happens

The worker is where the AI magic happens. It’s configured with the AiSdkPlugin that enables durable LLM execution:

import { Worker, NativeConnection, bundleWorkflowCode } from '@temporalio/worker';
import { AiSdkPlugin } from '@temporalio/ai-sdk';
import { anthropic } from '@ai-sdk/anthropic';

async function run() {
  const connection = await NativeConnection.connect({
    address: process.env.TEMPORAL_ADDRESS || 'localhost:7233',
  });

  const workflowBundle = await bundleWorkflowCode({
    workflowsPath: path.resolve(__dirname, '../workflows/index.ts'),
    workflowInterceptorModules: [path.resolve(__dirname, '../workflows/interceptors.ts')],
  });

  const worker = await Worker.create({
    connection,
    namespace: 'default',
    taskQueue: 'netwatch-intel',
    workflowBundle,
    activities: netwatchActivities,
    plugins: [
      new AiSdkPlugin({
        modelProvider: anthropic,
      }),
    ],
  });

  await worker.run();
}

The AiSdkPlugin configuration specifies Anthropic as the model provider. This is the only place where the Anthropic SDK is configured, and consequently, the only place that needs the ANTHROPIC_API_KEY environment variable.

flowchart LR
    subgraph Worker["NetWatch Worker Process"]
        direction TB
        WC[Worker Core]
        
        subgraph Plugins
            AIP[AiSdkPlugin]
            ANTH[Anthropic Provider]
        end
        
        subgraph Execution
            WF[Workflow Executor]
            AE[Activity Executor]
        end
        
        WC --> Plugins
        WC --> Execution
        AIP --> ANTH
    end
    
    ENV[ANTHROPIC_API_KEY] -.->|Environment| ANTH
    
    TQ[Task Queue] -->|Poll| WC
    AE -->|API Calls| API[Anthropic API]

Tools as Temporal Activities

In the NetWatch system, AI agents have access to five intelligence-gathering tools. Each tool is implemented as a Temporal Activity, which means every tool invocation gets the same durability guarantees as the LLM calls themselves.

The activities are defined in netwatch-activities.ts:

export async function queryCorporateIntel(input: { corporation: string }): Promise<object> {
  console.log(`[NETWATCH] Querying corporate intel: ${input.corporation}`);
  const corp = input.corporation.toLowerCase();
  const intel = corporateIntel[corp];
  
  if (!intel) {
    return {
      error: 'Corporation not found in database',
      available: Object.keys(corporateIntel),
    };
  }
  return intel;
}

export async function analyzeThreat(input: {
  target: string;
  operation_type: string;
}): Promise<ThreatAssessment> {
  console.log(`[NETWATCH] Analyzing threat: ${input.target} - ${input.operation_type}`);
  // Threat analysis logic...
  return {
    target: input.target,
    threat_level: threatLevel,
    summary: `Threat assessment for ${input.operation_type} targeting ${input.target}`,
    recommendations,
  };
}

These activities are then wrapped as AI SDK tools in the workflow using proxyActivities:

const {
  queryCorporateIntel,
  queryRunnerProfile,
  checkSecurityClearance,
  analyzeThreat,
  searchIncidentReports,
} = proxyActivities<typeof activities>({
  startToCloseTimeout: '60 seconds',
  retry: {
    initialInterval: '1 second',
    maximumAttempts: 3,
  },
});

function createTools(toolsUsed: string[]) {
  return {
    queryCorporateIntel: tool({
      description: 'Query the corporate intelligence database for information about a specific corporation',
      inputSchema: z.object({
        corporation: z.string().describe('The name of the corporation to query'),
      }),
      execute: async (input) => {
        toolsUsed.push('queryCorporateIntel');
        return await queryCorporateIntel(input);
      },
    }),
    analyzeThreat: tool({
      description: 'Analyze the threat level for a specific target or operation',
      inputSchema: z.object({
        target: z.string().describe('The target of the operation'),
        operation_type: z.string().describe('Type of operation'),
      }),
      execute: async (input) => {
        toolsUsed.push('analyzeThreat');
        return await analyzeThreat(input);
      },
    }),
    // ... additional tools
  };
}

When the LLM decides to call a tool, the execution flows through Temporal’s activity system:

If the activity fails (network error, database timeout), Temporal automatically retries it according to the configured retry policy. The workflow doesn’t need any error handling code for transient failures.

The Scatter/Gather Pattern for Multi-Model Queries

The most interesting architectural pattern in NetWatch is the scatter/gather approach to multi-model intelligence analysis. Rather than querying a single model, the workflow dispatches the same query to three different Claude models simultaneously and aggregates the results.

const CLAUDE_MODELS = {
  haiku: {
    id: 'claude-haiku-4-5-20251001',
    name: 'Haiku 4.5',
    tier: 'Fastest',
  },
  sonnet: {
    id: 'claude-sonnet-4-5-20250929',
    name: 'Sonnet 4.5',
    tier: 'Balanced',
  },
  opus: {
    id: 'claude-opus-4-5-20251101',
    name: 'Opus 4.5',
    tier: 'Most Capable',
  },
};

export async function netwatchIntelAgent(request: IntelRequest): Promise<IntelResponse> {
  const startTime = Date.now();

  // SCATTER: Query all three models in parallel
  const modelPromises = [
    queryModel('haiku', request.query),
    queryModel('sonnet', request.query),
    queryModel('opus', request.query),
  ];

  // Wait for all models (don't fail if one fails)
  const analyses = await Promise.all(modelPromises);

  // GATHER: Aggregate results
  const successCount = analyses.filter((a) => a.success).length;
  
  // Determine classification based on tools used
  const allToolsUsed = analyses.flatMap((a) => a.toolsUsed);
  let classification: IntelResponse['classification'] = 'PUBLIC';
  if (allToolsUsed.includes('analyzeThreat')) {
    classification = 'CLASSIFIED';
  }

  return {
    requestId: request.requestId,
    analyses,
    totalProcessingTime: Date.now() - startTime,
    classification,
  };
}

The queryModel function handles the individual LLM call:

async function queryModel(
  modelKey: keyof typeof CLAUDE_MODELS,
  query: string
): Promise<ModelAnalysis> {
  const modelConfig = CLAUDE_MODELS[modelKey];
  const startTime = Date.now();
  const toolsUsed: string[] = [];

  try {
    const result = await generateText({
      model: temporalProvider.languageModel(modelConfig.id),
      prompt: query,
      system: NETWATCH_SYSTEM_PROMPT,
      tools: createTools(toolsUsed),
      stopWhen: stepCountIs(10),
    });

    return {
      model: modelKey,
      modelName: modelConfig.name,
      modelTier: modelConfig.tier,
      analysis: result.text,
      toolsUsed,
      processingTime: Date.now() - startTime,
      success: true,
    };
  } catch (error) {
    return {
      model: modelKey,
      modelName: modelConfig.name,
      modelTier: modelConfig.tier,
      analysis: '',
      toolsUsed,
      processingTime: Date.now() - startTime,
      success: false,
      error: error instanceof Error ? error.message : 'Unknown error',
    };
  }
}

The scatter/gather flow basically looks like the below across each boundry.

This pattern provides several benefits:

  1. Comparative Analysis: See how different capability tiers approach the same problem
  2. Redundancy: If one model fails, the others still provide results
  3. Cost Optimization: Compare fast/cheap (Haiku) vs slow/capable (Opus) for your use case
  4. Parallel Execution: All three queries run simultaneously, reducing total latency

Observability in Temporal

One of the most powerful aspects of running AI agents in Temporal is the built-in observability. Every workflow execution, activity invocation, and state change is recorded in Temporal’s event history.

The event history shows:

  • When each model query started and completed
  • Which tools each model decided to use
  • The exact inputs and outputs of every activity
  • Retry attempts if any calls failed
  • Total execution time and latency breakdowns

This visibility is invaluable for debugging AI agent behavior. When an agent makes unexpected tool calls or produces surprising results, you can replay the exact sequence of events to understand what happened.

Try It Yourself

The complete project is available at github.com/jamescarr/night-city-services.

# Clone the repo
git clone https://github.com/jamescarr/night-city-services.git
cd night-city-services

# Start Temporal server + services
docker compose up -d

# Install dependencies
pnpm install

# Start the worker (requires ANTHROPIC_API_KEY)
pnpm run netwatch:worker

# Start the API server (in another terminal)
pnpm run netwatch:server

Open http://localhost:3000 and submit an intelligence query. Try queries like:

  • “What do we know about Arasaka? I need intel for a potential job.”
  • “I need a threat assessment for an extraction operation at Biotechnica Flats.”
  • “Give me everything you have on the runner known as V.”

[SCREENSHOT: NetWatch frontend showing side-by-side results from all three Claude models with processing times and tools used]

How This Fits the Cookbook

Temporal’s AI Cookbook documents several patterns for building durable AI systems. NetWatch combines a few of them:

PatternNetWatch Implementation
Basic Agentic Loop with Tool CallingEach model runs an agent loop with tools
Scatter-GatherParallel queries to Haiku, Sonnet, Opus
Durable Agent with ToolsActivities as tools with automatic retries

The TypeScript AI SDK integration is currently in Public Preview. The @temporalio/ai-sdk package wraps Vercel’s AI SDK, making durable AI agents feel like writing normal code.

Wrapping Up

Building durable AI agents with Temporal provides guarantees that are difficult to achieve otherwise:

  1. Automatic Durability: LLM calls become Activities with built-in retry logic. No manual error handling required for transient failures.

  2. Clean Separation: API credentials stay on workers. Clients only need to know workflow names and Task Queues.

  3. Observable by Default: Every step of agent execution is recorded and can be inspected through Temporal’s UI.

  4. Familiar Developer Experience: The code looks almost identical to standard Vercel AI SDK usage. The temporalProvider.languageModel() wrapper is the only change.

  5. Pattern Support: Complex patterns like scatter/gather work naturally with Temporal’s parallel execution model.

Production-ready AI agents don’t require reinventing infrastructure. Temporal provides the durable execution layer, letting you focus on the agent logic itself.

References


“In Night City, information is currency. Every piece of intel you provide could mean the difference between a successful run and a body bag.”