OpenAI API mit Claude Code: TypeScript-Integration 2026

Die OpenAI API ist das Rückgrat moderner KI-Anwendungen — von einfachen Chatbots bis hin zu komplexen RAG-Pipelines und autonomen Agenten. Wer sie mit TypeScript und Claude Code kombiniert, gewinnt nicht nur volle Typsicherheit, sondern auch eine deutlich schnellere Entwicklungsschleife: Claude Code generiert, refaktoriert und debuggt den gesamten API-Layer, während du dich auf Architektur und Produktlogik konzentrierst.

In diesem Artikel zeigen wir dir alle sechs zentralen OpenAI-APIs mit vollständigen, produktionsreifen TypeScript-Beispielen — so wie sie 2026 in realen Projekten eingesetzt werden. Die Code-Snippets sind direkt aus Claude-Code-Sessions entstanden und decken Setup, Fehlerbehandlung und Kostenoptimierung ab.

TypeScript OpenAI API Claude Code GPT-4o Function Calling RAG Streaming Batch API

Section 1

OpenAI Client Setup & Chat Completions

Der Einstieg beginnt mit dem offiziellen openai npm-Paket. Es bringt vollständige TypeScript-Typen mit, sodass du sofort von Autovervollständigung und Compilerzeitprüfungen profitierst. Claude Code liest diese Typen und kann daraus korrekte API-Calls ableiten, ohne dass du die Dokumentation öffnen musst.

Installation

# npm
npm install openai

# pnpm
pnpm add openai

# yarn
yarn add openai

# TypeScript-Typen sind im Paket enthalten — kein @types nötig

Nach der Installation legst du den Client an. Best Practice: API-Key niemals hardcoden, sondern aus der Umgebungsvariable lesen. Claude Code kann automatisch eine .env-Validierung einbauen.

Client Setup

// lib/openai-client.ts
import OpenAI from 'openai';
import { ChatCompletionMessageParam } from 'openai/resources/chat/completions';

if (!process.env.OPENAI_API_KEY) {
  throw new Error('OPENAI_API_KEY ist nicht gesetzt');
}

export const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  maxRetries: 3,
  timeout: 30_000, // 30 Sekunden
});

// Typisierte Message-Helper
export function userMessage(content: string): ChatCompletionMessageParam {
  return { role: 'user', content };
}

export function systemMessage(content: string): ChatCompletionMessageParam {
  return { role: 'system', content };
}

export function assistantMessage(content: string): ChatCompletionMessageParam {
  return { role: 'assistant', content };
}

Chat Completions

// services/chat.ts
import { openai, userMessage, systemMessage } from '../lib/openai-client';
import type { ChatCompletionMessageParam } from 'openai/resources';

interface ChatOptions {
  model?: string;
  temperature?: number;
  maxTokens?: number;
  systemPrompt?: string;
}

interface ChatResult {
  content: string;
  inputTokens: number;
  outputTokens: number;
  model: string;
}

export async function chat(
  messages: ChatCompletionMessageParam[],
  options: ChatOptions = {}
): Promise<ChatResult> {
  const {
    model = 'gpt-4o',
    temperature = 0.7,
    maxTokens = 2048,
    systemPrompt,
  } = options;

  const fullMessages: ChatCompletionMessageParam[] = systemPrompt
    ? [systemMessage(systemPrompt), ...messages]
    : messages;

  const response = await openai.chat.completions.create({
    model,
    messages: fullMessages,
    temperature,
    max_tokens: maxTokens,
  });

  const choice = response.choices[0];
  if (!choice?.message?.content) {
    throw new Error(`Kein Content in Response: ${response.id}`);
  }

  return {
    content: choice.message.content,
    inputTokens: response.usage?.prompt_tokens ?? 0,
    outputTokens: response.usage?.completion_tokens ?? 0,
    model: response.model,
  };
}

// Verwendung
const result = await chat(
  [userMessage('Erkläre den Unterschied zwischen REST und GraphQL')],
  { systemPrompt: 'Du bist ein erfahrener Backend-Entwickler.', temperature: 0.3 }
);
console.log(result.content);

Claude Code Tipp: Gib Claude Code deinen ChatOptions-Interface und bitte es, automatisch Retry-Logik mit exponential Backoff für Rate-Limit-Fehler (429) einzubauen. Claude Code erkennt die OpenAI-Fehlerklassen und generiert typsicheren Retry-Code in unter 30 Sekunden.

      Modell-Auswahl 2026: GPT-4o ist der Standard für komplexe Reasoning-Tasks. GPT-4o-mini eignet sich für einfache Klassifikation und kurze Antworten bei einem Bruchteil der Kosten. Claude Code kann dir basierend auf deinem Use Case eine Empfehlung geben und den Modellparameter automatisch setzen.
    

Section 2

Function Calling & Tool Use

Function Calling ist das mächtigste Feature der OpenAI API: Das Modell entscheidet selbst, wann und wie es externe Funktionen aufruft. Mit TypeScript und Claude Code lässt sich das gesamte Tool-System typsicher aufbauen — von der JSON-Schema-Definition bis zum Ergebnis-Parsing.

Tool Definition

// types/tools.ts
import type { ChatCompletionTool } from 'openai/resources';

export const weatherTool: ChatCompletionTool = {
  type: 'function',
  function: {
    name: 'get_current_weather',
    description: 'Ruft das aktuelle Wetter für eine Stadt ab',
    parameters: {
      type: 'object',
      properties: {
        city: {
          type: 'string',
          description: 'Name der Stadt, z.B. "Berlin"',
        },
        unit: {
          type: 'string',
          enum: ['celsius', 'fahrenheit'],
          description: 'Temperatureinheit',
        },
      },
      required: ['city'],
    },
  },
};

export const databaseTool: ChatCompletionTool = {
  type: 'function',
  function: {
    name: 'query_database',
    description: 'Führt eine sichere Datenbankabfrage aus',
    parameters: {
      type: 'object',
      properties: {
        table: { type: 'string', description: 'Tabellenname' },
        filter: { type: 'object', description: 'WHERE-Bedingungen' },
        limit: { type: 'number', description: 'Max. Ergebnisse (default 10)' },
      },
      required: ['table'],
    },
  },
};

Tool Execution

// services/tool-runner.ts
import { openai } from '../lib/openai-client';
import { weatherTool, databaseTool } from '../types/tools';
import type { ChatCompletionMessageParam } from 'openai/resources';

// Typisierte Tool-Handler-Map
type ToolHandler = (args: Record<string, unknown>) => Promise<unknown>;

const toolHandlers: Record<string, ToolHandler> = {
  get_current_weather: async ({ city, unit = 'celsius' }) => {
    // Echte Wetter-API-Anfrage hier
    return { city, temperature: 22, unit, condition: 'sonnig' };
  },
  query_database: async ({ table, filter, limit = 10 }) => {
    // Typsichere DB-Abfrage hier
    return { rows: [], table, filter, limit };
  },
};

export async function runWithTools(
  userQuery: string,
  tool_choice: 'auto' | 'none' | 'required' = 'auto'
): Promise<string> {
  const messages: ChatCompletionMessageParam[] = [
    { role: 'user', content: userQuery },
  ];

  // Erster API-Call — Modell entscheidet über Tool-Aufruf
  let response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages,
    tools: [weatherTool, databaseTool],
    tool_choice,
    parallel_tool_calls: true, // Mehrere Tools gleichzeitig
  });

  // Tool Calls verarbeiten
  while (response.choices[0].finish_reason === 'tool_calls') {
    const assistantMsg = response.choices[0].message;
    messages.push(assistantMsg);

    // Alle Tool Calls parallel ausführen
    const toolResults = await Promise.all(
      (assistantMsg.tool_calls ?? []).map(async (toolCall) => {
        const handler = toolHandlers[toolCall.function.name];
        if (!handler) {
          throw new Error(`Unbekanntes Tool: ${toolCall.function.name}`);
        }
        const args = JSON.parse(toolCall.function.arguments) as Record<string, unknown>;
        const result = await handler(args);
        return {
          role: 'tool' as const,
          tool_call_id: toolCall.id,
          content: JSON.stringify(result),
        };
      })
    );

    messages.push(...toolResults);

    // Zweiter Call mit Tool-Ergebnissen
    response = await openai.chat.completions.create({
      model: 'gpt-4o',
      messages,
      tools: [weatherTool, databaseTool],
    });
  }

  return response.choices[0].message.content ?? '';
}

Structured Output

// Structured Output mit JSON Schema (kein Halluzinations-Risiko)
import { zodResponseFormat } from 'openai/helpers/zod';
import { z } from 'zod';

const ProductSchema = z.object({
  name: z.string(),
  price: z.number().positive(),
  category: z.enum(['software', 'hardware', 'service']),
  features: z.array(z.string()).max(5),
  inStock: z.boolean(),
});

const response = await openai.beta.chat.completions.parse({
  model: 'gpt-4o-2024-08-06',
  messages: [
    { role: 'system', content: 'Extrahiere Produktdaten aus dem Text.' },
    { role: 'user', content: 'MacBook Pro M4 — 2.499€, Laptop, 20h Akku, M4 Chip, verfügbar' },
  ],
  response_format: zodResponseFormat(ProductSchema, 'product'),
});

const product = response.choices[0].message.parsed;
// product ist vollständig typisiert — kein any, kein cast
console.log(product?.name); // 'MacBook Pro M4'
console.log(product?.price); // 2499

Pro-Tipp Function Calling: Claude Code kann dir aus einem bestehenden TypeScript-Interface automatisch das passende JSON Schema für ein OpenAI-Tool generieren. Beschreibe deine Datenstruktur und lass Claude Code das vollständige ChatCompletionTool-Objekt erzeugen — inklusive Beschreibungen und required-Felder.

Section 3

Streaming Responses

Streaming transformiert die Nutzererfahrung: Statt auf die vollständige Antwort zu warten, sieht der User Token für Token wie Text erscheint. Die OpenAI API liefert dafür einen AsyncIterable-Stream, den TypeScript elegant mit for await verarbeitet.

Basic Streaming

// services/streaming.ts
import { openai } from '../lib/openai-client';

export async function streamChat(
  userMessage: string,
  onToken: (token: string) => void,
  signal?: AbortSignal
): Promise<string> {
  let fullContent = '';

  const stream = await openai.chat.completions.stream({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: userMessage }],
    stream: true,
  }, { signal });

  for await (const chunk of stream) {
    const delta = chunk.choices[0]?.delta?.content;
    if (delta) {
      fullContent += delta;
      onToken(delta);
    }

    // finish_reason prüfen
    const finishReason = chunk.choices[0]?.finish_reason;
    if (finishReason === 'length') {
      console.warn('max_tokens erreicht — Antwort möglicherweise abgeschnitten');
    }
  }

  return fullContent;
}

// Verwendung mit Abort
const controller = new AbortController();
setTimeout(() => controller.abort(), 10_000); // Timeout nach 10s

const full = await streamChat(
  'Schreibe einen ausführlichen Artikel über KI in der Softwareentwicklung',
  (token) => process.stdout.write(token),
  controller.signal
);

Server-Sent Events (SSE)

// api/stream-endpoint.ts (Next.js / Hono / Express)
import { openai } from '../lib/openai-client';

// Next.js App Router Route Handler
export async function POST(req: Request): Promise<Response> {
  const { message } = await req.json() as { message: string };

  const encoder = new TextEncoder();

  const readable = new ReadableStream({
    async start(controller) {
      try {
        const stream = await openai.chat.completions.stream({
          model: 'gpt-4o',
          messages: [{ role: 'user', content: message }],
          stream: true,
        });

        for await (const chunk of stream) {
          const delta = chunk.choices[0]?.delta?.content ?? '';
          if (delta) {
            // SSE-Format: "data: TOKEN\n\n"
            controller.enqueue(encoder.encode(`data: ${JSON.stringify({ delta })}\n\n`));
          }
        }
        controller.enqueue(encoder.encode('data: [DONE]\n\n'));
      } catch (err) {
        controller.error(err);
      } finally {
        controller.close();
      }
    },
  });

  return new Response(readable, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive',
    },
  });
}

Client-seitige Verarbeitung

// hooks/useStreamingChat.ts (React)
import { useState, useCallback } from 'react';

export function useStreamingChat() {
  const [response, setResponse] = useState('');
  const [isStreaming, setIsStreaming] = useState(false);

  const sendMessage = useCallback(async (message: string) => {
    setIsStreaming(true);
    setResponse('');

    const res = await fetch('/api/stream', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ message }),
    });

    const reader = res.body!.getReader();
    const decoder = new TextDecoder();

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      const lines = decoder.decode(value).split('\n');
      for (const line of lines) {
        if (line.startsWith('data: ') && line !== 'data: [DONE]') {
          const { delta } = JSON.parse(line.slice(6)) as { delta: string };
          setResponse(prev => prev + delta);
        }
      }
    }
    setIsStreaming(false);
  }, []);

  return { response, isStreaming, sendMessage };
}

Streaming + Abort: Claude Code hilft dir, AbortController korrekt in React-Komponenten zu integrieren — inklusive Cleanup im useEffect. Besonders wichtig, damit bei Component-Unmount keine Memory Leaks entstehen.

Section 4

Assistants API

Die Assistants API von OpenAI ermöglicht persistente Konversations-Threads, File-Uploads und eingebettete Tools wie file_search und code_interpreter. Sie ist ideal für komplexe Multi-Turn-Dialoge, bei denen der Assistant auf Dokumente zugreifen und Code ausführen soll.

Assistant erstellen

// services/assistant-manager.ts
import { openai } from '../lib/openai-client';
import type { Assistant, Thread, Run } from 'openai/resources/beta';

// Assistant einmalig erstellen (ID in DB/Config speichern)
export async function createAssistant(): Promise<Assistant> {
  return await openai.beta.assistants.create({
    name: 'Agentic Movers Support',
    instructions: `Du bist ein hilfreicher KI-Assistent für Agentic Movers.
Du hilfst Nutzern bei Fragen zu KI-Tools, Claude Code und TypeScript.
Nutze file_search um auf die Dokumentation zuzugreifen.
Antworte immer auf Deutsch, präzise und mit Code-Beispielen wo sinnvoll.`,
    model: 'gpt-4o',
    tools: [
      { type: 'file_search' },
      { type: 'code_interpreter' },
    ],
    temperature: 0.3,
    top_p: 0.95,
  });
}

Thread & Run

// Thread erstellen und Nachricht senden
export async function createThread(): Promise<Thread> {
  return await openai.beta.threads.create();
}

export async function addUserMessage(
  threadId: string,
  content: string,
  fileIds: string[] = []
): Promise<void> {
  await openai.beta.threads.messages.create(threadId, {
    role: 'user',
    content,
    attachments: fileIds.map(fileId => ({
      file_id: fileId,
      tools: [{ type: 'file_search' }],
    })),
  });
}

// Run erstellen und auf Abschluss warten (Polling)
export async function runAndWait(
  assistantId: string,
  threadId: string,
  maxWaitMs = 120_000
): Promise<string> {
  let run: Run = await openai.beta.threads.runs.create(threadId, {
    assistant_id: assistantId,
  });

  const deadline = Date.now() + maxWaitMs;

  // Polling bis Run abgeschlossen
  while (['queued', 'in_progress', 'cancelling'].includes(run.status)) {
    if (Date.now() > deadline) {
      await openai.beta.threads.runs.cancel(threadId, run.id);
      throw new Error('Run-Timeout nach ${maxWaitMs}ms');
    }
    await new Promise(resolve => setTimeout(resolve, 1500));
    run = await openai.beta.threads.runs.retrieve(threadId, run.id);
  }

  if (run.status !== 'completed') {
    throw new Error(`Run fehlgeschlagen: ${run.status} — ${run.last_error?.message}`);
  }

  // Letzte Assistent-Nachricht abrufen
  const messages = await openai.beta.threads.messages.list(threadId, {
    order: 'desc',
    limit: 1,
  });

  const lastMsg = messages.data[0];
  if (lastMsg?.role !== 'assistant') return '';

  return lastMsg.content
    .filter(c => c.type === 'text')
    .map(c => c.type === 'text' ? c.text.value : '')
    .join('\n');
}

Vollständiger Workflow

// Alles zusammen: Upload → Thread → Message → Run → Response
export async function assistantQA(
  assistantId: string,
  question: string,
  pdfPath?: string
): Promise<string> {
  const thread = await createThread();
  let fileIds: string[] = [];

  if (pdfPath) {
    const { createReadStream } = await import('fs');
    const file = await openai.files.create({
      file: createReadStream(pdfPath),
      purpose: 'assistants',
    });
    fileIds = [file.id];
  }

  await addUserMessage(thread.id, question, fileIds);
  const answer = await runAndWait(assistantId, thread.id);

  // Aufräumen — Thread und Dateien löschen
  await openai.beta.threads.del(thread.id);
  await Promise.all(fileIds.map(id => openai.files.del(id)));

  return answer;
}

Assistants vs. Chat Completions: Nutze die Assistants API wenn du persistente Threads, File-Upload oder code_interpreter brauchst. Für einfache Einzel-Requests ist Chat Completions schneller und günstiger. Claude Code hilft dir, die richtige API für deinen Use Case auszuwählen.

      Streaming mit Assistants: Seit OpenAI SDK v4 unterstützt die Assistants API ebenfalls Streaming via openai.beta.threads.runs.stream() — kein Polling mehr nötig. Claude Code kann die Stream-Handler direkt für dich generieren.
    

Section 5

Embeddings & Semantische Suche

Embeddings sind der Kern jeder RAG-Pipeline: Text wird in hochdimensionale Vektoren umgewandelt, die semantische Ähnlichkeit erfassen. text-embedding-3-small bietet 2026 das beste Preis-Leistungs-Verhältnis — 1536 Dimensionen bei einem Bruchteil der Kosten von text-embedding-ada-002.

Embeddings erzeugen

// services/embeddings.ts
import { openai } from '../lib/openai-client';

type EmbeddingModel = 'text-embedding-3-small' | 'text-embedding-3-large';

export async function embed(
  text: string,
  model: EmbeddingModel = 'text-embedding-3-small'
): Promise<number[]> {
  const response = await openai.embeddings.create({
    model,
    input: text,
    encoding_format: 'float',
  });
  return response.data[0].embedding;
}

// Batch-Embedding für mehrere Texte (effizienter)
export async function embedBatch(
  texts: string[],
  model: EmbeddingModel = 'text-embedding-3-small'
): Promise<number[][]> {
  // OpenAI erlaubt max. 2048 Inputs pro Request
  const BATCH_SIZE = 100;
  const results: number[][] = [];

  for (let i = 0; i < texts.length; i += BATCH_SIZE) {
    const batch = texts.slice(i, i + BATCH_SIZE);
    const response = await openai.embeddings.create({ model, input: batch });
    results.push(...response.data.map(d => d.embedding));
  }

  return results;
}

Cosine Similarity

// utils/similarity.ts

// Cosine Similarity: 1 = identisch, 0 = orthogonal, -1 = entgegengesetzt
export function cosineSimilarity(a: number[], b: number[]): number {
  if (a.length !== b.length) {
    throw new Error(`Vektor-Dimensionen stimmen nicht überein: ${a.length} vs ${b.length}`);
  }

  let dotProduct = 0;
  let normA = 0;
  let normB = 0;

  for (let i = 0; i < a.length; i++) {
    dotProduct += a[i] * b[i];
    normA += a[i] ** 2;
    normB += b[i] ** 2;
  }

  return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}

// Semantische Suche über Dokument-Array
interface Document {
  id: string;
  content: string;
  embedding: number[];
  metadata?: Record<string, unknown>;
}

export function semanticSearch(
  queryEmbedding: number[],
  documents: Document[],
  topK = 5,
  threshold = 0.7
): Array<Document & { score: number }> {
  return documents
    .map(doc => ({
      ...doc,
      score: cosineSimilarity(queryEmbedding, doc.embedding),
    }))
    .filter(doc => doc.score >= threshold)
    .sort((a, b) => b.score - a.score)
    .slice(0, topK);
}

RAG Pipeline

// services/rag.ts — Vollständige RAG Pipeline
import { embed, embedBatch } from './embeddings';
import { semanticSearch } from '../utils/similarity';
import { chat } from './chat';
import { userMessage } from '../lib/openai-client';

interface RagDocument {
  id: string;
  content: string;
  source: string;
  embedding?: number[];
}

export class VectorStore {
  private documents: Required<RagDocument>[] = [];

  async addDocuments(docs: RagDocument[]): Promise<void> {
    const texts = docs.map(d => d.content);
    const embeddings = await embedBatch(texts);

    this.documents.push(
      ...docs.map((doc, i) => ({ ...doc, embedding: embeddings[i] }))
    );
  }

  async query(question: string, topK = 3): Promise<string> {
    const queryEmbedding = await embed(question);
    const relevant = semanticSearch(queryEmbedding, this.documents, topK);

    if (relevant.length === 0) {
      return 'Keine relevanten Dokumente gefunden.';
    }

    const context = relevant
      .map((doc, i) => `[${i + 1}] (Score: ${doc.score.toFixed(3)}) ${doc.content}`)
      .join('\n\n');

    const prompt = `Kontext aus der Wissensdatenbank:

${context}

Frage: ${question}

Antworte basierend auf dem Kontext. Falls der Kontext die Frage nicht beantwortet, sage das explizit.`;

    const result = await chat(
      [userMessage(prompt)],
      { temperature: 0.1, systemPrompt: 'Du bist ein präziser Assistent. Antworte nur basierend auf dem gegebenen Kontext.' }
    );

    return result.content;
  }
}

Modell	Dimensionen	Preis / 1M Tokens	Empfehlung
`text-embedding-3-small`	1536	$0.02	Standard-RAG, Produktion
`text-embedding-3-large`	3072	$0.13	Höchste Qualität
`text-embedding-ada-002`	1536	$0.10	Legacy — nicht mehr empfohlen

Section 6

Batch API & Kostenoptimierung

Die OpenAI Batch API ist eines der unterschätztesten Features: Wer nicht auf sofortige Antworten angewiesen ist, kann mit asynchronen Batch-Requests bis zu 50% Kosten sparen. Ideal für Datenverarbeitung, Content-Generierung in großem Maßstab und nächtliche Analyse-Jobs.

      Wann Batch API nutzen? Immer wenn du mehr als 20 unabhängige Requests hast und das Ergebnis nicht sofort brauchst. Die API verarbeitet Batches asynchron innerhalb von 24 Stunden — in der Praxis meist in wenigen Minuten bis Stunden.
    

JSONL erstellen

// services/batch.ts
import { openai } from '../lib/openai-client';
import { writeFileSync, createReadStream } from 'fs';
import { join } from 'path';
import { tmpdir } from 'os';

interface BatchRequest {
  customId: string;
  prompt: string;
  systemPrompt?: string;
}

interface BatchLine {
  custom_id: string;
  method: 'POST';
  url: '/v1/chat/completions';
  body: {
    model: string;
    messages: Array<{ role: string; content: string }>;
    max_tokens: number;
  };
}

export function createBatchJsonl(
  requests: BatchRequest[],
  model = 'gpt-4o-mini'
): string {
  const lines: BatchLine[] = requests.map(req => ({
    custom_id: req.customId,
    method: 'POST',
    url: '/v1/chat/completions',
    body: {
      model,
      messages: [
        ...(req.systemPrompt
          ? [{ role: 'system', content: req.systemPrompt }]
          : []),
        { role: 'user', content: req.prompt },
      ],
      max_tokens: 1024,
    },
  }));

  const jsonl = lines.map(line => JSON.stringify(line)).join('\n');
  const tmpPath = join(tmpdir(), `batch-${Date.now()}.jsonl`);
  writeFileSync(tmpPath, jsonl, 'utf-8');
  return tmpPath;
}

Batch einreichen & überwachen

// Batch hochladen und Job starten
export async function submitBatch(
  requests: BatchRequest[],
  model = 'gpt-4o-mini'
): Promise<string> {
  const jsonlPath = createBatchJsonl(requests, model);

  // JSONL-Datei zu OpenAI hochladen
  const uploadedFile = await openai.files.create({
    file: createReadStream(jsonlPath),
    purpose: 'batch',
  });

  // Batch-Job starten
  const batch = await openai.batches.create({
    input_file_id: uploadedFile.id,
    endpoint: '/v1/chat/completions',
    completion_window: '24h',
    metadata: {
      description: `Batch mit ${requests.length} Requests`,
      created_at: new Date().toISOString(),
    },
  });

  console.log(`Batch gestartet: ${batch.id} | Status: ${batch.status}`);
  return batch.id;
}

// Status pollen und Ergebnisse abrufen
export async function waitForBatch(
  batchId: string,
  pollIntervalMs = 30_000
): Promise<Map<string, string>> {
  let batch = await openai.batches.retrieve(batchId);

  while (![ 'completed', 'failed', 'cancelled'].includes(batch.status)) {
    const progress = batch.request_counts;
    console.log(`Batch ${batchId}: ${batch.status} | ` +
      `${progress.completed}/${progress.total} abgeschlossen`);
    await new Promise(r => setTimeout(r, pollIntervalMs));
    batch = await openai.batches.retrieve(batchId);
  }

  if (batch.status !== 'completed') {
    throw new Error(`Batch fehlgeschlagen: ${batch.status}`);
  }

  return parseBatchOutput(batch.output_file_id!);
}

Output parsen

// Batch-Output-JSONL parsen
async function parseBatchOutput(
  outputFileId: string
): Promise<Map<string, string>> {
  const fileResponse = await openai.files.content(outputFileId);
  const rawText = await fileResponse.text();

  const results = new Map<string, string>();

  for (const line of rawText.split('\n').filter(Boolean)) {
    const parsed = JSON.parse(line) as {
      custom_id: string;
      response: {
        status_code: number;
        body: {
          choices: Array<{ message: { content: string } }>;
          usage: { total_tokens: number };
        };
      };
      error?: { code: string; message: string };
    };

    if (parsed.error) {
      console.error(`Fehler für ${parsed.custom_id}: ${parsed.error.message}`);
      continue;
    }

    const content = parsed.response.body.choices[0]?.message?.content ?? '';
    results.set(parsed.custom_id, content);
  }

  return results;
}

// Verwendung: 1000 Produkte kategorisieren
const products = [/* ... 1000 Produkte ... */];
const requests = products.map((p, i) => ({
  customId: `product-${i}`,
  prompt: `Kategorisiere dieses Produkt: ${p.name}\nBeschreibung: ${p.description}`,
  systemPrompt: 'Antworte nur mit einer Kategorie: Elektronik/Kleidung/Lebensmittel/Sport/Sonstiges',
}));

const batchId = await submitBatch(requests, 'gpt-4o-mini');
const results = await waitForBatch(batchId);
// Kosten: ~50% günstiger als einzelne API-Calls

Ansatz	Kosten (1000 Requests)	Latenz	Use Case
Einzelne Chat Completions	$3.00	<2s pro Request	Interaktive Chats
Batch API (gpt-4o-mini)	$0.15	Minuten bis 24h	Datenverarbeitung, Analyse
Batch API (gpt-4o)	$1.25	Minuten bis 24h	Komplexe Batch-Aufgaben

Kostenoptimierungs-Strategie: Claude Code kann deinen bestehenden synchronen API-Code analysieren und automatisch identifizieren, welche Calls für die Batch API geeignet sind. Typischerweise lassen sich 30-60% aller API-Calls auf Batch umstellen — bei entsprechenden Kosteneinsparungen.

Zusammenfassung: OpenAI API mit Claude Code meistern

Die OpenAI API bietet 2026 ein vollständiges Ökosystem für KI-Anwendungen jeder Größe. Der Schlüssel zur produktiven Nutzung liegt in der richtigen API-Auswahl für den jeweiligen Use Case:

Chat Completions

Einfache Ein-Schuss-Anfragen, Streaming, maximale Flexibilität. Standard für 80% der Anwendungsfälle.

Function Calling

Tool-Use und strukturierte Outputs. Basis für Agents und KI-gesteuerte Workflows.

Assistants API

Persistente Threads, File-Upload, Code Interpreter. Für komplexe Multi-Turn-Anwendungen.

Embeddings + RAG

Semantische Suche und Wissensdatenbanken. Grundlage für dokumentenbasierte KI-Systeme.

Streaming

Token-für-Token-Ausgabe für reaktive UIs. Drastisch bessere Nutzererfahrung bei langen Antworten.

Batch API

50% Kostenersparnis bei asynchroner Verarbeitung. Ideal für Datenverarbeitung und Content-Pipelines.

Claude Code als Entwicklungspartner: Alle gezeigten Patterns lassen sich mit Claude Code generieren, anpassen und debuggen. Beschreibe deinen Use Case in natürlicher Sprache, und Claude Code generiert typsicheren TypeScript-Code.

Type-first Development: Das openai npm-Paket bringt vollständige TypeScript-Typen. Nutze sie konsequent — TypeScript-Fehler sind billiger als Runtime-Fehler in Produktion.

Kosten im Blick: Starte mit gpt-4o-mini für Prototypen, wechsle zu gpt-4o nur wo Qualität es erfordert. Batch API für asynchrone Workloads. Embeddings mit text-embedding-3-small als Standard.

KI-API-Modul im Kurs

Im Claude Code Mastery Kurs: vollständiges OpenAI-Modul mit Function Calling, Assistants API, Embeddings, RAG und Kostenoptimierung für Produktionsanwendungen.

14 Tage kostenlos testen →