Model Cascade / Fallback Chain
AI ↔ AITry fast model first; escalate to larger model if confidence is low.
5 nodes · 5 edgesml-infra
agentsystem
Visual
Receive Queryevent
Incoming user request enters the cascade pipeline.
↓sequential→ Fast Model (Haiku)
Fast Model (Haiku)agent
Low-cost, low-latency first attempt.
↓sequential→ Confidence Check
Confidence Checksystem
Route based on model confidence score threshold.
↓conditional→ Return Response
↓conditional→ Large Model (Opus)
Large Model (Opus)agent
High-capability fallback for complex queries.
↓sequential→ Return Response
Return Responseapi
Deliver final answer to the caller.
uc-model-cascade.osop.yaml
osop_version: "1.0"
id: "model-cascade"
name: "Model Cascade / Fallback Chain"
description: "Try fast model first; escalate to larger model if confidence is low."
nodes:
- id: "receive"
type: "event"
name: "Receive Query"
description: "Incoming user request enters the cascade pipeline."
- id: "fast_model"
type: "agent"
subtype: "llm"
name: "Fast Model (Haiku)"
description: "Low-cost, low-latency first attempt."
timeout_sec: 5
- id: "check_confidence"
type: "system"
name: "Confidence Check"
description: "Route based on model confidence score threshold."
- id: "large_model"
type: "agent"
subtype: "llm"
name: "Large Model (Opus)"
description: "High-capability fallback for complex queries."
timeout_sec: 30
- id: "respond"
type: "api"
name: "Return Response"
description: "Deliver final answer to the caller."
edges:
- from: "receive"
to: "fast_model"
mode: "sequential"
- from: "fast_model"
to: "check_confidence"
mode: "sequential"
- from: "check_confidence"
to: "respond"
mode: "conditional"
when: "confidence >= 0.8"
- from: "check_confidence"
to: "large_model"
mode: "conditional"
when: "confidence < 0.8"
- from: "large_model"
to: "respond"
mode: "sequential"