Model Cascade / Fallback Chain

ML Infra

Try fast, cheap model first. If confidence is low, escalate to larger, more capable model. Optimize cost while maintaining quality.

agentsystem
Why OSOP matters here

Running every query through your most expensive model wastes money. OSOP records which model handled each query, the confidence threshold, and how often escalation occurs — letting you tune the cascade for optimal cost/quality.

Workflow Steps (5)

1
Receive Query
event
2
Fast Model (Haiku)
agent
3
Confidence Check
system
4
Large Model (Opus)
agent
5
Return Response
api

Connections (5)

Receive QueryFast Model (Haiku)sequential
Fast Model (Haiku)Confidence Checksequential
Confidence CheckReturn Responseconditionalconfidence >= 0.8
Confidence CheckLarge Model (Opus)conditionalconfidence < 0.8
Large Model (Opus)Return Responsesequential
5
Steps
5
Connections
4
Node Types