Model Cascade / Fallback Chain
ML InfraTry fast, cheap model first. If confidence is low, escalate to larger, more capable model. Optimize cost while maintaining quality.
agentsystem
Why OSOP matters here
Running every query through your most expensive model wastes money. OSOP records which model handled each query, the confidence threshold, and how often escalation occurs — letting you tune the cascade for optimal cost/quality.
Workflow Steps (5)
1
Receive Query
event2
Fast Model (Haiku)
agent3
Confidence Check
system4
Large Model (Opus)
agent5
Return Response
apiConnections (5)
Receive Query→Fast Model (Haiku)sequential
Fast Model (Haiku)→Confidence Checksequential
Confidence Check→Return Responseconditionalconfidence >= 0.8
Confidence Check→Large Model (Opus)conditionalconfidence < 0.8
Large Model (Opus)→Return Responsesequential
5
Steps
5
Connections
4
Node Types