Back to blog
Apr 3, 2026Engineering· 7 min read

Building a Workflow Executor in 48 Hours

By OSOP Team

Two days ago, OSOP could only validate workflows. You could write a .osop file, check it against the JSON Schema, render a diagram — but you could not run it. The executor was a placeholder that returned mock data. On April 1st we decided to change that. 48 hours later, osop run executes real workflows with conditional edges, fallback paths, security gates, cost controls, and agent nodes that call actual LLMs.

This is the story of that sprint — what we built, why each piece exists, and what we learned about executing AI agent workflows.

What We Built

The executor needed to handle everything the OSOP spec promises. That meant six major features in 48 hours:

  • WorkflowContextInter-node data flow. Each node reads inputs from previous nodes and writes outputs for downstream nodes. A shared context object carries state through the entire execution.
  • Conditional edgesEdges can have when: expressions that reference node outputs. The executor evaluates these at runtime to decide which path to take.
  • Fallback edgesIf a node fails, the executor follows fallback edges instead of crashing. This enables retry patterns and graceful degradation.
  • Security gates (--allow-exec)CLI nodes that run shell commands require explicit opt-in. Without --allow-exec, the executor refuses to run them. No accidental rm -rf in production.
  • Cost controls (--max-cost)Agent nodes that call LLMs accumulate cost. The executor tracks spending and aborts if the total exceeds the --max-cost limit.
  • Agent nodesNodes with type: agent actually call LLMs via a pluggable client. The executor passes the node's prompt, collects the response, and records token usage in the execution log.

The Workflow: Conditional Edges and Security Gates

Here is a real workflow that uses conditional edges and security gates. The security scan node produces a verdict, and the deploy node only runs if the verdict is not danger:

deploy-with-gates.osop.yaml
Build Artifactscli
sequentialRun Test Suite
Run Test Suitecicd
sequentialAI Security Review
AI Security Reviewagent
conditionalDeploy to Production
Deploy to Productioncli
fallbackRollback
Rollbackcli

Running It

The osop run command walks the workflow graph, executes each node, evaluates edge conditions, and produces a .osoplog.yaml file with full execution details:

terminal
$ osop run deploy-with-gates.osop.yaml \
    --allow-exec \
    --max-cost 2.00 \
    --env KUBE_CONTEXT=staging

[executor] Walking graph: build -> test -> security_scan -> deploy
[build]    npm run build                        OK  4.2s
[test]     36 passed, 0 failed                  OK  12.1s
[security] AI review (gpt-4o, $0.03)            OK  3.8s
[security] Verdict: safe (score: 12/100)
[deploy]   Approval gate: --allow-exec granted
[deploy]   kubectl apply -f k8s/               OK  6.3s

Workflow COMPLETED in 26.4s
Cost: $0.03 (limit: $2.00)
Log written: deploy-with-gates.osoplog.yaml

The Condition Evaluator

The hardest design decision was the condition evaluator. Edge conditions like security_scan.output.verdict != 'danger' need to be evaluated at runtime. The obvious approach — eval() — is a security disaster. We built a simple expression parser instead.

The evaluator supports dot-notation property access, string and number literals, comparison operators (==, !=, >, <, >=, <=), and boolean operators (and, or, not). It resolves references against the WorkflowContext, which holds all node outputs. No eval(), no arbitrary code execution, no injection attacks. The entire evaluator is under 120 lines of Python.

The Architecture

Three new modules power the executor:

  • execute.pyThe graph walker and node dispatcher. Resolves execution order, evaluates edge conditions, runs each node, handles fallbacks.
  • llm_client.pyPluggable LLM client for agent nodes. Supports OpenAI-compatible APIs with model selection, token tracking, and cost calculation.
  • osoplog.pyExecution log writer. Captures every node's inputs, outputs, duration, status, and AI metadata into a standards-compliant .osoplog.yaml.

Testing

We wrote 36 executor-specific tests covering:

  • Basic sequential execution and inter-node data flow
  • Conditional edge evaluation with various operators and data types
  • Fallback edge triggering on node failure
  • Security gate enforcement and cost limit enforcement

What's Next

The executor is live in osop v0.3.0. There are now 9 CLI commands — validate, run, render, test, report, optimize, import, export, and risk-assess — with 196 tests passing across the entire CLI. Next up: parallel node execution, streaming output, and webhook triggers. The spec supports all of these; the executor just needs to catch up.


Try it: pip install osop && osop run your-workflow.osop.yaml --allow-exec