Blog

ResearchJun 24, 2026

Behind the Myth: Can One API Call Rule Them All? (Part 1/2)

OpenRouter's Fusion plugin collapses a three-judge deliberation panel into a single model call. Part 1 puts our old and new checker agents head to head on speed, and asks whether one call is really faster than three.

Vladimir Vučković

ResearchApr 16, 2026

Mission defines strategy, and strategy defines structure

How our five phase pipeline revealed a bigger image on its own, and what that teaches us about context engineering across agent boundaries

Vladimir Vučković

ResearchApr 10, 2026

Robustness through meaning - one triple at a time

Exploring how ontologies cross-validate with LLMs, making robust failure detection in agentic systems. Why this is a different approach than Palantir's operational ontology.

Vladimir Vučković

ResearchFeb 27, 2026

Can Your Prompts Optimize Themselves?

Exploring how DSPy's declarative approach to prompt engineering replaces hand-crafted templates with Bayesian-optimized programs and what happens when you apply it to a real failure detection pipeline.

Vladimir Vučković

ResearchFeb 6, 2026

Are Your AI Agents Reliable?

Exploring how frameworks like τ²-bench and Pydantic Evals are shaping the science of evaluating AI agent reliability in production.

Vladimir Vučković

Showing 5 of 5 posts