BOHM: Zero-Cost Hierarchical Attribution for Compound AI Systems
Original reporting by arXiv (cs.AI)

Compound AI systems, which route tasks through hierarchies of specialized components, are rapidly becoming the backbone of advanced applications. However, pinpointing which of these components truly drives performance and output remains a complex challenge. Attribution in these systems has traditionally relied on Shapley-based methods (SHAP), which meticulously evaluate a system's value by assessing the contributions of numerous component subsets. While powerful, this approach faces inherent limitations. It becomes particularly problematic when dealing with opaque third-party APIs, black-box endpoints, or agentic orchestrators that concentrate routing on a few tools, making the evaluation of every subset impractical or impossible. Furthermore, SHAP is computationally expensive and offers only a flat view of component contributions, lacking multi-resolution insights.
A new attribution primitive
This article introduces BOHM, a novel attribution method designed to overcome these hurdles. BOHM extracts a hierarchical attribution tree directly from the routing weights these systems already maintain, incurring zero marginal cost and requiring no access to component internals. It uniquely provides multi-resolution attribution at every level simultaneously. While BOHM and SHAP answer distinct questions, they show remarkable convergence when routing is optimal: BOHM achieves a Kendall tau of 0.928 on LLM benchmarks, approaching SHAP's 0.980 but with 9,000 times fewer evaluations. Critically, where they diverge, BOHM offers diagnostic insights into how an orchestrator's routing choices align with empirically best tools. BOHM thus emerges as a vital, complementary primitive for understanding and diagnosing the complex internal logic of modern AI systems.
The introduction of BOHM marks a significant stride in our ability to understand and attribute contributions within increasingly complex compound AI systems. By leveraging existing routing weights, BOHM circumvents the inherent limitations of traditional Shapley-based methods like SHAP, which often prove impractical or impossible for opaque third-party APIs or agentic orchestrators. Its zero marginal cost and multi-resolution attribution capabilities make it an invaluable, efficient tool for dissecting system performance without needing internal component access, offering a unique diagnostic lens particularly where optimal routing is not assumed.
Unlocking System Insight
The broader implications of BOHM are profound for the development and deployment of advanced AI. As AI systems grow more intricate, composed of specialized modules and external services, the ability to accurately pinpoint the influence of each component becomes paramount. BOHM offers a complementary primitive, providing a unique lens into these architectures. Its capacity for multi-resolution analysis — understanding contributions at various hierarchical levels simultaneously — will enable developers and researchers to debug failures more precisely, optimize resource allocation, and gain deeper insights into emergent behaviors. This enhanced transparency is not merely an academic exercise; it directly contributes to building more robust, reliable, and trustworthy AI. By making the "why" behind a system's output more accessible, BOHM facilitates responsible AI development, allowing practitioners to identify biases, ensure fairness, and comply with regulatory requirements more effectively. In an era increasingly defined by complex, black-box AI, BOHM paves the way for greater accountability, accelerating the adoption of powerful AI solutions by providing a much-needed window into their operational logic and true efficacy.