Banner Banner

Towards Symbolic XAI -- Explanation Through Human Understandable Logical Relationships Between Features

Thomas Schnake
Farnoush Rezaei Jafaria
Jonas Lederer
Ping Xiong
Shinichi Nakajima
Stefan Gugler
Grégoire Montavon
Klaus-Robert Müller

January 20, 2025

Explainable Artificial Intelligence (XAI) plays a crucial role in fostering transparency and trust in AI systems. Traditional XAI methods typically provide a single level of abstraction for explanations, often in the form of heatmaps in post-hoc attribution methods. Alternatively, XAI offers rule-based explanations that are expressive and composed of logical formulas but often fail to faithfully capture the model’s decision-making process or impose strict limitations on the model’s learning capabilities by requiring it to be inherently self-explainable. We aim to bridge these two approaches by developing post-hoc explanations that attribute relevance to complex logical relationships between input features while faithfully aligning with the model’s intricate prediction processes and imposing no restrictions on the model’s architecture. To this end, we propose a framework called Symbolic XAI, which attributes relevance to symbolic formulas expressing logical relationships between input features. Our method naturally extends propagation-based explanation approaches, such as layer-wise relevance propagation or GNN-LRP, and perturbation-based approaches, such as Shapley values. Beyond relevance attribution of logical formulas for a model’s prediction, our framework introduces a strategy to automatically identify logical formulas that best summarize the model’s decision strategy, eliminating the need to predefine these formulas. We demonstrate the effectiveness of our framework in domains such as natural language processing (NLP), computer vision, and chemistry, where abstract symbolic domain knowledge is abundant and critically valuable to users. In summary, the Symbolic XAI framework provides a local understanding of the model’s decision-making process that is both flexible for customization by the user and human-readable through logical formulas.