Occlusion Based Explanations of Black-Box Models for Deep Learning

Occlusion sensitivity analysis allows us to better understand trained machine learning models by identifying which parts of a given input are most salient in determining the model’s prediction. It is particularly valuable as it can be applied to black-box models i.e. models with internal representations which are difficult to access or interpret.

 

Occlusion sensitivity works by iteratively occluding small regions of the input data for a single sample, and noting the change in the prediction. The resulting matrix of prediction differences can be visualized as a heatmap overlaying the original sample, allowing for intuitive visual evaluation of how successful the model is at identifying salient regions. This not only facilitates better understanding of the model and useful feature localization, but crucially provides visual validation that the model is indeed learning meaningful patterns rather than spurious correlations. 

 

An important question concerns what occlusion should be performed with. This decision can significantly affect the speed and accuracy of the technique, but has not yet been adequately explored. In the literature, zeros, the mean, blurred data, and a few others have been used – but there is little study of the effect of these different approaches across different occlusion techniques and datasets. Other approaches could include different forms of permutation, such as shuffling or otherwise rearranging the pixels within the occluded area. 

In summary, the main aim of this project is to provide a systemic analysis of different permutation methods, quantify what effect they have on the quality of heatmaps produced (and seek to understand the underlying reasons), and possibly propose a novel approaches based on the insights gained.

 

A notebook of interest: https://github.com/jessicamcooper/Binary-Occlusion