purify#
- interpret.utils.purify(scores, weights, tolerance=0.0, is_randomized=True)#
- Purifies a multi-dimensional tensor into it’s pure component and a series of impure components. For pairs, the
result will be a pair where the weighted sum along any row or column is zero, and the two main effects which are the impurities from the pair. The main effects will be further purified into zero-centered graphs and an intercept. This function also handles multiclass, which is detected when the scores tensor has one additional dimension than the weights. For multiclass, the class scores are adjusted to sum to 0, which can be done without generating any impurities.
Purification algorithm is based on the paper: “Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models” https://arxiv.org/abs/1911.04974
- Parameters:
scores – The score tensor to be purified.
weights – Weights for each element in the scores tensor.
tolerance – If needed, a tolerance can be specified to make the algorithm exit faster at the expense of purity. The algorithm will exit whenever either the tolerance is reached, or when purity stops improving.
is_randomized – If is_randomized is False, then purification happens in a predictable order. Using random ordering removes the bias that might be introduced by always processing one dimension first.
- Returns:
The purified tensor.
A list of tuples of the impurities generated. The first item in the tuple will be a tuple of indexes which indicates which dimensions the impurity applies to. The second item is the impurity tensor for those dimensions.
The intercept generated from purification.
- Return type:
A tuple containing 3 values