dice_ml.explainer_interfaces package
Submodules
dice_ml.explainer_interfaces.dice_KD module
Module to generate counterfactual explanations from a KD-Tree This code is similar to ‘Interpretable Counterfactual Explanations Guided by Prototypes’: https://arxiv.org/pdf/1907.02584.pdf
- class dice_ml.explainer_interfaces.dice_KD.DiceKD(data_interface, model_interface)[source]
Bases:
ExplainerBase
- find_counterfactuals(data_df_copy, query_instance, query_instance_orig, desired_range, desired_class, total_CFs, features_to_vary, permitted_range, sparsity_weight, stopping_threshold, posthoc_sparsity_param, posthoc_sparsity_algorithm, verbose, limit_steps_ls)[source]
Finds counterfactuals by querying a K-D tree for the nearest data points in the desired class from the dataset.
dice_ml.explainer_interfaces.dice_genetic module
Module to generate diverse counterfactual explanations based on genetic algorithm This code is similar to ‘GeCo: Quality Counterfactual Explanations in Real Time’: https://arxiv.org/pdf/2101.01292.pdf
- class dice_ml.explainer_interfaces.dice_genetic.DiceGenetic(data_interface, model_interface)[source]
Bases:
ExplainerBase
- compute_proximity_loss(x_hat_unnormalized, query_instance_normalized)[source]
Compute weighted distance between two vectors.
- compute_yloss(cfs, desired_range, desired_class)[source]
Computes the first part (y-loss) of the loss function.
- do_cf_initializations(total_CFs, initialization, algorithm, features_to_vary, desired_range, desired_class, query_instance, query_instance_df_dummies, verbose)[source]
Intializes CFs and other related variables.
- do_loss_initializations(yloss_type, diversity_loss_type, feature_weights, encoding='one-hot')[source]
Intializes variables related to main loss function
- do_param_initializations(total_CFs, initialization, desired_range, desired_class, query_instance, query_instance_df_dummies, algorithm, features_to_vary, permitted_range, yloss_type, diversity_loss_type, feature_weights, proximity_weight, sparsity_weight, diversity_weight, categorical_penalty, verbose)[source]
dice_ml.explainer_interfaces.dice_pytorch module
Module to generate diverse counterfactual explanations based on PyTorch framework
- class dice_ml.explainer_interfaces.dice_pytorch.DicePyTorch(data_interface, model_interface)[source]
Bases:
ExplainerBase
- compute_regularization_loss()[source]
Adds a linear equality constraints to the loss functions - to ensure all levels of a categorical variable sums to one
- do_cf_initializations(total_CFs, algorithm, features_to_vary)[source]
Intializes CFs and other related variables.
- do_loss_initializations(yloss_type, diversity_loss_type, feature_weights)[source]
Intializes variables related to main loss function
- do_optimizer_initializations(optimizer, learning_rate)[source]
Initializes gradient-based PyTorch optimizers.
- find_counterfactuals(query_instance, desired_class, optimizer, learning_rate, min_iter, max_iter, project_iter, loss_diff_thres, loss_converge_maxiter, verbose, init_near_query_instance, tie_random, stopping_threshold, posthoc_sparsity_param, posthoc_sparsity_algorithm, limit_steps_ls)[source]
Finds counterfactuals by gradient-descent.
dice_ml.explainer_interfaces.dice_random module
Module to generate diverse counterfactual explanations based on random sampling. A simple implementation.
dice_ml.explainer_interfaces.dice_tensorflow1 module
Module to generate diverse counterfactual explanations based on tensorflow 1.x
- class dice_ml.explainer_interfaces.dice_tensorflow1.DiceTensorFlow1(data_interface, model_interface)[source]
Bases:
ExplainerBase
- compute_regularization_loss()[source]
Adds a linear equality constraints to the loss functions - to ensure all levels of a categorical variable sums to one
- do_cf_initializations(total_CFs, algorithm, features_to_vary)[source]
Intializes TF variables required for CF generation.
- do_loss_initializations(yloss_type='hinge_loss', diversity_loss_type='dpp_style:inverse_dist', feature_weights='inverse_mad')[source]
Defines the optimization loss
- find_counterfactuals(query_instance, limit_steps_ls, desired_class='opposite', learning_rate=0.05, min_iter=500, max_iter=5000, project_iter=0, loss_diff_thres=1e-05, loss_converge_maxiter=1, verbose=False, init_near_query_instance=False, tie_random=False, stopping_threshold=0.5, posthoc_sparsity_param=0.1, posthoc_sparsity_algorithm='linear')[source]
Finds counterfactuals by gradient-descent.
- generate_counterfactuals(query_instance, total_CFs, desired_class='opposite', proximity_weight=0.5, diversity_weight=1.0, categorical_penalty=0.1, algorithm='DiverseCF', features_to_vary='all', permitted_range=None, yloss_type='hinge_loss', diversity_loss_type='dpp_style:inverse_dist', feature_weights='inverse_mad', optimizer='tensorflow:adam', learning_rate=0.05, min_iter=500, max_iter=5000, project_iter=0, loss_diff_thres=1e-05, loss_converge_maxiter=1, verbose=False, init_near_query_instance=True, tie_random=False, stopping_threshold=0.5, posthoc_sparsity_param=0.1, posthoc_sparsity_algorithm='linear', limit_steps_ls=10000)[source]
Generates diverse counterfactual explanations
- Parameters
query_instance – Test point of interest. A dictionary of feature names and values or a single row dataframe.
total_CFs – Total number of counterfactuals required.
desired_class – Desired counterfactual class - can take 0 or 1. Default value is “opposite” to the outcome class of query_instance for binary classification.
proximity_weight – A positive float. Larger this weight, more close the counterfactuals are to the query_instance.
diversity_weight – A positive float. Larger this weight, more diverse the counterfactuals are.
categorical_penalty – A positive float. A weight to ensure that all levels of a categorical variable sums to 1.
algorithm – Counterfactual generation algorithm. Either “DiverseCF” or “RandomInitCF”.
features_to_vary – Either a string “all” or a list of feature names to vary.
permitted_range – Dictionary with continuous feature names as keys and permitted min-max range in list as values. Defaults to the range inferred from training data. If None, uses the parameters initialized in data_interface.
yloss_type – Metric for y-loss of the optimization function. Takes “l2_loss” or “log_loss” or “hinge_loss”.
diversity_loss_type – Metric for diversity loss of the optimization function. Takes “avg_dist” or “dpp_style:inverse_dist”.
feature_weights – Either “inverse_mad” or a dictionary with feature names as keys and corresponding weights as values. Default option is “inverse_mad” where the weight for a continuous feature is the inverse of the Median Absolute Devidation (MAD) of the feature’s values in the training set; the weight for a categorical feature is equal to 1 by default.
optimizer – Tensorflow optimization algorithm. Currently tested only with “tensorflow:adam”.
learning_rate – Learning rate for optimizer.
min_iter – Min iterations to run gradient descent for.
max_iter – Max iterations to run gradient descent for.
project_iter – Project the gradients at an interval of these many iterations.
loss_diff_thres – Minimum difference between successive loss values to check convergence.
loss_converge_maxiter – Maximum number of iterations for loss_diff_thres to hold to declare convergence. Defaults to 1, but we assigned a more conservative value of 2 in the paper.
verbose – Print intermediate loss value.
init_near_query_instance – Boolean to indicate if counterfactuals are to be initialized near query_instance.
tie_random – Used in rounding off CFs and intermediate projection.
stopping_threshold – Minimum threshold for counterfactuals target class probability.
posthoc_sparsity_param – Parameter for the post-hoc operation on continuous features to enhance sparsity.
posthoc_sparsity_algorithm – Perform either linear or binary search. Takes “linear” or “binary”. Prefer binary search when a feature range is large (for instance, income varying from 10k to 1000k) and only if the features share a monotonic relationship with predicted outcome in the model.
limit_steps_ls – Defines an upper limit for the linear search step in the posthoc_sparsity_enhancement
- Returns
A CounterfactualExamples object to store and visualize the resulting counterfactual explanations (see diverse_counterfactuals.py).
dice_ml.explainer_interfaces.dice_tensorflow2 module
Module to generate diverse counterfactual explanations based on tensorflow 2.x
- class dice_ml.explainer_interfaces.dice_tensorflow2.DiceTensorFlow2(data_interface, model_interface)[source]
Bases:
ExplainerBase
- compute_regularization_loss()[source]
Adds a linear equality constraints to the loss functions - to ensure all levels of a categorical variable sums to one
- do_cf_initializations(total_CFs, algorithm, features_to_vary)[source]
Intializes CFs and other related variables.
- do_loss_initializations(yloss_type, diversity_loss_type, feature_weights)[source]
Intializes variables related to main loss function
- do_optimizer_initializations(optimizer, learning_rate)[source]
Initializes gradient-based TensorFLow optimizers.
- find_counterfactuals(query_instance, desired_class, optimizer, learning_rate, min_iter, max_iter, project_iter, loss_diff_thres, loss_converge_maxiter, verbose, init_near_query_instance, tie_random, stopping_threshold, posthoc_sparsity_param, posthoc_sparsity_algorithm, limit_steps_ls)[source]
Finds counterfactuals by gradient-descent.
- generate_counterfactuals(query_instance, total_CFs, desired_class='opposite', proximity_weight=0.5, diversity_weight=1.0, categorical_penalty=0.1, algorithm='DiverseCF', features_to_vary='all', permitted_range=None, yloss_type='hinge_loss', diversity_loss_type='dpp_style:inverse_dist', feature_weights='inverse_mad', optimizer='tensorflow:adam', learning_rate=0.05, min_iter=500, max_iter=5000, project_iter=0, loss_diff_thres=1e-05, loss_converge_maxiter=1, verbose=False, init_near_query_instance=True, tie_random=False, stopping_threshold=0.5, posthoc_sparsity_param=0.1, posthoc_sparsity_algorithm='linear', limit_steps_ls=10000)[source]
Generates diverse counterfactual explanations
- Parameters
query_instance – Test point of interest. A dictionary of feature names and values or a single row dataframe
total_CFs – Total number of counterfactuals required.
desired_class – Desired counterfactual class - can take 0 or 1. Default value is “opposite” to the outcome class of query_instance for binary classification.
proximity_weight – A positive float. Larger this weight, more close the counterfactuals are to the query_instance.
diversity_weight – A positive float. Larger this weight, more diverse the counterfactuals are.
categorical_penalty – A positive float. A weight to ensure that all levels of a categorical variable sums to 1.
algorithm – Counterfactual generation algorithm. Either “DiverseCF” or “RandomInitCF”.
features_to_vary – Either a string “all” or a list of feature names to vary.
permitted_range – Dictionary with continuous feature names as keys and permitted min-max range in list as values. Defaults to the range inferred from training data. If None, uses the parameters initialized in data_interface.
yloss_type – Metric for y-loss of the optimization function. Takes “l2_loss” or “log_loss” or “hinge_loss”.
diversity_loss_type – Metric for diversity loss of the optimization function. Takes “avg_dist” or “dpp_style:inverse_dist”.
feature_weights – Either “inverse_mad” or a dictionary with feature names as keys and corresponding weights as values. Default option is “inverse_mad” where the weight for a continuous feature is the inverse of the Median Absolute Devidation (MAD) of the feature’s values in the training set; the weight for a categorical feature is equal to 1 by default.
optimizer – Tensorflow optimization algorithm. Currently tested only with “tensorflow:adam”.
learning_rate – Learning rate for optimizer.
min_iter – Min iterations to run gradient descent for.
max_iter – Max iterations to run gradient descent for.
project_iter – Project the gradients at an interval of these many iterations.
loss_diff_thres – Minimum difference between successive loss values to check convergence.
loss_converge_maxiter – Maximum number of iterations for loss_diff_thres to hold to declare convergence. Defaults to 1, but we assigned a more conservative value of 2 in the paper.
verbose – Print intermediate loss value.
init_near_query_instance – Boolean to indicate if counterfactuals are to be initialized near query_instance.
tie_random – Used in rounding off CFs and intermediate projection.
stopping_threshold – Minimum threshold for counterfactuals target class probability.
posthoc_sparsity_param – Parameter for the post-hoc operation on continuous features to enhance sparsity.
posthoc_sparsity_algorithm – Perform either linear or binary search. Takes “linear” or “binary”. Prefer binary search when a feature range is large (for instance, income varying from 10k to 1000k) and only if the features share a monotonic relationship with predicted outcome in the model.
limit_steps_ls – Defines an upper limit for the linear search step in the posthoc_sparsity_enhancement
- Returns
A CounterfactualExamples object to store and visualize the resulting counterfactual explanations (see diverse_counterfactuals.py).
dice_ml.explainer_interfaces.explainer_base module
Module containing a template class to generate counterfactual explanations. Subclasses implement interfaces for different ML frameworks such as TensorFlow or PyTorch. All methods are in dice_ml.explainer_interfaces
- class dice_ml.explainer_interfaces.explainer_base.ExplainerBase(data_interface, model_interface=None)[source]
Bases:
ABC
- check_permitted_range(permitted_range)[source]
checks permitted range for continuous features TODO: add comments as to where this is used if this function is necessary, else remove.
- check_query_instance_validity(features_to_vary, permitted_range, query_instance, feature_ranges_orig)[source]
- static deserialize_explainer(path)[source]
Reload the explainer into the memory by reading the file specified by path.
- do_binary_search(diff, decimal_prec, query_instance, cf_ix, feature, final_cfs_sparse, current_pred)[source]
Performs a binary search between continuous features of a CF and corresponding values in query_instance until the prediction class changes.
- do_linear_search(diff, decimal_prec, query_instance, cf_ix, feature, final_cfs_sparse, current_pred_orig, limit_steps_ls)[source]
Performs a greedy linear search - moves the continuous features in CFs towards original values in query_instance greedily until the prediction class changes, or it reaches the maximum number of steps
- do_posthoc_sparsity_enhancement(final_cfs_sparse, query_instance, posthoc_sparsity_param, posthoc_sparsity_algorithm, limit_steps_ls)[source]
Post-hoc method to encourage sparsity in a generated counterfactuals.
- Parameters
final_cfs_sparse – Final CFs in original user-fed format, in a pandas dataframe.
query_instance – Query instance in original user-fed format, in a pandas dataframe.
posthoc_sparsity_param – Parameter for the post-hoc operation on continuous features to enhance sparsity.
posthoc_sparsity_algorithm – Perform either linear or binary search. Prefer binary search when a feature range is large (for instance, income varying from 10k to 1000k) and only if the features share a monotonic relationship with predicted outcome in the model.
limit_steps_ls – Defines the limit of steps to be done in the linear search, necessary to avoid infinite loops
- feature_importance(query_instances, cf_examples_list=None, total_CFs=10, local_importance=True, global_importance=True, desired_class='opposite', desired_range=None, permitted_range=None, features_to_vary='all', stopping_threshold=0.5, posthoc_sparsity_param=0.1, posthoc_sparsity_algorithm='linear', **kwargs)[source]
Estimate feature importance scores for the given inputs.
- Parameters
query_instances – A list of inputs for which to compute the feature importances. These can be provided as a dataframe.
cf_examples_list – If precomputed, a list of counterfactual examples for every input point. If cf_examples_list is provided, then all the following parameters are ignored.
total_CFs – The number of counterfactuals to generate per input (default is 10)
other_parameters – These are the same as the generate_counterfactuals method.
- Returns
An object of class CounterfactualExplanations that includes the list of counterfactuals per input, local feature importances per input, and the global feature importance summarized over all inputs.
- generate_counterfactuals(query_instances, total_CFs, desired_class='opposite', desired_range=None, permitted_range=None, features_to_vary='all', stopping_threshold=0.5, posthoc_sparsity_param=0.1, proximity_weight=0.2, sparsity_weight=0.2, diversity_weight=5.0, categorical_penalty=0.1, posthoc_sparsity_algorithm='linear', verbose=False, **kwargs)[source]
General method for generating counterfactuals.
- Parameters
query_instances – Input point(s) for which counterfactuals are to be generated. This can be a dataframe with one or more rows.
total_CFs – Total number of counterfactuals required.
desired_class – Desired counterfactual class - can take 0 or 1. Default value is “opposite” to the outcome class of query_instance for binary classification.
desired_range – For regression problems. Contains the outcome range to generate counterfactuals in. This should be a list of two numbers in ascending order.
permitted_range – Dictionary with feature names as keys and permitted range in list as values. Defaults to the range inferred from training data. If None, uses the parameters initialized in data_interface.
features_to_vary – Either a string “all” or a list of feature names to vary.
stopping_threshold – Minimum threshold for counterfactuals target class probability.
proximity_weight – A positive float. Larger this weight, more close the counterfactuals are to the query_instance. Used by [‘genetic’, ‘gradientdescent’], ignored by [‘random’, ‘kdtree’] methods.
sparsity_weight – A positive float. Larger this weight, less features are changed from the query_instance. Used by [‘genetic’, ‘kdtree’], ignored by [‘random’, ‘gradientdescent’] methods.
diversity_weight – A positive float. Larger this weight, more diverse the counterfactuals are. Used by [‘genetic’, ‘gradientdescent’], ignored by [‘random’, ‘kdtree’] methods.
categorical_penalty – A positive float. A weight to ensure that all levels of a categorical variable sums to 1. Used by [‘genetic’, ‘gradientdescent’], ignored by [‘random’, ‘kdtree’] methods.
posthoc_sparsity_param – Parameter for the post-hoc operation on continuous features to enhance sparsity.
posthoc_sparsity_algorithm – Perform either linear or binary search. Takes “linear” or “binary”. Prefer binary search when a feature range is large (for instance, income varying from 10k to 1000k) and only if the features share a monotonic relationship with predicted outcome in the model.
verbose – Whether to output detailed messages.
sample_size – Sampling size
random_seed – Random seed for reproducibility
kwargs – Other parameters accepted by specific explanation method
- Returns
A CounterfactualExplanations object that contains the list of counterfactual examples per query_instance as one of its attributes.
- global_feature_importance(query_instances, cf_examples_list=None, total_CFs=10, local_importance=True, desired_class='opposite', desired_range=None, permitted_range=None, features_to_vary='all', stopping_threshold=0.5, posthoc_sparsity_param=0.1, posthoc_sparsity_algorithm='linear', **kwargs)[source]
Estimate global feature importance scores for the given inputs.
- Parameters
query_instances – A list of inputs for which to compute the feature importances. These can be provided as a dataframe.
cf_examples_list – If precomputed, a list of counterfactual examples for every input point. If cf_examples_list is provided, then all the following parameters are ignored.
total_CFs – The number of counterfactuals to generate per input (default is 10)
local_importance – Binary flag indicating whether local feature importance values should also be returned for each query instance.
other_parameters – These are the same as the generate_counterfactuals method.
- Returns
An object of class CounterfactualExplanations that includes the list of counterfactuals per input, local feature importances per input, and the global feature importance summarized over all inputs.
- infer_target_cfs_class(desired_class_input, original_pred, num_output_nodes)[source]
Infer the target class for generating CFs. Only called when model_type==”classifier”. TODO: Add support for opposite desired class in multiclass. Downstream methods should decide whether it is allowed or not.
- local_feature_importance(query_instances, cf_examples_list=None, total_CFs=10, desired_class='opposite', desired_range=None, permitted_range=None, features_to_vary='all', stopping_threshold=0.5, posthoc_sparsity_param=0.1, posthoc_sparsity_algorithm='linear', **kwargs)[source]
Estimate local feature importance scores for the given inputs.
- Parameters
query_instances – A list of inputs for which to compute the feature importances. These can be provided as a dataframe.
cf_examples_list – If precomputed, a list of counterfactual examples for every input point. If cf_examples_list is provided, then all the following parameters are ignored.
total_CFs – The number of counterfactuals to generate per input (default is 10)
other_parameters – These are the same as the generate_counterfactuals method.
- Returns
An object of class CounterfactualExplanations that includes the list of counterfactuals per input, local feature importances per input, and the global feature importance summarized over all inputs.
dice_ml.explainer_interfaces.feasible_base_vae module
- class dice_ml.explainer_interfaces.feasible_base_vae.FeasibleBaseVAE(data_interface, model_interface, **kwargs)[source]
Bases:
ExplainerBase
- generate_counterfactuals(query_instance, total_CFs, desired_class='opposite')[source]
General method for generating counterfactuals.
- Parameters
query_instances – Input point(s) for which counterfactuals are to be generated. This can be a dataframe with one or more rows.
total_CFs – Total number of counterfactuals required.
desired_class – Desired counterfactual class - can take 0 or 1. Default value is “opposite” to the outcome class of query_instance for binary classification.
desired_range – For regression problems. Contains the outcome range to generate counterfactuals in. This should be a list of two numbers in ascending order.
permitted_range – Dictionary with feature names as keys and permitted range in list as values. Defaults to the range inferred from training data. If None, uses the parameters initialized in data_interface.
features_to_vary – Either a string “all” or a list of feature names to vary.
stopping_threshold – Minimum threshold for counterfactuals target class probability.
proximity_weight – A positive float. Larger this weight, more close the counterfactuals are to the query_instance. Used by [‘genetic’, ‘gradientdescent’], ignored by [‘random’, ‘kdtree’] methods.
sparsity_weight – A positive float. Larger this weight, less features are changed from the query_instance. Used by [‘genetic’, ‘kdtree’], ignored by [‘random’, ‘gradientdescent’] methods.
diversity_weight – A positive float. Larger this weight, more diverse the counterfactuals are. Used by [‘genetic’, ‘gradientdescent’], ignored by [‘random’, ‘kdtree’] methods.
categorical_penalty – A positive float. A weight to ensure that all levels of a categorical variable sums to 1. Used by [‘genetic’, ‘gradientdescent’], ignored by [‘random’, ‘kdtree’] methods.
posthoc_sparsity_param – Parameter for the post-hoc operation on continuous features to enhance sparsity.
posthoc_sparsity_algorithm – Perform either linear or binary search. Takes “linear” or “binary”. Prefer binary search when a feature range is large (for instance, income varying from 10k to 1000k) and only if the features share a monotonic relationship with predicted outcome in the model.
verbose – Whether to output detailed messages.
sample_size – Sampling size
random_seed – Random seed for reproducibility
kwargs – Other parameters accepted by specific explanation method
- Returns
A CounterfactualExplanations object that contains the list of counterfactual examples per query_instance as one of its attributes.
dice_ml.explainer_interfaces.feasible_model_approx module
- class dice_ml.explainer_interfaces.feasible_model_approx.FeasibleModelApprox(data_interface, model_interface, **kwargs)[source]
Bases:
FeasibleBaseVAE
,ExplainerBase
- train(constraint_type, constraint_variables, constraint_direction, constraint_reg, pre_trained=False)[source]
- Parameters
pre_trained – Bool Variable to check whether pre trained model exists to avoid training again
constraint_type – Binary Variable currently: (1) unary / (0) monotonic
constraint_variables – List of List: [[Effect, Cause1, Cause2, …. ]]
constraint_direction – -1: Negative, 1: Positive ( By default has to be one for monotonic constraints )
constraint_reg – Tunable Hyperparamter
:return None