Estimating local and global feature importance scores using DiCE
Summaries of counterfactual examples can be used to estimate importance of features. Intuitively, a feature that is changed more often to generate a proximal counterfactual is an important feature. We use this intuition to build a feature importance score.
This score can be interpreted as a measure of the necessity of a feature to cause a particular model output. That is, if the feature’s value changes, then it is likely that the model’s output class will also change (or the model’s output will significantly change in case of regression model).
Below we show how counterfactuals can be used to provide local feature importance scores for any input, and how those scores can be combined to yield a global importance score for each feature.
[1]:
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.ensemble import RandomForestClassifier
import dice_ml
from dice_ml import Dice
from dice_ml.utils import helpers # helper functions
[2]:
%load_ext autoreload
%autoreload 2
Preliminaries: Loading the data and ML model
[3]:
dataset = helpers.load_adult_income_dataset().sample(5000) # downsampling to reduce ML model fitting time
helpers.get_adult_data_info()
/mnt/c/Users/amshar/code/dice/dice_ml/utils/helpers.py:79: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
adult_data = adult_data.replace({'income': {'<=50K': 0, '>50K': 1}})
[3]:
{'age': 'age',
'workclass': 'type of industry (Government, Other/Unknown, Private, Self-Employed)',
'education': 'education level (Assoc, Bachelors, Doctorate, HS-grad, Masters, Prof-school, School, Some-college)',
'marital_status': 'marital status (Divorced, Married, Separated, Single, Widowed)',
'occupation': 'occupation (Blue-Collar, Other/Unknown, Professional, Sales, Service, White-Collar)',
'race': 'white or other race?',
'gender': 'male or female?',
'hours_per_week': 'total work hours per week',
'income': '0 (<=50K) vs 1 (>50K)'}
[4]:
target = dataset["income"]
# Split data into train and test
datasetX = dataset.drop("income", axis=1)
x_train, x_test, y_train, y_test = train_test_split(datasetX,
target,
test_size=0.2,
random_state=0,
stratify=target)
numerical = ["age", "hours_per_week"]
categorical = x_train.columns.difference(numerical)
# We create the preprocessing pipelines for both numeric and categorical data.
numeric_transformer = Pipeline(steps=[
('scaler', StandardScaler())])
categorical_transformer = Pipeline(steps=[
('onehot', OneHotEncoder(handle_unknown='ignore'))])
transformations = ColumnTransformer(
transformers=[
('num', numeric_transformer, numerical),
('cat', categorical_transformer, categorical)])
# Append classifier to preprocessing pipeline.
# Now we have a full prediction pipeline.
clf = Pipeline(steps=[('preprocessor', transformations),
('classifier', RandomForestClassifier())])
model = clf.fit(x_train, y_train)
[5]:
d = dice_ml.Data(dataframe=dataset, continuous_features=['age', 'hours_per_week'], outcome_name='income')
m = dice_ml.Model(model=model, backend="sklearn")
Local feature importance
We first generate counterfactuals for a given input point.
[6]:
exp = Dice(d, m, method="random")
query_instance = x_train[1:2]
e1 = exp.generate_counterfactuals(query_instance, total_CFs=10, desired_range=None,
desired_class="opposite",
permitted_range=None, features_to_vary="all")
e1.visualize_as_dataframe(show_only_changes=True)
100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.78it/s]
Query instance (original outcome : 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 51 | Self-Employed | Masters | Married | White-Collar | White | Male | 50 | 1 |
Diverse Counterfactual set (new outcome: 0)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 67 | - | School | - | - | - | - | - | 0 |
| 1 | - | - | Assoc | Single | - | - | - | - | 0 |
| 2 | - | - | Assoc | - | - | - | - | 96 | 0 |
| 3 | - | - | Assoc | - | - | - | - | 95 | 0 |
| 4 | - | Government | - | - | - | - | - | 27 | 0 |
| 5 | - | - | - | - | Blue-Collar | - | - | 64 | 0 |
| 6 | 27 | Other/Unknown | - | - | - | - | - | - | 0 |
| 7 | - | - | School | Divorced | - | - | - | - | 0 |
| 8 | - | - | - | Separated | - | - | Female | - | 0 |
| 9 | - | - | HS-grad | - | - | Other | - | - | 0 |
These can now be used to calculate the feature importance scores.
[7]:
imp = exp.local_feature_importance(query_instance, cf_examples_list=e1.cf_examples_list)
print(imp.local_importance)
[{'education': 0.6, 'hours_per_week': 0.4, 'marital_status': 0.3, 'workclass': 0.2, 'age': 0.2, 'occupation': 0.1, 'race': 0.1, 'gender': 0.1}]
Feature importance can also be estimated directly, by leaving the cf_examples_list argument blank.
[8]:
imp = exp.local_feature_importance(query_instance, posthoc_sparsity_param=None)
print(imp.local_importance)
100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7.13it/s]
[{'age': 0.5, 'education': 0.4, 'marital_status': 0.4, 'hours_per_week': 0.3, 'occupation': 0.2, 'workclass': 0.1, 'gender': 0.1, 'race': 0.0}]
Global importance
For global importance, we need to generate counterfactuals for a representative sample of the dataset.
[9]:
cobj = exp.global_feature_importance(x_train[0:10], total_CFs=10, posthoc_sparsity_param=None)
print(cobj.summary_importance)
100%|███████████████████████████████████████████████████████████████████████████████████| 10/10 [00:01<00:00, 5.47it/s]
{'education': 0.52, 'age': 0.49, 'hours_per_week': 0.35, 'marital_status': 0.23, 'occupation': 0.18, 'workclass': 0.11, 'gender': 0.1, 'race': 0.09}
Convert the counterfactual output to json
[10]:
json_str = cobj.to_json()
print(json_str)
{"test_data": [[[47, "Other/Unknown", "Bachelors", "Married", "Other/Unknown", "White", "Male", 36, 1]], [[51, "Self-Employed", "Masters", "Married", "White-Collar", "White", "Male", 50, 1]], [[38, "Self-Employed", "HS-grad", "Married", "Service", "White", "Male", 80, 0]], [[66, "Other/Unknown", "HS-grad", "Married", "Other/Unknown", "White", "Female", 20, 0]], [[32, "Private", "HS-grad", "Married", "Service", "White", "Male", 60, 1]], [[74, "Other/Unknown", "Bachelors", "Widowed", "Other/Unknown", "Other", "Female", 35, 0]], [[26, "Private", "HS-grad", "Single", "White-Collar", "White", "Female", 40, 0]], [[27, "Self-Employed", "HS-grad", "Single", "Sales", "White", "Male", 45, 0]], [[36, "Private", "Some-college", "Married", "Blue-Collar", "White", "Male", 60, 1]], [[32, "Private", "Assoc", "Single", "Service", "White", "Female", 60, 0]]], "cfs_list": [[[23, "Other/Unknown", "Bachelors", "Married", "Other/Unknown", "Other", "Male", 36, 0], [47, "Other/Unknown", "Bachelors", "Married", "Other/Unknown", "Other", "Male", 12, 0], [47, "Other/Unknown", "Bachelors", "Separated", "Other/Unknown", "White", "Male", 55, 0], [28, "Other/Unknown", "Bachelors", "Separated", "Other/Unknown", "White", "Male", 36, 0], [64, "Other/Unknown", "Bachelors", "Widowed", "Other/Unknown", "White", "Male", 36, 0], [47, "Other/Unknown", "School", "Married", "Other/Unknown", "White", "Male", 96, 0], [47, "Other/Unknown", "School", "Married", "Other/Unknown", "White", "Male", 34, 0], [47, "Other/Unknown", "HS-grad", "Widowed", "Other/Unknown", "White", "Male", 36, 0], [47, "Other/Unknown", "Bachelors", "Single", "Other/Unknown", "White", "Male", 73, 0], [19, "Other/Unknown", "Bachelors", "Separated", "Other/Unknown", "White", "Male", 36, 0]], [[51, "Self-Employed", "HS-grad", "Married", "Blue-Collar", "White", "Male", 50, 0], [28, "Self-Employed", "Masters", "Married", "White-Collar", "White", "Female", 50, 0], [51, "Self-Employed", "School", "Married", "White-Collar", "Other", "Male", 50, 0], [51, "Self-Employed", "Masters", "Married", "Blue-Collar", "White", "Male", 24, 0], [21, "Self-Employed", "Masters", "Married", "Other/Unknown", "White", "Male", 50, 0], [51, "Self-Employed", "Masters", "Widowed", "Service", "White", "Male", 50, 0], [28, "Self-Employed", "Bachelors", "Married", "White-Collar", "White", "Male", 50, 0], [51, "Self-Employed", "Masters", "Separated", "Blue-Collar", "White", "Male", 50, 0], [51, "Self-Employed", "HS-grad", "Married", "White-Collar", "Other", "Male", 50, 0], [85, "Self-Employed", "Masters", "Single", "White-Collar", "White", "Male", 50, 0]], [[78, "Self-Employed", "Some-college", "Married", "Service", "White", "Male", 80, 1], [38, "Self-Employed", "Masters", "Married", "Service", "White", "Male", 87, 1], [38, "Self-Employed", "Doctorate", "Married", "Service", "White", "Male", 86, 1], [38, "Self-Employed", "Some-college", "Married", "Service", "White", "Male", 90, 1], [77, "Self-Employed", "Masters", "Married", "Service", "White", "Male", 80, 1], [87, "Self-Employed", "HS-grad", "Married", "Service", "White", "Male", 61, 1], [78, "Self-Employed", "HS-grad", "Married", "Service", "White", "Male", 63, 1], [38, "Self-Employed", "Doctorate", "Married", "Service", "White", "Male", 80, 1], [38, "Self-Employed", "Masters", "Married", "White-Collar", "White", "Male", 80, 1], [38, "Self-Employed", "Masters", "Married", "Service", "White", "Male", 65, 1]], [[66, "Other/Unknown", "Doctorate", "Married", "Other/Unknown", "White", "Female", 20, 1], [66, "Other/Unknown", "Doctorate", "Married", "White-Collar", "White", "Female", 20, 1], [66, "Other/Unknown", "Doctorate", "Married", "Other/Unknown", "White", "Female", 15, 1], [37, "Other/Unknown", "Some-college", "Married", "Other/Unknown", "White", "Female", 20, 1], [66, "Other/Unknown", "Doctorate", "Widowed", "Other/Unknown", "White", "Female", 20, 1], [73, "Other/Unknown", "Doctorate", "Married", "Other/Unknown", "White", "Female", 20, 1], [56, "Other/Unknown", "HS-grad", "Married", "Other/Unknown", "White", "Female", 40, 1], [66, "Other/Unknown", "Bachelors", "Married", "Other/Unknown", "White", "Female", 86, 1], [50, "Other/Unknown", "HS-grad", "Married", "White-Collar", "White", "Female", 20, 1], [66, "Other/Unknown", "Bachelors", "Married", "Professional", "White", "Female", 20, 1]], [[40, "Other/Unknown", "HS-grad", "Married", "Service", "White", "Male", 60, 0], [80, "Private", "HS-grad", "Married", "Service", "White", "Male", 34, 0], [32, "Private", "HS-grad", "Separated", "Service", "White", "Male", 76, 0], [32, "Private", "HS-grad", "Married", "Service", "White", "Male", 47, 0], [32, "Private", "Doctorate", "Single", "Service", "White", "Male", 60, 0], [90, "Private", "HS-grad", "Single", "Service", "White", "Male", 60, 0], [55, "Private", "HS-grad", "Married", "Service", "White", "Male", 60, 0], [32, "Private", "Doctorate", "Married", "Service", "White", "Female", 60, 0], [32, "Government", "HS-grad", "Married", "Service", "White", "Male", 14, 0], [32, "Other/Unknown", "HS-grad", "Married", "Service", "White", "Male", 94, 0]], [[74, "Other/Unknown", "Doctorate", "Widowed", "Other/Unknown", "Other", "Female", 27, 1], [74, "Other/Unknown", "Bachelors", "Married", "Other/Unknown", "Other", "Female", 93, 1], [74, "Other/Unknown", "Doctorate", "Widowed", "Other/Unknown", "Other", "Female", 55, 1], [74, "Other/Unknown", "Bachelors", "Married", "Professional", "Other", "Female", 35, 1], [74, "Other/Unknown", "Doctorate", "Widowed", "Blue-Collar", "Other", "Female", 35, 1], [74, "Other/Unknown", "Doctorate", "Widowed", "Other/Unknown", "Other", "Female", 31, 1], [74, "Private", "Bachelors", "Married", "Other/Unknown", "Other", "Female", 35, 1], [74, "Other/Unknown", "Doctorate", "Widowed", "Other/Unknown", "Other", "Female", 79, 1], [74, "Other/Unknown", "Bachelors", "Widowed", "Other/Unknown", "White", "Female", 23, 1], [56, "Other/Unknown", "Bachelors", "Married", "Other/Unknown", "Other", "Female", 35, 1]], [[71, "Government", "Doctorate", "Single", "White-Collar", "White", "Female", 40, 1], [87, "Private", "Doctorate", "Single", "White-Collar", "White", "Female", 40, 1], [78, "Private", "HS-grad", "Single", "White-Collar", "White", "Male", 67, 1], [32, "Private", "HS-grad", "Married", "White-Collar", "White", "Female", 40, 1], [26, "Private", "HS-grad", "Single", "White-Collar", "Other", "Male", 94, 1], [44, "Other/Unknown", "Doctorate", "Single", "White-Collar", "White", "Female", 40, 1], [49, "Private", "Masters", "Married", "White-Collar", "White", "Female", 40, 1], [44, "Private", "HS-grad", "Single", "White-Collar", "White", "Male", 48, 1], [49, "Private", "Doctorate", "Single", "White-Collar", "White", "Female", 55, 1], [26, "Private", "HS-grad", "Single", "White-Collar", "Other", "Male", 97, 1]], [[50, "Self-Employed", "Doctorate", "Single", "Sales", "White", "Male", 45, 1], [27, "Self-Employed", "Masters", "Married", "Sales", "White", "Male", 45, 1], [67, "Self-Employed", "Prof-school", "Single", "Sales", "White", "Male", 45, 0], [50, "Self-Employed", "Prof-school", "Single", "Sales", "White", "Male", 45, 1], [72, "Self-Employed", "Prof-school", "Single", "Sales", "White", "Male", 45, 0], [48, "Self-Employed", "Prof-school", "Single", "Sales", "White", "Male", 45, 1], [48, "Private", "HS-grad", "Single", "Sales", "White", "Male", 45, 1], [51, "Private", "HS-grad", "Single", "Sales", "White", "Male", 45, 0], [48, "Self-Employed", "Masters", "Single", "Sales", "White", "Male", 45, 1], [49, "Self-Employed", "Doctorate", "Single", "Sales", "White", "Male", 45, 1]], [[65, "Private", "Some-college", "Married", "Blue-Collar", "White", "Female", 60, 0], [36, "Private", "HS-grad", "Married", "Other/Unknown", "White", "Male", 60, 0], [36, "Private", "Some-college", "Married", "Blue-Collar", "White", "Male", 50, 0], [36, "Private", "Some-college", "Married", "Blue-Collar", "White", "Female", 23, 0], [90, "Private", "Some-college", "Married", "Blue-Collar", "Other", "Male", 60, 0], [36, "Private", "Some-college", "Married", "Blue-Collar", "White", "Male", 38, 0], [36, "Private", "School", "Married", "Professional", "White", "Male", 60, 0], [36, "Self-Employed", "Some-college", "Married", "Blue-Collar", "White", "Male", 49, 0], [89, "Private", "Some-college", "Married", "Blue-Collar", "White", "Female", 60, 0], [66, "Other/Unknown", "Some-college", "Married", "Blue-Collar", "White", "Male", 60, 0]], [[46, "Private", "Assoc", "Single", "Sales", "White", "Male", 60, 1], [69, "Private", "Prof-school", "Single", "Professional", "White", "Female", 60, 1], [83, "Private", "Doctorate", "Single", "Service", "White", "Female", 60, 1], [32, "Private", "Assoc", "Married", "White-Collar", "White", "Female", 60, 1], [50, "Private", "Doctorate", "Single", "Service", "White", "Female", 60, 1], [55, "Private", "Doctorate", "Single", "Service", "Other", "Female", 60, 1], [44, "Private", "Doctorate", "Single", "Service", "White", "Female", 60, 1], [32, "Private", "Masters", "Divorced", "White-Collar", "White", "Female", 60, 1], [32, "Self-Employed", "Assoc", "Married", "White-Collar", "White", "Female", 60, 0], [55, "Private", "Doctorate", "Single", "Service", "White", "Female", 60, 1]]], "local_importance": [[0.4, 0.0, 0.3, 0.6, 0.0, 0.2, 0.0, 0.5], [0.4, 0.0, 0.4, 0.3, 0.5, 0.2, 0.1, 0.1], [0.4, 0.0, 0.8, 0.0, 0.1, 0.0, 0.0, 0.6], [0.4, 0.0, 0.8, 0.1, 0.3, 0.0, 0.0, 0.3], [0.4, 0.3, 0.2, 0.3, 0.0, 0.0, 0.1, 0.5], [0.1, 0.1, 0.5, 0.4, 0.2, 0.1, 0.0, 0.6], [0.8, 0.2, 0.5, 0.2, 0.0, 0.2, 0.4, 0.5], [0.9, 0.2, 0.8, 0.1, 0.0, 0.0, 0.0, 0.0], [0.4, 0.2, 0.2, 0.0, 0.2, 0.1, 0.3, 0.4], [0.7, 0.1, 0.7, 0.3, 0.5, 0.1, 0.1, 0.0]], "summary_importance": [0.49, 0.11, 0.52, 0.23, 0.18, 0.09, 0.1, 0.35], "data_interface": {"outcome_name": "income", "data_df": "dummy_data"}, "feature_names": ["age", "workclass", "education", "marital_status", "occupation", "race", "gender", "hours_per_week"], "feature_names_including_target": ["age", "workclass", "education", "marital_status", "occupation", "race", "gender", "hours_per_week", "income"], "model_type": "classifier", "desired_class": 1, "desired_range": null, "metadata": {"version": "2.0"}}
Convert the json output to a counterfactual object
[11]:
imp_r = imp.from_json(json_str)
print([o.visualize_as_dataframe(show_only_changes=True) for o in imp_r.cf_examples_list])
print(imp_r.local_importance)
print(imp_r.summary_importance)
Query instance (original outcome : 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 47 | Other/Unknown | Bachelors | Married | Other/Unknown | White | Male | 36 | 1 |
Counterfactual set (new outcome: 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 23 | - | - | - | - | Other | - | - | 0 |
| 1 | - | - | - | - | - | Other | - | 12 | 0 |
| 2 | - | - | - | Separated | - | - | - | 55 | 0 |
| 3 | 28 | - | - | Separated | - | - | - | - | 0 |
| 4 | 64 | - | - | Widowed | - | - | - | - | 0 |
| 5 | - | - | School | - | - | - | - | 96 | 0 |
| 6 | - | - | School | - | - | - | - | 34 | 0 |
| 7 | - | - | HS-grad | Widowed | - | - | - | - | 0 |
| 8 | - | - | - | Single | - | - | - | 73 | 0 |
| 9 | 19 | - | - | Separated | - | - | - | - | 0 |
Query instance (original outcome : 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 51 | Self-Employed | Masters | Married | White-Collar | White | Male | 50 | 1 |
Counterfactual set (new outcome: 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | - | - | HS-grad | - | Blue-Collar | - | - | - | 0 |
| 1 | 28 | - | - | - | - | - | Female | - | 0 |
| 2 | - | - | School | - | - | Other | - | - | 0 |
| 3 | - | - | - | - | Blue-Collar | - | - | 24 | 0 |
| 4 | 21 | - | - | - | Other/Unknown | - | - | - | 0 |
| 5 | - | - | - | Widowed | Service | - | - | - | 0 |
| 6 | 28 | - | Bachelors | - | - | - | - | - | 0 |
| 7 | - | - | - | Separated | Blue-Collar | - | - | - | 0 |
| 8 | - | - | HS-grad | - | - | Other | - | - | 0 |
| 9 | 85 | - | - | Single | - | - | - | - | 0 |
Query instance (original outcome : 0)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 38 | Self-Employed | HS-grad | Married | Service | White | Male | 80 | 0 |
Counterfactual set (new outcome: 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 78 | - | Some-college | - | - | - | - | - | 1 |
| 1 | - | - | Masters | - | - | - | - | 87 | 1 |
| 2 | - | - | Doctorate | - | - | - | - | 86 | 1 |
| 3 | - | - | Some-college | - | - | - | - | 90 | 1 |
| 4 | 77 | - | Masters | - | - | - | - | - | 1 |
| 5 | 87 | - | - | - | - | - | - | 61 | 1 |
| 6 | 78 | - | - | - | - | - | - | 63 | 1 |
| 7 | - | - | Doctorate | - | - | - | - | - | 1 |
| 8 | - | - | Masters | - | White-Collar | - | - | - | 1 |
| 9 | - | - | Masters | - | - | - | - | 65 | 1 |
Query instance (original outcome : 0)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 66 | Other/Unknown | HS-grad | Married | Other/Unknown | White | Female | 20 | 0 |
Counterfactual set (new outcome: 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | - | - | Doctorate | - | - | - | - | - | 1 |
| 1 | - | - | Doctorate | - | White-Collar | - | - | - | 1 |
| 2 | - | - | Doctorate | - | - | - | - | 15 | 1 |
| 3 | 37 | - | Some-college | - | - | - | - | - | 1 |
| 4 | - | - | Doctorate | Widowed | - | - | - | - | 1 |
| 5 | 73 | - | Doctorate | - | - | - | - | - | 1 |
| 6 | 56 | - | - | - | - | - | - | 40 | 1 |
| 7 | - | - | Bachelors | - | - | - | - | 86 | 1 |
| 8 | 50 | - | - | - | White-Collar | - | - | - | 1 |
| 9 | - | - | Bachelors | - | Professional | - | - | - | 1 |
Query instance (original outcome : 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 32 | Private | HS-grad | Married | Service | White | Male | 60 | 1 |
Counterfactual set (new outcome: 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 40 | Other/Unknown | - | - | - | - | - | - | 0 |
| 1 | 80 | - | - | - | - | - | - | 34 | 0 |
| 2 | - | - | - | Separated | - | - | - | 76 | 0 |
| 3 | - | - | - | - | - | - | - | 47 | 0 |
| 4 | - | - | Doctorate | Single | - | - | - | - | 0 |
| 5 | 90 | - | - | Single | - | - | - | - | 0 |
| 6 | 55 | - | - | - | - | - | - | - | 0 |
| 7 | - | - | Doctorate | - | - | - | Female | - | 0 |
| 8 | - | Government | - | - | - | - | - | 14 | 0 |
| 9 | - | Other/Unknown | - | - | - | - | - | 94 | 0 |
Query instance (original outcome : 0)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 74 | Other/Unknown | Bachelors | Widowed | Other/Unknown | Other | Female | 35 | 0 |
Counterfactual set (new outcome: 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | - | - | Doctorate | - | - | - | - | 27 | 1 |
| 1 | - | - | - | Married | - | - | - | 93 | 1 |
| 2 | - | - | Doctorate | - | - | - | - | 55 | 1 |
| 3 | - | - | - | Married | Professional | - | - | - | 1 |
| 4 | - | - | Doctorate | - | Blue-Collar | - | - | - | 1 |
| 5 | - | - | Doctorate | - | - | - | - | 31 | 1 |
| 6 | - | Private | - | Married | - | - | - | - | 1 |
| 7 | - | - | Doctorate | - | - | - | - | 79 | 1 |
| 8 | - | - | - | - | - | White | - | 23 | 1 |
| 9 | 56 | - | - | Married | - | - | - | - | 1 |
Query instance (original outcome : 0)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 26 | Private | HS-grad | Single | White-Collar | White | Female | 40 | 0 |
Counterfactual set (new outcome: 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 71 | Government | Doctorate | - | - | - | - | - | 1 |
| 1 | 87 | - | Doctorate | - | - | - | - | - | 1 |
| 2 | 78 | - | - | - | - | - | Male | 67 | 1 |
| 3 | 32 | - | - | Married | - | - | - | - | 1 |
| 4 | - | - | - | - | - | Other | Male | 94 | 1 |
| 5 | 44 | Other/Unknown | Doctorate | - | - | - | - | - | 1 |
| 6 | 49 | - | Masters | Married | - | - | - | - | 1 |
| 7 | 44 | - | - | - | - | - | Male | 48 | 1 |
| 8 | 49 | - | Doctorate | - | - | - | - | 55 | 1 |
| 9 | - | - | - | - | - | Other | Male | 97 | 1 |
Query instance (original outcome : 0)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 27 | Self-Employed | HS-grad | Single | Sales | White | Male | 45 | 0 |
Counterfactual set (new outcome: 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 50 | - | Doctorate | - | - | - | - | - | 1 |
| 1 | - | - | Masters | Married | - | - | - | - | 1 |
| 2 | 67 | - | Prof-school | - | - | - | - | - | - |
| 3 | 50 | - | Prof-school | - | - | - | - | - | 1 |
| 4 | 72 | - | Prof-school | - | - | - | - | - | - |
| 5 | 48 | - | Prof-school | - | - | - | - | - | 1 |
| 6 | 48 | Private | - | - | - | - | - | - | 1 |
| 7 | 51 | Private | - | - | - | - | - | - | - |
| 8 | 48 | - | Masters | - | - | - | - | - | 1 |
| 9 | 49 | - | Doctorate | - | - | - | - | - | 1 |
Query instance (original outcome : 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 36 | Private | Some-college | Married | Blue-Collar | White | Male | 60 | 1 |
Counterfactual set (new outcome: 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 65 | - | - | - | - | - | Female | - | 0 |
| 1 | - | - | HS-grad | - | Other/Unknown | - | - | - | 0 |
| 2 | - | - | - | - | - | - | - | 50 | 0 |
| 3 | - | - | - | - | - | - | Female | 23 | 0 |
| 4 | 90 | - | - | - | - | Other | - | - | 0 |
| 5 | - | - | - | - | - | - | - | 38 | 0 |
| 6 | - | - | School | - | Professional | - | - | - | 0 |
| 7 | - | Self-Employed | - | - | - | - | - | 49 | 0 |
| 8 | 89 | - | - | - | - | - | Female | - | 0 |
| 9 | 66 | Other/Unknown | - | - | - | - | - | - | 0 |
Query instance (original outcome : 0)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 32 | Private | Assoc | Single | Service | White | Female | 60 | 0 |
Counterfactual set (new outcome: 1)
| age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 46 | - | - | - | Sales | - | Male | - | 1 |
| 1 | 69 | - | Prof-school | - | Professional | - | - | - | 1 |
| 2 | 83 | - | Doctorate | - | - | - | - | - | 1 |
| 3 | - | - | - | Married | White-Collar | - | - | - | 1 |
| 4 | 50 | - | Doctorate | - | - | - | - | - | 1 |
| 5 | 55 | - | Doctorate | - | - | Other | - | - | 1 |
| 6 | 44 | - | Doctorate | - | - | - | - | - | 1 |
| 7 | - | - | Masters | Divorced | White-Collar | - | - | - | 1 |
| 8 | - | Self-Employed | - | Married | White-Collar | - | - | - | - |
| 9 | 55 | - | Doctorate | - | - | - | - | - | 1 |
[None, None, None, None, None, None, None, None, None, None]
[{'marital_status': 0.6, 'hours_per_week': 0.5, 'age': 0.4, 'education': 0.3, 'race': 0.2, 'workclass': 0.0, 'occupation': 0.0, 'gender': 0.0}, {'occupation': 0.5, 'age': 0.4, 'education': 0.4, 'marital_status': 0.3, 'race': 0.2, 'gender': 0.1, 'hours_per_week': 0.1, 'workclass': 0.0}, {'education': 0.8, 'hours_per_week': 0.6, 'age': 0.4, 'occupation': 0.1, 'workclass': 0.0, 'marital_status': 0.0, 'race': 0.0, 'gender': 0.0}, {'education': 0.8, 'age': 0.4, 'occupation': 0.3, 'hours_per_week': 0.3, 'marital_status': 0.1, 'workclass': 0.0, 'race': 0.0, 'gender': 0.0}, {'hours_per_week': 0.5, 'age': 0.4, 'workclass': 0.3, 'marital_status': 0.3, 'education': 0.2, 'gender': 0.1, 'occupation': 0.0, 'race': 0.0}, {'hours_per_week': 0.6, 'education': 0.5, 'marital_status': 0.4, 'occupation': 0.2, 'age': 0.1, 'workclass': 0.1, 'race': 0.1, 'gender': 0.0}, {'age': 0.8, 'education': 0.5, 'hours_per_week': 0.5, 'gender': 0.4, 'workclass': 0.2, 'marital_status': 0.2, 'race': 0.2, 'occupation': 0.0}, {'age': 0.9, 'education': 0.8, 'workclass': 0.2, 'marital_status': 0.1, 'occupation': 0.0, 'race': 0.0, 'gender': 0.0, 'hours_per_week': 0.0}, {'age': 0.4, 'hours_per_week': 0.4, 'gender': 0.3, 'workclass': 0.2, 'education': 0.2, 'occupation': 0.2, 'race': 0.1, 'marital_status': 0.0}, {'age': 0.7, 'education': 0.7, 'occupation': 0.5, 'marital_status': 0.3, 'workclass': 0.1, 'race': 0.1, 'gender': 0.1, 'hours_per_week': 0.0}]
{'education': 0.52, 'age': 0.49, 'hours_per_week': 0.35, 'marital_status': 0.23, 'occupation': 0.18, 'workclass': 0.11, 'gender': 0.1, 'race': 0.09}