holisticai.bias.metrics.silhouette_diff#

holisticai.bias.metrics.silhouette_diff(group_a, group_b, data, y_pred)[source]#

Silhouette Difference

We compute the difference of the mean silhouette score for both groups.

Interpretation

The silhouette difference ranges from -1 to 1, with lower values indicating bias towards group_a and larger values indicating bias against group_b.

Parameters

group_aarray-like

Group membership vector (binary)

group_barray-like

Group membership vector (binary)

datamatrix-like

Data matrix of shape (num_inst, dim)

y_predarray-like

Cluster predictions (categorical)

Returns

float

Silhouette difference

Notes

:math:` exttt{mean_silhouette_a - mean_silhouette_b}`

Examples

>>> import numpy as np
>>> from holisticai.bias.metrics import silhouette_diff
>>> group_a = np.array([1, 1, 1, 1, 0, 0, 0, 0, 0, 0])
>>> group_b = np.array([0, 0, 0, 0, 1, 1, 1, 1, 1, 1])
>>> data = np.array(
...     [
...         [-1, 1],
...         [1, 1],
...         [1, 1],
...         [0, -1],
...         [-1, 1],
...         [-1, 1],
...         [-1, 1],
...         [-1, 1],
...         [1, 1],
...         [0, -1],
...     ]
... )
>>> y_pred = np.array([0, 1, 1, 2, 0, 0, 0, 0, 1, 2])
>>> silhouette_diff(group_a, group_b, data, y_pred)
0.0