holisticai.bias.metrics.silhouette_diff#
- holisticai.bias.metrics.silhouette_diff(group_a, group_b, data, y_pred)[source]#
Silhouette Difference
We compute the difference of the mean silhouette score for both groups.
Interpretation
The silhouette difference ranges from -1 to 1, with lower values indicating bias towards group_a and larger values indicating bias against group_b.
Parameters
- group_aarray-like
Group membership vector (binary)
- group_barray-like
Group membership vector (binary)
- datamatrix-like
Data matrix of shape (num_inst, dim)
- y_predarray-like
Cluster predictions (categorical)
Returns
- float
Silhouette difference
Notes
:math:` exttt{mean_silhouette_a - mean_silhouette_b}`
Examples
>>> import numpy as np >>> from holisticai.bias.metrics import silhouette_diff >>> group_a = np.array([1, 1, 1, 1, 0, 0, 0, 0, 0, 0]) >>> group_b = np.array([0, 0, 0, 0, 1, 1, 1, 1, 1, 1]) >>> data = np.array( ... [ ... [-1, 1], ... [1, 1], ... [1, 1], ... [0, -1], ... [-1, 1], ... [-1, 1], ... [-1, 1], ... [-1, 1], ... [1, 1], ... [0, -1], ... ] ... ) >>> y_pred = np.array([0, 1, 1, 2, 0, 0, 0, 0, 1, 2]) >>> silhouette_diff(group_a, group_b, data, y_pred) 0.0