holisticai.bias.metrics.cluster_balance#

holisticai.bias.metrics.cluster_balance(group_a, group_b, y_pred)[source]#

Cluster Balance

Given a clustering and protected attribute. The cluster balance is the minimum over all groups and clusters of the ratio of the representation of members of that group in that cluster to the representation overall.

Interpretation

A value of 1 is desired. That is when all clusters have the exact same representation as the data. Lower values imply the existence of clusters where either group_a or group_b is underrepresented.

Parameters

group_aarray-like

Group membership vector (binary)

group_barray-like

Group membership vector (binary)

y_predarray-like

Cluster predictions (categorical)

Returns

float

Cluster Balance

Examples

>>> import numpy as np
>>> from holisticai.bias.metrics import cluster_balance
>>> group_a = np.array([1, 1, 1, 1, 0, 0, 0, 0, 0, 0])
>>> group_b = np.array([0, 0, 0, 0, 1, 1, 1, 1, 1, 1])
>>> y_pred_cluster = np.array([0, 1, 1, 2, 0, 0, 0, 0, 1, 2])
>>> cluster_balance(group_a, group_b, y_pred_cluster)
0.5