Bias mitigation with “Popularity propensity” and “Two-sided fairness”#

This demo demonstrates how to implement the “popularity propensity” and “Two-sided fairness” method to enhance fairness in recommender systems.

First, install the holisticai package if you haven’t already:

!pip install holisticai[all]

Then, import the necessary libraries.

[2]:
import numpy as np
import pandas as pd
from holisticai.datasets import load_dataset
from holisticai.bias.metrics import recommender_bias_metrics
from holisticai.bias.mitigation import PopularityPropensityMF

np.random.seed(0)
import warnings
warnings.filterwarnings("ignore")

Loading the proprocessed “LastFM” dataset.

[3]:
dataset = load_dataset('lastfm')
df_pivot, p_attr = dataset['data_pivot'], dataset['p_attr']
[4]:
def explode(arr, num_items):
    out = np.zeros(num_items)
    out[arr] = 1
    return out

Bias mitigation#

Method: Popularity propensity#

Traditional implementation#

First, we will show the traditional implementation of the “Popularity Propensity” method.

[5]:
mf = PopularityPropensityMF(K=40, beta=0.02, steps=100, verbose=1)
data_matrix = df_pivot.fillna(0).to_numpy()
mf.fit(data_matrix)
[5]:
def recommended_items(model, data_matrix, k):
    recommended_items_mask = data_matrix>0
    candidate_index = ~recommended_items_mask
    candidate_rating = model.pred*candidate_index
    return np.argsort(-candidate_rating,axis=1)[:,:k]
[6]:
new_items = recommended_items(mf, data_matrix, 10)
new_recs = [explode(new_items[u], len(df_pivot.columns)) for u in range(df_pivot.shape[0])]
new_df_pivot_db = pd.DataFrame(new_recs, columns = df_pivot.columns)

mat = new_df_pivot_db.replace(0,np.nan).to_numpy()
df_popularity = recommender_bias_metrics(mat_pred=mat, metric_type='item_based')
df_popularity
[6]:
Value Reference
Metric
Aggregate Diversity 0.999004 1
GINI index 0.440891 0
Exposure Distribution Entropy 6.579432 -
Average Recommendation Popularity 278.321600 -

Pipeline implementation#

[7]:
from holisticai.pipeline import Pipeline

inprocessing_model = PopularityPropensityMF(K=40, beta=0.02, steps=100, verbose=1)

pipeline = Pipeline(
    steps=[
        ("bm_inprocessing", inprocessing_model),
    ]
)

pipeline.fit(data_matrix)

rankings  = pipeline.predict(data_matrix, top_n=10)
mat = rankings.pivot(columns='Y',index='X',values='score').replace(np.nan,0).to_numpy()
df = recommender_bias_metrics(mat_pred=mat>0, metric_type='item_based')
df_pop_pipeline =df.copy()
df_pop_pipeline
[7]:
Value Reference
Metric
Aggregate Diversity 1.000000 1
GINI index 0.441953 0
Exposure Distribution Entropy 6.578349 -
Average Recommendation Popularity 275.996493 -

Method: Two sided fairness#

Traditional implementation for FairRec#

Now, we will show the traditional implementation of the “Two sided fairness” method.

[8]:
from holisticai.bias.mitigation import FairRec

fr = FairRec(rec_size=10, MMS_fraction=0.5)
fr.fit(data_matrix)
[8]:
[FairRec]
FairRec(rec_size=10, MMS_fraction=0.5)

Type: Bias Mitigation Inprocessing
[9]:
recommendations = fr.recommendation
new_recs = [explode(recommendations[key], len(df_pivot.columns)) for key in recommendations.keys()]

new_df_pivot_db = pd.DataFrame(new_recs, columns = df_pivot.columns)

mat = new_df_pivot_db.replace(0,np.nan).to_numpy()

df_tsf = recommender_bias_metrics(mat_pred=mat, metric_type='item_based')
df_tsf
[9]:
Value Reference
Metric
Aggregate Diversity 1.000000 1
GINI index 0.421428 0
Exposure Distribution Entropy 6.567894 -
Average Recommendation Popularity 317.154227 -

Pipeline implementation for FairRec#

[10]:
from holisticai.pipeline import Pipeline

inprocessing_model = FairRec(rec_size=10, MMS_fraction=0.5)

pipeline = Pipeline(
    steps=[
        ("bm_inprocessing", inprocessing_model),
    ]
)

pipeline.fit(data_matrix)

rankings  = pipeline.predict(data_matrix, top_n=10)
mat = rankings.pivot(columns='Y',index='X',values='score').replace(np.nan,0).to_numpy()
df_tsf_pipeline = recommender_bias_metrics(mat_pred=mat>0, metric_type='item_based')
df_tsf_pipeline
[10]:
Value Reference
Metric
Aggregate Diversity 1.000000 1
GINI index 0.421428 0
Exposure Distribution Entropy 6.567894 -
Average Recommendation Popularity 317.154227 -