holisticai.robustness.attackers.RidgeGDPoisoner#
- class holisticai.robustness.attackers.RidgeGDPoisoner(poison_proportion=0.2, num_inits=1, max_iter=15, initializer='inf_flip', eta=0.01, beta=0.05, sigma=0.9, eps=0.001, objective=0, opty=True)[source]#
RidgeGDPoisoner implements a gradient-based poisoning attack for regression models, designed to inject malicious data points into the training dataset to manipulate the model’s learned parameters and degrade its predictive performance. Unlike LinRegGDPoisoner, this method includes regularization terms in the computations to generate the poisoned points to maximize their impact on the regression line.
Parameters
- poison_proportionfloat
The proportion of points to flip. Default is 0.2.
- num_initsint
The number of initializations. Default is 1.
- max_iterint
The maximum number of iterations. Default is 15.
- initializerstr
The initialization method. Default is ‘inf_flip’. Options are ‘inf_flip’. ‘adaptive’, ‘randflip’ and ‘randflipnobd’.
- etafloat
Gradient descent step size. Default is 0.01.
- betafloat
Decay rate for line search. Default is 0.05.
- sigmafloat
Line search stop condition. Default is 0.9.
- epsfloat
Poisoning stop condition. Default is 1e-3.
- objectiveint
Objective function to optimize. Default is 0.
- optybool
Whether to optimize y. Default is True.
References
- generate(X_train, y_train, categorical_mask=None, return_only_poisoned=False)[source]#
Parameters
- X_trainpandas.DataFrame
The training data features.
- y_trainpandas.Series
The training data labels.
- categorical_masknumpy.ndarray, optional
A boolean mask indicating which columns in X_train are categorical.
- return_only_poisonedbool, optional
If True, return only the poisoned data points. Otherwise, return the entire dataset including the poisoned points.
Returns
- pandas.DataFrame
The features of the dataset including the poisoned points.
- pandas.Series
The labels of the dataset including the poisoned points.
Notes
If return_only_poisoned is True, the original dataset is not modified. Otherwise, the original dataset is concatenated with the poisoned points.