holisticai.datasets.load_dataset#

holisticai.datasets.load_dataset(dataset_name: Literal['adult', 'law_school', 'student_multiclass', 'student', 'lastfm', 'us_crime', 'us_crime_multiclass', 'clinical_records', 'german_credit', 'census_kdd', 'bank_marketing', 'compas_two_year_recid', 'compas_is_recid', 'diabetes', 'acsincome', 'acspublic', 'mw_medium', 'mw_small'], preprocessed: bool = True, protected_attribute: str | None = None, target: str | None = None) Dataset[source]#

Load a specific dataset based on the given dataset name.

Parameters

dataset_name: ProcessedDatasets

The name of the dataset to load. The list of supported datasets are here: Processed Datasets.

preprocessed: (bool, Optional)

Whether to return the preprocessed X and y.

protected_attribute: (str, Optional)

If this parameter is set, the dataset will be returned with the protected attribute as a binary column group_a and group_b. Otherwise, the dataset will be returned with the protected attribute as a column p_attrs.

Returns

Dataset: The loaded dataset.

Raises:

NotImplementedError: – If the specified dataset name is not supported.