beta_rec.data package

beta_rec.data.auxiliary_data module

beta_rec.data.base_data module

class beta_rec.data.base_data.BaseData(split_dataset, intersect=True, binarize=True, bin_thld=0.0, normalize=False)[source]

Bases: object

A plain DataBase object modeling general recommendation data. Re_index all the users and items from raw dataset.

Parameters:
  • split_dataset (train,valid,test) – the split dataset, a tuple consisting of training (DataFrame), validate/list of validate (DataFrame), testing/list of testing (DataFrame).
  • intersect (bool, optional) – remove users and items of test/valid sets that do not exist in the train set. If the model is able to predict for new users and new items, this can be False. (default: True).
  • binarize (bool, optional) – binarize the rating column of train set 0 or 1, i.e. implicit feedback. (default: True).
  • bin_thld (int, optional) – the threshold of binarization (default: 0) normalize (bool, optional): normalize the rating column of train. set into [0, 1], i.e. explicit feedback. (default: False).
create_adj_mat()[source]

Create adjacent matirx from the user-item interaction matrix.

create_constraint_mat()[source]

Create adjacent matirx from the user-item interaction matrix.

create_sgl_mat(config)[source]

Create adjacent matirx from the user-item interaction matrix.

get_adj_mat(config)[source]

Get the adjacent matrix, if not previously stored then call the function to create.

This method is for NGCF model.

Returns:Different types of adjacment matrix.
get_constraint_mat(config)[source]

Get the adjacent matrix, if not previously stored then call the function to create.

This method is for NGCF model.

Returns:Different types of adjacment matrix.
instance_bce_loader(batch_size, device, num_negative)[source]

Instance a train DataLoader that have rating.

instance_bpr_loader(batch_size, device)[source]

Instance a pairwise Data_loader for training.

Sample ONE negative items for each user-item pare, and shuffle them with positive items. A batch of data in this DataLoader is suitable for a binary cross-entropy loss. # todo implement the item popularity-biased sampling

instance_mul_neg_loader(batch_size, device, num_negative)[source]

Instance a pairwise Data_loader for training.

Sample multiples negative items for each user-item pare, and shuffle them with positive items. A batch of data in this DataLoader is suitable for a binary cross-entropy loss.

instance_vae_loader(device)[source]

Instance a train DataLoader that have rating.

randint_choice(high, size=None, replace=True, p=None, exclusion=None)[source]

Return random integers from 0 (inclusive) to high (exclusive).

beta_rec.data.data_loaders module

class beta_rec.data.data_loaders.PairwiseNegativeDataset(user_tensor, pos_item_tensor, neg_item_tensor)[source]

Bases: torch.utils.data.dataset.Dataset

Wrapper, convert <user, pos_item, neg_item> Tensor into Pytorch Dataset.

class beta_rec.data.data_loaders.RatingDataset(user_tensor, item_tensor, target_tensor)[source]

Bases: torch.utils.data.dataset.Dataset

Wrapper, convert <user, item, rating> Tensor into Pytorch Dataset.

beta_rec.data.deprecated_data module

beta_rec.data.deprecated_data_base module

class beta_rec.data.deprecated_data_base.DataLoaderBase(ratings)[source]

Bases: object

Construct dataset for NCF.

create_adj_mat()[source]

Create adjacent matirx from the user-item interaction matrix.

create_graph_embeddings(config)[source]

Create graph embeddings from the user and item hypergraph.

evaluate_data

Create evaluation data.

get_adj_mat(config)[source]

Get the adjacent matrix, if not previously stored then call the function to create.

This method is for NGCF model.

Returns:Different types of adjacment matrix.
get_graph_embeddings(config)[source]

Get the graph embedding, if not previously stored then call the function to create.

This method is for LCFN model.

Returns:eigsh of the graph matrix
instance_a_train_loader(num_negatives, batch_size)[source]

Instance train loader for one training epoch.

pairwise_negative_train_loader(batch_size, device)[source]

Instance a pairwise Data_loader for training.

Sample ONE negative items for each user-item pare, and shuffle them with positive items. A batch of data in this DataLoader is suitable for a binary cross-entropy loss. # todo implement the item popularity-biased sampling

uniform_negative_train_loader(num_negatives, batch_size, device)[source]

Instance a Data_loader for training.

Sample ‘num_negatives’ negative items for each user, and shuffle them with positive items. A batch of data in this DataLoader is suitable for a binary cross-entropy loss. # todo implement the item popularity-biased sampling

class beta_rec.data.deprecated_data_base.PairwiseNegativeDataset(user_tensor, pos_item_tensor, neg_item_tensor)[source]

Bases: torch.utils.data.dataset.Dataset

Wrapper, convert <user, pos_item, neg_item> Tensor into Pytorch Dataset.

class beta_rec.data.deprecated_data_base.RatingNegativeDataset(user_tensor, item_tensor, rating_tensor)[source]

Bases: torch.utils.data.dataset.Dataset

RatingNegativeDataset.

Wrapper, convert <user, item, rating> Tensor into Pytorch Dataset, which contains negative items with rating being 0.0.

class beta_rec.data.deprecated_data_base.UserItemRatingDataset(user_tensor, item_tensor, target_tensor)[source]

Bases: torch.utils.data.dataset.Dataset

Wrapper, convert <user, item, rating> Tensor into Pytorch Dataset.

beta_rec.data.grocery_data module

Module contents

Data Module.

class beta_rec.data.BaseData(split_dataset, intersect=True, binarize=True, bin_thld=0.0, normalize=False)[source]

Bases: object

A plain DataBase object modeling general recommendation data. Re_index all the users and items from raw dataset.

Parameters:
  • split_dataset (train,valid,test) – the split dataset, a tuple consisting of training (DataFrame), validate/list of validate (DataFrame), testing/list of testing (DataFrame).
  • intersect (bool, optional) – remove users and items of test/valid sets that do not exist in the train set. If the model is able to predict for new users and new items, this can be False. (default: True).
  • binarize (bool, optional) – binarize the rating column of train set 0 or 1, i.e. implicit feedback. (default: True).
  • bin_thld (int, optional) – the threshold of binarization (default: 0) normalize (bool, optional): normalize the rating column of train. set into [0, 1], i.e. explicit feedback. (default: False).
create_adj_mat()[source]

Create adjacent matirx from the user-item interaction matrix.

create_constraint_mat()[source]

Create adjacent matirx from the user-item interaction matrix.

create_sgl_mat(config)[source]

Create adjacent matirx from the user-item interaction matrix.

get_adj_mat(config)[source]

Get the adjacent matrix, if not previously stored then call the function to create.

This method is for NGCF model.

Returns:Different types of adjacment matrix.
get_constraint_mat(config)[source]

Get the adjacent matrix, if not previously stored then call the function to create.

This method is for NGCF model.

Returns:Different types of adjacment matrix.
instance_bce_loader(batch_size, device, num_negative)[source]

Instance a train DataLoader that have rating.

instance_bpr_loader(batch_size, device)[source]

Instance a pairwise Data_loader for training.

Sample ONE negative items for each user-item pare, and shuffle them with positive items. A batch of data in this DataLoader is suitable for a binary cross-entropy loss. # todo implement the item popularity-biased sampling

instance_mul_neg_loader(batch_size, device, num_negative)[source]

Instance a pairwise Data_loader for training.

Sample multiples negative items for each user-item pare, and shuffle them with positive items. A batch of data in this DataLoader is suitable for a binary cross-entropy loss.

instance_vae_loader(device)[source]

Instance a train DataLoader that have rating.

randint_choice(high, size=None, replace=True, p=None, exclusion=None)[source]

Return random integers from 0 (inclusive) to high (exclusive).