easy_rec.python.core

easy_rec.python.core.learning_schedules

Library of common learning rate schedules.

easy_rec.python.core.learning_schedules.exponential_decay_with_burnin(global_step, learning_rate_base, learning_rate_decay_steps, learning_rate_decay_factor, burnin_learning_rate=0.0, burnin_steps=0, min_learning_rate=0.0, staircase=True)[source]

Exponential decay schedule with burn-in period.

In this schedule, learning rate is fixed at burnin_learning_rate for a fixed period, before transitioning to a regular exponential decay schedule.

Parameters:
  • global_step – int tensor representing global step.

  • learning_rate_base – base learning rate.

  • learning_rate_decay_steps – steps to take between decaying the learning rate. Note that this includes the number of burn-in steps.

  • learning_rate_decay_factor – multiplicative factor by which to decay learning rate.

  • burnin_learning_rate – initial learning rate during burn-in period. If 0.0 (which is the default), then the burn-in learning rate is simply set to learning_rate_base.

  • burnin_steps – number of steps to use burnin learning rate.

  • min_learning_rate – the minimum learning rate.

  • staircase – whether use staircase decay.

Returns:

a (scalar) float tensor representing learning rate

easy_rec.python.core.learning_schedules.cosine_decay_with_warmup(global_step, learning_rate_base, total_steps, warmup_learning_rate=0.0, warmup_steps=0, hold_base_rate_steps=0)[source]

Cosine decay schedule with warm up period.

Cosine annealing learning rate as described in:

Loshchilov and Hutter, SGDR: Stochastic Gradient Descent with Warm Restarts. ICLR 2017. https://arxiv.org/abs/1608.03983

In this schedule, the learning rate grows linearly from warmup_learning_rate to learning_rate_base for warmup_steps, then transitions to a cosine decay schedule.

Parameters:
  • global_step – int64 (scalar) tensor representing global step.

  • learning_rate_base – base learning rate.

  • total_steps – total number of training steps.

  • warmup_learning_rate – initial learning rate for warm up.

  • warmup_steps – number of warmup steps.

  • hold_base_rate_steps – Optional number of steps to hold base learning rate before decaying.

Returns:

a (scalar) float tensor representing learning rate.

Raises:

ValueError – if warmup_learning_rate is larger than learning_rate_base, or if warmup_steps is larger than total_steps.

easy_rec.python.core.learning_schedules.manual_stepping(global_step, boundaries, rates, warmup=False)[source]

Manually stepped learning rate schedule.

This function provides fine grained control over learning rates. One must specify a sequence of learning rates as well as a set of integer steps at which the current learning rate must transition to the next. For example, if boundaries = [5, 10] and rates = [.1, .01, .001], then the learning rate returned by this function is .1 for global_step=0,…,4, .01 for global_step=5…9, and .001 for global_step=10 and onward.

Parameters:
  • global_step – int64 (scalar) tensor representing global step.

  • boundaries – a list of global steps at which to switch learning rates. This list is assumed to consist of increasing positive integers.

  • rates – a list of (float) learning rates corresponding to intervals between the boundaries. The length of this list must be exactly len(boundaries) + 1.

  • warmup – Whether to linearly interpolate learning rate for steps in [0, boundaries[0]].

Returns:

a (scalar) float tensor representing learning rate

Raises:

ValueError – if one of the following checks fails: 1. boundaries is a strictly increasing list of positive integers 2. len(rates) == len(boundaries) + 1 3. boundaries[0] != 0

easy_rec.python.core.learning_schedules.transformer_policy(global_step, learning_rate, d_model, warmup_steps, step_scaling_rate=1.0, max_lr=None, coefficient=1.0, dtype=tf.float32)[source]

Transformer’s learning rate schedule.

Transformer’s learning rate policy from https://arxiv.org/pdf/1706.03762.pdf with a hat (max_lr) (also called “noam” learning rate decay scheme).

Parameters:
  • global_step – global step TensorFlow tensor (ignored for this policy).

  • learning_rate (float) – initial learning rate to use.

  • d_model (int) – model dimensionality.

  • warmup_steps (int) – number of warm-up steps.

  • step_scaling_rate (float) – num step scale rate

  • max_lr (float) – maximal learning rate, i.e. hat.

  • coefficient (float) – optimizer adjustment. Recommended 0.002 if using “Adam” else 1.0.

  • dtype – dtype for this policy.

Returns:

learning rate at step global_step.

easy_rec.python.core.metrics.max_f1(label, predictions)[source]

Calculate the largest F1 metric under different thresholds.

Parameters:
  • label – Ground truth (correct) target values.

  • predictions – Estimated targets as returned by a model.

easy_rec.python.core.metrics.fast_auc(labels, predictions, name, num_thresholds=100000.0)[source]
easy_rec.python.core.metrics.gauc(labels, predictions, uids, reduction='mean')[source]

Computes the AUC group by user separately.

Parameters:
  • labels – A Tensor whose shape matches predictions. Will be cast to bool.

  • predictions – A floating point Tensor of arbitrary shape and whose values are in the range [0, 1].

  • uids – user ids, A int or string Tensor whose shape matches predictions.

  • reduction – reduction method for auc of different users * “mean”: simple mean of different users * “mean_by_sample_num”: weighted mean with sample num of different users * “mean_by_positive_num”: weighted mean with positive sample num of different users

easy_rec.python.core.metrics.session_auc(labels, predictions, session_ids, reduction='mean')[source]

Computes the AUC group by session separately.

Parameters:
  • labels – A Tensor whose shape matches predictions. Will be cast to bool.

  • predictions – A floating point Tensor of arbitrary shape and whose values are in the range [0, 1].

  • session_ids – session ids, A int or string Tensor whose shape matches predictions.

  • reduction – reduction method for auc of different sessions * “mean”: simple mean of different sessions * “mean_by_sample_num”: weighted mean with sample num of different sessions * “mean_by_positive_num”: weighted mean with positive sample num of different sessions

easy_rec.python.core.metrics.metric_learning_recall_at_k(k, embeddings, labels, session_ids=None, embed_normed=False)[source]

Computes the recall_at_k metric for metric learning.

Parameters:
  • k – a scalar of int, or a tuple of ints

  • embeddings – the output of last hidden layer, a tf.float32 Tensor with shape [batch_size, embedding_size]

  • labels – a Tensor with shape [batch_size]

  • session_ids – session ids, a Tensor with shape [batch_size]

  • embed_normed – indicator of whether the input embeddings are l2_normalized

easy_rec.python.core.metrics.metric_learning_average_precision_at_k(k, embeddings, labels, session_ids=None, embed_normed=False)[source]
class easy_rec.python.core.sampler.BaseSampler(fields, num_sample, num_eval_sample=None)[source]

Bases: object

__init__(fields, num_sample, num_eval_sample=None)[source]
set_eval_num_sample()[source]
classmethod instance(*args, **kwargs)[source]
class easy_rec.python.core.sampler.NegativeSampler(data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]

Bases: BaseSampler

Negative Sampler.

Weighted random sampling items not in batch.

Parameters:
  • data_path – item feature data path. id:int64 | weight:float | attrs:string.

  • fields – item input fields.

  • num_sample – number of negative samples.

  • batch_size – mini-batch size.

  • attr_delimiter – delimiter of feature string.

  • num_eval_sample – number of negative samples for evaluator.

__init__(data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]
get(ids)[source]

Sampling method.

Parameters:

ids – item id tensor.

Returns:

Negative sampled feature dict.

class easy_rec.python.core.sampler.NegativeSamplerInMemory(data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]

Bases: BaseSampler

Negative Sampler.

Weighted random sampling items not in batch.

Parameters:
  • data_path – item feature data path. id:int64 | weight:float | attrs:string.

  • fields – item input fields.

  • num_sample – number of negative samples.

  • batch_size – mini-batch size.

  • attr_delimiter – delimiter of feature string.

  • num_eval_sample – number of negative samples for evaluator.

__init__(data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]
get(ids)[source]

Sampling method.

Parameters:

ids – item id tensor.

Returns:

Negative sampled feature dict.

class easy_rec.python.core.sampler.NegativeSamplerV2(user_data_path, item_data_path, edge_data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]

Bases: BaseSampler

Negative Sampler V2.

Weighted random sampling items which do not have positive edge with the user.

Parameters:
  • user_data_path – user node data path. id:int64 | weight:float.

  • item_data_path – item feature data path. id:int64 | weight:float | attrs:string.

  • edge_data_path – positive edge data path. userid:int64 | itemid:int64 | weight:float

  • fields – item input fields.

  • num_sample – number of negative samples.

  • batch_size – mini-batch size.

  • attr_delimiter – delimiter of feature string.

  • num_eval_sample – number of negative samples for evaluator.

__init__(user_data_path, item_data_path, edge_data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]
get(src_ids, dst_ids)[source]

Sampling method.

Parameters:
  • src_ids – user id tensor.

  • dst_ids – item id tensor.

Returns:

Negative sampled feature dict.

class easy_rec.python.core.sampler.HardNegativeSampler(user_data_path, item_data_path, hard_neg_edge_data_path, fields, num_sample, num_hard_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]

Bases: BaseSampler

HardNegativeSampler.

Weighted random sampling items not in batch as negative samples, and sampling destination nodes in hard_neg_edge as hard negative samples

Parameters:
  • user_data_path – user node data path. id:int64 | weight:float.

  • item_data_path – item feature data path. id:int64 | weight:float | attrs:string.

  • hard_neg_edge_data_path – hard negative edge data path. userid:int64 | itemid:int64 | weight:float

  • fields – item input fields.

  • num_sample – number of negative samples.

  • num_hard_sample – maximum number of hard negative samples.

  • batch_size – mini-batch size.

  • attr_delimiter – delimiter of feature string.

  • num_eval_sample – number of negative samples for evaluator.

__init__(user_data_path, item_data_path, hard_neg_edge_data_path, fields, num_sample, num_hard_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]
get(src_ids, dst_ids)[source]

Sampling method.

Parameters:
  • src_ids – user id tensor.

  • dst_ids – item id tensor.

Returns:

Sampled feature dict. The first batch_size is negative samples, remainder is hard negative samples

class easy_rec.python.core.sampler.HardNegativeSamplerV2(user_data_path, item_data_path, edge_data_path, hard_neg_edge_data_path, fields, num_sample, num_hard_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]

Bases: BaseSampler

HardNegativeSampler.

Weighted random sampling items which do not have positive edge with the user., and sampling destination nodes in hard_neg_edge as hard negative samples

Parameters:
  • user_data_path – user node data path. id:int64 | weight:float.

  • item_data_path – item feature data path. id:int64 | weight:float | attrs:string.

  • edge_data_path – positive edge data path. userid:int64 | itemid:int64 | weight:float

  • hard_neg_edge_data_path – hard negative edge data path. userid:int64 | itemid:int64 | weight:float

  • fields – item input fields.

  • num_sample – number of negative samples.

  • num_hard_sample – maximum number of hard negative samples.

  • batch_size – mini-batch size.

  • attr_delimiter – delimiter of feature string.

  • num_eval_sample – number of negative samples for evaluator.

__init__(user_data_path, item_data_path, edge_data_path, hard_neg_edge_data_path, fields, num_sample, num_hard_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]
get(src_ids, dst_ids)[source]

Sampling method.

Parameters:
  • src_ids – user id tensor.

  • dst_ids – item id tensor.

Returns:

Sampled feature dict. The first batch_size is negative samples, remainder is hard negative samples

easy_rec.python.core.sampler.build(data_config)[source]