easy_rec.python.core¶
easy_rec.python.core.learning_schedules¶
Library of common learning rate schedules.
- easy_rec.python.core.learning_schedules.exponential_decay_with_burnin(global_step, learning_rate_base, learning_rate_decay_steps, learning_rate_decay_factor, burnin_learning_rate=0.0, burnin_steps=0, min_learning_rate=0.0, staircase=True)[source]¶
Exponential decay schedule with burn-in period.
In this schedule, learning rate is fixed at burnin_learning_rate for a fixed period, before transitioning to a regular exponential decay schedule.
- Parameters:
global_step – int tensor representing global step.
learning_rate_base – base learning rate.
learning_rate_decay_steps – steps to take between decaying the learning rate. Note that this includes the number of burn-in steps.
learning_rate_decay_factor – multiplicative factor by which to decay learning rate.
burnin_learning_rate – initial learning rate during burn-in period. If 0.0 (which is the default), then the burn-in learning rate is simply set to learning_rate_base.
burnin_steps – number of steps to use burnin learning rate.
min_learning_rate – the minimum learning rate.
staircase – whether use staircase decay.
- Returns:
a (scalar) float tensor representing learning rate
- easy_rec.python.core.learning_schedules.cosine_decay_with_warmup(global_step, learning_rate_base, total_steps, warmup_learning_rate=0.0, warmup_steps=0, hold_base_rate_steps=0)[source]¶
Cosine decay schedule with warm up period.
- Cosine annealing learning rate as described in:
Loshchilov and Hutter, SGDR: Stochastic Gradient Descent with Warm Restarts. ICLR 2017. https://arxiv.org/abs/1608.03983
In this schedule, the learning rate grows linearly from warmup_learning_rate to learning_rate_base for warmup_steps, then transitions to a cosine decay schedule.
- Parameters:
global_step – int64 (scalar) tensor representing global step.
learning_rate_base – base learning rate.
total_steps – total number of training steps.
warmup_learning_rate – initial learning rate for warm up.
warmup_steps – number of warmup steps.
hold_base_rate_steps – Optional number of steps to hold base learning rate before decaying.
- Returns:
a (scalar) float tensor representing learning rate.
- Raises:
ValueError – if warmup_learning_rate is larger than learning_rate_base, or if warmup_steps is larger than total_steps.
- easy_rec.python.core.learning_schedules.manual_stepping(global_step, boundaries, rates, warmup=False)[source]¶
Manually stepped learning rate schedule.
This function provides fine grained control over learning rates. One must specify a sequence of learning rates as well as a set of integer steps at which the current learning rate must transition to the next. For example, if boundaries = [5, 10] and rates = [.1, .01, .001], then the learning rate returned by this function is .1 for global_step=0,…,4, .01 for global_step=5…9, and .001 for global_step=10 and onward.
- Parameters:
global_step – int64 (scalar) tensor representing global step.
boundaries – a list of global steps at which to switch learning rates. This list is assumed to consist of increasing positive integers.
rates – a list of (float) learning rates corresponding to intervals between the boundaries. The length of this list must be exactly len(boundaries) + 1.
warmup – Whether to linearly interpolate learning rate for steps in [0, boundaries[0]].
- Returns:
a (scalar) float tensor representing learning rate
- Raises:
ValueError – if one of the following checks fails: 1. boundaries is a strictly increasing list of positive integers 2. len(rates) == len(boundaries) + 1 3. boundaries[0] != 0
- easy_rec.python.core.learning_schedules.transformer_policy(global_step, learning_rate, d_model, warmup_steps, step_scaling_rate=1.0, max_lr=None, coefficient=1.0, dtype=tf.float32)[source]¶
Transformer’s learning rate schedule.
Transformer’s learning rate policy from https://arxiv.org/pdf/1706.03762.pdf with a hat (max_lr) (also called “noam” learning rate decay scheme).
- Parameters:
global_step – global step TensorFlow tensor (ignored for this policy).
learning_rate (float) – initial learning rate to use.
d_model (int) – model dimensionality.
warmup_steps (int) – number of warm-up steps.
step_scaling_rate (float) – num step scale rate
max_lr (float) – maximal learning rate, i.e. hat.
coefficient (float) – optimizer adjustment. Recommended 0.002 if using “Adam” else 1.0.
dtype – dtype for this policy.
- Returns:
learning rate at step
global_step
.
- easy_rec.python.core.metrics.max_f1(label, predictions)[source]¶
Calculate the largest F1 metric under different thresholds.
- Parameters:
label – Ground truth (correct) target values.
predictions – Estimated targets as returned by a model.
- easy_rec.python.core.metrics.gauc(labels, predictions, uids, reduction='mean')[source]¶
Computes the AUC group by user separately.
- Parameters:
labels – A Tensor whose shape matches predictions. Will be cast to bool.
predictions – A floating point Tensor of arbitrary shape and whose values are in the range [0, 1].
uids – user ids, A int or string Tensor whose shape matches predictions.
reduction – reduction method for auc of different users * “mean”: simple mean of different users * “mean_by_sample_num”: weighted mean with sample num of different users * “mean_by_positive_num”: weighted mean with positive sample num of different users
- easy_rec.python.core.metrics.session_auc(labels, predictions, session_ids, reduction='mean')[source]¶
Computes the AUC group by session separately.
- Parameters:
labels – A Tensor whose shape matches predictions. Will be cast to bool.
predictions – A floating point Tensor of arbitrary shape and whose values are in the range [0, 1].
session_ids – session ids, A int or string Tensor whose shape matches predictions.
reduction – reduction method for auc of different sessions * “mean”: simple mean of different sessions * “mean_by_sample_num”: weighted mean with sample num of different sessions * “mean_by_positive_num”: weighted mean with positive sample num of different sessions
- easy_rec.python.core.metrics.metric_learning_recall_at_k(k, embeddings, labels, session_ids=None, embed_normed=False)[source]¶
Computes the recall_at_k metric for metric learning.
- Parameters:
k – a scalar of int, or a tuple of ints
embeddings – the output of last hidden layer, a tf.float32 Tensor with shape [batch_size, embedding_size]
labels – a Tensor with shape [batch_size]
session_ids – session ids, a Tensor with shape [batch_size]
embed_normed – indicator of whether the input embeddings are l2_normalized
- easy_rec.python.core.metrics.metric_learning_average_precision_at_k(k, embeddings, labels, session_ids=None, embed_normed=False)[source]¶
- class easy_rec.python.core.sampler.BaseSampler(fields, num_sample, num_eval_sample=None)[source]¶
Bases:
object
- class easy_rec.python.core.sampler.NegativeSampler(data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶
Bases:
BaseSampler
Negative Sampler.
Weighted random sampling items not in batch.
- Parameters:
data_path – item feature data path. id:int64 | weight:float | attrs:string.
fields – item input fields.
num_sample – number of negative samples.
batch_size – mini-batch size.
attr_delimiter – delimiter of feature string.
num_eval_sample – number of negative samples for evaluator.
- class easy_rec.python.core.sampler.NegativeSamplerInMemory(data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶
Bases:
BaseSampler
Negative Sampler.
Weighted random sampling items not in batch.
- Parameters:
data_path – item feature data path. id:int64 | weight:float | attrs:string.
fields – item input fields.
num_sample – number of negative samples.
batch_size – mini-batch size.
attr_delimiter – delimiter of feature string.
num_eval_sample – number of negative samples for evaluator.
- class easy_rec.python.core.sampler.NegativeSamplerV2(user_data_path, item_data_path, edge_data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶
Bases:
BaseSampler
Negative Sampler V2.
Weighted random sampling items which do not have positive edge with the user.
- Parameters:
user_data_path – user node data path. id:int64 | weight:float.
item_data_path – item feature data path. id:int64 | weight:float | attrs:string.
edge_data_path – positive edge data path. userid:int64 | itemid:int64 | weight:float
fields – item input fields.
num_sample – number of negative samples.
batch_size – mini-batch size.
attr_delimiter – delimiter of feature string.
num_eval_sample – number of negative samples for evaluator.
- class easy_rec.python.core.sampler.HardNegativeSampler(user_data_path, item_data_path, hard_neg_edge_data_path, fields, num_sample, num_hard_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶
Bases:
BaseSampler
HardNegativeSampler.
Weighted random sampling items not in batch as negative samples, and sampling destination nodes in hard_neg_edge as hard negative samples
- Parameters:
user_data_path – user node data path. id:int64 | weight:float.
item_data_path – item feature data path. id:int64 | weight:float | attrs:string.
hard_neg_edge_data_path – hard negative edge data path. userid:int64 | itemid:int64 | weight:float
fields – item input fields.
num_sample – number of negative samples.
num_hard_sample – maximum number of hard negative samples.
batch_size – mini-batch size.
attr_delimiter – delimiter of feature string.
num_eval_sample – number of negative samples for evaluator.
- class easy_rec.python.core.sampler.HardNegativeSamplerV2(user_data_path, item_data_path, edge_data_path, hard_neg_edge_data_path, fields, num_sample, num_hard_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶
Bases:
BaseSampler
HardNegativeSampler.
Weighted random sampling items which do not have positive edge with the user., and sampling destination nodes in hard_neg_edge as hard negative samples
- Parameters:
user_data_path – user node data path. id:int64 | weight:float.
item_data_path – item feature data path. id:int64 | weight:float | attrs:string.
edge_data_path – positive edge data path. userid:int64 | itemid:int64 | weight:float
hard_neg_edge_data_path – hard negative edge data path. userid:int64 | itemid:int64 | weight:float
fields – item input fields.
num_sample – number of negative samples.
num_hard_sample – maximum number of hard negative samples.
batch_size – mini-batch size.
attr_delimiter – delimiter of feature string.
num_eval_sample – number of negative samples for evaluator.