easy_rec.python.core¶

easy_rec.python.core.learning_schedules¶

Library of common learning rate schedules.

easy_rec.python.core.learning_schedules.exponential_decay_with_burnin(global_step, learning_rate_base, learning_rate_decay_steps, learning_rate_decay_factor, burnin_learning_rate=0.0, burnin_steps=0, min_learning_rate=0.0, staircase=True)[source]¶

Exponential decay schedule with burn-in period.

In this schedule, learning rate is fixed at burnin_learning_rate for a fixed period, before transitioning to a regular exponential decay schedule.

Parameters:

global_step – int tensor representing global step.
learning_rate_base – base learning rate.
learning_rate_decay_steps – steps to take between decaying the learning rate. Note that this includes the number of burn-in steps.
learning_rate_decay_factor – multiplicative factor by which to decay learning rate.
burnin_learning_rate – initial learning rate during burn-in period. If 0.0 (which is the default), then the burn-in learning rate is simply set to learning_rate_base.
burnin_steps – number of steps to use burnin learning rate.
min_learning_rate – the minimum learning rate.
staircase – whether use staircase decay.

Returns:

a (scalar) float tensor representing learning rate

easy_rec.python.core.learning_schedules.cosine_decay_with_warmup(global_step, learning_rate_base, total_steps, warmup_learning_rate=0.0, warmup_steps=0, hold_base_rate_steps=0)[source]¶

Cosine decay schedule with warm up period.

Cosine annealing learning rate as described in:: Loshchilov and Hutter, SGDR: Stochastic Gradient Descent with Warm Restarts. ICLR 2017. https://arxiv.org/abs/1608.03983

In this schedule, the learning rate grows linearly from warmup_learning_rate to learning_rate_base for warmup_steps, then transitions to a cosine decay schedule.

Parameters:

global_step – int64 (scalar) tensor representing global step.
learning_rate_base – base learning rate.
total_steps – total number of training steps.
warmup_learning_rate – initial learning rate for warm up.
warmup_steps – number of warmup steps.
hold_base_rate_steps – Optional number of steps to hold base learning rate before decaying.

Returns:

a (scalar) float tensor representing learning rate.

Raises:

ValueError – if warmup_learning_rate is larger than learning_rate_base, or if warmup_steps is larger than total_steps.

easy_rec.python.core.learning_schedules.manual_stepping(global_step, boundaries, rates, warmup=False)[source]¶

Manually stepped learning rate schedule.

This function provides fine grained control over learning rates. One must specify a sequence of learning rates as well as a set of integer steps at which the current learning rate must transition to the next. For example, if boundaries = [5, 10] and rates = [.1, .01, .001], then the learning rate returned by this function is .1 for global_step=0,…,4, .01 for global_step=5…9, and .001 for global_step=10 and onward.

Parameters:

global_step – int64 (scalar) tensor representing global step.
boundaries – a list of global steps at which to switch learning rates. This list is assumed to consist of increasing positive integers.
rates – a list of (float) learning rates corresponding to intervals between the boundaries. The length of this list must be exactly len(boundaries) + 1.
warmup – Whether to linearly interpolate learning rate for steps in [0, boundaries[0]].

Returns:

a (scalar) float tensor representing learning rate

Raises:

ValueError – if one of the following checks fails: 1. boundaries is a strictly increasing list of positive integers 2. len(rates) == len(boundaries) + 1 3. boundaries[0] != 0

easy_rec.python.core.learning_schedules.transformer_policy(global_step, learning_rate, d_model, warmup_steps, step_scaling_rate=1.0, max_lr=None, coefficient=1.0, dtype=tf.float32)[source]¶

Transformer’s learning rate schedule.

Transformer’s learning rate policy from https://arxiv.org/pdf/1706.03762.pdf with a hat (max_lr) (also called “noam” learning rate decay scheme).

Parameters:

global_step – global step TensorFlow tensor (ignored for this policy).
learning_rate (float) – initial learning rate to use.
d_model (int) – model dimensionality.
warmup_steps (int) – number of warm-up steps.
step_scaling_rate (float) – num step scale rate
max_lr (float) – maximal learning rate, i.e. hat.
coefficient (float) – optimizer adjustment. Recommended 0.002 if using “Adam” else 1.0.
dtype – dtype for this policy.

Returns:

learning rate at step global_step.

easy_rec.python.core.metrics.max_f1(label, predictions)[source]¶

Calculate the largest F1 metric under different thresholds.

Parameters:

label – Ground truth (correct) target values.
predictions – Estimated targets as returned by a model.

easy_rec.python.core.metrics.fast_auc(labels, predictions, name, num_thresholds=100000.0)[source]¶

easy_rec.python.core.metrics.gauc(labels, predictions, uids, reduction='mean')[source]¶

Computes the AUC group by user separately.

Parameters:

labels – A Tensor whose shape matches predictions. Will be cast to bool.
predictions – A floating point Tensor of arbitrary shape and whose values are in the range [0, 1].
uids – user ids, A int or string Tensor whose shape matches predictions.
reduction – reduction method for auc of different users * “mean”: simple mean of different users * “mean_by_sample_num”: weighted mean with sample num of different users * “mean_by_positive_num”: weighted mean with positive sample num of different users

easy_rec.python.core.metrics.session_auc(labels, predictions, session_ids, reduction='mean')[source]¶

Computes the AUC group by session separately.

Parameters:

labels – A Tensor whose shape matches predictions. Will be cast to bool.
predictions – A floating point Tensor of arbitrary shape and whose values are in the range [0, 1].
session_ids – session ids, A int or string Tensor whose shape matches predictions.
reduction – reduction method for auc of different sessions * “mean”: simple mean of different sessions * “mean_by_sample_num”: weighted mean with sample num of different sessions * “mean_by_positive_num”: weighted mean with positive sample num of different sessions

easy_rec.python.core.metrics.metric_learning_recall_at_k(k, embeddings, labels, session_ids=None, embed_normed=False)[source]¶

Computes the recall_at_k metric for metric learning.

Parameters:

k – a scalar of int, or a tuple of ints
embeddings – the output of last hidden layer, a tf.float32 Tensor with shape [batch_size, embedding_size]
labels – a Tensor with shape [batch_size]
session_ids – session ids, a Tensor with shape [batch_size]
embed_normed – indicator of whether the input embeddings are l2_normalized

easy_rec.python.core.metrics.metric_learning_average_precision_at_k(k, embeddings, labels, session_ids=None, embed_normed=False)[source]¶

class easy_rec.python.core.sampler.BaseSampler(fields, num_sample, num_eval_sample=None)[source]¶

Bases: object

__init__(fields, num_sample, num_eval_sample=None)[source]¶

set_eval_num_sample()[source]¶

classmethod instance(*args, **kwargs)[source]¶

class easy_rec.python.core.sampler.NegativeSampler(data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶

Bases: BaseSampler

Negative Sampler.

Weighted random sampling items not in batch.

Parameters:

data_path – item feature data path. id:int64 | weight:float | attrs:string.
fields – item input fields.
num_sample – number of negative samples.
batch_size – mini-batch size.
attr_delimiter – delimiter of feature string.
num_eval_sample – number of negative samples for evaluator.

__init__(data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶

get(ids)[source]¶

Sampling method.

Parameters:: ids – item id tensor.
Returns:: Negative sampled feature dict.

class easy_rec.python.core.sampler.NegativeSamplerInMemory(data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶

Bases: BaseSampler

Negative Sampler.

Weighted random sampling items not in batch.

Parameters:

data_path – item feature data path. id:int64 | weight:float | attrs:string.
fields – item input fields.
num_sample – number of negative samples.
batch_size – mini-batch size.
attr_delimiter – delimiter of feature string.
num_eval_sample – number of negative samples for evaluator.

__init__(data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶

get(ids)[source]¶

Sampling method.

Parameters:: ids – item id tensor.
Returns:: Negative sampled feature dict.

class easy_rec.python.core.sampler.NegativeSamplerV2(user_data_path, item_data_path, edge_data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶

Bases: BaseSampler

Negative Sampler V2.

Weighted random sampling items which do not have positive edge with the user.

Parameters:

user_data_path – user node data path. id:int64 | weight:float.
item_data_path – item feature data path. id:int64 | weight:float | attrs:string.
edge_data_path – positive edge data path. userid:int64 | itemid:int64 | weight:float
fields – item input fields.
num_sample – number of negative samples.
batch_size – mini-batch size.
attr_delimiter – delimiter of feature string.
num_eval_sample – number of negative samples for evaluator.

__init__(user_data_path, item_data_path, edge_data_path, fields, num_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶

get(src_ids, dst_ids)[source]¶

Sampling method.

Parameters:

src_ids – user id tensor.
dst_ids – item id tensor.

Returns:

Negative sampled feature dict.

class easy_rec.python.core.sampler.HardNegativeSampler(user_data_path, item_data_path, hard_neg_edge_data_path, fields, num_sample, num_hard_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶

Bases: BaseSampler

HardNegativeSampler.

Weighted random sampling items not in batch as negative samples, and sampling destination nodes in hard_neg_edge as hard negative samples

Parameters:

user_data_path – user node data path. id:int64 | weight:float.
item_data_path – item feature data path. id:int64 | weight:float | attrs:string.
hard_neg_edge_data_path – hard negative edge data path. userid:int64 | itemid:int64 | weight:float
fields – item input fields.
num_sample – number of negative samples.
num_hard_sample – maximum number of hard negative samples.
batch_size – mini-batch size.
attr_delimiter – delimiter of feature string.
num_eval_sample – number of negative samples for evaluator.

__init__(user_data_path, item_data_path, hard_neg_edge_data_path, fields, num_sample, num_hard_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶

get(src_ids, dst_ids)[source]¶

Sampling method.

Parameters:

src_ids – user id tensor.
dst_ids – item id tensor.

Returns:

Sampled feature dict. The first batch_size is negative samples, remainder is hard negative samples

class easy_rec.python.core.sampler.HardNegativeSamplerV2(user_data_path, item_data_path, edge_data_path, hard_neg_edge_data_path, fields, num_sample, num_hard_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶

Bases: BaseSampler

HardNegativeSampler.

Weighted random sampling items which do not have positive edge with the user., and sampling destination nodes in hard_neg_edge as hard negative samples

Parameters:

user_data_path – user node data path. id:int64 | weight:float.
item_data_path – item feature data path. id:int64 | weight:float | attrs:string.
edge_data_path – positive edge data path. userid:int64 | itemid:int64 | weight:float
hard_neg_edge_data_path – hard negative edge data path. userid:int64 | itemid:int64 | weight:float
fields – item input fields.
num_sample – number of negative samples.
num_hard_sample – maximum number of hard negative samples.
batch_size – mini-batch size.
attr_delimiter – delimiter of feature string.
num_eval_sample – number of negative samples for evaluator.

__init__(user_data_path, item_data_path, edge_data_path, hard_neg_edge_data_path, fields, num_sample, num_hard_sample, batch_size, attr_delimiter=':', num_eval_sample=None)[source]¶

get(src_ids, dst_ids)[source]¶

Sampling method.

Parameters:

src_ids – user id tensor.
dst_ids – item id tensor.

Returns:

Sampled feature dict. The first batch_size is negative samples, remainder is hard negative samples

easy_rec.python.core.sampler.build(data_config)[source]¶