Protocol Documentation
Table of Contents
easy_rec/python/protos/autoint.proto
Top
AutoInt
Field | Type | Label | Description |
multi_head_num |
uint32 |
required |
The number of heads Default: 1 |
multi_head_size |
uint32 |
required |
The dimension of heads |
interacting_layer_num |
uint32 |
required |
The number of interacting layers Default: 1 |
l2_regularization |
float |
required |
Default: 0.0001 |
easy_rec/python/protos/cmbf.proto
Top
CMBF
Field | Type | Label | Description |
config |
CMBFTower |
required |
|
final_dnn |
DNN |
required |
|
easy_rec/python/protos/collaborative_metric_learning.proto
Top
CoMetricLearningI2I
easy_rec/python/protos/data_source.proto
Top
Field | Type | Label | Description |
category_path |
string |
repeated |
support gfile.Glob |
dense_path |
string |
repeated |
|
label_path |
string |
repeated |
|
DatahubServer
Field | Type | Label | Description |
akId |
string |
required |
|
akSecret |
string |
required |
|
endpoint |
string |
required |
|
project |
string |
required |
|
topic |
string |
required |
|
offset_info |
string |
optional |
in json format: {"0":{"cursor": ""}, "1":{"cursor":""}} |
offset_time |
string |
optional |
offset_time could be two formats:
1: %Y%m%d %H:%M:%S "20220508 12:00:00"
2: %s "1651982400" |
KafkaServer
Field | Type | Label | Description |
server |
string |
required |
|
topic |
string |
required |
|
group |
string |
required |
|
offset_info |
string |
optional |
in json format: {'0':10, '1':20} |
offset_time |
string |
optional |
offset_time could be two formats:
1: %Y%m%d %H:%M:%S '20220508 12:00:00'
2: %s '1651982400' |
config_global |
string |
repeated |
kafka global config, such as: fetch.max.bytes=1024 |
config_topic |
string |
repeated |
kafka topic config, such as: max.partition.fetch.bytes=1024 |
easy_rec/python/protos/dataset.proto
Top
DatasetConfig
Field | Type | Label | Description |
batch_size |
uint32 |
optional |
mini batch size to use for training and evaluation. Default: 32 |
auto_expand_input_fields |
bool |
optional |
set auto_expand_input_fields to true to
auto_expand field[1-21] to field1, field2, ..., field21 Default: false |
label_fields |
string |
repeated |
label fields, normally only one field is used.
For multiple target models such as MMOE
multiple label_fields will be set. |
label_sep |
string |
repeated |
label separator |
label_dim |
uint32 |
repeated |
label dimensions which need to be set when there
are labels have dimension > 1 |
shuffle |
bool |
optional |
whether to shuffle data Default: true |
shuffle_buffer_size |
int32 |
optional |
shufffle buffer for better performance, even shuffle buffer is set,
it is suggested to do full data shuffle before training
especially when the performance of models is not good. Default: 32 |
num_epochs |
uint32 |
optional |
The number of times a data source is read. If set to zero, the data source
will be reused indefinitely. Default: 0 |
prefetch_size |
uint32 |
optional |
Number of decoded batches to prefetch. Default: 32 |
shard |
bool |
optional |
shard dataset to 1/num_workers in distribute mode
this param is not used anymore Default: false |
file_shard |
bool |
optional |
shard by file, not by sample, valid only for CSVInput Default: false |
input_type |
DatasetConfig.InputType |
required |
|
separator |
string |
optional |
separator of column features, only used for CSVInput*
not used in OdpsInput*
binary separators are supported:
CTRL+A could be set as '\001'
CTRL+B could be set as '\002'
CTRL+C could be set as '\003'
for RTPInput and OdpsRTPInput it is usually set
to '\002' Default: , |
num_parallel_calls |
uint32 |
optional |
parallel preproces of raw data, avoid using too small
or too large numbers(suggested be to small than
number of the cores) Default: 8 |
selected_cols |
string |
optional |
only used for OdpsInput/OdpsInputV2/OdpsRTPInput, comma separated
for RTPInput, selected_cols use indices as column names
such as '1,2,4', where 1,2 are label columns, and
4 is the feature column, column 0,3 are not used, |
selected_col_types |
string |
optional |
selected col types, only used for OdpsInput/OdpsInputV2
to avoid error setting of data types |
input_fields |
DatasetConfig.Field |
repeated |
the input fields must be the same number and in the
same order as data in csv files or odps tables |
rtp_separator |
string |
optional |
for RTPInput only Default: ; |
ignore_error |
bool |
optional |
ignore some data errors
it is not suggested to set this parameter Default: false |
pai_worker_queue |
bool |
optional |
whether to use pai global shuffle queue, only for OdpsInput,
OdpsInputV2, OdpsRTPInputV2 Default: false |
pai_worker_slice_num |
int32 |
optional |
Default: 100 |
chief_redundant |
bool |
optional |
if true, one worker will duplicate the data of the chief node
and undertake the gradient computation of the chief node Default: false |
sample_weight |
string |
optional |
input field for sample weight |
data_compression_type |
string |
optional |
the compression type of tfrecord |
n_data_batch_tfrecord |
uint32 |
optional |
n data for one feature in tfrecord |
with_header |
bool |
optional |
for csv files, may optionally with an header
in that case, input_name must match header name,
and the number and the order of input_fields
may not be the same as that in csv files. Default: false |
feature_fields |
string |
repeated |
|
negative_sampler |
NegativeSampler |
optional |
|
negative_sampler_v2 |
NegativeSamplerV2 |
optional |
|
hard_negative_sampler |
HardNegativeSampler |
optional |
|
hard_negative_sampler_v2 |
HardNegativeSamplerV2 |
optional |
|
negative_sampler_in_memory |
NegativeSamplerInMemory |
optional |
|
eval_batch_size |
uint32 |
optional |
Default: 4096 |
DatasetConfig.Field
HardNegativeSampler
Weighted Random Sampling ItemID not in Batch and Sampling Hard Edge
Field | Type | Label | Description |
user_input_path |
string |
required |
user data path
userid weight |
item_input_path |
string |
required |
item data path
itemid weight attrs |
hard_neg_edge_input_path |
string |
required |
hard negative edge path
userid itemid weight |
num_sample |
uint32 |
required |
number of negative sample |
num_hard_sample |
uint32 |
required |
max number of hard negative sample |
attr_fields |
string |
repeated |
field names of attrs in train data or eval data |
item_id_field |
string |
required |
field name of item_id in train data or eval data |
user_id_field |
string |
required |
field name of user_id in train data or eval data |
attr_delimiter |
string |
optional |
Default: : |
num_eval_sample |
uint32 |
optional |
Default: 0 |
field_delimiter |
string |
optional |
only works on DataScience/Local Default: |
HardNegativeSamplerV2
Weighted Random Sampling ItemID not with Edge and Sampling Hard Edge
Field | Type | Label | Description |
user_input_path |
string |
required |
user data path
userid weight |
item_input_path |
string |
required |
item data path
itemid weight attrs |
pos_edge_input_path |
string |
required |
positive edge path
userid itemid weight |
hard_neg_edge_input_path |
string |
required |
hard negative edge path
userid itemid weight |
num_sample |
uint32 |
required |
number of negative sample |
num_hard_sample |
uint32 |
required |
max number of hard negative sample |
attr_fields |
string |
repeated |
field names of attrs in train data or eval data |
item_id_field |
string |
required |
field name of item_id in train data or eval data |
user_id_field |
string |
required |
field name of user_id in train data or eval data |
attr_delimiter |
string |
optional |
Default: : |
num_eval_sample |
uint32 |
optional |
Default: 0 |
field_delimiter |
string |
optional |
only works on DataScience/Local Default: |
NegativeSampler
Weighted Random Sampling ItemID not in Batch
Field | Type | Label | Description |
input_path |
string |
required |
sample data path
itemid weight attrs |
num_sample |
uint32 |
required |
number of negative sample |
attr_fields |
string |
repeated |
field names of attrs in train data or eval data |
item_id_field |
string |
required |
field name of item_id in train data or eval data |
attr_delimiter |
string |
optional |
Default: : |
num_eval_sample |
uint32 |
optional |
Default: 0 |
field_delimiter |
string |
optional |
only works on DataScience/Local Default: |
NegativeSamplerInMemory
Field | Type | Label | Description |
input_path |
string |
required |
sample data path
itemid weight attrs |
num_sample |
uint32 |
required |
number of negative sample |
attr_fields |
string |
repeated |
field names of attrs in train data or eval data |
item_id_field |
string |
required |
field name of item_id in train data or eval data |
attr_delimiter |
string |
optional |
Default: : |
num_eval_sample |
uint32 |
optional |
Default: 0 |
field_delimiter |
string |
optional |
only works on DataScience/Local Default: |
NegativeSamplerV2
Weighted Random Sampling ItemID not with Edge
Field | Type | Label | Description |
user_input_path |
string |
required |
user data path
userid weight |
item_input_path |
string |
required |
item data path
itemid weight attrs |
pos_edge_input_path |
string |
required |
positive edge path
userid itemid weight |
num_sample |
uint32 |
required |
number of negative sample |
attr_fields |
string |
repeated |
field names of attrs in train data or eval data |
item_id_field |
string |
required |
field name of item_id in train data or eval data |
user_id_field |
string |
required |
field name of user_id in train data or eval data |
attr_delimiter |
string |
optional |
Default: : |
num_eval_sample |
uint32 |
optional |
Default: 0 |
field_delimiter |
string |
optional |
only works on DataScience/Local Default: |
DatasetConfig.FieldType
Name | Number | Description |
INT32 |
0 |
|
INT64 |
1 |
|
STRING |
2 |
|
FLOAT |
4 |
|
DOUBLE |
5 |
|
BOOL |
6 |
|
Name | Number | Description |
CSVInput |
10 |
csv format input, could be used in local or hdfs
support .gz compression(but not .tar.gz files) |
CSVInputV2 |
11 |
@Depreciated |
CSVInputEx |
12 |
extended csv format, allow quote in fields |
OdpsInput |
2 |
@Depreciated, has memory leak problem |
OdpsInputV2 |
3 |
odps input, used on pai |
DataHubInput |
15 |
|
OdpsInputV3 |
9 |
|
RTPInput |
4 |
|
RTPInputV2 |
5 |
|
OdpsRTPInput |
601 |
|
OdpsRTPInputV2 |
602 |
|
TFRecordInput |
7 |
|
BatchTFRecordInput |
14 |
|
DummyInput |
8 |
for the purpose to debug performance bottleneck of
input pipelines |
KafkaInput |
13 |
|
HiveInput |
16 |
|
HiveRTPInput |
17 |
|
HiveParquetInput |
18 |
|
CriteoInput |
1001 |
|
easy_rec/python/protos/dbmtl.proto
Top
DBMTL
Field | Type | Label | Description |
bottom_cmbf |
CMBFTower |
optional |
shared bottom cmbf layer |
bottom_uniter |
UniterTower |
optional |
shared bottom uniter layer |
bottom_dnn |
DNN |
optional |
shared bottom dnn layer |
expert_dnn |
DNN |
optional |
mmoe expert dnn layer definition |
num_expert |
uint32 |
optional |
number of mmoe experts Default: 0 |
task_towers |
BayesTaskTower |
repeated |
bayes task tower |
l2_regularization |
float |
optional |
l2 regularization Default: 0.0001 |
easy_rec/python/protos/dcn.proto
Top
CrossTower
Field | Type | Label | Description |
input |
string |
required |
|
cross_num |
uint32 |
required |
The number of cross layers Default: 3 |
DCN
Field | Type | Label | Description |
deep_tower |
Tower |
required |
|
cross_tower |
CrossTower |
required |
|
final_dnn |
DNN |
required |
|
l2_regularization |
float |
required |
Default: 0.0001 |
easy_rec/python/protos/deepfm.proto
Top
DeepFM
Field | Type | Label | Description |
dnn |
DNN |
required |
|
final_dnn |
DNN |
optional |
|
wide_output_dim |
uint32 |
optional |
Default: 1 |
wide_regularization |
float |
optional |
deprecated Default: 0.0001 |
dense_regularization |
float |
optional |
deprecated Default: 0.0001 |
l2_regularization |
float |
optional |
Default: 0.0001 |
easy_rec/python/protos/dlrm.proto
Top
DLRM
Field | Type | Label | Description |
top_dnn |
DNN |
required |
|
bot_dnn |
DNN |
required |
|
arch_interaction_op |
string |
optional |
options are: dot and cat Default: dot |
arch_interaction_itself |
bool |
optional |
whether a feature will interact with itself Default: false |
arch_with_dense_feature |
bool |
optional |
whether to include dense features after interaction Default: false |
l2_regularization |
float |
optional |
Default: 1e-05 |
easy_rec/python/protos/dnn.proto
Top
DNN
Field | Type | Label | Description |
hidden_units |
uint32 |
repeated |
hidden units for each layer |
dropout_ratio |
float |
repeated |
ratio of dropout |
activation |
string |
optional |
activation function Default: tf.nn.relu |
use_bn |
bool |
optional |
use batch normalization Default: true |
easy_rec/python/protos/dropoutnet.proto
Top
DropoutNet
Field | Type | Label | Description |
user_content |
DNN |
required |
|
user_preference |
DNN |
required |
|
item_content |
DNN |
required |
|
item_preference |
DNN |
required |
|
user_tower |
DNN |
required |
|
item_tower |
DNN |
required |
|
l2_regularization |
float |
required |
Default: 0 |
user_dropout_rate |
float |
required |
Default: 0 |
item_dropout_rate |
float |
required |
Default: 0.5 |
softmax_loss |
SoftmaxCrossEntropyWithNegativeMining |
optional |
|
easy_rec/python/protos/dssm.proto
Top
DSSM
Field | Type | Label | Description |
user_tower |
DSSMTower |
required |
|
item_tower |
DSSMTower |
required |
|
l2_regularization |
float |
required |
Default: 0.0001 |
simi_func |
Similarity |
optional |
Default: COSINE |
scale_simi |
bool |
optional |
add a layer for scaling the similarity Default: true |
item_id |
string |
optional |
|
ignore_in_batch_neg_sam |
bool |
required |
Default: false |
DSSMTower
Field | Type | Label | Description |
id |
string |
required |
|
dnn |
DNN |
required |
|
easy_rec/python/protos/easy_rec_model.proto
Top
DummyModel
for input performance test
EasyRecModel
KD
for knowledge distillation
Field | Type | Label | Description |
loss_name |
string |
optional |
|
pred_name |
string |
required |
|
pred_is_logits |
bool |
optional |
default to be logits Default: true |
soft_label_name |
string |
required |
for CROSS_ENTROPY_LOSS, soft_label must be logits instead of probs |
label_is_logits |
bool |
optional |
default to be logits Default: true |
loss_type |
LossType |
required |
currently only support CROSS_ENTROPY_LOSS and L2_LOSS |
loss_weight |
float |
optional |
Default: 1 |
temperature |
float |
optional |
only for loss_type == CROSS_ENTROPY_LOSS Default: 1 |
easy_rec/python/protos/esmm.proto
Top
ESMM
Field | Type | Label | Description |
groups |
Tower |
repeated |
|
ctr_tower |
TaskTower |
required |
|
cvr_tower |
TaskTower |
required |
|
l2_regularization |
float |
required |
Default: 0.0001 |
easy_rec/python/protos/eval.proto
Top
AUC
Field | Type | Label | Description |
num_thresholds |
uint32 |
optional |
Default: 200 |
Accuracy
AvgPrecisionAtTopK
Field | Type | Label | Description |
topk |
uint32 |
optional |
Default: 5 |
EvalConfig
Message for configuring EasyRecModel evaluation jobs (eval.py).
Field | Type | Label | Description |
num_examples |
uint32 |
optional |
Number of examples to process of evaluation. Default: 0 |
eval_interval_secs |
uint32 |
optional |
How often to run evaluation. Default: 300 |
max_evals |
uint32 |
optional |
Maximum number of times to run evaluation. If set to 0, will run forever. Default: 0 |
save_graph |
bool |
optional |
Whether the TensorFlow graph used for evaluation should be saved to disk. Default: false |
metrics_set |
EvalMetrics |
repeated |
Type of metrics to use for evaluation.
possible values: |
eval_online |
bool |
optional |
Evaluation online with batch forward data of training Default: false |
EvalMetrics
GAUC
Field | Type | Label | Description |
uid_field |
string |
required |
uid field name |
reduction |
string |
optional |
reduction method for auc of different users
* "mean": simple mean of different users
* "mean_by_sample_num": weighted mean with sample num of different users
* "mean_by_positive_num": weighted mean with positive sample num of different users Default: mean |
Max_F1
MeanAbsoluteError
MeanSquaredError
Precision
Recall
RecallAtTopK
Field | Type | Label | Description |
topk |
uint32 |
optional |
Default: 5 |
RootMeanSquaredError
SessionAUC
Field | Type | Label | Description |
session_id_field |
string |
required |
session id field name |
reduction |
string |
optional |
reduction: reduction method for auc of different sessions
* "mean": simple mean of different sessions
* "mean_by_sample_num": weighted mean with sample num of different sessions
* "mean_by_positive_num": weighted mean with positive sample num of different sessions Default: mean |
easy_rec/python/protos/export.proto
Top
ExportConfig
Message for configuring exporting models.
Field | Type | Label | Description |
batch_size |
int32 |
optional |
batch size used for exported model, -1 indicates batch_size is None
which is only supported by classification model right now, while
other models support static batch_size Default: -1 |
exporter_type |
string |
optional |
type of exporter [final | latest | best | none] when train_and_evaluation
final: performs a single export in the end of training
latest: regularly exports the serving graph and checkpoints
best: export the best model according to best_exporter_metric
none: do not perform export Default: final |
best_exporter_metric |
string |
optional |
the metric used to determine the best checkpoint Default: auc |
metric_bigger |
bool |
optional |
metric value the bigger the best Default: true |
enable_early_stop |
bool |
optional |
enable early stop Default: false |
early_stop_func |
string |
optional |
custom early stop function, format:
early_stop_func(eval_results, early_stop_params)
return True if should stop |
early_stop_params |
string |
optional |
custom early stop parameters |
max_check_steps |
int32 |
optional |
early stop max check steps Default: 10000 |
multi_placeholder |
bool |
optional |
each feature has a placeholder Default: true |
exports_to_keep |
int32 |
optional |
export to keep, only for exporter_type in [best, latest] Default: 1 |
multi_value_fields |
MultiValueFields |
optional |
multi value field list |
placeholder_named_by_input |
bool |
optional |
is placeholder named by input Default: false |
filter_inputs |
bool |
optional |
filter out inputs, only keep effective ones Default: true |
export_features |
bool |
optional |
export the original feature values as string Default: false |
export_rtp_outputs |
bool |
optional |
export the outputs required by RTP Default: false |
asset_files |
string |
repeated |
export asset files |
MultiValueFields
Field | Type | Label | Description |
input_name |
string |
repeated |
|
easy_rec/python/protos/feature_config.proto
Top
AttentionCombiner
EVParams
Field | Type | Label | Description |
filter_freq |
uint64 |
optional |
Default: 0 |
steps_to_live |
uint64 |
optional |
Default: 0 |
FeatureConfig
Field | Type | Label | Description |
feature_name |
string |
optional |
|
input_names |
string |
repeated |
input field names: must be included in DatasetConfig.input_fields |
feature_type |
FeatureConfig.FeatureType |
required |
Default: IdFeature |
embedding_name |
string |
optional |
|
embedding_dim |
uint32 |
optional |
Default: 0 |
hash_bucket_size |
uint64 |
optional |
Default: 0 |
num_buckets |
uint64 |
optional |
for categorical_column_with_identity Default: 0 |
boundaries |
double |
repeated |
only for raw features |
separator |
string |
optional |
separator with in features Default: | |
kv_separator |
string |
optional |
delimeter to separator key from value |
seq_multi_sep |
string |
optional |
delimeter to separate sequence multi-values |
max_seq_len |
uint32 |
optional |
truncate sequence data to max_seq_len |
vocab_file |
string |
optional |
|
vocab_list |
string |
repeated |
|
shared_names |
string |
repeated |
many other field share this config |
lookup_max_sel_elem_num |
int32 |
optional |
lookup max select element number, default 10 Default: 10 |
max_partitions |
int32 |
optional |
max_partitions Default: 1 |
combiner |
string |
optional |
combiner Default: sum |
initializer |
Initializer |
optional |
embedding initializer |
precision |
int32 |
optional |
number of digits kept after dot in format float/double to string
scientific format is not used.
in default it is not allowed to convert float/double to string Default: -1 |
min_val |
double |
optional |
normalize raw feature to [0-1] Default: 0 |
max_val |
double |
optional |
Default: 0 |
normalizer_fn |
string |
optional |
normalization function for raw features:
such as: tf.math.log1p |
raw_input_dim |
uint32 |
optional |
raw feature of multiple dimensions Default: 1 |
sequence_combiner |
SequenceCombiner |
optional |
sequence feature combiner |
sub_feature_type |
FeatureConfig.FeatureType |
optional |
sub feature type for sequence feature Default: IdFeature |
sequence_length |
uint32 |
optional |
sequence length Default: 1 |
expression |
string |
optional |
for expr feature |
ev_params |
EVParams |
optional |
embedding variable params |
FeatureConfigV2
FeatureGroupConfig
Field | Type | Label | Description |
group_name |
string |
optional |
|
feature_names |
string |
repeated |
|
wide_deep |
WideOrDeep |
optional |
Default: DEEP |
sequence_features |
SeqAttGroupConfig |
repeated |
|
negative_sampler |
bool |
optional |
Default: false |
MultiHeadAttentionCombiner
SeqAttGroupConfig
Field | Type | Label | Description |
group_name |
string |
optional |
|
seq_att_map |
SeqAttMap |
repeated |
|
tf_summary |
bool |
optional |
Default: false |
seq_dnn |
DNN |
optional |
|
allow_key_search |
bool |
optional |
Default: false |
need_key_feature |
bool |
optional |
Default: true |
allow_key_transform |
bool |
optional |
Default: false |
SeqAttMap
Field | Type | Label | Description |
key |
string |
repeated |
|
hist_seq |
string |
repeated |
|
aux_hist_seq |
string |
repeated |
|
SequenceCombiner
TextCnnCombiner
Field | Type | Label | Description |
filter_sizes |
uint32 |
repeated |
|
num_filters |
uint32 |
repeated |
|
FeatureConfig.FeatureType
Name | Number | Description |
IdFeature |
0 |
|
RawFeature |
1 |
|
TagFeature |
2 |
|
ComboFeature |
3 |
|
LookupFeature |
4 |
|
SequenceFeature |
5 |
|
ExprFeature |
6 |
|
FeatureConfig.FieldType
Name | Number | Description |
INT32 |
0 |
|
INT64 |
1 |
|
STRING |
2 |
|
FLOAT |
4 |
|
DOUBLE |
5 |
|
BOOL |
6 |
|
WideOrDeep
Name | Number | Description |
DEEP |
0 |
|
WIDE |
1 |
|
WIDE_AND_DEEP |
2 |
|
easy_rec/python/protos/fm.proto
Top
FM
Field | Type | Label | Description |
l2_regularization |
float |
optional |
Default: 0.0001 |
easy_rec/python/protos/hive_config.proto
Top
HiveConfig
Field | Type | Label | Description |
host |
string |
required |
hive master's ip |
port |
uint32 |
required |
hive port Default: 10000 |
username |
string |
required |
hive username Default: admin |
database |
string |
required |
hive database Default: default |
table_name |
string |
required |
|
easy_rec/python/protos/hyperparams.proto
Top
ConstantInitializer
Field | Type | Label | Description |
consts |
float |
repeated |
|
GlorotNormalInitializer
Initializer
Proto with one-of field for initializers.
L1L2Regularizer
Configuration proto for L2 Regularizer.
Field | Type | Label | Description |
scale_l1 |
float |
optional |
Default: 1 |
scale_l2 |
float |
optional |
Default: 1 |
L1Regularizer
Configuration proto for L1 Regularizer.
Field | Type | Label | Description |
scale |
float |
optional |
Default: 1 |
L2Regularizer
Configuration proto for L2 Regularizer.
Field | Type | Label | Description |
scale |
float |
optional |
Default: 1 |
RandomNormalInitializer
Configuration proto for random normal initializer. See
https://www.tensorflow.org/api_docs/python/tf/random_normal_initializer
Field | Type | Label | Description |
mean |
float |
optional |
Default: 0 |
stddev |
float |
optional |
Default: 1 |
Regularizer
Proto with one-of field for regularizers.
TruncatedNormalInitializer
Configuration proto for truncated normal initializer. See
https://www.tensorflow.org/api_docs/python/tf/truncated_normal_initializer
Field | Type | Label | Description |
mean |
float |
optional |
Default: 0 |
stddev |
float |
optional |
Default: 1 |
easy_rec/python/protos/layer.proto
Top
CMBFTower
Field | Type | Label | Description |
multi_head_num |
uint32 |
required |
The number of heads of cross modal fusion layer Default: 1 |
image_multi_head_num |
uint32 |
required |
The number of heads of image feature learning layer Default: 1 |
text_multi_head_num |
uint32 |
required |
The number of heads of text feature learning layer Default: 1 |
text_head_size |
uint32 |
required |
The dimension of text heads |
image_head_size |
uint32 |
required |
The dimension of image heads Default: 64 |
image_feature_patch_num |
uint32 |
required |
The number of patches of image feature, take effect when there is only one image feature Default: 1 |
image_feature_dim |
uint32 |
required |
Do dimension reduce to this size for image feature before single modal learning module Default: 0 |
image_self_attention_layer_num |
uint32 |
required |
The number of self attention layers for image features Default: 0 |
text_self_attention_layer_num |
uint32 |
required |
The number of self attention layers for text features Default: 1 |
cross_modal_layer_num |
uint32 |
required |
The number of cross modal layers Default: 1 |
image_cross_head_size |
uint32 |
required |
The dimension of image cross modal heads |
text_cross_head_size |
uint32 |
required |
The dimension of text cross modal heads |
hidden_dropout_prob |
float |
required |
Dropout probability for hidden layers Default: 0 |
attention_probs_dropout_prob |
float |
required |
Dropout probability of the attention probabilities Default: 0 |
use_token_type |
bool |
required |
Whether to add embeddings for different text sequence features Default: false |
use_position_embeddings |
bool |
required |
Whether to add position embeddings for the position of each token in the text sequence Default: true |
max_position_embeddings |
uint32 |
required |
Maximum sequence length that might ever be used with this model Default: 0 |
text_seq_emb_dropout_prob |
float |
required |
Dropout probability for text sequence embeddings Default: 0.1 |
other_feature_dnn |
DNN |
optional |
dnn layers for other features |
HighWayTower
Field | Type | Label | Description |
input |
string |
required |
|
emb_size |
uint32 |
required |
|
UniterTower
Field | Type | Label | Description |
hidden_size |
uint32 |
required |
Size of the encoder layers and the pooler layer |
num_hidden_layers |
uint32 |
required |
Number of hidden layers in the Transformer encoder |
num_attention_heads |
uint32 |
required |
Number of attention heads for each attention layer in the Transformer encoder |
intermediate_size |
uint32 |
required |
The size of the "intermediate" (i.e. feed-forward) layer in the Transformer encoder |
hidden_act |
string |
required |
The non-linear activation function (function or string) in the encoder and pooler.
"gelu", "relu", "tanh" and "swish" are supported. Default: gelu |
hidden_dropout_prob |
float |
required |
The dropout probability for all fully connected layers in the embeddings, encoder, and pooler Default: 0.1 |
attention_probs_dropout_prob |
float |
required |
The dropout ratio for the attention probabilities Default: 0.1 |
max_position_embeddings |
uint32 |
required |
The maximum sequence length that this model might ever be used with Default: 512 |
use_position_embeddings |
bool |
required |
Whether to add position embeddings for the position of each token in the text sequence Default: true |
initializer_range |
float |
required |
The stddev of the truncated_normal_initializer for initializing all weight matrices Default: 0.02 |
other_feature_dnn |
DNN |
optional |
dnn layers for other features |
easy_rec/python/protos/loss.proto
Top
CircleLoss
Field | Type | Label | Description |
margin |
float |
required |
Default: 0.25 |
gamma |
float |
required |
Default: 32 |
F1ReweighedLoss
Field | Type | Label | Description |
f1_beta_square |
float |
required |
Default: 1 |
label_smoothing |
float |
required |
Default: 0 |
Loss
MultiSimilarityLoss
Field | Type | Label | Description |
alpha |
float |
required |
Default: 2 |
beta |
float |
required |
Default: 50 |
lamb |
float |
required |
Default: 1 |
eps |
float |
required |
Default: 0.1 |
SoftmaxCrossEntropyWithNegativeMining
Field | Type | Label | Description |
num_negative_samples |
uint32 |
required |
|
margin |
float |
required |
Default: 0 |
gamma |
float |
required |
Default: 1 |
coefficient_of_support_vector |
float |
required |
Default: 1 |
LossType
Name | Number | Description |
CLASSIFICATION |
0 |
|
L2_LOSS |
1 |
|
SIGMOID_L2_LOSS |
2 |
|
CROSS_ENTROPY_LOSS |
3 |
crossentropy loss/log loss |
SOFTMAX_CROSS_ENTROPY |
4 |
|
CIRCLE_LOSS |
5 |
|
MULTI_SIMILARITY_LOSS |
6 |
|
SOFTMAX_CROSS_ENTROPY_WITH_NEGATIVE_MINING |
7 |
|
PAIR_WISE_LOSS |
8 |
|
F1_REWEIGHTED_LOSS |
9 |
|
easy_rec/python/protos/mind.proto
Top
Capsule
Field | Type | Label | Description |
max_k |
uint32 |
optional |
max number of high capsules Default: 5 |
max_seq_len |
uint32 |
required |
max behaviour sequence length |
high_dim |
uint32 |
required |
high capsule embedding vector dimension |
num_iters |
uint32 |
optional |
number EM iterations Default: 3 |
routing_logits_scale |
float |
optional |
routing logits scale Default: 20 |
routing_logits_stddev |
float |
optional |
routing logits initial stddev Default: 1 |
squash_pow |
float |
optional |
squash power Default: 1 |
scale_ratio |
float |
optional |
output ratio Default: 1 |
const_caps_num |
bool |
optional |
constant interest number
in default, use log(seq_len) Default: false |
MIND
Field | Type | Label | Description |
pre_capsule_dnn |
DNN |
optional |
preprocessing dnn before entering capsule layer |
user_dnn |
DNN |
required |
dnn layers applied on user_context(none sequence features) |
concat_dnn |
DNN |
required |
concat user and capsule dnn |
user_seq_combine |
MIND.UserSeqCombineMethod |
optional |
method to combine several user sequences
such as item_ids, category_ids Default: SUM |
item_dnn |
DNN |
required |
dnn layers applied on item features |
capsule_config |
Capsule |
required |
|
simi_pow |
float |
optional |
similarity power, the paper says that the big
the better Default: 10 |
simi_func |
Similarity |
optional |
Default: COSINE |
scale_simi |
bool |
optional |
add a layer for scaling the similarity Default: true |
l2_regularization |
float |
required |
Default: 0.0001 |
time_id_fea |
string |
optional |
|
item_id |
string |
optional |
|
ignore_in_batch_neg_sam |
bool |
optional |
Default: false |
max_interests_simi |
float |
optional |
if small than 1.0, then a loss will be added to
limit the maximal interest similarities, but
in experiments, setup such a loss leads to low hitrate. Default: 1 |
MIND.UserSeqCombineMethod
Name | Number | Description |
CONCAT |
0 |
|
SUM |
1 |
|
easy_rec/python/protos/mmoe.proto
Top
ExpertTower
Field | Type | Label | Description |
expert_name |
string |
required |
|
dnn |
DNN |
required |
|
MMoE
Field | Type | Label | Description |
experts |
ExpertTower |
repeated |
deprecated: original mmoe experts config |
expert_dnn |
DNN |
optional |
mmoe expert dnn layer definition |
num_expert |
uint32 |
optional |
number of mmoe experts Default: 0 |
task_towers |
TaskTower |
repeated |
task tower |
l2_regularization |
float |
required |
l2 regularization Default: 0.0001 |
easy_rec/python/protos/multi_tower.proto
Top
BSTTower
Field | Type | Label | Description |
input |
string |
required |
|
seq_len |
uint32 |
required |
Default: 5 |
multi_head_size |
uint32 |
required |
Default: 4 |
DINTower
Field | Type | Label | Description |
input |
string |
required |
|
dnn |
DNN |
required |
|
MultiTower
Field | Type | Label | Description |
towers |
Tower |
repeated |
|
final_dnn |
DNN |
required |
|
l2_regularization |
float |
required |
Default: 0.0001 |
din_towers |
DINTower |
repeated |
|
bst_towers |
BSTTower |
repeated |
|
easy_rec/python/protos/multi_tower_recall.proto
Top
MultiTowerRecall
Field | Type | Label | Description |
user_tower |
RecallTower |
required |
|
item_tower |
RecallTower |
required |
|
l2_regularization |
float |
required |
Default: 0.0001 |
final_dnn |
DNN |
required |
|
ignore_in_batch_neg_sam |
bool |
required |
Default: false |
RecallTower
Field | Type | Label | Description |
dnn |
DNN |
required |
|
easy_rec/python/protos/optimizer.proto
Top
AdagradOptimizer
Configuration message for the AdagradOptimizer
See: https://www.tensorflow.org/api_docs/python/tf/train/AdagradOptimizer
Field | Type | Label | Description |
learning_rate |
LearningRate |
optional |
|
AdamAsyncOptimizer
Only available on pai-tf, which has better performance than AdamOptimizer
Field | Type | Label | Description |
learning_rate |
LearningRate |
optional |
|
beta1 |
float |
optional |
Default: 0.9 |
beta2 |
float |
optional |
Default: 0.999 |
AdamAsyncWOptimizer
Field | Type | Label | Description |
learning_rate |
LearningRate |
optional |
|
weight_decay |
float |
optional |
Default: 1e-06 |
beta1 |
float |
optional |
Default: 0.9 |
beta2 |
float |
optional |
Default: 0.999 |
AdamOptimizer
Configuration message for the AdamOptimizer
See: https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer
Field | Type | Label | Description |
learning_rate |
LearningRate |
optional |
|
beta1 |
float |
optional |
Default: 0.9 |
beta2 |
float |
optional |
Default: 0.999 |
AdamWOptimizer
Field | Type | Label | Description |
learning_rate |
LearningRate |
optional |
|
weight_decay |
float |
optional |
Default: 1e-06 |
beta1 |
float |
optional |
Default: 0.9 |
beta2 |
float |
optional |
Default: 0.999 |
ConstantLearningRate
Configuration message for a constant learning rate.
Field | Type | Label | Description |
learning_rate |
float |
optional |
Default: 0.002 |
CosineDecayLearningRate
Configuration message for a cosine decaying learning rate as defined in
utils/learning_schedules.py
Field | Type | Label | Description |
learning_rate_base |
float |
optional |
Default: 0.002 |
total_steps |
uint32 |
optional |
Default: 4000000 |
warmup_learning_rate |
float |
optional |
Default: 0.0002 |
warmup_steps |
uint32 |
optional |
Default: 10000 |
hold_base_rate_steps |
uint32 |
optional |
Default: 0 |
ExponentialDecayLearningRate
Configuration message for an exponentially decaying learning rate.
See https://www.tensorflow.org/versions/master/api_docs/python/train/ \
decaying_the_learning_rate#exponential_decay
Field | Type | Label | Description |
initial_learning_rate |
float |
optional |
Default: 0.002 |
decay_steps |
uint32 |
optional |
Default: 4000000 |
decay_factor |
float |
optional |
Default: 0.95 |
staircase |
bool |
optional |
Default: true |
burnin_learning_rate |
float |
optional |
Default: 0 |
burnin_steps |
uint32 |
optional |
Default: 0 |
min_learning_rate |
float |
optional |
Default: 0 |
FtrlOptimizer
Field | Type | Label | Description |
learning_rate |
LearningRate |
optional |
optional float learning_rate = 1 [default=1e-4]; |
learning_rate_power |
float |
optional |
Default: -0.5 |
initial_accumulator_value |
float |
optional |
Default: 0.1 |
l1_reg |
float |
optional |
Default: 0 |
l2_reg |
float |
optional |
Default: 0 |
l2_shrinkage_reg |
float |
optional |
Default: 0 |
LearningRate
Configuration message for optimizer learning rate.
ManualStepLearningRate
Configuration message for a manually defined learning rate schedule.
ManualStepLearningRate.LearningRateSchedule
Field | Type | Label | Description |
step |
uint32 |
optional |
|
learning_rate |
float |
optional |
Default: 0.002 |
MomentumOptimizer
Configuration message for the MomentumOptimizer
See: https://www.tensorflow.org/api_docs/python/tf/train/MomentumOptimizer
Field | Type | Label | Description |
learning_rate |
LearningRate |
optional |
|
momentum_optimizer_value |
float |
optional |
Default: 0.9 |
MomentumWOptimizer
Field | Type | Label | Description |
learning_rate |
LearningRate |
optional |
|
weight_decay |
float |
optional |
Default: 1e-06 |
momentum_optimizer_value |
float |
optional |
Default: 0.9 |
Optimizer
Top level optimizer message.
PolyDecayLearningRate
Configuration message for a poly decaying learning rate.
See https://www.tensorflow.org/api_docs/python/tf/train/polynomial_decay.
Field | Type | Label | Description |
learning_rate_base |
float |
required |
|
total_steps |
int64 |
required |
|
power |
float |
required |
|
end_learning_rate |
float |
optional |
Default: 0 |
RMSPropOptimizer
Configuration message for the RMSPropOptimizer
See: https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
Field | Type | Label | Description |
learning_rate |
LearningRate |
optional |
|
momentum_optimizer_value |
float |
optional |
Default: 0.9 |
decay |
float |
optional |
Default: 0.9 |
epsilon |
float |
optional |
Default: 1 |
Field | Type | Label | Description |
learning_rate_base |
float |
required |
|
hidden_size |
int32 |
required |
|
warmup_steps |
int32 |
required |
|
step_scaling_rate |
float |
optional |
Default: 1 |
easy_rec/python/protos/pipeline.proto
Top
EasyRecConfig
Field | Type | Label | Description |
train_input_path |
string |
optional |
|
kafka_train_input |
KafkaServer |
optional |
|
datahub_train_input |
DatahubServer |
optional |
|
hive_train_input |
HiveConfig |
optional |
|
binary_train_input |
BinaryDataInput |
optional |
|
eval_input_path |
string |
optional |
|
kafka_eval_input |
KafkaServer |
optional |
|
datahub_eval_input |
DatahubServer |
optional |
|
hive_eval_input |
HiveConfig |
optional |
|
binary_eval_input |
BinaryDataInput |
optional |
|
model_dir |
string |
required |
|
train_config |
TrainConfig |
optional |
train config, including optimizer, weight decay, num_steps and so on |
eval_config |
EvalConfig |
optional |
|
data_config |
DatasetConfig |
optional |
|
feature_configs |
FeatureConfig |
repeated |
for compatibility |
feature_config |
FeatureConfigV2 |
optional |
|
model_config |
EasyRecModel |
required |
recommendation model config |
export_config |
ExportConfig |
optional |
|
fg_json_path |
string |
optional |
Json file[RTP FG] to define input data and features:
* In easy_rec.python.utils.fg_util.load_fg_json_to_config:
data_config and feature_config will be generated
based on fg_json.
* After generation, a prefix '!' is added:
fg_json_path = '!' + fg_json_path
indicates config update is already done, and should not
be updated anymore. In this way, we make load_fg_json_to_config
function reentrant.
This step is done before edit_config_json to take effect. |
easy_rec/python/protos/ple.proto
Top
Field | Type | Label | Description |
network_name |
string |
required |
|
expert_num_per_task |
uint32 |
required |
number of experts per task |
share_num |
uint32 |
optional |
number of experts for share
For the last extraction_network, no need to configure this |
task_expert_net |
DNN |
required |
dnn network of experts per task |
share_expert_net |
DNN |
optional |
dnn network of experts for share
For the last extraction_network, no need to configure this |
PLE
Field | Type | Label | Description |
extraction_networks |
ExtractionNetwork |
repeated |
extraction network |
task_towers |
TaskTower |
repeated |
task tower |
l2_regularization |
float |
optional |
l2 regularization Default: 0.0001 |
easy_rec/python/protos/rocket_launching.proto
Top
RocketLaunching
Field | Type | Label | Description |
share_dnn |
DNN |
required |
|
booster_dnn |
DNN |
required |
|
light_dnn |
DNN |
required |
|
l2_regularization |
float |
optional |
Default: 0.0001 |
feature_based_distillation |
bool |
optional |
Default: false |
feature_distillation_function |
Similarity |
optional |
COSINE = 0; EUCLID = 1; Default: COSINE |
easy_rec/python/protos/simi.proto
Top
Similarity
Name | Number | Description |
COSINE |
0 |
|
INNER_PRODUCT |
1 |
|
EUCLID |
2 |
|
easy_rec/python/protos/simple_multi_task.proto
Top
SimpleMultiTask
Field | Type | Label | Description |
task_towers |
TaskTower |
repeated |
|
l2_regularization |
float |
required |
Default: 0.0001 |
easy_rec/python/protos/tf_predict.proto
Top
ArrayProto
Protocol buffer representing an array
Field | Type | Label | Description |
dtype |
ArrayDataType |
|
Data Type. |
array_shape |
ArrayShape |
|
Shape of the array. |
float_val |
float |
repeated |
DT_FLOAT. |
double_val |
double |
repeated |
DT_DOUBLE. |
int_val |
int32 |
repeated |
DT_INT32, DT_INT16, DT_INT8, DT_UINT8. |
string_val |
bytes |
repeated |
DT_STRING. |
int64_val |
int64 |
repeated |
DT_INT64. |
bool_val |
bool |
repeated |
DT_BOOL. |
ArrayShape
Dimensions of an array
Field | Type | Label | Description |
dim |
int64 |
repeated |
|
PredictRequest
PredictRequest specifies which TensorFlow model to run, as well as
how inputs are mapped to tensors and how outputs are filtered before
returning to user.
Field | Type | Label | Description |
signature_name |
string |
|
A named signature to evaluate. If unspecified, the default signature
will be used |
inputs |
PredictRequest.InputsEntry |
repeated |
Input tensors.
Names of input tensor are alias names. The mapping from aliases to real
input tensor names is expected to be stored as named generic signature
under the key "inputs" in the model export.
Each alias listed in a generic signature named "inputs" should be provided
exactly once in order to run the prediction. |
output_filter |
string |
repeated |
Output filter.
Names specified are alias names. The mapping from aliases to real output
tensor names is expected to be stored as named generic signature under
the key "outputs" in the model export.
Only tensors specified here will be run/fetched and returned, with the
exception that when none is specified, all tensors specified in the
named signature will be run/fetched and returned. |
debug_level |
int32 |
|
|
PredictRequest.InputsEntry
PredictResponse
Response for PredictRequest on successful run.
PredictResponse.OutputsEntry
ArrayDataType
Name | Number | Description |
DT_INVALID |
0 |
Not a legal value for DataType. Used to indicate a DataType field
has not been set. |
DT_FLOAT |
1 |
Data types that all computation devices are expected to be
capable to support. |
DT_DOUBLE |
2 |
|
DT_INT32 |
3 |
|
DT_UINT8 |
4 |
|
DT_INT16 |
5 |
|
DT_INT8 |
6 |
|
DT_STRING |
7 |
|
DT_COMPLEX64 |
8 |
Single-precision complex |
DT_INT64 |
9 |
|
DT_BOOL |
10 |
|
DT_QINT8 |
11 |
Quantized int8 |
DT_QUINT8 |
12 |
Quantized uint8 |
DT_QINT32 |
13 |
Quantized int32 |
DT_BFLOAT16 |
14 |
Float32 truncated to 16 bits. Only for cast ops. |
DT_QINT16 |
15 |
Quantized int16 |
DT_QUINT16 |
16 |
Quantized uint16 |
DT_UINT16 |
17 |
|
DT_COMPLEX128 |
18 |
Double-precision complex |
DT_HALF |
19 |
|
DT_RESOURCE |
20 |
|
DT_VARIANT |
21 |
Arbitrary C++ data types |
easy_rec/python/protos/tower.proto
Top
BayesTaskTower
Field | Type | Label | Description |
tower_name |
string |
required |
task name for the task tower |
label_name |
string |
optional |
label for the task, default is label_fields by order |
metrics_set |
EvalMetrics |
repeated |
metrics for the task |
loss_type |
LossType |
optional |
loss for the task Default: CLASSIFICATION |
num_class |
uint32 |
optional |
num_class for multi-class classification loss Default: 1 |
dnn |
DNN |
optional |
task specific dnn |
relation_tower_names |
string |
repeated |
related tower names |
relation_dnn |
DNN |
optional |
relation dnn |
weight |
float |
optional |
training loss weights Default: 1 |
task_space_indicator_label |
string |
optional |
label name for indcating the sample space for the task tower |
in_task_space_weight |
float |
optional |
the loss weight for sample in the task space Default: 1 |
out_task_space_weight |
float |
optional |
the loss weight for sample out the task space Default: 1 |
losses |
Loss |
repeated |
level for prediction
required uint32 prediction_level = 13;
prediction weights
optional float prediction_weight = 14 [default = 1.0];
multiple losses |
TaskTower
Field | Type | Label | Description |
tower_name |
string |
required |
task name for the task tower |
label_name |
string |
optional |
label for the task, default is label_fields by order |
metrics_set |
EvalMetrics |
repeated |
metrics for the task |
loss_type |
LossType |
optional |
loss for the task Default: CLASSIFICATION |
num_class |
uint32 |
optional |
num_class for multi-class classification loss Default: 1 |
dnn |
DNN |
optional |
task specific dnn |
weight |
float |
optional |
training loss weights Default: 1 |
task_space_indicator_label |
string |
optional |
label name for indcating the sample space for the task tower |
in_task_space_weight |
float |
optional |
the loss weight for sample in the task space Default: 1 |
out_task_space_weight |
float |
optional |
the loss weight for sample out the task space Default: 1 |
losses |
Loss |
repeated |
multiple losses |
Tower
Field | Type | Label | Description |
input |
string |
required |
|
dnn |
DNN |
required |
|
easy_rec/python/protos/train.proto
Top
IncrementSaveConfig
IncrementSaveConfig.Datahub
IncrementSaveConfig.Datahub.Consumer
Field | Type | Label | Description |
offset |
int64 |
optional |
Default: 0 |
timeout |
int32 |
optional |
Default: 600 |
IncrementSaveConfig.File
Field | Type | Label | Description |
incr_save_dir |
string |
optional |
Default: incr_save |
relative |
bool |
optional |
relative to model_dir Default: true |
mount_path |
string |
optional |
for online inference, please set the storage.mount_path to mount_path
online service will fail Default: /home/admin/docker_ml/workspace/incr_save/ |
IncrementSaveConfig.Kafka
IncrementSaveConfig.Kafka.Consumer
Field | Type | Label | Description |
config_topic |
string |
optional |
|
config_global |
string |
optional |
|
offset |
int64 |
optional |
Default: 0 |
timeout |
int32 |
optional |
Default: 600 |
TrainConfig
Message for configuring EasyRecModel training jobs (train.py).
Next id: 25
Field | Type | Label | Description |
optimizer_config |
Optimizer |
repeated |
optimizer options |
gradient_clipping_by_norm |
float |
optional |
If greater than 0, clips gradients by this value. Default: 0 |
num_steps |
uint32 |
optional |
Number of steps to train the models: if 0, will train the model
indefinitely. Default: 0 |
fine_tune_checkpoint |
string |
optional |
Checkpoint to restore variables from. |
fine_tune_ckpt_var_map |
string |
optional |
|
sync_replicas |
bool |
optional |
Whether to synchronize replicas during training.
In case so, build a SyncReplicateOptimizer Default: true |
sparse_accumulator_type |
string |
optional |
only take effect on pai-tf when sync_replicas is set,
options are:
raw, hash, multi_map, list, parallel
in general, multi_map runs faster than other options. Default: multi_map |
startup_delay_steps |
float |
optional |
Number of training steps between replica startup.
This flag must be set to 0 if sync_replicas is set to true. Default: 15 |
save_checkpoints_steps |
uint32 |
optional |
Step interval for saving checkpoint Default: 1000 |
save_checkpoints_secs |
uint32 |
optional |
Seconds interval for saving checkpoint |
keep_checkpoint_max |
uint32 |
optional |
Max checkpoints to keep Default: 10 |
save_summary_steps |
uint32 |
optional |
Save summaries every this many steps. Default: 1000 |
log_step_count_steps |
uint32 |
optional |
The frequency global step/sec and the loss will be logged during training. Default: 10 |
is_profiling |
bool |
optional |
profiling or not Default: false |
force_restore_shape_compatible |
bool |
optional |
if variable shape is incompatible, clip or pad variables in checkpoint Default: false |
train_distribute |
DistributionStrategy |
optional |
DistributionStrategy, available values are 'mirrored' and 'collective' and 'ess'
- mirrored: MirroredStrategy, single machine and multiple devices;
- collective: CollectiveAllReduceStrategy, multiple machines and multiple devices. Default: NoStrategy |
num_gpus_per_worker |
int32 |
optional |
Number of gpus per machine Default: 1 |
summary_model_vars |
bool |
optional |
summary model variables or not Default: false |
protocol |
string |
optional |
distribute training protocol [grpc++ | star_server]
grpc++: https://help.aliyun.com/document_detail/173157.html?spm=5176.10695662.1996646101.searchclickresult.3ebf450evuaPT3
star_server: https://help.aliyun.com/document_detail/173154.html?spm=a2c4g.11186623.6.627.39ad7e3342KOX4 |
inter_op_parallelism_threads |
int32 |
optional |
inter_op_parallelism_threads Default: 0 |
intra_op_parallelism_threads |
int32 |
optional |
intra_op_parallelism_threads Default: 0 |
tensor_fuse |
bool |
optional |
tensor fusion on PAI-TF Default: false |
write_graph |
bool |
optional |
write graph into graph.pbtxt and summary or not Default: true |
freeze_gradient |
string |
repeated |
match variable patterns to freeze |
incr_save_config |
IncrementSaveConfig |
optional |
increment save config |
enable_oss_stop_signal |
bool |
optional |
enable oss stop signal
stop by create OSS_STOP_SIGNAL under model_dir Default: false |
dead_line |
string |
optional |
stop training after dead_line time, format:
20220508 23:59:59 |
DistributionStrategy
Name | Number | Description |
NoStrategy |
0 |
use old SyncReplicasOptimizer for ParameterServer training |
PSStrategy |
1 |
PSStrategy with multiple gpus on one node could not work
on pai-tf, could only work on TF >=1.15 |
MirroredStrategy |
2 |
could only work on PaiTF or TF >=1.15
single worker multiple gpu mode |
CollectiveAllReduceStrategy |
3 |
Depreciated |
ExascaleStrategy |
4 |
currently not working good |
MultiWorkerMirroredStrategy |
5 |
multi worker multi gpu mode
see tf.distribute.experimental.MultiWorkerMirroredStrategy |
easy_rec/python/protos/uniter.proto
Top
Uniter
Field | Type | Label | Description |
config |
UniterTower |
required |
|
final_dnn |
DNN |
required |
|
easy_rec/python/protos/variational_dropout.proto
Top
VariationalDropoutLayer
Field | Type | Label | Description |
regularization_lambda |
float |
optional |
regularization coefficient lambda Default: 0.01 |
embedding_wise_variational_dropout |
bool |
optional |
variational_dropout dimension Default: false |
easy_rec/python/protos/wide_and_deep.proto
Top
WideAndDeep
Field | Type | Label | Description |
wide_output_dim |
uint32 |
required |
Default: 1 |
dnn |
DNN |
required |
|
final_dnn |
DNN |
optional |
if set, the output of dnn and wide part are concatenated and
passed to the final_dnn; otherwise, they are summarized |
l2_regularization |
float |
optional |
Default: 0.0001 |
Scalar Value Types
.proto Type | Notes | C++ Type | Java Type | Python Type |
double |
|
double |
double |
float |
float |
|
float |
float |
float |
int32 |
Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. |
int32 |
int |
int |
int64 |
Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. |
int64 |
long |
int/long |
uint32 |
Uses variable-length encoding. |
uint32 |
int |
int/long |
uint64 |
Uses variable-length encoding. |
uint64 |
long |
int/long |
sint32 |
Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. |
int32 |
int |
int |
sint64 |
Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. |
int64 |
long |
int/long |
fixed32 |
Always four bytes. More efficient than uint32 if values are often greater than 2^28. |
uint32 |
int |
int |
fixed64 |
Always eight bytes. More efficient than uint64 if values are often greater than 2^56. |
uint64 |
long |
int/long |
sfixed32 |
Always four bytes. |
int32 |
int |
int |
sfixed64 |
Always eight bytes. |
int64 |
long |
int/long |
bool |
|
bool |
boolean |
boolean |
string |
A string must always contain UTF-8 encoded or 7-bit ASCII text. |
string |
String |
str/unicode |
bytes |
May contain any arbitrary sequence of bytes. |
string |
ByteString |
str |