sequence_feature

功能介绍

⽤户的历史⾏为也是⼀个很重要的 feature。历史⾏为通常是⼀个序列,例如点击序列、购买序列等,组成这个序列的实体可能是商品本身。

配置方法

例如我们需要对⽤户的点击序列进⾏fg,序列⻓度为50,每个序列提取item_id, price和ts特征,其中ts=请求时间(request_time) - 用户行为时间(event_time)。 配置如下:

{
    "features":[
        {
            "feature_type":"raw_feature",
            "feature_name":"feat0",
            "expression":"user:feat0"
        },
        ...
        {
            "sequence_name":"click_50_seq",
            "sequence_column":"click_50_seq",
            "sequence_length":10,
            "sequence_delim":";",
            "attribute_delim":"#",
            "sequence_table":"item",
            "sequence_pk":"user:click_50_seq",
            "features":[
                {
                    "feature_name":"item_id",
                    "feature_type":"id_feature",
                    "value_type":"String",
                    "expression":"item:item_id"
                },
                {
                    "feature_name":"price",
                    "feature_type":"raw_feature",
                    "expression":"item:price"
                },
                {
                    "feature_name":"ts",
                    "feature_type":"raw_feature",
                    "expression":"user:ts"
                }
            ]
        }
    ]
}
  • sequence_name: sequence名称

  • sequence_column: sequence输出名成

  • sequence_length: sequence的最大长度

  • sequence_delim: sequence元素之间的分隔符

  • attribute_delim: sequence元素内部各个属性之间的分隔符, 仅离线需要

  • sequence_pk: sequence primary key, 主键, 如user:click_50_seq, 里面保存了user点击的最近的50个itemId;

  • features: sequence的sideinfo, 包含item的静态属性值和行为时间信息等

在线 FG

⽀持两种⽅式获取⾏为sideinfo信息,⼀种是从EasyRec Processor的item cache获取sideinfo信息, 以sequence_pk 配置的字段为主键,EasyRec Processor 从item cache中查找item的属性信息; 另⼀种⽤户在请求中填充对应的字段值, 如上述配置中的”ts”字段, 其含义是(request_time - event_time), 即推荐请求时间 - 用户行为时间, 这个是随请求时间变化的, 因此需要从请求中获取:

user_features {
  key: "click_50_seq"
  value {
    string_feature: "9008721;34926279;22487529;73379;840804;911247;31999202;7421440;4911004;40866551"
  }
}

user_features {
  key: "click__ts"
  value {
    string_feature: "23;113;401363;401369;401375;401405;486678;486803;486922;486969"
  }
}