年度反思之2024我在淘宝做内容

2025-02-22T13:41:15.000Z

本文写于2025年2月，一方面是最近读到了若干篇Google在短视频推荐方向的文章，另一方面是年前和某前同事聊新工作在做的事情，感觉思路更开阔了，顺便把以前做的事情再拿出来“鞭尸”一下。

全文中的”内容“仅指

LiRank: Industrial Large Scale Ranking Models at LinkedIn

2024-09-08T13:51:52.000Z

TL;DR

本文是LinkedIn的模型团队模型迭代“年终汇报”，包含LinkedIn团队对于各类模型涨点的技巧的实践。

摘要

We present LiRank, a

3 Years in Hangzhou

2024-05-25T08:57:50.000Z

一转眼三年，零零散散还是要写一点，大部分摘自2023的复盘。

关于「工作」

An Empirical Study of Selection Bias in Pinterest Ads Retrieval

2023-08-19T17:09:03.000Z

摘要

Data selection bias has been a long-lasting challenge in the machine learning domain, especially in

Streaming CTR Prediction: Rethinking Recommendation Task for Real-World Streaming Data

2023-08-06T13:21:28.000Z

摘要

The Click-Through Rate (CTR) prediction task is critical in industrial recom- mender systems, where

Fresh Content Needs More Attention- Multi-funnel Fresh Content Recommendation

2023-07-02T02:47:37.000Z

摘要

Recommendation system serves as a conduit connecting users to an incredibly large, diverse and ever

Attention is all you need

2023-07-01T10:24:02.000Z

摘要

The dominant sequence transduction models are based on complex recurrent or convolutional neural

On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models

2022-10-07T13:37:16.000Z

写在最前面，

本来是想简单摘抄一下这篇文章中的精华，写到一半觉得这篇文章不应如此，本文应该是一篇可以比肩Wide&Deep的文章。如果说Wide&Deep告诉业界推荐就是要搞Embedding，E2E，那么本文可能就是告诉大家CTR模型就是要搞

Impression Pacing for Jobs Marketplace at LinkedIn

2022-09-15T02:51:22.000Z

TL;DR

本文是LinkedIn发表在CIKM 2020上的文章，主要内容是基于曝光行为来做Pacing，控制预算消耗，帮助广告主触达更广泛的人群。

What Makes Forest-Based Heterogeneous Treatment Effect Estimators Work?

2022-07-03T04:39:13.000Z

TL;DR,

本文感觉像是一篇大作业文章，应该还不是终稿，有很多typo而且写得不那么易懂，但Stefan Wager是作者之一。主要讨论两个常见的用于估计“异质处理效应”（HTE）的forest模型，causal forest和model-based for

1 Year in HangZhou

2022-06-18T14:51:20.000Z

A Unified Solution to Constrained Bidding in Online Display Advertising

2022-06-18T03:26:45.000Z

TL;DR

本文提出一种通用的智能出价框架，可以满足不同广告主关于竞价目标和约束的诉求，具体出价调控算法的实现为DDPG，目前该算法已经在淘宝广告平台部署使用。

Ads Allocation in Feed via Constrained Optimization

2022-06-10T14:38:43.000Z

TL;DR

LinkedIn发表在KDD2020上的文章，主要想解决在信息流中，如何将自然推荐物品和广告进行混排的问题，文中将该问题建模为带约束的优化问题，并提出了一种归并排序的方式。

目前这种方法及其衍生版本、简化版本，在日常工作中都非常常见，

Bid Optimization by Multivariable Control in Display Advertising

2022-06-06T13:38:19.000Z

TL;DR,

预算约束+效率成本约束下，最优广告出价策略，双PID在线调节超参数（对偶变量）

摘要

Real-Time Bidding (RTB) is an important

Optimized Cost per Click in Taobao Display Advertising

2022-05-29T13:07:35.000Z

TL;DR,

很早期的阿里妈妈在广告出价方向的实践，细节比较多，很适合做baseline。

但很多做法以及解法在今天来看都比较过时了，后续会逐渐写近几年怎么做出价问题的。

Smart Pacing for Effective Online Ad Campaign Optimization

2022-05-21T07:04:53.000Z

这篇文章是广告预算控制平滑的第二篇经典文章解读，来自Yahoo，KDD2015。

与之前LinkedIn那篇文章类似，这篇文章也是用概率节流的方式进行预算控制，同时这篇文章考虑有无效率保障情况下，如何平滑预算消耗。

Budget Pacing for Targeted Online Advertisements at LinkedIn

2022-05-21T04:48:17.000Z

最近恶补了一波广告相关文章，本文是预算控制平滑方面很经典的文章之一，由LinkedIn团队发表在KDD2014上。

摘要

Targeted online advertising is a prime

Field-aware Calibration- A Simple and Empirically Strong Method for Reliable Probabilistic Predictions

2022-05-08T05:19:30.000Z

最近要做一些和广告出价相关的工作，恶补了一下广告相关的知识，这篇文章是WWW2020 腾讯的文章，解决广告场景下的概率校准问题。

A Deep Probabilistic Model for Customer Lifetime Value Prediction

2022-04-26T11:41:12.000Z

TL;DR

提出一种新的回归Loss来建模用户的长期价值，解决LTV分布并非高斯分布，而是一部分为0和一部分服从log normal的问题。

ate_decomposition

2022-04-24T10:42:43.000Z

今天在公司内部技术论坛上看到一个帖子，关于用Tree方法做ATE估计的，本着一个严（zhao）谨（cha）的态度，认认真真看了一遍，发现里面一个式子长的比较奇怪，随手推了推感觉还挺有意思的。在这里记一下。

POM框架就不多写了，这篇文章的目的是估计平均因果效应

Arvin's Blog

年度反思之2024我在淘宝做内容

LiRank: Industrial Large Scale Ranking Models at LinkedIn

摘要

3 Years in Hangzhou

关于「工作」

An Empirical Study of Selection Bias in Pinterest Ads Retrieval

摘要

Streaming CTR Prediction: Rethinking Recommendation Task for Real-World Streaming Data

摘要

Fresh Content Needs More Attention- Multi-funnel Fresh Content Recommendation

摘要

Attention is all you need

摘要

On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models

Impression Pacing for Jobs Marketplace at LinkedIn

What Makes Forest-Based Heterogeneous Treatment Effect Estimators Work?

1 Year in HangZhou

A Unified Solution to Constrained Bidding in Online Display Advertising

Ads Allocation in Feed via Constrained Optimization

Bid Optimization by Multivariable Control in Display Advertising

摘要

Optimized Cost per Click in Taobao Display Advertising

Smart Pacing for Effective Online Ad Campaign Optimization

Budget Pacing for Targeted Online Advertisements at LinkedIn

摘要

Field-aware Calibration- A Simple and Empirically Strong Method for Reliable Probabilistic Predictions

A Deep Probabilistic Model for Customer Lifetime Value Prediction

ate_decomposition