XJTU team achieves breakthrough in AI-driven operation management

The deep neural network proposed by XJTU's research team.
Online assortment optimization has emerged as a critical research focus on operations management, exploring how platforms can meticulously select an optimal product portfolio from limited inventory and recommend it to diverse, sequentially arriving customers to maximize total platform revenue over a defined period.
However, prevailing model-driven approaches often rely on restrictive assumptions that misalign with real user behavior, while solving corresponding high-dimensional dynamic programming problems incurs significant computational costs.
To address these challenges, Professor Wang Yao from the Center for Intelligent Decision and Machine Learning at Xi'an Jiaotong University's (XJTU) School of Management, along with his former master's student Li Tao (now a PhD student at HKUST) and Wang Chenhao (soon to join Tongji University), collaborated with Professor Tang Shaojie from the University at Buffalo, SUNY, and Professor Chen Ningyuan from the University of Toronto.
The team pioneered a novel artificial intelligence-based research strategy, proposing a model-free deep reinforcement learning (DRL) method. This approach designs a specialized deep neural network (DNN) to construct assortment policies. With a simulator built from historical transaction data, it updates DNN parameters via the Advantage Actor-Critic (A2C) algorithm, effectively overcoming traditional RL's need for prohibitively large transaction datasets.
This research, titled Deep Reinforcement Learning for Online Assortment Customization: A Data-Driven Approach, was published online in June 2025 in Production and Operations Management, a premier journal in the field of operations management.

