{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,5]],"date-time":"2026-06-05T05:12:34Z","timestamp":1780636354528,"version":"3.54.1"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2018,10,30]],"date-time":"2018-10-30T00:00:00Z","timestamp":1540857600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Shanghai Sailing Program","award":["17YF1428200"],"award-info":[{"award-number":["17YF1428200"]}]},{"name":"Huawei Innovation Research Program"},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61632017, 61702327, 61772333"],"award-info":[{"award-number":["61632017, 61702327, 61772333"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2019,1,31]]},"abstract":"<jats:p>User response prediction is a crucial component for personalized information retrieval and filtering scenarios, such as recommender system and web search. The data in user response prediction is mostly in a multi-field categorical format and transformed into sparse representations via one-hot encoding. Due to the sparsity problems in representation and optimization, most research focuses on feature engineering and shallow modeling. Recently, deep neural networks have attracted research attention on such a problem for their high capacity and end-to-end training scheme. In this article, we study user response prediction in the scenario of click prediction. We first analyze a coupled gradient issue in latent vector-based models and propose kernel product to learn field-aware feature interactions. Then, we discuss an insensitive gradient issue in DNN-based models and propose Product-based Neural Network, which adopts a feature extractor to explore feature interactions. Generalizing the kernel product to a net-in-net architecture, we further propose Product-network in Network (PIN), which can generalize previous models. Extensive experiments on four industrial datasets and one contest dataset demonstrate that our models consistently outperform eight baselines on both area under curve and log loss. Besides, PIN makes great click-through rate improvement (relatively 34.67%) in online A\/B test.<\/jats:p>","DOI":"10.1145\/3233770","type":"journal-article","created":{"date-parts":[[2018,10,30]],"date-time":"2018-10-30T12:11:00Z","timestamp":1540901460000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":163,"title":["Product-Based Neural Networks for User Response Prediction over Multi-Field Categorical Data"],"prefix":"10.1145","volume":"37","author":[{"given":"Yanru","family":"Qu","sequence":"first","affiliation":[{"name":"Shanghai Jiao Tong University, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bohui","family":"Fang","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Weinan","family":"Zhang","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ruiming","family":"Tang","sequence":"additional","affiliation":[{"name":"Noah\u2019s Ark Lab, Huawei, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Minzhe","family":"Niu","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Huifeng","family":"Guo","sequence":"additional","affiliation":[{"name":"Shenzhen Graduate School, Harbin Institute of Technology, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yong","family":"Yu","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiuqiang","family":"He","sequence":"additional","affiliation":[{"name":"Data service center, MIG, Tencent, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2018,10,30]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1835804.1835834"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148175"},{"key":"e_1_2_1_3_1","volume-title":"Jamie Ryan Kiros, and Geoffrey E. Hinton","author":"Ba Jimmy Lei","year":"2016"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098086"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526711"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939785"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2988450.2988454"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2959100.2959190"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020454"},{"key":"e_1_2_1_10_1","first-page":"2121","article-title":"Adaptive subgradient methods for online learning and stochastic optimization","author":"Duchi John","year":"2011","journal-title":"J. Mach. Learn. Res. 12"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/64.236478"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. 249--256","author":"Glorot Xavier","year":"2010"},{"key":"e_1_2_1_13_1","volume-title":"Deep Learning","author":"Goodfellow Ian"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201910)","author":"Graepel Thore"},{"key":"e_1_2_1_15_1","unstructured":"Huifeng Guo Ruiming Tang Yunming Ye Zhenguo Li and Xiuqiang He. 2017. DeepFM: A factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247.  Huifeng Guo Ruiming Tang Yunming Ye Zhenguo Li and Xiuqiang He. 2017. DeepFM: A factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080777"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052569"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2648584.2648589"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/0893-6080(89)90020-8"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the International Conference on Machine Learning. 448--456","author":"Ioffe Sergey","year":"2015"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2959100.2959134"},{"key":"e_1_2_1_22_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.","author":"Kingma Diederik","year":"2014"},{"key":"e_1_2_1_23_1","unstructured":"G\u00fcnter Klambauer Thomas Unterthiner Andreas Mayr and Sepp Hochreiter. 2017. Self-normalizing neural networks. arXiv preprint arXiv:1706.02515.  G\u00fcnter Klambauer Thomas Unterthiner Andreas Mayr and Sepp Hochreiter. 2017. Self-normalizing neural networks. arXiv preprint arXiv:1706.02515."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2009.263"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2339530.2339651"},{"key":"e_1_2_1_26_1","unstructured":"Min Lin Qiang Chen and Shuicheng Yan. 2013. Network in network. arXiv preprint arXiv:1312.4400.  Min Lin Qiang Chen and Shuicheng Yan. 2013. Network in network. arXiv preprint arXiv:1312.4400."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2806416.2806603"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3018661.3018716"},{"key":"e_1_2_1_29_1","article-title":"Visualizing data using t-SNE","author":"van der Maaten Laurens","year":"2008","journal-title":"J. Mach. Learn. Res. 9"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2487575.2488200"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020436"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2339530.2339655"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2016.0151"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622737.1622742"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983347"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2010.127"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242643"},{"key":"e_1_2_1_38_1","volume-title":"Kingma","author":"Salimans Tim","year":"2016"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the International Conference on Machine Learning. 3067--3075","author":"Shalev-Shwartz Shai","year":"2017"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939704"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2670313"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2015.7364112"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783273"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098096"},{"key":"e_1_2_1_46_1","doi-asserted-by":"crossref","unstructured":"Jun Xiao Hao Ye Xiangnan He Hanwang Zhang Fei Wu and Tat-Seng Chua. 2017. Attentional factorization machines: Learning the weight of feature interactions via attention networks. arXiv preprint arXiv:1708.04617.  Jun Xiao Hao Ye Xiangnan He Hanwang Zhang Fei Wu and Tat-Seng Chua. 2017. Attentional factorization machines: Learning the weight of feature interactions via attention networks. arXiv preprint arXiv:1708.04617.","DOI":"10.24963\/ijcai.2017\/435"},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of the International Conference on Machine Learning. 2048--2057","author":"Xu Kelvin","year":"2015"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/1031171.1031192"},{"key":"e_1_2_1_49_1","unstructured":"Manzil Zaheer Satwik Kottur Siamak Ravanbakhsh Barnabas Poczos Ruslan Salakhutdinov and Alexander Smola. 2017. Deep sets. arXiv preprint arXiv:1703.06114.  Manzil Zaheer Satwik Kottur Siamak Ravanbakhsh Barnabas Poczos Ruslan Salakhutdinov and Alexander Smola. 2017. Deep sets. arXiv preprint arXiv:1703.06114."},{"key":"e_1_2_1_50_1","unstructured":"Shuai Zhang Lina Yao and Aixin Sun. 2017. Deep learning based recommender system: A survey and new perspectives. arXiv preprint arXiv:1707.07435.  Shuai Zhang Lina Yao and Aixin Sun. 2017. Deep learning based recommender system: A survey and new perspectives. arXiv preprint arXiv:1707.07435."},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of the European Conference on Information Retrieval (ECIR\u201916)","author":"Zhang Weinan","year":"2016"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623633"},{"key":"e_1_2_1_53_1","volume-title":"Chang Xu et al","author":"Zhang Yuyu","year":"2014"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3233770","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3233770","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T02:07:03Z","timestamp":1750212423000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3233770"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,10,30]]},"references-count":53,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2019,1,31]]}},"alternative-id":["10.1145\/3233770"],"URL":"https:\/\/doi.org\/10.1145\/3233770","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"value":"1046-8188","type":"print"},{"value":"1558-2868","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,10,30]]},"assertion":[{"value":"2017-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-10-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}