{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:10:34Z","timestamp":1760235034681,"version":"build-2065373602"},"reference-count":37,"publisher":"MDPI AG","issue":"14","license":[{"start":{"date-parts":[[2021,7,9]],"date-time":"2021-07-09T00:00:00Z","timestamp":1625788800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Customer activity (CA) in retail environments, which ranges over various shopper situations in store spaces, provides valuable information for store management and marketing planning. Several systems have been proposed for customer activity recognition (CAR) from in-store camera videos, and most of them use machine learning based end-to-end (E2E) CAR models, due to their remarkable performance. Usually, such E2E models are trained for target conditions (i.e., particular CA types in specific store spaces). Accordingly, the existing systems are not malleable to fit the changes in target conditions because they require entire retraining of their specialized E2E models and concurrent use of additional E2E models for new target conditions. This paper proposes a novel CAR system based on a hierarchy that organizes CA types into different levels of abstraction from lowest to highest. The proposed system consists of multiple CAR models, each of which performs CAR tasks that belong to a certain level of the hierarchy on the lower level\u2019s output, and thus conducts CAR for videos through the models level by level. Since these models are separated, this system can deal efficiently with the changes in target conditions by modifying some models individually. Experimental results show the effectiveness of the proposed system in adapting to different target conditions.<\/jats:p>","DOI":"10.3390\/s21144712","type":"journal-article","created":{"date-parts":[[2021,7,9]],"date-time":"2021-07-09T10:50:38Z","timestamp":1625827838000},"page":"4712","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["A Hierarchy-Based System for Recognizing Customer Activity in Retail Environments"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7384-0957","authenticated-orcid":false,"given":"Jiahao","family":"Wen","sequence":"first","affiliation":[{"name":"Graduate School of Information Sciences, Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai 980-8577, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8287-5964","authenticated-orcid":false,"given":"Luis","family":"Guillen","sequence":"additional","affiliation":[{"name":"Research Institute of Electrical Communication, Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai 980-8577, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3786-0122","authenticated-orcid":false,"given":"Toru","family":"Abe","sequence":"additional","affiliation":[{"name":"Cyberscience Center, Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai 980-8577, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5798-5125","authenticated-orcid":false,"given":"Takuo","family":"Suganuma","sequence":"additional","affiliation":[{"name":"Cyberscience Center, Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai 980-8577, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,7,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.patrec.2016.04.008","article-title":"Purchase behavior analysis through gaze and gesture observation","volume":"81","author":"Merad","year":"2016","journal-title":"Pattern Recognit. Lett."},{"key":"ref_2","unstructured":"Hernandez, D.A.M., Nalbach, O., and Werth, D. (2019, January 15\u201317). How computer vision provides physical retail with a better view on customers. Proceedings of the 2019 IEEE 21st Conference on Business Informatics (CBI), Moscow, Russia."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Liu, J., Gu, Y., and Kamijo, S. (2015, January 14\u201316). Customer behavior recognition in retail store from surveillance camera. Proceedings of the 2015 IEEE International Symposium on Multimedia (ISM), Miami, FL, USA.","DOI":"10.1109\/ISM.2015.52"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s00138-020-01118-w","article-title":"Deep understanding of shopper behaviours and interactions using RGB-D vision","volume":"31","author":"Paolanti","year":"2020","journal-title":"Mach. Vision Appl."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"11992","DOI":"10.1016\/j.eswa.2012.03.038","article-title":"A novel mobile recommender system for indoor shopping","volume":"39","author":"Fang","year":"2012","journal-title":"Expert Syst. Appl."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"17436","DOI":"10.1109\/ACCESS.2017.2744263","article-title":"Mining customer preference in physical stores from interaction behavior","volume":"5","author":"Chen","year":"2017","journal-title":"IEEE Access"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Zheng, Z., Chen, Y., Chen, S., Sun, L., and Chen, D. (2017). Location-aware POI recommendation for indoor space by exploiting WiFi logs. Mobile Inf. Syst., 2017.","DOI":"10.1155\/2017\/9601404"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Chawathe, S.S. (2008, January 12\u201315). Beacon placement for indoor localization using bluetooth. Proceedings of the 2008 11th International IEEE Conference on Intelligent Transportation Systems, Beijing, China.","DOI":"10.1109\/ITSC.2008.4732690"},{"key":"ref_9","unstructured":"Lacic, E., Kowald, D., Traub, M., Luzhnica, G., Simon, J., and Lex, E. (2015, January 16\u201320). Tackling cold-start users in recommender systems with indoor positioning systems. Proceedings of the 9th ACM Conference Recommender Systems, Vienna, Austria."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Christodoulou, P., Christodoulou, K., and Andreou, A.S. (2017, January 26\u201329). A real-time targeted recommender system for supermarkets. Proceedings of the 19th International Conference on Enterprise Information Systems, Porto, Portugal.","DOI":"10.5220\/0006309907030712"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"So, W.T., and Yada, K. (2017, January 17\u201319). A framework of recommendation system based on in-store behavior. Proceedings of the 4th Multidisciplinary International Social Networks Conference, Bangkok, Thailand.","DOI":"10.1145\/3092090.3092130"},{"key":"ref_12","first-page":"197","article-title":"Recommending stores for shopping mall customers with RecStore","volume":"9","year":"2018","journal-title":"J. Inf. Data Manag."},{"key":"ref_13","unstructured":"Lee, K., Choo, C.Y., See, H.Q., Tan, Z.J., and Lee, Y. (2010, January 9\u201311). Human detection using histogram of oriented gradients and human body ratio estimation. Proceedings of the 2010 3rd International Conference on Computer Science and Information Technology, Chengdu, China."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhang, S., and Wang, X. (2013, January 23\u201325). Human detection and object tracking based on histograms of oriented gradients. Proceedings of the 2013 9th International Conference on Natural Computation (ICNC), Shenyang, China.","DOI":"10.1109\/ICNC.2013.6818189"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Frontoni, E., Raspa, P., Mancini, A., Zingaretti, P., and Placidi, V. (2013, January 9\u201313). Customers\u2019 activity recognition in intelligent retail environments. Proceedings of the International Conference on Image Analysis and Processing, Naples, Italy.","DOI":"10.1007\/978-3-642-41190-8_55"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Liciotti, D., Contigiani, M., Frontoni, E., Mancini, A., Zingaretti, P., and Placidi, V. (2014, January 24). Shopper analytics: A customer activity recognition system using a distributed RGB-D camera network. Proceedings of the International Workshop on Video Analytics for Audience Measurement in Retail and Digital Signage, Stockholm, Sweden.","DOI":"10.1007\/978-3-319-12811-5_11"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1016\/j.patrec.2014.09.013","article-title":"Detecting and tracking people in real time with RGB-D camera","volume":"53","author":"Liu","year":"2015","journal-title":"Pattern Recognit. Lett."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1016\/j.patrec.2016.02.010","article-title":"Robust and affordable retail customer profiling by vision and radio beacon sensor fusion","volume":"81","author":"Sturari","year":"2016","journal-title":"Pattern Recognit. Lett."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Yamamoto, J., Inoue, K., and Yoshioka, M. (2017, January 24\u201331). Investigation of customer behavior analysis based on top-view depth camera. Proceedings of the 2017 IEEE Winter Applications of Computer Vision Workshops (WACVW), Santa Rosa, CA, USA.","DOI":"10.1109\/WACVW.2017.18"},{"key":"ref_20","unstructured":"Alex, L., and Mihran, T. (2007, January 5\u20137). Detecting shopper groups in video sequences. Proceedings of the 2007 IEEE Conference on Advanced Video and Signal Based Surveillance, London, UK."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Generosi, A., Ceccacci, S., and Mengoni, M. (2018, January 2\u20135). A deep learning-based system to track and analyze customer behavior in retail store. Proceedings of the 2018 IEEE 8th International Conference on Consumer Electronics\u2014Berlin (ICCE-Berlin), Berlin, Germany.","DOI":"10.1109\/ICCE-Berlin.2018.8576169"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1007\/s10846-017-0674-7","article-title":"Modelling and forecasting customer navigation in intelligent retail environments","volume":"91","author":"Paolanti","year":"2018","journal-title":"J. Intell. Rob. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhao, L., Yao, J., Du, H., Zhao, J., and Zhang, R. (2019, January 22\u201325). A unified object detection framework for intelligent retail container commodities. Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803536"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Popa, M.C., Gritti, T., Rothkrantz, L.J.M., Shan, C., and Wiggers, P. (2011, January 29\u201331). Detecting customers\u2019 buying events on a real-life database. Proceedings of the International Conference on Computer Analysis of Images and Patterns, Seville, Spain.","DOI":"10.1007\/978-3-642-23672-3_3"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1879","DOI":"10.1016\/j.patrec.2012.11.015","article-title":"Shopping behavior recognition using a language modeling analogy","volume":"34","author":"Popa","year":"2013","journal-title":"Pattern Recognit. Lett."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Singh, B., Marks, T.K., Jones, M., Tuzel, O., and Shao, M. (2016, January 27\u201330). A multi-stream bi-directional recurrent neural network for fine-grained action detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.216"},{"key":"ref_27","first-page":"567","article-title":"Person detection from overhead view: A survey","volume":"10","author":"Ahmad","year":"2019","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"ref_28","unstructured":"Kong, Y., and Fu, Y. (2018). Human action recognition and prediction: A survey. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1115\/1.3662552","article-title":"A new approach to linear filtering and prediction problems","volume":"82","author":"Kalman","year":"1960","journal-title":"Trans. ASME J. Basic Eng."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1109\/TITS.2009.2030963","article-title":"Understanding transit scenes: A survey on human behavior-recognition algorithms","volume":"11","author":"Candamo","year":"2009","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Lee, J.G., Han, J., and Whang, K.Y. (2007, January 11\u201314). Trajectory clustering: A partition-and-group framework. Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, Beijing, China.","DOI":"10.1145\/1247480.1247546"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Leal, E., and Gruenwald, L. (2018, January 2\u20137). DynMDL: A parallel trajectory segmentation algorithm. Proceedings of the 2018 IEEE International Congress on Big Data (BigData Congress), San Francisco, CA, USA.","DOI":"10.1109\/BigDataCongress.2018.00036"},{"key":"ref_34","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). YOLOv4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Bewley, A., Ge, Z., Ott, L., Ramos, F., and Upcroft, B. (2016, January 25\u201328). Simple Online and Realtime Tracking. Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA.","DOI":"10.1109\/ICIP.2016.7533003"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"428","DOI":"10.1080\/09511920802527582","article-title":"Flexibility evaluation: A toolbox approach","volume":"22","author":"Georgoulias","year":"2009","journal-title":"Int. J. Comput. Integr. Manuf."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1016\/j.procs.2014.12.007","article-title":"Software architecture and detailed design evaluation","volume":"43","author":"Vishnyakov","year":"2015","journal-title":"Procedia Comput. Sci."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/14\/4712\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:28:33Z","timestamp":1760164113000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/14\/4712"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,9]]},"references-count":37,"journal-issue":{"issue":"14","published-online":{"date-parts":[[2021,7]]}},"alternative-id":["s21144712"],"URL":"https:\/\/doi.org\/10.3390\/s21144712","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,7,9]]}}}