{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,18]],"date-time":"2026-01-18T11:43:51Z","timestamp":1768736631882,"version":"3.49.0"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2022,6,18]],"date-time":"2022-06-18T00:00:00Z","timestamp":1655510400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,6,18]],"date-time":"2022-06-18T00:00:00Z","timestamp":1655510400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002920","name":"Research Grants Council, University Grants Committee","doi-asserted-by":"publisher","award":["16201718"],"award-info":[{"award-number":["16201718"]}],"id":[{"id":"10.13039\/501100002920","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002920","name":"Research Grants Council, University Grants Committee","doi-asserted-by":"publisher","award":["16216119"],"award-info":[{"award-number":["16216119"]}],"id":[{"id":"10.13039\/501100002920","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Data Min Knowl Disc"],"published-print":{"date-parts":[[2022,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Individual passenger travel patterns have significant value in understanding passenger\u2019s behavior, such as learning the hidden clusters of locations, time, and passengers. The learned clusters further enable commercially beneficial actions such as customized services, promotions, data-driven urban-use planning, peak hour discovery, and so on. However, the individualized passenger modeling is very challenging for the following reasons: 1) The individual passenger travel data are multi-dimensional spatiotemporal big data, including at least the origin, destination, and time dimensions; 2) Moreover, individualized passenger travel patterns usually depend on the external environment, such as the distances and functions of locations, which are ignored in most current works. This work proposes a multi-clustering model to learn the latent clusters along the multiple dimensions of Origin, Destination, Time, and eventually, Passenger (ODT-P). We develop a graph-regularized tensor Latent Dirichlet Allocation (LDA) model by first extending the traditional LDA model into a tensor version and then applies to individual travel data. Then, the external information of stations is formulated as semantic graphs and incorporated as the Laplacian regularizations; Furthermore, to improve the model scalability when dealing with massive data, an online stochastic learning method based on tensorized variational Expectation-Maximization algorithm is developed. Finally, a case study based on passengers in the Hong Kong metro system is conducted and demonstrates that a better clustering performance is achieved compared to state-of-the-arts with the improvement in point-wise mutual information index and algorithm convergence speed by a factor of two.<\/jats:p>","DOI":"10.1007\/s10618-022-00842-3","type":"journal-article","created":{"date-parts":[[2022,6,18]],"date-time":"2022-06-18T06:02:33Z","timestamp":1655532153000},"page":"1247-1278","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["Individualized passenger travel pattern multi-clustering based on graph regularized tensor latent dirichlet allocation"],"prefix":"10.1007","volume":"36","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4983-9352","authenticated-orcid":false,"given":"Ziyue","family":"Li","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4322-7323","authenticated-orcid":false,"given":"Hao","family":"Yan","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4767-9597","authenticated-orcid":false,"given":"Chen","family":"Zhang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0575-8254","authenticated-orcid":false,"given":"Fugee","family":"Tsung","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,6,18]]},"reference":[{"issue":"Jan","key":"842_CR1","first-page":"993","volume":"3","author":"DM Blei","year":"2003","unstructured":"Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993\u20131022","journal-title":"J Mach Learn Res"},{"key":"842_CR2","doi-asserted-by":"publisher","first-page":"274","DOI":"10.1016\/j.trc.2017.03.021","volume":"79","author":"AS Briand","year":"2017","unstructured":"Briand AS, C\u00f4me E, Tr\u00e9panier M et al (2017) Analyzing year-to-year changes in public transport passenger behaviour using smart card data. Transp Res Part C: Emerg Technol 79:274\u2013289","journal-title":"Transp Res Part C: Emerg Technol"},{"key":"842_CR3","unstructured":"Chang J, Gerrish S, Wang C, et\u00a0al (2009) Reading tea leaves: How humans interpret topic models. In: Advances in neural information processing systems, pp 288\u2013296"},{"key":"842_CR4","doi-asserted-by":"crossref","unstructured":"Chen L, Jose JM, Yu H, et\u00a0al (2016) A semantic graph based topic model for question retrieval in community question answering. In: Proceedings of the ninth ACM international conference on web search and data mining, pp 287\u2013296","DOI":"10.1145\/2835776.2835809"},{"issue":"4","key":"842_CR5","doi-asserted-by":"publisher","first-page":"2035","DOI":"10.1007\/s11116-020-10120-0","volume":"48","author":"Z Cheng","year":"2020","unstructured":"Cheng Z, Tr\u00e9panier M, Sun L (2020) Probabilistic model for destination inference and travel pattern mining from smart card data. Transportation 48(4):2035\u20132053","journal-title":"Transportation"},{"issue":"11","key":"842_CR6","doi-asserted-by":"publisher","first-page":"2765","DOI":"10.1109\/TPAMI.2013.57","volume":"35","author":"E Elhamifar","year":"2013","unstructured":"Elhamifar E, Vidal R (2013) Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765\u20132781","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"842_CR7","doi-asserted-by":"crossref","unstructured":"Gao H, Nie F, Li X, et\u00a0al (2015) Multi-view subspace clustering. In: Proceedings of the IEEE international conference on computer vision, pp 4238\u20134246","DOI":"10.1109\/ICCV.2015.482"},{"key":"842_CR8","doi-asserted-by":"crossref","unstructured":"Geng X, Li Y, Wang L, et\u00a0al (2019) Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 3656\u20133663","DOI":"10.1609\/aaai.v33i01.33013656"},{"issue":"suppl 1","key":"842_CR9","doi-asserted-by":"publisher","first-page":"5228","DOI":"10.1073\/pnas.0307752101","volume":"101","author":"TL Griffiths","year":"2004","unstructured":"Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(suppl 1):5228\u20135235","journal-title":"Proc Natl Acad Sci"},{"key":"842_CR10","doi-asserted-by":"crossref","unstructured":"Guo S, Lin Y, Feng N, et\u00a0al (2019) Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 922\u2013929","DOI":"10.1609\/aaai.v33i01.3301922"},{"key":"842_CR11","unstructured":"Hoffman M, Bach FR, Blei DM (2010) Online learning for latent dirichlet allocation. In: advances in neural information processing systems, Citeseer, pp 856\u2013864"},{"key":"842_CR12","doi-asserted-by":"crossref","unstructured":"Hu H, Lin Z, Feng J, et\u00a0al (2014) Smooth representation clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3834\u20133841","DOI":"10.1109\/CVPR.2014.484"},{"issue":"3","key":"842_CR13","doi-asserted-by":"publisher","first-page":"455","DOI":"10.1137\/07070111X","volume":"51","author":"TG Kolda","year":"2009","unstructured":"Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3):455\u2013500","journal-title":"SIAM Rev"},{"key":"842_CR14","doi-asserted-by":"crossref","unstructured":"Li D, Zamani S, Zhang J, et\u00a0al (2019a) Integration of knowledge graph embedding into topic modeling with hierarchical dirichlet process. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp 940\u2013950","DOI":"10.18653\/v1\/N19-1099"},{"key":"842_CR15","doi-asserted-by":"crossref","unstructured":"Li X, Zhang J, Ouyang J (2019b) Dirichlet multinomial mixture with variational manifold regularization: Topic modeling over short texts. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 7884\u20137891","DOI":"10.1609\/aaai.v33i01.33017884"},{"key":"842_CR16","doi-asserted-by":"crossref","unstructured":"Li Z, Sergin ND, Yan H, et\u00a0al (2020) Tensor completion for weakly-dependent data on graph for metro passenger flow prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 4804\u20134810","DOI":"10.1609\/aaai.v34i04.5915"},{"key":"842_CR17","doi-asserted-by":"crossref","unstructured":"Li Z, Yan H, Zhang C, et\u00a0al (2021) Tensor topic models with graphs and applications on individualized travel patterns. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), IEEE, pp 2756\u20132761","DOI":"10.1109\/ICDE51399.2021.00320"},{"issue":"1","key":"842_CR18","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1109\/TPAMI.2012.88","volume":"35","author":"G Liu","year":"2012","unstructured":"Liu G, Lin Z, Yan S et al (2012) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171\u2013184","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"842_CR19","doi-asserted-by":"crossref","unstructured":"Liu H, Tong Y, Zhang P, et\u00a0al (2019) Hydra: A personalized and context-aware multi-modal transportation recommendation system. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 2314\u20132324","DOI":"10.1145\/3292500.3330660"},{"key":"842_CR20","doi-asserted-by":"crossref","unstructured":"Mei Q, Cai D, Zhang D, et\u00a0al (2008) Topic modeling with network regularization. In: Proceedings of the 17th international conference on World Wide Web, pp 101\u2013110","DOI":"10.1145\/1367497.1367512"},{"issue":"3","key":"842_CR21","first-page":"712","volume":"18","author":"K Mohamed","year":"2016","unstructured":"Mohamed K, C\u00f4me E, Oukhellou L et al (2016) Clustering smart card data for urban mobility analysis. IEEE Trans Intell Transp Syst 18(3):712\u2013728","journal-title":"IEEE Trans Intell Transp Syst"},{"key":"842_CR22","unstructured":"Newman D, Lau JH, Grieser K, et\u00a0al (2010) Automatic evaluation of topic coherence. In: Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics, pp 100\u2013108"},{"issue":"1","key":"842_CR23","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1145\/1007730.1007731","volume":"6","author":"L Parsons","year":"2004","unstructured":"Parsons L, Haque E, Liu H (2004) Subspace clustering for high dimensional data: a review. ACM SIGKDD Explorations Newsl 6(1):90\u2013105","journal-title":"ACM SIGKDD Explorations Newsl"},{"key":"842_CR24","doi-asserted-by":"crossref","unstructured":"Porteous I, Newman D, Ihler A, et\u00a0al (2008) Fast collapsed gibbs sampling for latent dirichlet allocation. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 569\u2013577","DOI":"10.1145\/1401890.1401960"},{"key":"842_CR25","doi-asserted-by":"crossref","unstructured":"Ren J, Xie Q (2017) Efficient od trip matrix prediction based on tensor decomposition. In: 2017 18th IEEE International Conference on Mobile Data Management (MDM), IEEE, pp 180\u2013185","DOI":"10.1109\/MDM.2017.32"},{"key":"842_CR26","doi-asserted-by":"crossref","unstructured":"Shi H, Yao Q, Guo Q, et\u00a0al (2020) Predicting origin-destination flow via multi-perspective graph convolutional network. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp 1818\u20131821","DOI":"10.1109\/ICDE48307.2020.00178"},{"key":"842_CR27","doi-asserted-by":"publisher","first-page":"511","DOI":"10.1016\/j.trb.2016.06.011","volume":"91","author":"L Sun","year":"2016","unstructured":"Sun L, Axhausen KW (2016) Understanding urban mobility patterns with a probabilistic tensor factorization framework. Transp Res Part B: Methodol 91:511\u2013524","journal-title":"Transp Res Part B: Methodol"},{"key":"842_CR28","doi-asserted-by":"publisher","first-page":"260","DOI":"10.1016\/j.trc.2018.03.004","volume":"90","author":"K Tang","year":"2018","unstructured":"Tang K, Chen S, Liu Z et al (2018) A tensor-based bayesian probabilistic model for citywide personalized travel time estimation. Transp Res Part C: Emerg Technol 90:260\u2013280","journal-title":"Transp Res Part C: Emerg Technol"},{"key":"842_CR29","doi-asserted-by":"publisher","first-page":"247","DOI":"10.1016\/j.trb.2020.05.006","volume":"138","author":"Y Tang","year":"2020","unstructured":"Tang Y, Jiang Y, Yang H et al (2020) Modeling and optimizing a fare incentive strategy to manage queuing and crowding in mass transit systems: Modeling and optimizing a fare incentive strategy to manage queuing and crowding in mass transit systems. Transp Res Part B: Methodol 138:247\u2013267","journal-title":"Transp Res Part B: Methodol"},{"key":"842_CR30","doi-asserted-by":"crossref","unstructured":"Teh YW, Newman D, Welling M (2007) A collapsed variational bayesian inference algorithm for latent dirichlet allocation. Tech. rep., CALIFORNIA UNIV IRVINE SCHOOL OF INFORMATION AND COMPUTER SCIENCE","DOI":"10.21236\/ADA629956"},{"key":"842_CR31","doi-asserted-by":"crossref","unstructured":"Wang S, He L, Stenneth L, et\u00a0al (2015) Citywide traffic congestion estimation with social media. In: Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM, p\u00a034","DOI":"10.1145\/2820783.2820829"},{"key":"842_CR32","doi-asserted-by":"crossref","unstructured":"Wang Y, Yin H, Chen H, et\u00a0al (2019) Origin-destination matrix prediction via graph convolution: a new perspective of passenger demand modeling. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 1227\u20131235","DOI":"10.1145\/3292500.3330877"},{"key":"842_CR33","unstructured":"Xiao H, Stibor T (2010) Efficient collapsed gibbs sampling for latent dirichlet allocation. In: Proceedings of 2nd asian conference on machine learning, JMLR Workshop and Conference Proceedings, pp 63\u201378"},{"key":"842_CR34","doi-asserted-by":"crossref","unstructured":"Yao L, Zhang Y, Wei B, et\u00a0al (2017) Incorporating knowledge graph embeddings into topic modeling. In: Thirty-first AAAI conference on artificial intelligence","DOI":"10.1609\/aaai.v31i1.10951"},{"key":"842_CR35","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1016\/j.trc.2019.05.042","volume":"105","author":"D Yi","year":"2019","unstructured":"Yi D, Su J, Liu C et al (2019) A machine learning based personalized system for driving state recognition. Transp Res Part C: Emerg Technol 105:241\u2013261","journal-title":"Transp Res Part C: Emerg Technol"},{"key":"842_CR36","doi-asserted-by":"crossref","unstructured":"Yu B, Yin H, Zhu Z (2018) Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp 3634\u20133640","DOI":"10.24963\/ijcai.2018\/505"},{"key":"842_CR37","doi-asserted-by":"publisher","first-page":"18,113","DOI":"10.1109\/ACCESS.2019.2894267","volume":"7","author":"K Yu","year":"2019","unstructured":"Yu K, He L, Philip SY et al (2019) Coupled tensor decomposition for user clustering in mobile internet traffic interaction pattern. IEEE Access 7:18,113-18,124","journal-title":"IEEE Access"},{"key":"842_CR38","doi-asserted-by":"crossref","unstructured":"Zhang C, Hu Q, Fu H, et\u00a0al (2017) Latent multi-view subspace clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4279\u20134287","DOI":"10.1109\/CVPR.2017.461"},{"issue":"1","key":"842_CR39","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1109\/TPAMI.2018.2877660","volume":"42","author":"C Zhang","year":"2018","unstructured":"Zhang C, Fu H, Hu Q et al (2018) Generalized latent multi-view subspace clustering. IEEE Trans Pattern Anal Mach Intell 42(1):86\u201399","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"11","key":"842_CR40","doi-asserted-by":"publisher","first-page":"3135","DOI":"10.1109\/TITS.2017.2679179","volume":"18","author":"J Zhao","year":"2017","unstructured":"Zhao J, Qu Q, Zhang F et al (2017) Spatio-temporal analysis of passenger travel patterns in massive smart card data. IEEE Trans Intell Transp Syst 18(11):3135\u20133146","journal-title":"IEEE Trans Intell Transp Syst"},{"issue":"102","key":"842_CR41","first-page":"627","volume":"116","author":"Z Zhao","year":"2020","unstructured":"Zhao Z, Koutsopoulos HN, Zhao J (2020) Discovering latent activity patterns from transit smart card data: A spatiotemporal topic model. Transp Res Part C: Emerg Technol 116(102):627","journal-title":"Transp Res Part C: Emerg Technol"},{"key":"842_CR42","first-page":"1","volume-title":"2017 IEEE SmartWorld","author":"R Zhong","year":"2017","unstructured":"Zhong R, Lv W, Du B et al (2017) Spatiotemporal multi-task learning for citywide passenger flow prediction. 2017 IEEE SmartWorld. Ubiquitous Intelligence Computing, Advanced Trusted Computed, Scalable Computing Communications, Cloud Big Data Computing, Internet of People and Smart City Innovation (SmartWorld\/SCALCOM\/UIC\/ATC\/CBDCom\/IOP\/SCI), pp 1\u20138"},{"issue":"4","key":"842_CR43","doi-asserted-by":"publisher","first-page":"1065","DOI":"10.1214\/17-BA1070","volume":"13","author":"M Zhou","year":"2018","unstructured":"Zhou M (2018) Nonparametric bayesian negative binomial factor analysis. Bayesian Anal 13(4):1065\u20131093","journal-title":"Bayesian Anal"},{"issue":"1","key":"842_CR44","first-page":"2237","volume":"13","author":"J Zhu","year":"2012","unstructured":"Zhu J, Ahmed A, Xing EP (2012) Medlda: maximum margin supervised topic models. J. Mach Learn Res 13(1):2237\u20132278","journal-title":"J. Mach Learn Res"}],"container-title":["Data Mining and Knowledge Discovery"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10618-022-00842-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10618-022-00842-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10618-022-00842-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,7,25]],"date-time":"2022-07-25T04:30:57Z","timestamp":1658723457000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10618-022-00842-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,18]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,7]]}},"alternative-id":["842"],"URL":"https:\/\/doi.org\/10.1007\/s10618-022-00842-3","relation":{},"ISSN":["1384-5810","1573-756X"],"issn-type":[{"value":"1384-5810","type":"print"},{"value":"1573-756X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,18]]},"assertion":[{"value":"30 September 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 May 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 June 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no relevant financial or non-financial interests to disclose.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"The code is publicly accessible by .","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}},{"value":"This paper satisfies the compliance with ethical standards. There is no potential conflicts of interest; The research does not involve Human Participants and\/or Animals; The data in this paper has been anonymized to protect data privacy; Informed consent was obtained from all individual participants.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval"}},{"value":"Informed consent was obtained from all individual participants included in the study.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"All individual participants have consented to the submission of the regular paper to the journal.","order":6,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}]}}