{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T01:58:48Z","timestamp":1761789528458,"version":"build-2065373602"},"reference-count":27,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2019,1,22]],"date-time":"2019-01-22T00:00:00Z","timestamp":1548115200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>The proliferation of indoor and outdoor tracking devices has led to a vast amount of spatial data. Each object can be described by several trajectories that, once analysed, can yield to significant knowledge. In particular, pattern analysis by clustering generic trajectories can give insight into objects sharing the same patterns. Still, sequential clustering approaches fail to handle large volumes of data. Hence, the necessity of distributed systems to be able to infer knowledge in a trivial time interval. In this paper, we detail an efficient, scalable and distributed execution pipeline for clustering raw trajectories. The clustering is achieved via a fuzzy similarity relation obtained by the transitive closure of a proximity relation. Moreover, the pipeline is integrated in Spark, implemented in Scala and leverages the Core and Graphx libraries making use of Resilient Distributed Datasets (RDD) and graph processing. Furthermore, a new simple, but very efficient, partitioning logic has been deployed in Spark and integrated into the execution process. The objective behind this logic is to equally distribute the load among all executors by considering the complexity of the data. In particular, resolving the load balancing issue has reduced the conventional execution time in an important manner. Evaluation and performance of the whole distributed process has been analysed by handling the Geolife project\u2019s GPS trajectory dataset.<\/jats:p>","DOI":"10.3390\/a12020029","type":"journal-article","created":{"date-parts":[[2019,1,24]],"date-time":"2019-01-24T03:52:32Z","timestamp":1548301952000},"page":"29","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["A Distributed Execution Pipeline for Clustering Trajectories Based on a Fuzzy Similarity Relation"],"prefix":"10.3390","volume":"12","author":[{"given":"Soufiane","family":"Maguerra","sequence":"first","affiliation":[{"name":"LIM\/IOS, FSTM, Hassan II University of Casablanca, Mohammedia 20000, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1301-5696","authenticated-orcid":false,"given":"Azedine","family":"Boulmakoul","sequence":"additional","affiliation":[{"name":"LIM\/IOS, FSTM, Hassan II University of Casablanca, Mohammedia 20000, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lamia","family":"Karim","sequence":"additional","affiliation":[{"name":"National School of Applied Sciences Berrechid, Hassan 1st University, Berrechid 26002, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hassan","family":"Badir","sequence":"additional","affiliation":[{"name":"National School of Applied Sciences Tangier, Abdelmalek Essa\u00e2di University, T\u00e9touan 93000, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2019,1,22]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Zheng, Y., Zhang, L., Xie, X., and Ma, W.Y. (2009, January 20\u201324). Mining Interesting Locations and Travel Sequences From GPS Trajectories. Proceedings of the International conference on World Wide Web 2009, Madrid, Spain.","DOI":"10.1145\/1526709.1526816"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Zheng, Y., Li, Q., Chen, Y., Xie, X., and Ma, W.Y. (2008, January 21\u201324). Understanding Mobility Based on GPS Data. Proceedings of the Ubicomp 2008, Seoul, Korea.","DOI":"10.1145\/1409635.1409677"},{"key":"ref_3","first-page":"32","article-title":"Geolife: A collaborative social networking service among user, location and trajectory","volume":"33","author":"Zheng","year":"2010","journal-title":"IEEE Data Eng. Bull."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Magdy, N., Sakr, M.A., Mostafa, T., and El-Bahnasy, K. (2015, January 12\u201314). Review on trajectory similarity measures. Proceedings of the 2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, Egypt.","DOI":"10.1109\/IntelCIS.2015.7397286"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Zheng, V.W., Cao, B., Zheng, Y., Xie, X., and Yang, Q. (2010, January 11\u201315). Collaborative Filtering Meets Mobile Recommendation: A User-Centered Approach. Proceedings of the AAAI, Atlanta, GA, USA.","DOI":"10.1609\/aaai.v24i1.7577"},{"key":"ref_6","unstructured":"Ioannidis, Y.E. (1986, January 25\u201328). On the computation of the transitive closure of relational operators. Proceedings of the VLDB, Berkeley, CA, USA."},{"key":"ref_7","unstructured":"Brodie, M.L., and Mylopoulos, J. (2012). On Knowledge Base Management Systems: Integrating Artificial Intelligence and Database Technologies, Springer Science & Business Media."},{"key":"ref_8","unstructured":"Zadeh, L.A. (1965). Fuzzy Logic and Its Applications, Springer."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/0895-7177(93)90202-A","article-title":"A survey of fuzzy clustering","volume":"18","author":"Yang","year":"1993","journal-title":"Math. Comput. Model."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1016\/S0165-0114(98)00038-4","article-title":"Entropy-based fuzzy clustering and fuzzy modeling","volume":"113","author":"Yao","year":"2000","journal-title":"Fuzzy Sets Syst."},{"key":"ref_11","first-page":"14","article-title":"Fuzzy structural primitives for spatial data mining","volume":"5","author":"Boulmakoul","year":"2002","journal-title":"Complex Syst."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Kondruk, N. (2017). Clustering method based on fuzzy binary relation. East.-Eur. J. Enterp. Technol., 2.","DOI":"10.15587\/1729-4061.2017.94961"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1109\/TSMC.1971.5408605","article-title":"Pattern classification based on fuzzy relations","volume":"SMC-1","author":"Tamura","year":"1971","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1016\/S0165-0114(99)00146-3","article-title":"Cluster analysis based on fuzzy relations","volume":"120","author":"Yang","year":"2001","journal-title":"Fuzzy Sets Syst."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1016\/j.ejor.2004.03.018","article-title":"Cluster analysis based on fuzzy equivalence relation","volume":"166","author":"Liang","year":"2005","journal-title":"Eur. J. Oper. Res."},{"key":"ref_16","unstructured":"Houtsma, M.A., Apers, P.M., and Ceri, S. (1990, January 13\u201316). Distributed transitive closure computations: The disconnection set approach. Proceedings of the VLDB, Brisbane, Queensland, Australia."},{"key":"ref_17","unstructured":"Houtsma, M.A., Apers, P.M., and Schipper, G.L. (1993, January 19\u201323). Data fragmentation for parallel transitive closure strategies. Proceedings of the IEEE Ninth International Conference on Data Engineering, Vienna, Austria."},{"key":"ref_18","unstructured":"Gribkoff, E. (2019, January 19). Distributed Algorithms for the Transitive Closure. Available online: https:\/\/pdfs.semanticscholar.org\/57fd\/5969b2a454c90b57b12c49e90847fee079a8.pdf."},{"key":"ref_19","unstructured":"Boulmakoul, A., Maguerra, S., Karim, L., and Badir, H. (2017, January 24\u201325). A Scalable, Distributed and Directed Fuzzy Relational Algorithm for Clustering Semantic Trajectories. Proceedings of the Sixth International Conference on Innovation and New Trends in Information Systems, Casablanca, Morocco."},{"key":"ref_20","unstructured":"Maguerra, S., Boulmakoul, A., Karim, L., and Badir, H. (2018, January 2\u20133). Scalable Solution for Profiling Potential Cyber-criminals in Twitter. Proceedings of the Big Data & Applications 12th Edition of the Conference on Advances of Decisional Systems, Marrakech, Morocco."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"58939","DOI":"10.1109\/ACCESS.2018.2866364","article-title":"Spatio-temporal vessel trajectory clustering based on data mapping and density","volume":"6","author":"Li","year":"2018","journal-title":"IEEE Access"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Yi, D., Su, J., Liu, C., and Chen, W.H. (2018). Trajectory Clustering Aided Personalized Driver Intention Prediction for Intelligent Vehicles. IEEE Trans. Ind. Inform.","DOI":"10.1109\/TII.2018.2890141"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1016\/S0019-9958(65)90241-X","article-title":"Fuzzy sets","volume":"8","author":"Zadeh","year":"1965","journal-title":"Inf. Control"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1016\/0165-0114(78)90012-X","article-title":"Fuzzy partitions and relations; an axiomatic basis for clustering","volume":"1","author":"Bezdek","year":"1978","journal-title":"Fuzzy Sets Syst."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Boulmakoul, A., Karim, L., and Lbath, A. (arXiv, 2012). Moving object trajectories meta-model and spatio-temporal queries, arXiv.","DOI":"10.5121\/ijdms.2012.4203"},{"key":"ref_26","unstructured":"Cook, J.D. (2018, November 30). Converting Miles to Degrees Longitude or Latitude. Available online: https:\/\/www.johndcook.com\/blog\/2009\/04\/27\/converting-miles-to-degrees-longitude-or-latitude\/."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1145\/360825.360861","article-title":"A linear space algorithm for computing maximal common subsequences","volume":"18","author":"Hirschberg","year":"1975","journal-title":"Commun. ACM"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/12\/2\/29\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:27:56Z","timestamp":1760185676000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/12\/2\/29"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,1,22]]},"references-count":27,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2019,2]]}},"alternative-id":["a12020029"],"URL":"https:\/\/doi.org\/10.3390\/a12020029","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2019,1,22]]}}}