{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T22:30:31Z","timestamp":1775946631735,"version":"3.50.1"},"reference-count":30,"publisher":"National Library of Serbia","issue":"3","license":[{"start":{"date-parts":[[2025,1,1]],"date-time":"2025-01-01T00:00:00Z","timestamp":1735689600000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["ComSIS","COMPUT SCI INF SYST","COMPUT SCI INFORM SY","COMPUTER SCI INFORM","COMSIS J"],"published-print":{"date-parts":[[2025]]},"abstract":"<jats:p>Illegal gambling websites use advanced technology to evade regulations, posing cybersecurity challenges. To address this, we propose a machine learning method to identify these sites and analyze user behavior accurately. The method extracts key data from post messages in a real-world network environment, generating word vectors via Word2Vec with TF-IDF, which are then downscaled and feature-extracted using a Stacked Denoising Auto Encoder (SDAE). Next, this paper uses Agglomerative Clustering, improved through a combination of distance caching and heap optimization, to initially cluster post-template websites of the same type by clustering them into the same cluster. Then, multiple algorithms are integrated within each website cluster to cluster users? different operational behaviors into different clusters based on the cosine similarity consensus function voting secondary clustering. Results show improved detection of illegal gambling sites and classification of user activities, offering new insights for combating these sites.<\/jats:p>","DOI":"10.2298\/csis240930019z","type":"journal-article","created":{"date-parts":[[2025,3,5]],"date-time":"2025-03-05T09:17:15Z","timestamp":1741166235000},"page":"859-879","source":"Crossref","is-referenced-by-count":1,"title":["Identification and detection of illegal gambling websites and analysis of user behavior"],"prefix":"10.2298","volume":"22","author":[{"given":"Zhimin","family":"Zhang","sequence":"first","affiliation":[{"name":"College of Information Engineering, Shanghai Maritime University, Shanghai, China"}]},{"given":"Dezhi","family":"Han","sequence":"additional","affiliation":[{"name":"College of Information Engineering, Shanghai Maritime University, Shanghai, China"}]},{"given":"Songyang","family":"Wu","sequence":"additional","affiliation":[{"name":"Network Security Center, The Third Research Institute of the Ministry of Public Security, Shanghai, China"}]},{"given":"Wenqi","family":"Sun","sequence":"additional","affiliation":[{"name":"Network Security Center, The Third Research Institute of the Ministry of Public Security, Shanghai, China"}]},{"given":"Shuxin","family":"Shi","sequence":"additional","affiliation":[{"name":"College of Information Engineering, Shanghai Maritime University, Shanghai, China"}]}],"member":"1078","reference":[{"key":"ref1","unstructured":"Ghelfi, M., Scattola, P., Giudici, G., Velasco, V.: Online gambling: A systematic review of risk and protective factors in the adult population. In: Proceedings of the Journal of Gambling Studies. vol. 39, pp. 1-27 (2023)"},{"key":"ref2","doi-asserted-by":"crossref","unstructured":"Kong, X., Wang, C., Li, Y., Hou, J., Jiang, T., Liu, Z.: Traffic classification based on cnn-lstm hybrid network. In: International Forum on Digital TV and Wireless Multimedia Communications. pp. 401-411. Springer Singapore, Singapore (2021)","DOI":"10.1007\/978-981-19-2266-4_31"},{"key":"ref3","doi-asserted-by":"crossref","unstructured":"Mu, J., He, H., Li, L., Pang, S., Liu, C.: A hybrid network intrusion detection model based on cnn-lstm and attention mechanism. In: International Conference on Frontiers in Cyber Security. pp. 214-229. Springer Singapore, Singapore (2021)","DOI":"10.1007\/978-981-19-0523-0_14"},{"key":"ref4","doi-asserted-by":"crossref","unstructured":"Alshingiti, Z., Alaqel, R., Al-Muhtadi, J., Haq, Q.E.U., Saleem, K., Faheem, M.H.: A deep learning-based phishing detection system using cnn, lstm, and lstm-cnn. Electronics 12(1), 232 (2023)","DOI":"10.3390\/electronics12010232"},{"key":"ref5","doi-asserted-by":"crossref","unstructured":"Alnemari, S., Alshammari, M.: Detecting phishing domains using machine learning. Applied Sciences 13(8), 4649 (2023)","DOI":"10.3390\/app13084649"},{"key":"ref6","doi-asserted-by":"crossref","unstructured":"Chen, Z., Fu, L., Yao, J., Guo, W., Plant, C., Wang, S.: Learnable graph convolutional network and feature fusion for multi-view learning. Information Fusion 95, 109-119 (2023)","DOI":"10.1016\/j.inffus.2023.02.013"},{"key":"ref7","doi-asserted-by":"crossref","unstructured":"Huang, Z., Ren, Y., Pu, X., Huang, S., Xu, Z., He, L.: Self-supervised graph attention networks for deep weighted multi-view clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 37, pp. 7936-7943 (2023)","DOI":"10.1609\/aaai.v37i7.25960"},{"key":"ref8","doi-asserted-by":"crossref","unstructured":"Chen, Y., Zheng, R., Zhou, A., Liao, S., Liu, L.: Automatic detection of pornographic and gambling websites based on visual and textual content using a decision mechanism. Sensors 20(14), 3989 (2020)","DOI":"10.3390\/s20143989"},{"key":"ref9","doi-asserted-by":"crossref","unstructured":"Wang, C., Zhang, M., Shi, F., Xue, P., Li, Y.: A hybrid multimodal data fusion-based method for identifying gambling websites. Electronics 11(16), 2489 (2022)","DOI":"10.3390\/electronics11162489"},{"key":"ref10","doi-asserted-by":"crossref","unstructured":"Sun, G., Ye, F., Chai, T., Zhang, Z., Tong, X., Prasad, S.: Gambling domain name recognition via certificate and textual analysis. The Computer Journal 66(8), 1829-1839 (2023)","DOI":"10.1093\/comjnl\/bxac043"},{"key":"ref11","doi-asserted-by":"crossref","unstructured":"Singh, H., Kaur, P.: An effective clustering-based web page recommendation framework for e-commerce websites. SN Computer Science 2(4), 339 (2021)","DOI":"10.1007\/s42979-021-00736-z"},{"key":"ref12","doi-asserted-by":"crossref","unstructured":"Li, Y., Chu, X., Tian, D., Feng, J., Mu, W.: Customer segmentation using k-means clustering and the adaptive particle swarm optimization algorithm. Applied Soft Computing 113, 107924 (2021)","DOI":"10.1016\/j.asoc.2021.107924"},{"key":"ref13","doi-asserted-by":"crossref","unstructured":"Liu, L.: e-commerce personalized recommendation based on machine learning technology. Mobile Information Systems 2022(1), 1761579 (2022)","DOI":"10.1155\/2022\/1761579"},{"key":"ref14","doi-asserted-by":"crossref","unstructured":"Qiao, M., Wei, L., Han, D., et al.: Efficient multi-party psi and its application in port management. Computer Standards & Interfaces 91, 103884 (2025)","DOI":"10.1016\/j.csi.2024.103884"},{"key":"ref15","unstructured":"Jiang, T., Jia, L., Wan, C.M., et al.: The text modeling method of tibetan text combining word2vec and improved tf-idf. In: Proceedings of 2020 4th International Conference on Electrical, Mechanical and Computer Engineering (ICEMCE 2020). vol. 3, p. 8. IOP Publishing (2020)"},{"key":"ref16","doi-asserted-by":"crossref","unstructured":"Zhang, T., Wang, L.: Research on text classification method based on word2vec and improved tf-idf. In: Advances in Intelligent Systems and Interactive Applications: Proceedings of the 4th International Conference on Intelligent, Interactive Systems and Applications (IISA2019). pp. 199-205. Springer International Publishing (2020)","DOI":"10.1007\/978-3-030-34387-3_24"},{"key":"ref17","doi-asserted-by":"crossref","unstructured":"Xin, X., Han, D., Cui, M.: Daaps: A deformable-attention-based anchor-free person search model. Computers, Materials & Continua 77(2) (2023)","DOI":"10.32604\/cmc.2023.042308"},{"key":"ref18","doi-asserted-by":"crossref","unstructured":"Ni, Q., Fan, Z., Zhang, L., Nugent, C.D., Cleland, I., Zhang, Y., Zhou, N.: Leveraging wearable sensors for human daily activity recognition with stacked denoising autoencoders. Sensors 20, 5114 (2020)","DOI":"10.3390\/s20185114"},{"key":"ref19","doi-asserted-by":"crossref","unstructured":"Fern\u00e1ndez-Garc\u00eda, M.E., Sancho-G\u00f3mez, J.L., Ros-Ros, A., Figueiras-Vidal, A.R.: Complete stacked denoising auto-encoders for regression. Neural Processing Letters 53, 787-797 (2021)","DOI":"10.1007\/s11063-020-10419-0"},{"key":"ref20","doi-asserted-by":"crossref","unstructured":"Shkaberina, G., Verenev, L., Tovbis, E., Rezova, N., Kazakovtsev, L.: Clustering algorithm with a greedy agglomerative heuristic and special distance measures. Algorithms 15, 191 (2022)","DOI":"10.3390\/a15060191"},{"key":"ref21","doi-asserted-by":"crossref","unstructured":"Ali, M.A., PP, F.R., Abd Elminaam, D.S.: An efficient heap based optimizer algorithm for feature selection. Mathematics 10, 2396 (2022)","DOI":"10.3390\/math10142396"},{"key":"ref22","doi-asserted-by":"crossref","unstructured":"Ezugwu, A.E., Ikotun, A.M., Oyelade, O.O., Abualigah, L., Agushaka, J.O., Eke, C.I., Akinyelu, A.A.: A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Engineering Applications of Artificial Intelligence 110, 104743 (2022)","DOI":"10.1016\/j.engappai.2022.104743"},{"key":"ref23","doi-asserted-by":"crossref","unstructured":"Iffath, N., Mummadi, U.K., Taranum, F., Ahmad, S.S., Khan, I., Shravani, D.: Phishing website detection using ensemble learning models. In: AIP Conference Proceedings. vol. 3007, p. 1. AIP Publishing (2024)","DOI":"10.1063\/5.0192754"},{"key":"ref24","doi-asserted-by":"crossref","unstructured":"Chen, G.: Scalable spectral clustering with cosine similarity. In: 2018 24th International Conference on Pattern Recognition (ICPR). pp. 314-319. IEEE (2018)","DOI":"10.1109\/ICPR.2018.8546193"},{"key":"ref25","doi-asserted-by":"crossref","unstructured":"Pho, K.H., Akbarzadeh, H., Parvin, H., et al.: A multi-level consensus function clustering ensemble. Soft Computing 25, 13147-13165 (2021)","DOI":"10.1007\/s00500-021-06092-7"},{"key":"ref26","doi-asserted-by":"crossref","unstructured":"Alizade, M., Kheni, R., Price, S., Sousa, B.C., Cote, D.L., Neamtu, R.: A comparative study of clustering methods for nanoindentation mapping data. Integrating Materials and Manufacturing Innovation 13, 526-540 (2024)","DOI":"10.1007\/s40192-024-00349-3"},{"key":"ref27","doi-asserted-by":"crossref","unstructured":"Li, J., Han, D., Wu, Z., et al.: A novel system for medical equipment supply chain traceability based on alliance chain and attribute and role access control. Future Generation Computer Systems 142, 195-211 (2023)","DOI":"10.1016\/j.future.2022.12.037"},{"key":"ref28","doi-asserted-by":"crossref","unstructured":"Li, J., Han, D., Weng, T.H., et al.: A secure data storage and sharing scheme for port supply chain based on blockchain and dynamic searchable encryption. Computer Standards & Interfaces 91, 103887 (2025)","DOI":"10.1016\/j.csi.2024.103887"},{"key":"ref29","doi-asserted-by":"crossref","unstructured":"Han, D., Pan, N., Li, K.C.: A traceable and revocable ciphertext-policy attribute-based encryption scheme based on privacy protection. IEEE Transactions on Dependable and Secure Computing 19(1), 316-327 (2022)","DOI":"10.1109\/TDSC.2020.2977646"},{"key":"ref30","doi-asserted-by":"crossref","unstructured":"Han, D., Zhu, Y., Li, D., Liang, W., Souri, A., Li, K.C.: A blockchain-based auditable access control system for private data in service-centric iot environments. IEEE Transactions on Industrial Informatics 18(5), 3530-3540 (2022)","DOI":"10.1109\/TII.2021.3114621"}],"container-title":["Computer Science and Information Systems"],"original-title":[],"language":"en","deposited":{"date-parts":[[2025,7,18]],"date-time":"2025-07-18T09:17:42Z","timestamp":1752830262000},"score":1,"resource":{"primary":{"URL":"https:\/\/doiserbia.nb.rs\/Article.aspx?ID=1820-02142500019Z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025]]},"references-count":30,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025]]}},"URL":"https:\/\/doi.org\/10.2298\/csis240930019z","relation":{},"ISSN":["1820-0214","2406-1018"],"issn-type":[{"value":"1820-0214","type":"print"},{"value":"2406-1018","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025]]}}}