{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,4]],"date-time":"2025-11-04T16:21:38Z","timestamp":1762273298570,"version":"3.41.0"},"reference-count":51,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2024,12,13]],"date-time":"2024-12-13T00:00:00Z","timestamp":1734048000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2024,12,31]]},"abstract":"<jats:p>The anonymity and untraceability benefits of the dark web increased its popularity exponentially. The cost of these technical benefits is that such anonymity has created a suitable womb for illicit activity. Hence\u2014in collaboration with cybersecurity practitioners and law-enforcement agencies\u2014the research community provided approaches for recognizing and classifying illicit activities. Most of these approaches exploit textual content from dark web markets, whereas few used images that originated from them. This article investigates alternative techniques for recognizing illegal activities from images. The significant contributions of our work are threefold: (a) We investigate label-agnostic learning techniques like One-Shot and Few-Shot learning that use Siamese Neural Networks. Our approach manages to handle small-scale datasets with promising accuracy. In particular, the Siamese Neural Network approach reaches 90.9% on 5-Shot experiments over a 10-class dataset. (b) This study\u2019s satisfactory findings facilitate the creation of potent tools to assist authorities in identifying illicit content on the Web. Moreover, our proof-of-concept approach demonstrated the ability to recognize illegal images using a limited number of files, reducing the time constraint in collecting illegal images. (c) We provide a complete labeled dataset of 3,570 images from 55 different categories from dark web markets that can be used for future research activities.<\/jats:p>","DOI":"10.1145\/3696458","type":"journal-article","created":{"date-parts":[[2024,9,20]],"date-time":"2024-09-20T16:41:46Z","timestamp":1726850506000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["<i>Few Images, Many Insights<\/i>\n            : Illicit Content Detection Using a Limited Number of Images"],"prefix":"10.1145","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0724-3772","authenticated-orcid":false,"given":"Giuseppe","family":"Cascavilla","sequence":"first","affiliation":[{"name":"Tilburg University, Tilburg, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4689-3401","authenticated-orcid":false,"given":"Gemma","family":"Catolino","sequence":"additional","affiliation":[{"name":"University of Salerno, Fisciano, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3612-1934","authenticated-orcid":false,"given":"Mauro","family":"Conti","sequence":"additional","affiliation":[{"name":"University of Padova, Padova, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-1819-1412","authenticated-orcid":false,"given":"Dimos","family":"Mellios","sequence":"additional","affiliation":[{"name":"Tilburg University, Tilburg, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1230-8961","authenticated-orcid":false,"given":"Damian","family":"Tamburri","sequence":"additional","affiliation":[{"name":"Eindhoven University of Technology, Eindhoven, The Netherlands"}]}],"member":"320","published-online":{"date-parts":[[2024,12,13]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"35","volume-title":"15th Conference of the European Chapter of the Association for Computational Linguistics","volume":"1","author":"Nabki Mhd Wesam Al","year":"2017","unstructured":"Mhd Wesam Al Nabki, Eduardo Fidalgo, Enrique Alegre, and Ivan de Paz. 2017. Classifying illegal activities on Tor network based on web textual contents. In 15th Conference of the European Chapter of the Association for Computational Linguistics, Vol. 1, Long Papers, 35\u201343."},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2019.01.029"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSEC.2024.3407859"},{"key":"e_1_3_2_5_2","doi-asserted-by":"crossref","first-page":"324","DOI":"10.5220\/0012049400003555","volume-title":"20th International Conference on Security and Cryptography - SECRYPT","author":"Cascavilla G.","year":"2023","unstructured":"G. Cascavilla, G. Catolino, M. Conti, D. Mellios, and D. Tamburri. 2023. When the few outweigh the many: Illicit content recognition with few-shot learning. In 20th International Conference on Security and Cryptography - SECRYPT. INSTICC, SciTePress, 324\u2013334."},{"key":"e_1_3_2_6_2","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1109\/ICSME55016.2022.00055","volume-title":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","author":"Cascavilla G.","year":"2022","unstructured":"G. Cascavilla, G. Catolino, F. Ebert, D. A. Tamburri, and W. J. van den Heuvel. 2022. \u201cWhen the code becomes a crime scene\u201d towards dark web threat intelligence with software quality metrics. In 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME), 439\u2013443."},{"key":"e_1_3_2_7_2","doi-asserted-by":"crossref","first-page":"620","DOI":"10.5220\/0011298600003283","volume-title":"19th International Conference on Security and Cryptography","volume":"1","author":"Cascavilla G.","year":"2022","unstructured":"G. Cascavilla, G. Catolino, and M. Sangiovanni. 2022. Illicit Darkweb classification via natural-language processing: Classifying illicit content of webpages based on textual information. In 19th International Conference on Security and Cryptography, Vol. 1 SECRYPT, INSTICC, SciTePress, 620\u2013626."},{"key":"e_1_3_2_8_2","article-title":"Cybercrime threat intelligence: A systematic multi-vocal literature review","volume":"105","author":"Cascavilla Giuseppe","year":"2021","unstructured":"Giuseppe Cascavilla, Damian A. Tamburri, and Willem-Jan Van Den Heuvel. 2021. Cybercrime threat intelligence: A systematic multi-vocal literature review. Computers & Security 105 (2021), 102258.","journal-title":"Computers & Security"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.145"},{"key":"e_1_3_2_10_2","first-page":"539","volume-title":"2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"1","author":"Chopra S.","year":"2005","unstructured":"S. Chopra, R. Hadsell, and Y. LeCun. 2005. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 539\u2013546."},{"key":"e_1_3_2_11_2","doi-asserted-by":"crossref","first-page":"4271","DOI":"10.18653\/v1\/P19-1419","volume-title":"57th Annual Meeting of the Association for Computational Linguistics","author":"Choshen Leshem","year":"2019","unstructured":"Leshem Choshen, Dan Eldad, Daniel Hershcovich, Elior Sulem, and Omri Abend. 2019. The language of legal and illegal activity on the Darknet. In 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 4271\u20134279."},{"key":"e_1_3_2_12_2","doi-asserted-by":"crossref","first-page":"489","DOI":"10.1007\/978-3-031-06975-8_28","volume-title":"ICT Systems Security and Privacy Protection","author":"Covrig Bogdan","year":"2022","unstructured":"Bogdan Covrig, Enrique Barrueco Mikelarena, Constanta Rosca, Catalina Goanta, Gerasimos Spanakis, and Apostolis Zarras. 2022. Upside down: Exploring the ecosystem of Dark Web data markets. In ICT Systems Security and Privacy Protection.Weizhi Meng, Simone Fischer-H\u00fcbner, and Christian D. Jensen (Eds.), Springer International Publishing, Cham, 489\u2013506."},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3459665"},{"key":"e_1_3_2_14_2","first-page":"54","article-title":"Laying foundations for effective machine learning in law enforcement. Majura \u2013 A labelling schema for child exploitation materials","volume":"26","author":"Dalins Janis","year":"2018","unstructured":"Janis Dalins, Yuriy Tyshetskiy, Campbell Wilson, Mark J. Carman, and Douglas Boudry. 2018. Laying foundations for effective machine learning in law enforcement. Majura \u2013 A labelling schema for child exploitation materials. Digital Investigation 26 (2018), 40\u2013 54.","journal-title":"Digital Investigation"},{"key":"e_1_3_2_15_2","unstructured":"Ali Fayzi Mohammad Fayzi and Kourosh Dadashtabar Ahmadi. 2023. Dark Web activity classification using deep learning. arXiv:2306.07980. Retrieved from https:\/\/arxiv.org\/abs\/2306.07980"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2006.79"},{"key":"e_1_3_2_17_2","first-page":"22","article-title":"Classifying suspicious content in Tor Darknet through semantic attention keypoint filtering","volume":"30","author":"Fidalgo Eduardo","year":"2019","unstructured":"Eduardo Fidalgo, Enrique Alegre, Laura Fern\u00e1ndez-Robles, and V\u00edctor Gonz\u00e1lez-Castro. 2019. Classifying suspicious content in Tor Darknet through semantic attention keypoint filtering. Digital Investigation 30 (2019), 12\u2013 22.","journal-title":"Digital Investigation"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-67180-2_58"},{"key":"e_1_3_2_19_2","unstructured":"Yang Fu Yunchao Wei Guanshuo Wang Jiwei Li Xi Zhou Honghui Shi and Thomas S. Huang. 2018. One shot domain adaptation for person re-identification."},{"key":"e_1_3_2_20_2","unstructured":"Victor Garcia and Joan Bruna. 2017. Few-shot learning with graph neural networks. arXiv:1711.04043. Retrieved from https:\/\/arxiv.org\/abs\/1711.04043"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098193"},{"key":"e_1_3_2_22_2","first-page":"1","volume-title":"2022 International Conference on Computer Communication and Informatics (ICCCI)","author":"Gulati H.","year":"2022","unstructured":"H. Gulati, A. Saxena, N. Pawar, P. Tanwar, and S. Sharma. 2022. Dark Web in modern world theoretical perspective: A survey. In 2022 International Conference on Computer Communication and Informatics (ICCCI), 1\u201310."},{"key":"e_1_3_2_23_2","first-page":"1735","volume-title":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR \u201906)","volume":"2","author":"Hadsell R.","year":"2006","unstructured":"R. Hadsell, S. Chopra, and Y. LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR \u201906) , Vol. 2. IEEE, 1735\u20131742."},{"key":"e_1_3_2_24_2","first-page":"105","article-title":"Detecting and classifying online dark visual propaganda","volume":"89","author":"Hashemi Mahdi","year":"2019","unstructured":"Mahdi Hashemi and Margeret Hall. 2019. Detecting and classifying online dark visual propaganda. Image and Vision Computing 89 (2019), 95\u2013105.","journal-title":"Image and Vision Computing"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3322645.3322691"},{"key":"e_1_3_2_26_2","unstructured":"A. Hermans L. Beyer and B. Leibe. 2017. In defense of the triplet loss for person re-identification. arXiv:1703.07737. Retrieved from https:\/\/arxiv.org\/abs\/1703.07737"},{"key":"e_1_3_2_27_2","unstructured":"Nathan Hilliard Lawrence Phillips Scott Howland Art\u00ebm Yankov Courtney D. Corley and Nathan O. Hodas. 2018. Few-shot learning with metric-agnostic conditional embeddings. arXiv:1802.04376. Retrieved from https:\/\/arxiv.org\/abs\/1802.04376"},{"key":"e_1_3_2_28_2","unstructured":"Elad Hoffer and Nir Ailon. 2014. Deep metric learning using Triplet network. arXiv:1412.6622. Retrieved from https:\/\/arxiv.org\/abs\/1412.6622"},{"key":"e_1_3_2_29_2","unstructured":"Garth Griffin Juan Sanchez. 2019. Who\u2019s Afraid of the Dark? Hype Versus Reality on the Dark Web. Retrieved from https:\/\/www.recordedfuture.com\/blog\/dark-web-reality"},{"key":"e_1_3_2_30_2","first-page":"1","volume-title":"2023 IEEE 32nd International Symposium on Industrial Electronics (ISIE)","author":"Khan M.","year":"2023","unstructured":"M. Khan, M. Saeed, A. El Saddik, and W. Gueaieb. 2023. ARTriViT: Automatic face recognition system using ViT-based Siamese neural networks with a triplet loss. In 2023 IEEE 32nd International Symposium on Industrial Electronics (ISIE), 1\u20136."},{"key":"e_1_3_2_31_2","article-title":"Siamese neural networks for one-shot image recognition","volume":"2","author":"Koch Gregory","year":"2015","unstructured":"Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. 2015. Siamese neural networks for one-shot image recognition. In ICML Deep Learning Workshop, Vol. 2. Lille.","journal-title":"ICML Deep Learning Workshop"},{"key":"e_1_3_2_32_2","article-title":"One shot learning of simple visual concepts","volume":"33","author":"Lake Brenden M.","year":"2011","unstructured":"Brenden M. Lake, Ruslan Salakhutdinov, Jason Gross, and Joshua B. Tenenbaum. 2011. One shot learning of simple visual concepts. Cognitive Science 33 (2011).","journal-title":"Cognitive Science"},{"key":"e_1_3_2_33_2","unstructured":"Zhenguo Li Fengwei Zhou Fei Chen and Hang Li. 2017. Meta-SGD: Learning to learn quickly for few shot learning. arXiv: 1707.09835. Retrieved from https:\/\/arxiv.org\/abs\/1707.09835"},{"key":"e_1_3_2_34_2","unstructured":"Xingyu Lin Hao Wang Zhihao Li Yimeng Zhang Alan Yuille and Tai Sing Lee. 2017. Transfer of view-manifold learning to similarity perception of novel objects. arXiv:1704.00033. Retrieved from https:\/\/arxiv.org\/abs\/1704.00033"},{"key":"e_1_3_2_35_2","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Maaten Laurens van der","year":"2008","unstructured":"Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, (Nov. 2008), 2579\u20132605.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2016.7899663"},{"key":"e_1_3_2_37_2","unstructured":"Mateusz Ochal Massimiliano Patacchiola Jose Vazquez Amos Storkey and Sen Wang. 2021. Class Imbalance in Few-Shot Learning. Retrieved from https:\/\/openreview.net\/forum?id=j0yLJ-MsgJ"},{"key":"e_1_3_2_38_2","unstructured":"United Nations Office on Drugs and Crime. 2022. World Drug Report 2022. Retrieved from https:\/\/www.unodc.org\/res\/wdr2022\/MS\/WDR22_Booklet_2.pdf"},{"key":"e_1_3_2_39_2","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1007\/978-3-030-88583-0_12","volume-title":"Agile Processes in Software Engineering and Extreme Programming \u2013 Workshops","author":"Onyango Samuel","year":"2021","unstructured":"Samuel Onyango, Emilie Steenvoorden, Joram Scholten, and Slinger Jansen. 2021. Assessing the health of the Dark Web. In Agile Processes in Software Engineering and Extreme Programming \u2013 Workshops.Peggy Gregory and Philippe Kruchten (Eds.), 125\u2013134."},{"key":"e_1_3_2_40_2","article-title":"CovidExpert: A triplet Siamese neural network framework for the detection of COVID-19","volume":"37","author":"Ornob Tareque Rahman","year":"2023","unstructured":"Tareque Rahman Ornob, Gourab Roy, and Enamul Hassan. 2023. CovidExpert: A triplet Siamese neural network framework for the detection of COVID-19. Informatics in Medicine Unlocked 37 (2023), 101156.","journal-title":"Informatics in Medicine Unlocked"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.promfg.2020.01.025"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00755"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISI.2018.8587374"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","unstructured":"Replication-Package. 2023. When the Few Outweigh the Many: Illicit Content Recognition with Few-Shot Learning. Retrieved from 10.5281\/zenodo.7657482","DOI":"10.5281\/zenodo.7657482"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-023-08610-0"},{"key":"e_1_3_2_47_2","doi-asserted-by":"crossref","unstructured":"Amirreza Shaban Shray Bansal Zhen Liu Irfan Essa and Byron Boots. 2017. One-shot learning for semantic segmentation. arXiv:1709.03410. Retrieved from https:\/\/arxiv.org\/abs\/1709.03410","DOI":"10.5244\/C.31.167"},{"key":"e_1_3_2_48_2","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1007\/978-3-031-15037-1_23","volume-title":"Brain Informatics","author":"Shaffi Noushath","year":"2022","unstructured":"Noushath Shaffi, Faizal Hajamohideen, Mufti Mahmud, Abdelhamid Abdesselam, Karthikeyan Subramanian, and Arwa Al Sariri. 2022. Triplet-loss based Siamese convolutional neural network for 4-way classification of Alzheimer\u2019s disease. In Brain Informatics. Mufti Mahmud, Jing He, Stefano Vassanelli, Andr\u00e9 van Zundert, and Ning Zhong (Eds.), Springer, Cham, 277\u2013287."},{"key":"e_1_3_2_49_2","doi-asserted-by":"crossref","unstructured":"Rahul Rama Varior Mrinal Haloi and Gang Wang. 2016. Gated Siamese convolutional neural network architecture for human re-identification. arXiv:1607.08378. Retrieved from https:\/\/arxiv.org\/abs\/1607.08378","DOI":"10.1007\/978-3-319-46484-8_48"},{"key":"e_1_3_2_50_2","article-title":"Matching networks for one shot learning","author":"Vinyals Oriol","year":"2016","unstructured":"Oriol Vinyals, Charles Blundell, Timothy P. Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. 2016. Matching networks for one shot learning. In Neural Information Processing Systems (NIPS).","journal-title":"Neural Information Processing Systems (NIPS)"},{"key":"e_1_3_2_51_2","doi-asserted-by":"crossref","unstructured":"Jiang Wang Yang song Thomas Leung Chuck Rosenberg Jinbin Wang James Philbin Bo Chen and Ying Wu. 2014. Learning fine-grained image similarity with deep ranking. arXiv:1404.4661. Retrieved from https:\/\/arxiv.org\/abs\/1404.4661","DOI":"10.1109\/CVPR.2014.180"},{"key":"e_1_3_2_52_2","unstructured":"Yaqing Wang Quanming Yao James Kwok and Lionel M. Ni. 2019. Generalizing from a few examples: A survey on few-shot learning. arXiv:1904.05046. Retrieved from https:\/\/arxiv.org\/abs\/1904.05046"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3696458","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3696458","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:10:14Z","timestamp":1750295414000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3696458"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,13]]},"references-count":51,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,12,31]]}},"alternative-id":["10.1145\/3696458"],"URL":"https:\/\/doi.org\/10.1145\/3696458","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"type":"print","value":"2157-6904"},{"type":"electronic","value":"2157-6912"}],"subject":[],"published":{"date-parts":[[2024,12,13]]},"assertion":[{"value":"2023-12-11","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-09-04","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-13","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}