{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:27:16Z","timestamp":1760243236564,"version":"build-2065373602"},"reference-count":55,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2014,4,24]],"date-time":"2014-04-24T00:00:00Z","timestamp":1398297600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/3.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In many ways, the web is the greatest resource ever created by human-kind. However, due to the dynamic and distributed nature of the web, its content changes, appears and disappears on a daily basis. Web archiving provides a way of capturing snapshots of (parts of) the web for preservation and future analysis. This paper provides an overview of techniques we have developed within the context of the EU funded ARCOMEM (ARchiving COmmunity MEMories) project to allow multimedia web content to be leveraged during the archival process and for post-archival analysis. Through a set of use cases, we explore several practical applications of multimedia analytics within the realm of web archiving, web archive analysis and multimedia data on the web in general.<\/jats:p>","DOI":"10.3390\/fi6020242","type":"journal-article","created":{"date-parts":[[2014,4,24]],"date-time":"2014-04-24T11:10:43Z","timestamp":1398337843000},"page":"242-260","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Exploiting Multimedia in Creating and Analysing Multimedia Web Archives"],"prefix":"10.3390","volume":"6","author":[{"given":"Jonathon","family":"Hare","sequence":"first","affiliation":[{"name":"Web and Internet Science Research Group, University of Southampton, Highfield, Southampton SO17 1PR, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Dupplaw","sequence":"additional","affiliation":[{"name":"Web and Internet Science Research Group, University of Southampton, Highfield, Southampton SO17 1PR, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Paul","family":"Lewis","sequence":"additional","affiliation":[{"name":"Web and Internet Science Research Group, University of Southampton, Highfield, Southampton SO17 1PR, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wendy","family":"Hall","sequence":"additional","affiliation":[{"name":"Web and Internet Science Research Group, University of Southampton, Highfield, Southampton SO17 1PR, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kirk","family":"Martinez","sequence":"additional","affiliation":[{"name":"Web and Internet Science Research Group, University of Southampton, Highfield, Southampton SO17 1PR, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2014,4,24]]},"reference":[{"unstructured":"ARCOMEM: Archiving Community Memories. Available online: http:\/\/www.arcomem.eu\/.","key":"ref_1"},{"unstructured":"Tahmasebi, N., Demartini, G., Dupplaw, D., Hare, J., Ioannou, E., Jaimes, A., Lewis, P., Maynard, D., Peters, W., and Risse, T. Models and Architecture Definition\/Contribution to Models and Architecture Definition. ARCOMEM Deliverable D3.1\/D4.1. Available online: http:\/\/www.arcomem.eu\/wp-content\/uploads\/2012\/05\/D3_1.pdf.","key":"ref_2"},{"doi-asserted-by":"crossref","unstructured":"Hare, J.S., Dupplaw, D., Hall, W., Lewis, P., and Martinez, K. (2013, January 6). The Role of Multimedia in Archiving Community Memories. Proceedings of the 1st International Workshop on Archiving Community Memories, Lisbon, PT, USA.","key":"ref_3","DOI":"10.3390\/fi6020242"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1108\/00220410710758977","article-title":"Facing the reality of semantic image retrieval","volume":"63","author":"Enser","year":"2007","journal-title":"J. Doc."},{"unstructured":"Hare, J.S., Sinclair, P.A.S., Lewis, P.H., Martinez, K., Enser, P.G., and Sandom, C.J. (2006, January 12). Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and Bottom-up Approaches. Proceedings of the 3rd European Semantic Web Conference, Budva, Montenegro.","key":"ref_5"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1349","DOI":"10.1109\/34.895972","article-title":"Content-Based Image Retrieval at the End of the Early Years","volume":"22","author":"Smeulders","year":"2000","journal-title":"IEEE Trans. Pattern Anal. Mach. Intel."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1582","DOI":"10.1109\/TPAMI.2009.154","article-title":"Evaluating color descriptors for object and scene recognition","volume":"32","author":"Gevers","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intel."},{"unstructured":"Lazebnik, S., Schmid, C., and Ponce, J. (2006, January 17\u201322). Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, NY, USA.","key":"ref_9"},{"unstructured":"Flickr Photo Sharing. Available online: http:\/\/www.flickr.com\/.","key":"ref_10"},{"doi-asserted-by":"crossref","unstructured":"Hare, J., Samangooei, S., and Lewis, P. (2011, January 17\u201320). Efficient Clustering and Quantisation of SIFT Features: Exploiting Characteristics of the SIFT Descriptor and Interest Region Detectors under Image Inversion. Proceedings of the 1st ACM International Conference on Multimedia Retrieval, Trento, Italy.","key":"ref_11","DOI":"10.1145\/1991996.1991998"},{"doi-asserted-by":"crossref","unstructured":"Zerr, S., Siersdorfer, S., Hare, J., and Demidova, E. (2012, January 12\u201316). Privacy-Aware Image Classification and Search. Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, Portland, OR, USA.","key":"ref_12","DOI":"10.1145\/2348283.2348292"},{"doi-asserted-by":"crossref","unstructured":"Huiskes, M.J., and Lew, M.S. (2008, January 26\u201331). The MIR Flickr Retrieval Evaluation. Proceedings of the 2008 ACM International Conference on Multimedia Information Retrieval, Vancouver, BC, Canada.","key":"ref_13","DOI":"10.1145\/1460096.1460104"},{"doi-asserted-by":"crossref","unstructured":"Hare, J., and Lewis, P.H. (2013, January 16\u201319). Explicit diversification of image search. Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval, Dallas, TX, USA.","key":"ref_14","DOI":"10.1145\/2461466.2461513"},{"unstructured":"Zontone, P., Boato, G., Natale, F.G.B.D., Rosa, A.D., Barni, M., Piva, A., Hare, J., Dupplaw, D., and Lewis, P. (2009, January 25\u201329). Image Diversity Analysis: Context, Opinion and Bias. Proceedings of the First International Workshop on Living Web: Making Web Diversity a True Asset, Collocated with the 8th International Semantic Web Conference, Washington, DC, USA.","key":"ref_15"},{"doi-asserted-by":"crossref","unstructured":"Agrawal, R., Gollapudi, S., Halverson, A., and Ieong, S. (2009, January 9\u201312). Diversifying Search Results. Proceedings of the Second ACM International Conference on Web Search and Data Mining, Barcelona, Spain.","key":"ref_16","DOI":"10.1145\/1498759.1498766"},{"unstructured":"Ionescu, B., Men\u00e9ndez, M., M\u00fcller, H., and Popescu, A. (2013, January 18\u201319). Retrieving Diverse Social Images at MediaEval 2013: Objectives, Dataset and Evaluation. Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, Barcelona, Spain.","key":"ref_17"},{"unstructured":"Jain, N., Hare, J., Samangooei, S., Preston, J., Davies, J., Dupplaw, D., and Lewis, P.H. (2013, January 18\u201319). Experiments in Diversifying Flickr Result Sets. Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, Barcelona, Spain.","key":"ref_18"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1016\/S0031-3203(02)00052-3","article-title":"Automatic facial expression analysis: A survey","volume":"36","author":"Fasel","year":"2003","journal-title":"Pattern Recognit."},{"unstructured":"Tian, Y.l., Kanade, T., and Cohn, J.F. (2005). Handbook of Face Recognition, Springer.","key":"ref_20"},{"doi-asserted-by":"crossref","unstructured":"Pantic, M., Sebe, N., Cohn, J.F., and Huang, T. (2005, January 6\u201311). Affective Multimodal Human-Computer Interaction. Proceedings of the 13th annual ACM international conference on Multimedia, Singapore.","key":"ref_21","DOI":"10.1145\/1101149.1101299"},{"unstructured":"Wang, W., and He, Q. (2008, January 12\u201315). A Survey on Emotional Semantic Image Retrieval. Proceedings of the International Conference on Image Processing, San Diego, CA, USA.","key":"ref_22"},{"unstructured":"Zontone, P., Boato, G., Hare, J., Lewis, P., Siersdorfer, S., and Minack, E. (2010, January 16). Image and Collateral Text in Support of Auto-annotation and Sentiment Analysis. Proceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing, Uppsala, Sweden.","key":"ref_23"},{"unstructured":"Wang, W., Yu, Y., and Jiang, S. (2006, January 8\u201311). Image Retrieval by Emotional Semantics: A Study of Emotional Space and Feature Extraction. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Taipei, Taiwan.","key":"ref_24"},{"doi-asserted-by":"crossref","unstructured":"Yanulevskaya, V., van Gemert, J.C., Roth, K., Herbold, A.K., Sebe, N., and Geusebroek, J.M. (2008, January 12\u201315). Emotional Valence Categorization Using Holistic Image Features. Proceedings of the IEEE International Conference on Image Processing, San Diego, CA, USA.","key":"ref_25","DOI":"10.1109\/ICIP.2008.4711701"},{"doi-asserted-by":"crossref","unstructured":"Siersdorfer, S., Hare, J., Minack, E., and Deng, F. (2010, January 25\u201329). Analyzing and Predicting Sentiment of Images on the Social Web. Proceedings of the International Conference on Multimedia, Firenze, Italy.","key":"ref_26","DOI":"10.1145\/1873951.1874060"},{"unstructured":"Apache HBase Home. Available online: http:\/\/hbase.apache.org.","key":"ref_27"},{"unstructured":"Apache Hadoop Home. Available online: http:\/\/hadoop.apache.org.","key":"ref_28"},{"doi-asserted-by":"crossref","unstructured":"Kohlsch\u00fctter, C., Fankhauser, P., and Nejdl, W. (2010, January 3\u20136). Boilerplate Detection Using Shallow Text Features. Proceedings of the Third ACM International Conference on Web Search and Data Mining, New York, NY, USA.","key":"ref_29","DOI":"10.1145\/1718487.1718542"},{"unstructured":"Hare, J., Matthews, M., Dupplaw, D., and Samangooei, S. Readability4J\u2014Automated Webpage Information Extraction Engine. http:\/\/www.openimaj.org\/openimaj-web\/readability4j\/.","key":"ref_30"},{"unstructured":"Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., Gorrell, G., Funk, A., Roberts, A., and Damljanovic, D. (2011). Text Processing with GATE, Version 6, Gateway Press.","key":"ref_31"},{"unstructured":"GATE General Architecture for Text Engineering. Available online: http:\/\/www.gate.ac.uk.","key":"ref_32"},{"unstructured":"Hare, J.S., Samangooei, S., and Dupplaw, D.P. (December, January 28). OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia Analysis and Indexing of Images. Proceedings of the 19th ACM international conference on Multimedia, Scottsdale, AZ, USA.","key":"ref_33"},{"unstructured":"OpenIMAJ Open Intelligent Multimedia Analysis in Java. Available online: http:\/\/www.openimaj.org.","key":"ref_34"},{"unstructured":"International Organization for Standardization (2009). Information and Documentation\u2014WARC File Format, ISO. ISO 28500:2009.","key":"ref_35"},{"doi-asserted-by":"crossref","unstructured":"Hare, J.S., Samangooei, S., Dupplaw, D.P., and Lewis, P.H. (2013, January 16\u201319). Twitter\u2019s Visual Pulse. Proceedings of the 3rd ACM International Conference on Multimedia Retrieval, Dallas, TX, USA.","key":"ref_36","DOI":"10.1145\/2461466.2461514"},{"doi-asserted-by":"crossref","unstructured":"Hare, J., Samangooei, S., Dupplaw, D., and Lewis, P. (2012, January 5\u20138). ImageTerrier: An Extensible Platform for Scalable High-Performance Image Retrieval. Proceedings of the ACM International Conference on Multimedia Retrieval, Hong Kong, China.","key":"ref_37","DOI":"10.1145\/2324796.2324844"},{"unstructured":"Gionis, A., Indyk, P., and Motwani, R. (1999, January 7\u201310). Similarity Search in High Dimensions via Hashing. Proceedings of the 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK.","key":"ref_38"},{"doi-asserted-by":"crossref","unstructured":"Dong, W., Charikar, M., and Li, K. (2008, January 20\u201324). Asymmetric Distance Estimation with Sketches for Similarity Search in High-Dimensional Spaces. Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, Singapore.","key":"ref_39","DOI":"10.1145\/1390334.1390358"},{"doi-asserted-by":"crossref","unstructured":"Dong, W., Wang, Z., Charikar, M., and Li, K. (2012, January 5\u20138). High-Confidence Near-Duplicate Image Detection. Proceedings of the 2nd ACM International Conference on Multimedia Retrieva, Hong Kong, China.","key":"ref_40","DOI":"10.1145\/2324796.2324798"},{"unstructured":"Dupplaw, D.P., Hare, J.S., and Samangooei, S. Twitter\u2019s Visual Pulse Demo. Available online: https:\/\/www.youtube.com\/watch?v=CBk5nDd6CLU.","key":"ref_41"},{"unstructured":"MediaEval MediaEval Benchmarking Initiative for Multimedia Evaluation. Available online: http:\/\/www.multimediaeval.org.","key":"ref_42"},{"unstructured":"Reuter, T., Papadopoulos, S., Mezaris, V., Cimiano, P., de Vries, C., and Geva, S. (2013, January 18\u201319). Social Event Detection at MediaEval 2013: Challenges, Datasets, and Evaluation. Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, Barcelona, Spain.","key":"ref_43"},{"unstructured":"Samangooei, S., Hare, J., Dupplaw, D., Niranjan, M., Gibbins, N., Lewis, P., Davies, J., Jai, N., and Preston, J. (2013, January 18\u201319). Social Event Detection Via Sparse Multi-Modal Feature Seating Contest and Incremental Density Based Clustering. Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, Barcelona, Spain.","key":"ref_44"},{"doi-asserted-by":"crossref","unstructured":"Parkhi, O., Vedaldi, A., and Zisserman, A. (2012, January 23\u201325). On-the-Fly Specific Person Retrieval. Proceedings of the 13th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Dublin, Ireland.","key":"ref_45","DOI":"10.1109\/WIAMIS.2012.6226775"},{"doi-asserted-by":"crossref","unstructured":"Psyllos, A., Anagnostopoulos, C.N., and Kayafas, E. (2012, January 24\u201327). M-SIFT: A New Method for Vehicle Logo Recognition. Proceedings of the IEEE International Conference on Vehicular Electronics and Safety, Istanbul, Turkey.","key":"ref_46","DOI":"10.1109\/ICVES.2012.6294277"},{"doi-asserted-by":"crossref","unstructured":"Kalantidis, Y., Pueyo, L.G., Trevisiol, M., van Zwol, R., and Avrithis, Y. (2008, January 26\u201331). Scalable Triangulation-Based Logo Recognition. Proceedings of the 1st ACM International Conference on Multimedia Retrieval, Vancouver, CB, Canada.","key":"ref_47","DOI":"10.1145\/1991996.1992016"},{"unstructured":"Davies, J., Hare, J., Samangooei, S., Preston, J., Jain, N., Dupplaw, D., and Lewis, P.H. (2013, January 18\u201319). Identifying the Geographic Location of an Image with a Multimodal Probability Density Function. Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, Barcelona, Spain.","key":"ref_48"},{"unstructured":"Hauff, C., Thomee, B., and Trevisiol, M. (2013, January 18\u201319). Working Notes for the Placing Task at MediaEval 2013. Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, Barcelona, Spain.","key":"ref_49"},{"doi-asserted-by":"crossref","unstructured":"Hare, J.S., Davies, J., Samangooei, S., and Lewis, P.H. (2014, January 1\u20134). Placing Photos with a Multimodal Probability Density Function. Proceedings of the International Conference on Multimedia Retrieval, Glasgow, Scotland, UK.","key":"ref_50","DOI":"10.1145\/2578726.2578768"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1006\/cviu.1995.1004","article-title":"Active shape models\u2014Their training and application","volume":"61","author":"Cootes","year":"1995","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"681","DOI":"10.1109\/34.927467","article-title":"Active appearance models","volume":"23","author":"Cootes","year":"2001","journal-title":"IEEE Trans. Pattern Anal. Mach. Intel."},{"doi-asserted-by":"crossref","unstructured":"Ekman, P., and Friesen, W. (1978). Facial Action Coding System: A Technique for the Measurement of Facial Movement, Consulting Psychologists Press.","key":"ref_53","DOI":"10.1037\/t27734-000"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"863","DOI":"10.1002\/asi.21043","article-title":"Collective indexing of emotions in images. A study in emotional information retrieval","volume":"60","author":"Schmidt","year":"2009","journal-title":"J. Am. Soc. Inf. Sci. Technol."},{"doi-asserted-by":"crossref","unstructured":"San Pedro, J., and Siersdorfer, S. (2009, January 20\u201324). Ranking and Classifying Attractiveness of Photos in Folksonomies. Proceedings of the 18th International World Wide Web Conference, Madrid, Spain.","key":"ref_55","DOI":"10.1145\/1526709.1526813"}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/6\/2\/242\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T21:10:41Z","timestamp":1760217041000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/6\/2\/242"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,4,24]]},"references-count":55,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2014,6]]}},"alternative-id":["fi6020242"],"URL":"https:\/\/doi.org\/10.3390\/fi6020242","relation":{},"ISSN":["1999-5903"],"issn-type":[{"type":"electronic","value":"1999-5903"}],"subject":[],"published":{"date-parts":[[2014,4,24]]}}}