{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,9]],"date-time":"2025-04-09T04:30:57Z","timestamp":1744173057582,"version":"3.40.3"},"reference-count":37,"publisher":"World Scientific Pub Co Pte Ltd","issue":"03","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Soft. Eng. Knowl. Eng."],"published-print":{"date-parts":[[2025,3]]},"abstract":"<jats:p> The advancement of the semantic web and Linked Open Data (LOD) cloud has led to the creation and integration of various knowledge bases defined by ontologies. A significant challenge within the LOD paradigm is identifying resources that refer to the same real-world object to enable large-scale data integration and sharing. In this context, instance matching has emerged as a key solution, linking co-referent instances from heterogeneous data sources using owl:sameAs links. Traditional approaches focus on schema-level matching but often fail to address property-level heterogeneity. Moreover, given the large scale of instances, examining all possible instance pairs is impractical. This paper proposes a scalable and efficient instance-matching approach using MongoDb (Humongous database) and Lucene. MongoDb stores instances at any scale and Lucene uses inverted indexes to identify matching candidates. Experiments on the instance matching track from the Ontology Alignment Evaluation Initiative (OAEI\u20192022) show that our approach matches the F-measure score of RE-Miner, the top performer in OAEI\u20192020, while surpassing all other participants in OAEI\u20192020, 2021 and 2022. Additionally, it operates 17 times faster than RE-Miner, four times faster than Lily and 15 times faster than LogMap, the fastest in OAEI\u20192020, 2021 and 2022, respectively. Moreover, we evaluate our approach on other knowledge bases from OAEI\u20192010. Once again, our approach gets highly competitive resuts compared to state-of-the-art approaches. <\/jats:p>","DOI":"10.1142\/s0218194025500111","type":"journal-article","created":{"date-parts":[[2025,3,18]],"date-time":"2025-03-18T09:11:21Z","timestamp":1742289081000},"page":"421-440","source":"Crossref","is-referenced-by-count":0,"title":["Scalable Instance Matching Using Lucene and Mongodb"],"prefix":"10.1142","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-6112-0141","authenticated-orcid":false,"given":"Siham","family":"Amrouch","sequence":"first","affiliation":[{"name":"LIM Laboratory, Computer Science Department, Mohamed Cherif Messaadia University, Souk Ahras, Algeria"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1756-1256","authenticated-orcid":false,"given":"Ryma","family":"Guefrouchi","sequence":"additional","affiliation":[{"name":"MISC Laboratory, Abdelhamid Mehri Constantine2 University, Constantine, Algeria"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2827-4946","authenticated-orcid":false,"given":"Nawel","family":"Zemmal","sequence":"additional","affiliation":[{"name":"Computer Science Department, Mohamed Cherif Messaadia University, Souk Ahras, Algeria"},{"name":"LabGED Laboratory, Computer Science Department, Badji Mokhtar University, Annaba, Algeria"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8939-8948","authenticated-orcid":false,"given":"Sadok Ben","family":"Yahia","sequence":"additional","affiliation":[{"name":"The Maersk Mc-Kinney Moller Institute, Centre for Industrial Software (CIS), University of Southern Denmark, Denmark"}]}],"member":"219","published-online":{"date-parts":[[2025,3,18]]},"reference":[{"key":"S0218194025500111BIB001","doi-asserted-by":"publisher","DOI":"10.4018\/jswis.2009081901"},{"key":"S0218194025500111BIB002","doi-asserted-by":"publisher","DOI":"10.1016\/j.websem.2013.05.004"},{"key":"S0218194025500111BIB003","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2013.06.004"},{"key":"S0218194025500111BIB004","doi-asserted-by":"publisher","DOI":"10.14778\/2078331.2078332"},{"key":"S0218194025500111BIB005","doi-asserted-by":"publisher","DOI":"10.1007\/s13218-021-00713-x"},{"key":"S0218194025500111BIB006","first-page":"53","volume":"538","author":"Volz J.","year":"2009","journal-title":"LDOW"},{"key":"S0218194025500111BIB007","doi-asserted-by":"publisher","DOI":"10.1007\/s11280-006-0226-8"},{"key":"S0218194025500111BIB008","doi-asserted-by":"publisher","DOI":"10.1016\/j.websem.2015.07.003"},{"key":"S0218194025500111BIB009","first-page":"211","volume-title":"Proc. 15th Int. Semantic Web Conf.","author":"Nassiri A.","year":"2020"},{"key":"S0218194025500111BIB010","doi-asserted-by":"publisher","DOI":"10.1145\/3418896"},{"key":"S0218194025500111BIB011","doi-asserted-by":"publisher","DOI":"10.1145\/1963405.1963421"},{"key":"S0218194025500111BIB012","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-00685-2_3"},{"key":"S0218194025500111BIB013","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.250581"},{"key":"S0218194025500111BIB014","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-13489-0_23"},{"key":"S0218194025500111BIB016","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-016-1620-z"},{"key":"S0218194025500111BIB017","doi-asserted-by":"publisher","DOI":"10.4018\/jswis.2011070103"},{"key":"S0218194025500111BIB018","doi-asserted-by":"publisher","DOI":"10.3233\/SW-150210"},{"key":"S0218194025500111BIB019","first-page":"369","volume-title":"Proc. 1st Workshop about Linked Data on the Web","author":"Raimond Y.","year":"2008"},{"key":"S0218194025500111BIB020","first-page":"25","volume-title":"Proc. 3rd Int. Workshop on Social Data on the Web","author":"Sleeman J.","year":"2010"},{"volume-title":"MongoDB: The Definitive Guide: Powerful and Scalable Data Storage","year":"2019","author":"Bradshaw S.","key":"S0218194025500111BIB021"},{"key":"S0218194025500111BIB022","volume-title":"Information Retrieval","author":"J C.","year":"1979","edition":"2"},{"key":"S0218194025500111BIB023","first-page":"17","volume-title":"SIGIR 2012 Workshop on Open Source Information Retrieval","author":"Bia\u0142ecki A.","year":"2012"},{"key":"S0218194025500111BIB024","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-41030-7_38"},{"key":"S0218194025500111BIB025","first-page":"154","volume-title":"Proc. 15th Int. Semantic Web Conf.","author":"Lima B.","year":"2020"},{"key":"S0218194025500111BIB026","first-page":"167","volume-title":"OM@ISWC","author":"Zou S.","year":"2021"},{"key":"S0218194025500111BIB027","first-page":"201","volume-title":"Proc. 15th Int. Semantic Web Conf.","author":"Jimenez-Ruiz E.","year":"2020"},{"key":"S0218194025500111BIB028","doi-asserted-by":"publisher","DOI":"10.3390\/e23050602"},{"key":"S0218194025500111BIB029","first-page":"187","volume-title":"Proc. 15th Int. Semantic Web Conf.","author":"Wang X.","year":"2020"},{"key":"S0218194025500111BIB030","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-11964-9_3"},{"key":"S0218194025500111BIB031","first-page":"166","volume-title":"OM@ISWC","author":"Happi B. G. H.","year":"2022"},{"key":"S0218194025500111BIB032","doi-asserted-by":"publisher","DOI":"10.1145\/1132956.1132959"},{"key":"S0218194025500111BIB033","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-35176-1_29"},{"key":"S0218194025500111BIB034","doi-asserted-by":"publisher","DOI":"10.1109\/ACIT53391.2021.9677377"},{"issue":"3","key":"S0218194025500111BIB035","first-page":"432","volume":"19","author":"Amrouch S.","year":"2022","journal-title":"Int. Arab J. Inf. Technol."},{"key":"S0218194025500111BIB036","first-page":"42","volume-title":"Proc. 15th Int. Workshop on Ontology Matching","author":"Pour M.","year":"2020"},{"key":"S0218194025500111BIB037","doi-asserted-by":"publisher","DOI":"10.1016\/j.websem.2009.04.001"},{"key":"S0218194025500111BIB038","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2008.202"}],"container-title":["International Journal of Software Engineering and Knowledge Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218194025500111","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,9]],"date-time":"2025-04-09T01:53:32Z","timestamp":1744163612000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S0218194025500111"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3]]},"references-count":37,"journal-issue":{"issue":"03","published-print":{"date-parts":[[2025,3]]}},"alternative-id":["10.1142\/S0218194025500111"],"URL":"https:\/\/doi.org\/10.1142\/s0218194025500111","relation":{},"ISSN":["0218-1940","1793-6403"],"issn-type":[{"type":"print","value":"0218-1940"},{"type":"electronic","value":"1793-6403"}],"subject":[],"published":{"date-parts":[[2025,3]]}}}