{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T13:03:30Z","timestamp":1760101410944,"version":"3.41.2"},"reference-count":34,"publisher":"Emerald","issue":"3","license":[{"start":{"date-parts":[[2023,3,1]],"date-time":"2023-03-01T00:00:00Z","timestamp":1677628800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["DTA"],"published-print":{"date-parts":[[2024,7,19]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>In federated search, a query is sent simultaneously to multiple resources and each one of them returns a list of results. These lists are merged into a single list using the results merging process. In this work, the authors apply machine learning methods for results merging in federated patent search. Even though several methods for results merging have been developed, none of them were tested on patent data nor considered several machine learning models. Thus, the authors experiment with state-of-the-art methods using patent data and they propose two new methods for results merging that use machine learning models.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>The methods are based on a centralized index containing samples of documents from all the remote resources, and they implement machine learning models to estimate comparable scores for the documents retrieved by different resources. The authors examine the new methods in cooperative and uncooperative settings where document scores from the remote search engines are available and not, respectively. In uncooperative environments, they propose two methods for assigning document scores.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>The effectiveness of the new results merging methods was measured against state-of-the-art models and found to be superior to them in many cases with significant improvements. The random forest model achieves the best results in comparison to all other models and presents new insights for the results merging problem.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>In this article the authors prove that machine learning models can substitute other standard methods and models that used for results merging for many years. Our methods outperformed state-of-the-art estimation methods for results merging, and they proved that they are more effective for federated patent search.<\/jats:p><\/jats:sec>","DOI":"10.1108\/dta-06-2021-0156","type":"journal-article","created":{"date-parts":[[2023,3,1]],"date-time":"2023-03-01T05:53:05Z","timestamp":1677649985000},"page":"363-379","source":"Crossref","is-referenced-by-count":4,"title":["Machine learning methods for results merging in patent retrieval"],"prefix":"10.1108","volume":"58","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5370-9695","authenticated-orcid":false,"given":"Vasileios","family":"Stamatis","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4087-125X","authenticated-orcid":false,"given":"Michail","family":"Salampasis","sequence":"additional","affiliation":[]},{"given":"Konstantinos","family":"Diamantaras","sequence":"additional","affiliation":[]}],"member":"140","published-online":{"date-parts":[[2023,3,1]]},"reference":[{"issue":"3","key":"key2024072308212345300_ref001","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1002\/asi.20283","article-title":"The FedLemur project: federated search in the real world","volume":"57","year":"2006","journal-title":"Journal of the American Society for Information Science and Technology"},{"key":"key2024072308212345300_ref002","doi-asserted-by":"crossref","unstructured":"Callan, J. (2002), \u201cDistributed information retrieval\u201d, in Croft, W.B. (Ed.), Advances in Information Retrieval, The Information Retrieval Series, Vol. 7, Springer, Boston, MA, pp. 127-150. doi: 10.1007\/0-306-47019-5_5","DOI":"10.1007\/0-306-47019-5_5"},{"issue":"2","key":"key2024072308212345300_ref003","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1145\/382979.383040","article-title":"Query-based sampling of text databases","volume":"19","year":"2001","journal-title":"ACM Transactions on Information Systems"},{"first-page":"21","article-title":"Searching distributed collections with inference networks","year":"1995","key":"key2024072308212345300_ref004"},{"key":"key2024072308212345300_ref005","doi-asserted-by":"crossref","unstructured":"Clarke, N.S. (2018), \u201cThe basics of patent searching\u201d, World Patent Information, Vol. 54, pp. S4-S10.","DOI":"10.1016\/j.wpi.2017.02.006"},{"first-page":"189","article-title":"Merging results from isolated search engines","year":"1999","key":"key2024072308212345300_ref006"},{"key":"key2024072308212345300_ref007","doi-asserted-by":"crossref","unstructured":"Giachanou, A. and Salampasis, M. (2014), \u201cIPC Selection using collection selection algorithms\u201d, in Lamas, D. and Buitelaar, P. (Eds), Multidisciplinary Information Retrieval, Springer International Publishing, Cham, Vol. 8849, IRFC, LNCS, pp. 41-52.","DOI":"10.1007\/978-3-319-12979-2_4"},{"key":"key2024072308212345300_ref008","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1007\/s10791-015-9270-2","article-title":"Multilayer source selection as a tool for supporting patent search and classification","volume":"18","year":"2015","journal-title":"Information Retrieval Journal"},{"year":"2011","key":"key2024072308212345300_ref009","article-title":"A weighted curve fitting method for result merging in federated search"},{"year":"2012","key":"key2024072308212345300_ref010","article-title":"Mixture model with multiple centralized retrieval algorithms for result merging in federated search"},{"key":"key2024072308212345300_ref011","article-title":"New re-ranking approach in merging search results","volume":"43","year":"2019","journal-title":"Informatica"},{"issue":"1","key":"key2024072308212345300_ref012","doi-asserted-by":"crossref","first-page":"6067","DOI":"10.35940\/ijeat.A1922.109119","article-title":"Effect of technical domains and patent structure on patent information retrieval","volume":"9","year":"2019","journal-title":"International Journal of Engineering and Advanced Technology"},{"year":"2015","key":"key2024072308212345300_ref013","article-title":"An optimization framework for merging multiple result lists"},{"issue":"1","key":"key2024072308212345300_ref014","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/1500000027","article-title":"Patent retrieval","volume":"7","year":"2013","journal-title":"Foundations and Trends in Information Retrieval"},{"year":"2013","key":"key2024072308212345300_ref015","article-title":"Leveraging conceptual Lexicon: query disambiguation using proximity information for patent retrieval"},{"key":"key2024072308212345300_ref016","unstructured":"Mao, J., Mukherjee, R., Raghavan, P. and Tsaparas, P. (2004), \u201cMethod and apparatus for merging result lists from multiple search engines\u201d, US patent No. US 6,728,704 B2."},{"first-page":"173","article-title":"Results merging algorithm using multiple regression models","year":"2007","key":"key2024072308212345300_ref017"},{"issue":"4","key":"key2024072308212345300_ref018","doi-asserted-by":"crossref","first-page":"1580","DOI":"10.1016\/j.ipm.2007.12.008","article-title":"A Results Merging algorithm for distributed information retrieval environments that combines regression methodologies with a selective download phase","volume":"44","year":"2008","journal-title":"Information Processing and Management"},{"key":"key2024072308212345300_ref019","unstructured":"Piroi, F., Lupu, M., Hanbury, A. and Veronika, Z. (2011), \u201cCLEF-IP 2011: retrieval in the intellectual property domain\u201d, CLEF 2011 Labs and Workshop, Notebook Papers, Amsterdam."},{"key":"key2024072308212345300_ref020","doi-asserted-by":"crossref","unstructured":"Salampasis, M. (2017), \u201cFederated patent search\u201d, in Lupu, M., Mayer, K., Kando, N. and Trippe, A.J. (Eds), Current Challenges in Patent Information Retrieval, pp. 213-240.","DOI":"10.1007\/978-3-662-53817-3_8"},{"key":"key2024072308212345300_ref021","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1016\/j.wpi.2014.08.001","article-title":"PerFedPat: an integrated federated system for patent search","volume":"38","year":"2014","journal-title":"World Patent Information"},{"key":"key2024072308212345300_ref022","unstructured":"Salampasis, M., Paltoglou, G. and Giahanou, A. (2012), \u201cReport on the CLEF-IP 2012 experiments: search of topically organized patents\u201d, in Forner, P., Karlgren, J. and Womser-Hacker, C. (Eds), CLEF (Online Working Notes\/Labs\/Workshop), [Wil88], Peter Willett, CEUR Workshop Proceedings, Aachen, Germany."},{"key":"key2024072308212345300_ref023","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1007\/s10115-018-1322-7","article-title":"Patent retrieval: a literature review","volume":"61","year":"2019","journal-title":"Knowledge and Information Systems"},{"year":"2011","key":"key2024072308212345300_ref024","article-title":"Lambdamerge: merging the results of query reformulations"},{"issue":"3","key":"key2024072308212345300_ref025","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1508850.1508852","article-title":"Robust result merging using sample-based score estimates","volume":"27","year":"2009","journal-title":"ACM Transactions on Information Systems"},{"issue":"4","key":"key2024072308212345300_ref026","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1145\/944012.944017","article-title":"A semisupervised learning method to merge search engine results","volume":"21","year":"2003","journal-title":"ACM Transactions on Information Systems"},{"year":"2020","key":"key2024072308212345300_ref027","article-title":"Results merging in the patent domain"},{"key":"key2024072308212345300_ref028","unstructured":"Taylor, M., Radlinski, F. and Shokouhi, M. (2016), \u201cMerging search results\u201d, US patent No. US 9,495.460 B2."},{"year":"2010","key":"key2024072308212345300_ref029","article-title":"Quantifying the challenges in parsing patent claims"},{"key":"key2024072308212345300_ref030","doi-asserted-by":"crossref","first-page":"2604","DOI":"10.1007\/s11771-016-3322-7","article-title":"Artificial neural network-based merging score for meta search engine","volume":"23","year":"2016","journal-title":"Journal of Central South University"},{"first-page":"225","article-title":"The collection fusion problem","year":"1995","key":"key2024072308212345300_ref031"},{"year":"2010","key":"key2024072308212345300_ref032","article-title":"PRES: a score metric for evaluating recall-oriented information retrieval applications"},{"issue":"4","key":"key2024072308212345300_ref033","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3239571","article-title":"Anserini: reproducible ranking baselines using Lucene","volume":"10","year":"2018","journal-title":"Journal of Data and Information Quality"},{"key":"key2024072308212345300_ref034","unstructured":"MAREC Data Set [Online] (2009), available at: https:\/\/researchdata.tuwien.ac.at\/records\/2zx6e-5pr64 (accessed 15 April 2020)."}],"container-title":["Data Technologies and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/DTA-06-2021-0156\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/DTA-06-2021-0156\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T23:15:11Z","timestamp":1753398911000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/dta\/article\/58\/3\/363-379\/1226309"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,1]]},"references-count":34,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,3,1]]},"published-print":{"date-parts":[[2024,7,19]]}},"alternative-id":["10.1108\/DTA-06-2021-0156"],"URL":"https:\/\/doi.org\/10.1108\/dta-06-2021-0156","relation":{},"ISSN":["2514-9288","2514-9288"],"issn-type":[{"type":"print","value":"2514-9288"},{"type":"electronic","value":"2514-9288"}],"subject":[],"published":{"date-parts":[[2023,3,1]]}}}