{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T21:08:00Z","timestamp":1770325680456,"version":"3.49.0"},"reference-count":15,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2019,8,6]],"date-time":"2019-08-06T00:00:00Z","timestamp":1565049600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"published-print":{"date-parts":[[2019,12,23]]},"abstract":"<jats:p>Real world data is often interconnected, forming large and complex heterogeneous information networks (HINs) with multiple types of objects and links such as bibliographic network (DBLP) and knowledge bases (YaGo). Querying meta-paths requires exploration of path instances which can be computational cost in large HINs. However, existing meta-path based studies mostly focus on analytical applications of meta-paths, rather than systems to query meta-paths efficiently in large HINs. To bridge this gap, in this work we present SparkHINlog, a system based on Apache Spark, to handle meta-paths queries efficiently on large scale HINs. In SparkHINlog we propose an algorithm to not only translate meta-paths to Datalog rules, but also to manage the working memory area of Datalog efficiently to increase the scalability of SparkHINlog. To avoid the computing overhead of join operation to discover path instances when evaluating these rules, we leverage Motif Finding, a powerful tool of GraphFrames Library. With motif finding, SparkHINLog can speed up the time to evaluate the rules by path finding on graph instead on joining two relations. We conduct experimental comparisons with SparkDatalog, the state-of-the-art large-scale Datalog system, and verify the efficacy and effectiveness of our system in supporting meta-path queries.<\/jats:p>","DOI":"10.3233\/jifs-179362","type":"journal-article","created":{"date-parts":[[2019,8,13]],"date-time":"2019-08-13T11:28:29Z","timestamp":1565695709000},"page":"7555-7566","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":2,"title":["SparkHINlog: Extension of SparkDatalog for heterogeneous information network"],"prefix":"10.1177","volume":"37","author":[{"given":"Do","family":"Phuc","sequence":"first","affiliation":[{"name":"University of Information Technology, VNU-HCM"}]}],"member":"179","published-online":{"date-parts":[[2019,8,6]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"crossref","unstructured":"Y.Li C.Shi P.S.Yu and Q.Chen HRank: A Path based Ranking Framework in Heterogeneous Information Network Web-Age Information Management in: 15th International Conference WAIM 2014 2014.","DOI":"10.1007\/978-3-319-08010-9_61"},{"key":"e_1_3_2_3_2","doi-asserted-by":"crossref","unstructured":"Y.Sun J.Han X.Yan P.Yu and T.Wu Path-Sim: Meta path-based top-k similarity search in heterogeneous information networks in VLDB 2011.","DOI":"10.14778\/3402707.3402736"},{"key":"e_1_3_2_4_2","unstructured":"C.Shi X.Kong Y.Huang P.S.Yu and B.Wu HeteSim: A General Framework for Relevance Measure in Heterogeneous Networks 2013. [Online]. Available: https:\/\/arxiv.org\/abs\/1309.7393. [Accessed 31 October 2018]."},{"key":"e_1_3_2_5_2","doi-asserted-by":"crossref","unstructured":"J.Kuck H.Zhuang X.Yan H.Cam and J.Han Query-based outlier detection in heterogeneous information networks in EDBT 2015.","DOI":"10.1109\/ICDM.2014.85"},{"key":"e_1_3_2_6_2","doi-asserted-by":"crossref","unstructured":"M.Ji Y.Sun M.Danilevsky J.Han and J.Gao Graph regularized transductive classification on heterogeneous information networks in ECML\/PKDD 2010.","DOI":"10.1007\/978-3-642-15880-3_42"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2016.2598561"},{"key":"e_1_3_2_8_2","unstructured":"J.Seo S.Guo and M.S.Lam SociaLite: Datalog Extensions for Efficient Social Network Analysis in IEEE ICDE 2014."},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.14778\/2536274.2536290"},{"key":"e_1_3_2_10_2","volume-title":"Learning PySpark","author":"Drabas T.","year":"2017","unstructured":"T.Drabas and D.Lee, Learning PySpark, Packt, 2017."},{"key":"e_1_3_2_11_2","first-page":"993","article-title":"Latent dirichlet allocation","author":"Blei D.M.","year":"2013","unstructured":"D.M.Blei, A.Y.Ng and M.I.Jordan, Latent dirichlet allocation, Journal of Machine Learning Research (2013), 993\u20131022.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_12_2","doi-asserted-by":"crossref","unstructured":"L.Bellomarini E.Sallinger and G.Gottlob The Vadalog System: Datalog-based Reasoning for Knowledge Graphs in VLDB Endowment 2018.","DOI":"10.14778\/3213880.3213888"},{"issue":"12","key":"e_1_3_2_13_2","article-title":"Datalog and recursive query","volume":"5","author":"Green T.J.","year":"2012","unstructured":"T.J.Green, S.Huang, B.T.Loo and W.Zhou, Datalog and recursive query, Foundations and Trends in Databases 5(12) (2012).","journal-title":"Foundations and Trends in Databases"},{"key":"e_1_3_2_14_2","volume-title":"GraphX in Action","author":"Malak M.S.","year":"2016","unstructured":"M.S.Malak and R.East, GraphX in Action, Manning, 2016."},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/CSCI.2015.60"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/2960414.2960417"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-179362","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-179362","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-179362","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,4]],"date-time":"2026-02-04T18:20:59Z","timestamp":1770229259000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-179362"}},"subtitle":[],"editor":[{"given":"Ngoc Thanh","family":"Nguyen","sequence":"additional","affiliation":[]},{"given":"Edward","family":"Szczerbicki","sequence":"additional","affiliation":[]},{"given":"Bogdan","family":"Trawi\u0144ski","sequence":"additional","affiliation":[]},{"given":"Van Du","family":"Nguyen","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,8,6]]},"references-count":15,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2019,12,23]]}},"alternative-id":["10.3233\/JIFS-179362"],"URL":"https:\/\/doi.org\/10.3233\/jifs-179362","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,8,6]]}}}