{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T08:59:36Z","timestamp":1776329976174,"version":"3.50.1"},"reference-count":45,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,9,16]],"date-time":"2020-09-16T00:00:00Z","timestamp":1600214400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,16]],"date-time":"2020-09-16T00:00:00Z","timestamp":1600214400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Big graphs are part of the movement of \u201cNot Only SQL\u201d databases (also called NoSQL) focusing on the relationships between data, rather than the values themselves. The data is stored in vertices while the edges model the interactions or relationships between these data. They offer flexibility in handling data that is strongly connected to each other. The analysis of a big graph generally involves exploring all of its vertices. Thus, this operation is costly in time and resources because big graphs are generally composed of millions of vertices connected through billions of edges. Consequently, the graph algorithms are expansive compared to the size of the big graph, and are therefore ineffective for data exploration. Thus, partitioning the graph stands out as an efficient and less expensive alternative for exploring a big graph. This technique consists in partitioning the graph into a set of k sub-graphs in order to reduce the complexity of the queries. Nevertheless, it presents many challenges because it is an NP-complete problem. In this article, we present DPHV (Distributed Placement of Hub-Vertices) an efficient parallel and distributed heuristic for large-scale graph partitioning. An application on a real-world graphs demonstrates the feasibility and reliability of our method. The experiments carried on a 10-nodes Spark cluster proved that the proposed methodology achieves significant gain in term of time and outperforms JA-BE-JA, Greedy, DFEP.<\/jats:p>","DOI":"10.1186\/s40537-020-00357-y","type":"journal-article","created":{"date-parts":[[2020,9,16]],"date-time":"2020-09-16T07:05:01Z","timestamp":1600239901000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["DHPV: a distributed algorithm for large-scale graph partitioning"],"prefix":"10.1186","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1445-8018","authenticated-orcid":false,"given":"Wilfried Yves Hamilton","family":"Adoni","sequence":"first","affiliation":[]},{"given":"Tarik","family":"Nahhal","sequence":"additional","affiliation":[]},{"given":"Moez","family":"Krichen","sequence":"additional","affiliation":[]},{"given":"Abdeltif","family":"El byed","sequence":"additional","affiliation":[]},{"given":"Ismail","family":"Assayad","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,9,16]]},"reference":[{"key":"357_CR1","first-page":"2","volume":"9","author":"K Danai","year":"2017","unstructured":"Danai K, Christos F. Individual and collective graph mining: principles, algorithms, and applications. Synth Lect Data Mining Knowl Discov. 2017;9:2.","journal-title":"Synth Lect Data Mining Knowl Discov"},{"issue":"1","key":"357_CR2","doi-asserted-by":"crossref","first-page":"19","DOI":"10.5808\/GI.2017.15.1.19","volume":"15","author":"B Yoon","year":"2017","unstructured":"Yoon B, Kim S, Kim S. Use of graph database for the integration of heterogeneous biological data. Genomics Inf. 2017;15(1):19\u201327.","journal-title":"Genomics Inf"},{"key":"357_CR3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.bdr.2016.07.002","volume":"6","author":"S Aridhi","year":"2016","unstructured":"Aridhi S, Nguifo EM. Big graph mining: frameworks and techniques. Big Data Res. 2016;6:1\u201310.","journal-title":"Big Data Res"},{"issue":"4","key":"357_CR4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2746403","volume":"10","author":"M Jiang","year":"2016","unstructured":"Jiang M, Cui P, Beutel A, Faloutsos C, Yang S. Catching synchronized behaviors in large networks: a graph mining approach. ACM Trans Knowl Discov Data. 2016;10(4):1\u201327.","journal-title":"ACM Trans Knowl Discov Data"},{"issue":"1","key":"357_CR5","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1016\/j.tcs.2007.09.013","volume":"389","author":"VE Alekseev","year":"2007","unstructured":"Alekseev VE, Boliac R, Korobitsyn DV, Lozin VV. NP-hard graph problems and boundary classes of graphs. Theor Comput Sci. 2007;389(1):219\u201336.","journal-title":"Theor Comput Sci"},{"issue":"4","key":"357_CR6","doi-asserted-by":"crossref","first-page":"900","DOI":"10.1137\/060666238","volume":"21","author":"K Cameron","year":"2008","unstructured":"Cameron K, Eschen EM, Ho\u00e1ng CT, Sritharan R. The complexity of the list partition problem for graphs. SIAM J Discrete Math. 2008;21(4):900\u201329.","journal-title":"SIAM J Discrete Math"},{"key":"357_CR7","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1109\/MCSE.2009.120","volume":"11","author":"J Cohen","year":"2009","unstructured":"Cohen J. Graph twiddling in a MapReduce world. Comput Sci Eng. 2009;11:29\u201341.","journal-title":"Comput Sci Eng"},{"issue":"2","key":"357_CR8","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1007\/s10619-019-07276-9","volume":"38","author":"HWY Adoni","year":"2020","unstructured":"Adoni HWY, Nahhal T, Krichen M, Aghezzaf B, Elbyed A. A survey of current challenges in partitioning and processing of graph-structured data in parallel and distributed systems. Distrib Parall Datab. 2020;38(2):495\u2013530.","journal-title":"Distrib Parall Datab"},{"key":"357_CR9","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1016\/j.engappai.2015.02.008","volume":"41","author":"S Aridhi","year":"2015","unstructured":"Aridhi S, Lacomme P, Ren L, Vincent B. A mapreduce-based approach for shortest path problem in large-scale networks. Eng Appl Artif Intellig. 2015;41:151\u201365.","journal-title":"Eng Appl Artif Intellig"},{"key":"357_CR10","first-page":"129","volume":"73","author":"BV Cherkassky","year":"1993","unstructured":"Cherkassky BV, Goldberg AV, Radzik T. Shortest paths algorithms: theory and experimental evaluation. Math Programm. 1993;73:129\u201374.","journal-title":"Math Programm"},{"key":"357_CR11","doi-asserted-by":"crossref","unstructured":"Adoni Wilfried YH, Nahhal T, Aghezzaf B, Elbyed A. MRA*: Parallel and distributed path in large-scale graph using mapReduce-A* based approach. In: Ubiquitous networking, lecture notes in computer science. Springer, Cham, May 2017, pp. 390\u2013401.","DOI":"10.1007\/978-3-319-68179-5_34"},{"key":"357_CR12","doi-asserted-by":"crossref","unstructured":"Adoni Wilfried YH, Nahhal T, Aghezzaf B, Elbyed A. The MapReduce-based approach to improve the shortest path computation in large-scale road networks. In: The case of A* algorithm. Journal of Big Data, 5, 2018.","DOI":"10.1186\/s40537-018-0125-8"},{"key":"357_CR13","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1016\/j.bdr.2017.05.003","volume":"9","author":"S Aridhi","year":"2017","unstructured":"Aridhi S, Montresor A, Velegrakis Y. BLADYG: a graph processing framework for large dynamic graphs. Big Data Res. 2017;9:9\u201317.","journal-title":"Big Data Res"},{"key":"357_CR14","doi-asserted-by":"crossref","unstructured":"Vavilapalli VK, Seth S, Saha B, Curino C, O\u2019Malley O, Radia S, Reed B, Baldeschwieler E, Murthy AC, Douglas C, Agarwal S, Konar M, Evans R, Graves T, Lowe J, Shah H. Apache hadoop YARN: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, pp. 1\u201316, Santa Clara, California, 2013. ACM Press.","DOI":"10.1145\/2523616.2523633"},{"issue":"10\u201310","key":"357_CR15","first-page":"95","volume":"10","author":"M Zaharia","year":"2010","unstructured":"Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I. Spark: cluster computing with working sets. HotCloud. 2010;10(10\u201310):95.","journal-title":"HotCloud"},{"key":"357_CR16","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1016\/j.knosys.2018.05.006","volume":"157","author":"BA Hammou","year":"2018","unstructured":"Hammou BA, Lahcen AA, Mouline S. APRA: an approximate parallel recommendation algorithm for Big Data. Knowl Based Syst. 2018;157:10\u20139.","journal-title":"Knowl Based Syst"},{"issue":"2","key":"357_CR17","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1109\/MC.2012.37","volume":"45","author":"E Brewer","year":"2012","unstructured":"Brewer E. Pushing the CAP: strategies for consistency and availability. Computer. 2012;45(2):23\u20139.","journal-title":"Computer"},{"key":"357_CR18","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1007\/978-3-319-49340-4_17","volume-title":"Handbook of big data technologies","author":"AP Appel","year":"2017","unstructured":"Appel AP, Moyano LG. Link and graph mining in the big data era. In: Zomaya AY, Sakr S, editors. Handbook of big data technologies. Cham: Springer; 2017. p. 583\u2013616."},{"issue":"4","key":"357_CR19","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1145\/358841.358850","volume":"23","author":"JL Bentley","year":"1980","unstructured":"Bentley JL. Multidimensional divide-and-conquer. Commun ACM. 1980;23(4):214\u201329.","journal-title":"Commun ACM"},{"key":"357_CR20","doi-asserted-by":"crossref","unstructured":"Shin K, Eliassi-Rad T, Faloutsos C. CoreScope: graph mining using k-core analysis patterns, anomalies and algorithms. In: 2016 IEEE 16th international conference on data mining (ICDM), pp. 469\u2013478, December 2016. ISSN: 2374-8486.","DOI":"10.1109\/ICDM.2016.0058"},{"key":"357_CR21","unstructured":"Guerrieri A. Distributed computing for large-scale graphs. Ph.D. thesis, University of Trento, 2015."},{"issue":"2","key":"357_CR22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2714568","volume":"10","author":"F Rahimian","year":"2015","unstructured":"Rahimian F, Payberah AH, Girdzijauskas S, Jelasity M, Haridi S. A distributed algorithm for large-scale graph partitioning. ACM Trans Autonom Adapt Syst. 2015;10(2):1\u201324.","journal-title":"ACM Trans Autonom Adapt Syst"},{"key":"357_CR23","unstructured":"Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C. PowerGraph: distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX conference on operating systems design and implementation, OSDI\u201912, pages 17\u201330, Berkeley, CA, USA, 2012. USENIX Association."},{"key":"357_CR24","doi-asserted-by":"crossref","unstructured":"Rahimian F, Payberah AH, Girdzijauskas S, Haridi S. Distributed vertex-cut partitioning. In: IFIP international conference on distributed applications and interoperable systems. Springer, 2014, p 186\u2013200.","DOI":"10.1007\/978-3-662-43352-2_15"},{"key":"357_CR25","doi-asserted-by":"crossref","unstructured":"Yan D, Huang L, Jordan MI. Fast approximate spectral clustering. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, KDD \u201909, New York; 2009. ACM, p 907\u2013916.","DOI":"10.1145\/1557019.1557118"},{"key":"357_CR26","unstructured":"Martin Charles\u00a0H. and Ph.D. Spectral clustering: a quick overview, 2012."},{"issue":"2","key":"357_CR27","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1002\/j.1538-7305.1970.tb01770.x","volume":"49","author":"BW Kernighan","year":"1970","unstructured":"Kernighan BW, Lin S. An efficient heuristic procedure for partitioning graphs. Bell Syst Tech J. 1970;49(2):291\u2013307.","journal-title":"Bell Syst Tech J"},{"key":"357_CR28","doi-asserted-by":"crossref","unstructured":"Fiduccia CM, Mattheyses RM. A Linear-time Heuristic for Improving Network Partitions. In: Proceedings of the 19th Design Automation Conference, DAC \u201982. Piscataway: IEEE Press; 1982, p 175\u2013181.","DOI":"10.1109\/DAC.1982.1585498"},{"key":"357_CR29","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1137\/S1064827595287997","volume":"20","author":"G Karypis","year":"1998","unstructured":"Karypis G, Kumar V. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput. 1998;20:359\u201392.","journal-title":"SIAM J Sci Comput"},{"key":"357_CR30","doi-asserted-by":"crossref","unstructured":"Karypis G, Kumar V. Multilevel algorithms for multi-constraint graph partitioning. In: Proceedings of the 1998 ACM\/IEEE conference on supercomputing, SC \u201998. Washington: IEEE Computer Society; 1998, p 1\u201313.","DOI":"10.1109\/SC.1998.10018"},{"key":"357_CR31","doi-asserted-by":"crossref","unstructured":"Karypis G, Kumar V. Multilevel K-way hypergraph partitioning. In: Proceedings of the 36th annual ACM\/IEEE design automation conference, DAC \u201999, New York: ACM; 1999, p 343\u2013348.","DOI":"10.1145\/309847.309954"},{"key":"357_CR32","doi-asserted-by":"crossref","unstructured":"Schloegel K, Karypis G, Kumar V. Parallel multilevel algorithms for multi-constraint graph partitioning. In: Euro-par 2000 parallel processing, lecture notes in computer science. Berlin: Springer; 2000, p 296\u2013310.","DOI":"10.1007\/3-540-44520-X_39"},{"key":"357_CR33","unstructured":"Kyrola A, Blelloch G, Guestrin C. GraphChi: large-scale graph computation on just a PC. In: Proceedings of the 10th USENIX conference on operating systems design and implementation, OSDI\u201912. Berkeley: USENIX Association; 2012. , p 31\u201346."},{"key":"357_CR34","doi-asserted-by":"crossref","unstructured":"Tsourakakis C, Gkantsidis C, Radunovic B, Vojnovic M. FENNEL: streaming graph partitioning for massive scale graphs. In: Proceedings of the 7th ACM international conference on web search and data mining, WSDM \u201914. New York: ACM; 2014, p 333\u2013342.","DOI":"10.1145\/2556195.2556213"},{"issue":"6","key":"357_CR35","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1002\/sam.10090","volume":"3","author":"CC Aggarwal","year":"2010","unstructured":"Aggarwal CC, Zhao Y, Philip SY. A framework for clustering massive graph streams. Stat Anal Data Mining. 2010;3(6):399\u2013416.","journal-title":"Stat Anal Data Mining"},{"key":"357_CR36","doi-asserted-by":"crossref","unstructured":"Kao E, Gadepally V, Hurley M, Jones M, Kepner J, Mohindra S, Monticciolo P, Reuther A, Samsi S, Song W, Staheli D, Smith S. Streaming graph challenge: stochastic block partition. In: 2017 IEEE High performance extreme computing conference (HPEC). 2017, p 1\u201312.","DOI":"10.1109\/HPEC.2017.8091040"},{"key":"357_CR37","doi-asserted-by":"crossref","unstructured":"Stanton I, Kliot G. Streaming graph partitioning for large distributed graphs. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD \u201912. New York: ACM; 2012, p 1222\u20131230.","DOI":"10.1145\/2339530.2339722"},{"issue":"5","key":"357_CR38","doi-asserted-by":"crossref","first-page":"286","DOI":"10.1504\/IJBIC.2011.042257","volume":"3","author":"K Tashkova","year":"2011","unstructured":"Tashkova K, Koros\u0306ec P, S\u0306ilc J. A distributed multilevel ant-colony algorithm for the multi-way graph partitioning. Int J Bio-Inspired Comput. 2011;3(5):286\u201396.","journal-title":"Int J Bio-Inspired Comput"},{"key":"357_CR39","doi-asserted-by":"crossref","unstructured":"Ushijima-Mwesigwa H, Negre CFA, Mniszewski SM. Graph partitioning using quantum annealing on the D-wave system. In: Proceedings of the second international workshop on post moores era supercomputing, PMES\u201917. Denver: Association for Computing Machinery; 2017, p 22\u201329.","DOI":"10.1145\/3149526.3149531"},{"issue":"9","key":"357_CR40","doi-asserted-by":"crossref","first-page":"2625","DOI":"10.1109\/TPDS.2017.2671868","volume":"28","author":"H Meyerhenke","year":"2017","unstructured":"Meyerhenke H, Sanders P, Schulz C. Parallel graph partitioning for complex networks. IEEE Trans Parallel Distrib Syst. 2017;28(9):2625\u201338.","journal-title":"IEEE Trans Parallel Distrib Syst"},{"issue":"8","key":"357_CR41","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1145\/79173.79181","volume":"33","author":"LG Valiant","year":"1990","unstructured":"Valiant LG. A bridging model for parallel computation. Commun ACM. 1990;33(8):103\u201311.","journal-title":"Commun ACM"},{"issue":"7","key":"357_CR42","doi-asserted-by":"crossref","first-page":"817","DOI":"10.1016\/j.parco.2004.04.001","volume":"30","author":"ML Massie","year":"2004","unstructured":"Massie ML, Chun BN, Culler DE. The ganglia distributed monitoring system: design, implementation, and experience. Parallel Comput. 2004;30(7):817\u201340.","journal-title":"Parallel Comput"},{"key":"357_CR43","doi-asserted-by":"crossref","unstructured":"Junghanns M, Petermann A, Teichmann N, Gomez K, Rahm E. Analyzing extended property graphs with Apache Flink. In: Proceedings of the 1st ACM SIGMOD workshop on network data analytics\u2014NDA \u201916. San Francisco: ACM Press; 2016, p 1\u20138.","DOI":"10.1145\/2980523.2980527"},{"key":"357_CR44","unstructured":"Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I. Graphx: graph processing in a distributed dataflow framework. In: 11th $$USENIX$$ symposium on operating systems design and implementation ($$OSDI$$ 14). 2014, p 599\u2013613."},{"key":"357_CR45","doi-asserted-by":"crossref","unstructured":"Shvachko K, Kuang H, Radia S, Chansler R. The hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th symposium on mass storage systems and technologies (MSST). IEEE Computer Society, 2010, p 1\u201310.","DOI":"10.1109\/MSST.2010.5496972"}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-020-00357-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s40537-020-00357-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-020-00357-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,15]],"date-time":"2021-09-15T20:12:18Z","timestamp":1631736738000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-020-00357-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,16]]},"references-count":45,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["357"],"URL":"https:\/\/doi.org\/10.1186\/s40537-020-00357-y","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-30206\/v2","asserted-by":"object"},{"id-type":"doi","id":"10.21203\/rs.3.rs-30206\/v1","asserted-by":"object"}]},"ISSN":["2196-1115"],"issn-type":[{"value":"2196-1115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,16]]},"assertion":[{"value":"19 May 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 September 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 September 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"76"}}