{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,14]],"date-time":"2025-05-14T02:28:17Z","timestamp":1747189697092,"version":"3.40.5"},"reference-count":35,"publisher":"World Scientific Pub Co Pte Ltd","issue":"16","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61772185"],"award-info":[{"award-number":["61772185"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61572377"],"award-info":[{"award-number":["61572377"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Zhejiang Lab Research Project","award":["2019KC0AC01"],"award-info":[{"award-number":["2019KC0AC01"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J CIRCUIT SYST COMP"],"published-print":{"date-parts":[[2020,12,30]]},"abstract":"<jats:p> As one of the most popular frameworks for large-scale analytics processing, Hadoop is facing two challenges: both applications and storage devices become heterogeneous. However, existing data placement and job scheduling schemes pay little attention to such heterogeneity of either application I\/O requirements or I\/O device capability, thus can greatly degrade system efficiencies. In this paper, we propose ASPS, an Application and Storage-aware data Placement and job Scheduling approach for Hadoop clusters. The idea is to place application data and schedule application tasks considering both application I\/O requirements and storage device characteristics. Specifically, ASPS first introduces novel metrics to quantify I\/O requirements of applications. Then, based on the quantification, ASPS places data of different applications to the preferred storage devices. Finally, ASPS tries to launch jobs with high I\/O requirements on the nodes with the same type of faster devices to improve system efficiency. We have implemented ASPS in Hadoop framework. Experimental results show that ASPS can reduce the completion time of a single application by up to 36% and the average completion time of six concurrent applications by 27%, compared to existing data placement policies and job scheduling approaches. <\/jats:p>","DOI":"10.1142\/s0218126620502540","type":"journal-article","created":{"date-parts":[[2020,12,5]],"date-time":"2020-12-05T03:17:01Z","timestamp":1607138221000},"page":"2050254","source":"Crossref","is-referenced-by-count":0,"title":["Application and Storage-Aware Data Placement and Job Scheduling for Hadoop Clusters"],"prefix":"10.1142","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3677-1320","authenticated-orcid":false,"given":"Tao","family":"Li","sequence":"first","affiliation":[{"name":"College of Computer Science and Electronic Engineering, Hunan University, Changsha, P.\u00a0R.\u00a0China"}]},{"given":"Shuibing","family":"He","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou, P.\u00a0R.\u00a0China"}]},{"given":"Ping","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou, P.\u00a0R.\u00a0China"}]},{"given":"Siling","family":"Yang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou, P.\u00a0R.\u00a0China"}]},{"given":"Yanlong","family":"Yin","sequence":"additional","affiliation":[{"name":"Intelligent Computing System Research Center, Institute of Artificial Intelligence, Zhejiang Lab, Hangzhou, P.\u00a0R.\u00a0China"}]},{"given":"Cheng","family":"Xu","sequence":"additional","affiliation":[{"name":"College of Computer Science and Electronic Engineering, Hunan University, Changsha, P.\u00a0R.\u00a0China"}]}],"member":"219","published-online":{"date-parts":[[2020,12,21]]},"reference":[{"key":"S0218126620502540BIB001","first-page":"137","volume-title":"Proc. USENIX Symp. Operating Systems Design and Implementation (OSDI)","author":"Dean J.","year":"2004"},{"key":"S0218126620502540BIB003","first-page":"1","volume-title":"Proc. IEEE Symp. Mass Storage Systems and Technologies (MSST)","author":"Shvachko K.","year":"2010"},{"key":"S0218126620502540BIB004","first-page":"101","volume-title":"Proc. 15th IEEE\/ACM Int. Symp. Cluster, Cloud Grid Computing (CCGrid)","author":"Islam N. S.","year":"2015"},{"key":"S0218126620502540BIB005","first-page":"267","volume-title":"Proc. 9th USENIX Conf. Networked Systems Design and Implementation (NSDI)","author":"Ananthanarayanan G.","year":"2012"},{"key":"S0218126620502540BIB008","first-page":"335","volume-title":"Proc. 40th Int. Conf. Parallel Processing (ICPP)","author":"Ibrahim S.","year":"2011"},{"key":"S0218126620502540BIB009","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1145\/2462902.2462904","volume-title":"Proc. 22nd Int. Symp. High-Performance Parallel and Distributed Computing (HPDC)","author":"Bu X.","year":"2013"},{"key":"S0218126620502540BIB010","first-page":"1","volume-title":"Proc. Int. Conf. High Performance Computing, Networking, Storage and Analysis (SC)","author":"Li X.","year":"2013"},{"key":"S0218126620502540BIB011","first-page":"1","volume-title":"Proc. USENIX Conf. USENIX Annual Technical Conf. (ATC)","author":"Ahmad F.","year":"2014"},{"key":"S0218126620502540BIB012","first-page":"137","volume-title":"Proc. 11th Int. Conf. Autonomic Computing (ICAC)","author":"Pettijohn E.","year":"2014"},{"key":"S0218126620502540BIB013","first-page":"502","volume-title":"Proc. 14th IEEE\/ACM Int. Symp. Cluster, Cloud and Grid Computing (CCGrid)","author":"Krish K.","year":"2014"},{"key":"S0218126620502540BIB014","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1109\/BigData.2014.7004234","volume-title":"Proc. 2014 IEEE Int. Conf. Big Data (Big Data)","author":"Krish K.","year":"2014"},{"key":"S0218126620502540BIB015","first-page":"1","volume-title":"Proc. 2016 Int. Conf. Supercomputing (ICS)","author":"Islam N. S.","year":"2016"},{"key":"S0218126620502540BIB016","first-page":"1","volume-title":"Workshops and PhD forum of the Int. Symp. Parallel & Distributed Processing (IPDPSW)","author":"Xie J.","year":"2010"},{"key":"S0218126620502540BIB017","first-page":"29","volume-title":"Proc. 8th USENIX Conf. Operating Systems Design and Implementation (OSDI)","author":"Zaharia M.","year":"2008"},{"key":"S0218126620502540BIB018","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1145\/2213836.2213840","volume-title":"Proc. 2012 ACM Int. Conf. Management of Data (SIGMOD)","author":"Kwon Y.","year":"2012"},{"key":"S0218126620502540BIB019","first-page":"189","volume-title":"Proc. 11th Int. Conf. Autonomic Computing (ICAC)","author":"Zaheilas N.","year":"2014"},{"key":"S0218126620502540BIB020","first-page":"61","volume-title":"Proc. Int. Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS)","author":"Ahmad F.","year":"2012"},{"key":"S0218126620502540BIB021","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1145\/2600212.2600229","volume-title":"Proc. 23rd Int. Symp. High-Performance Parallel and Distributed Computing (HPDC)","author":"Li M.","year":"2014"},{"key":"S0218126620502540BIB022","first-page":"255","volume-title":"Proc. IEEE 22nd Int. Symp. Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)","author":"Krish K.","year":"2014"},{"key":"S0218126620502540BIB027","first-page":"117","volume-title":"Proc. 12th USENIX Symp. Operating Systems Design and Implementation (OSDI)","author":"Jyothi S. A.","year":"2016"},{"key":"S0218126620502540BIB028","first-page":"36:1","volume-title":"Proc. Eleventh European Conf. Computer Systems (EuroSys)","author":"Rasley J.","year":"2016"},{"key":"S0218126620502540BIB029","doi-asserted-by":"crossref","first-page":"1111","DOI":"10.14778\/3402707.3402746","volume":"4","author":"Herodotou H.","year":"2011","journal-title":"Proc. VLDB Endow."},{"key":"S0218126620502540BIB030","first-page":"1","volume-title":"Proc. Third ACM Symp. Cloud Computing (SOCC)","author":"Rasmussen A.","year":"2012"},{"key":"S0218126620502540BIB031","doi-asserted-by":"crossref","first-page":"1736","DOI":"10.14778\/2367502.2367513","volume":"5","author":"Shinnar A.","year":"2012","journal-title":"Proc. VLDB Endow."},{"key":"S0218126620502540BIB033","first-page":"41","volume-title":"Proc. IEEE 26th Int. Conf. Data Engineering Workshops (ICDEW)","author":"Huang S.","year":"2010"},{"key":"S0218126620502540BIB034","first-page":"613","volume-title":"Proc. 29th IEEE Int. Parallel and Distributed Processing Symp. (IPDPS)","author":"He S.","year":"2015"},{"key":"S0218126620502540BIB035","first-page":"1133","volume-title":"Proc. 32nd IEEE Int. Parallel and Distributed Processing Symp. (IPDPS)","author":"He S.","year":"2018"},{"key":"S0218126620502540BIB036","first-page":"1","volume-title":"Proc. IEEE 24th Int. Conf. Parallel and Distributed Systems (ICPADS)","author":"Pan F.","year":"2018"},{"key":"S0218126620502540BIB037","doi-asserted-by":"crossref","first-page":"1269","DOI":"10.1109\/TCAD.2015.2501286","volume":"35","author":"Zhou J.","year":"2015","journal-title":"IEEE Trans. Comput. Aided Design Integr. Circuits Syst."},{"key":"S0218126620502540BIB038","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.sysarc.2017.09.007","volume":"82","author":"Zhou J.","year":"2018","journal-title":"J. Syst. Arch."},{"key":"S0218126620502540BIB039","doi-asserted-by":"publisher","DOI":"10.1142\/S0218126619501901"},{"key":"S0218126620502540BIB040","doi-asserted-by":"publisher","DOI":"10.1142\/S0218126619501597"},{"key":"S0218126620502540BIB041","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1016\/j.future.2018.10.046","volume":"93","author":"Zhou X.","year":"2019","journal-title":"Future Gener Comput. Syst."},{"key":"S0218126620502540BIB042","doi-asserted-by":"publisher","DOI":"10.1142\/S021812661930006X"},{"key":"S0218126620502540BIB043","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1016\/j.jpdc.2005.06.014","volume":"66","author":"Ucar B.","year":"2006","journal-title":"J. Parallel Distrib. Comput."}],"container-title":["Journal of Circuits, Systems and Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218126620502540","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,1,4]],"date-time":"2021-01-04T08:27:05Z","timestamp":1609748825000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S0218126620502540"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,21]]},"references-count":35,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2020,12,30]]}},"alternative-id":["10.1142\/S0218126620502540"],"URL":"https:\/\/doi.org\/10.1142\/s0218126620502540","relation":{},"ISSN":["0218-1266","1793-6454"],"issn-type":[{"type":"print","value":"0218-1266"},{"type":"electronic","value":"1793-6454"}],"subject":[],"published":{"date-parts":[[2020,12,21]]}}}