{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T21:37:50Z","timestamp":1776375470705,"version":"3.51.2"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,9,1]],"date-time":"2022-09-01T00:00:00Z","timestamp":1661990400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,9,1]],"date-time":"2022-09-01T00:00:00Z","timestamp":1661990400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"national natural science foundation of china","doi-asserted-by":"publisher","award":["61872387"],"award-info":[{"award-number":["61872387"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cloud Comp"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Modern data centers have widely deployed lots of cluster computing applications such as MapReduce and Spark. Since the coflow\/task abstraction can exactly express the requirements of cluster computing applications, various task-based solutions have been proposed to improve application-level performance. However, most of solutions require modification of the applications to obtain task information, making them impractical in many scenarios. In this paper, we propose a Bayesian decision-based Task Prediction mechanism named BTP to identify task and predict the task-size category. First, we design an automatic identification mechanism to identify tasks without manually modifying the applications. Then we leverage bayesian decision to predict the task-size category. Through a series of large-scale NS2 simulations, we demonstrate that BTP can accurately identify task and predict the task-size category. More specifically, BTP achieves 96% precision and 92% recall while obtaining accuracy by up to 98%.<\/jats:p>","DOI":"10.1186\/s13677-022-00312-7","type":"journal-article","created":{"date-parts":[[2022,9,1]],"date-time":"2022-09-01T12:02:53Z","timestamp":1662033773000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["BTP: automatic identification and prediction of tasks in data center networks"],"prefix":"10.1186","volume":"11","author":[{"given":"Shaojun","family":"Zou","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Ji","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7578-4490","authenticated-orcid":false,"given":"Jiawei","family":"Huang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2022,9,1]]},"reference":[{"key":"312_CR1","doi-asserted-by":"crossref","unstructured":"Dogar FR, Karagiannis T, Ballani H, Rowstron A (2014) Decentralized task-aware scheduling for data center networks In: Proc. ACM SIGCOMM, New York 431\u2013442.","DOI":"10.1145\/2740070.2626322"},{"issue":"1","key":"312_CR2","doi-asserted-by":"publisher","first-page":"389","DOI":"10.1109\/TNET.2018.2890010","volume":"27","author":"S Liu","year":"2019","unstructured":"Liu S, Huang J, Zhou Y, Wang J, He T (2019) Task-aware TCP in data center networks. IEEE\/ACM Trans Networking 27(1):389\u2013404.","journal-title":"IEEE\/ACM Trans Networking"},{"issue":"4","key":"312_CR3","doi-asserted-by":"publisher","first-page":"1954","DOI":"10.1109\/TNET.2017.2669216","volume":"25","author":"W Bai","year":"2017","unstructured":"Bai W, Chen L, Chen K, et al. (2017) PIAS: Practical information-agnostic flow scheduling for commodity data centers. IEEE\/ACM Trans Networking 25(4):1954\u20131967.","journal-title":"IEEE\/ACM Trans Networking"},{"issue":"4","key":"312_CR4","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1017\/CBO9780511984037.006","volume":"11","author":"AL Yuille","year":"1996","unstructured":"Yuille AL, Bthoff HH (1996) Bayesian decision theory and psychophysics. Percept Bayesian Infer 11(4):123\u2013161.","journal-title":"Percept Bayesian Infer"},{"issue":"3","key":"312_CR5","first-page":"749","volume":"8","author":"J Huang","year":"2020","unstructured":"Huang J, Huang Y, Wang J, He T (2020) Adjusting packet size to mitigate TCP Incast in data center networks with COTS switches. IEEE Trans Cloud Comput 8(3):749\u2013763.","journal-title":"IEEE Trans Cloud Comput"},{"issue":"1","key":"312_CR6","first-page":"134","volume":"29","author":"S Zou","year":"2020","unstructured":"Zou S, Huang J, Wang J, He T (2020) Flow-aware adaptive pacing to mitigate TCP incast in data center networks. IEEE\/ACM Trans Networking 29(1):134\u2013147.","journal-title":"IEEE\/ACM Trans Networking"},{"issue":"5","key":"312_CR7","doi-asserted-by":"publisher","first-page":"2364","DOI":"10.1109\/TNET.2020.3012556","volume":"28","author":"T Zhang","year":"2020","unstructured":"Zhang T, Huang J, Chen K, Wang J, Chen J, Pan Y, Min G (2020) Rethinking fast and friendly transport in data center networks. IEEE\/ACM Trans Networking 28(5):2364\u20132377.","journal-title":"IEEE\/ACM Trans Networking"},{"issue":"1","key":"312_CR8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13677-020-00160-3","volume":"9","author":"J Huang","year":"2020","unstructured":"Huang J, Li W, Li Q, Zhang T, Dong P, Wang J (2020) Tuning high flow concurrency for MPTCP in data center networks. J Cloud Comput 9(1):1\u201315.","journal-title":"J Cloud Comput"},{"key":"312_CR9","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1016\/j.jnca.2019.01.024","volume":"131","author":"J Huang","year":"2019","unstructured":"Huang J, Li S, Han R, Wang J (2019) Receiver-driven fair congestion control for TCP outcast in data center networks. J Netw Comput Appl 131:75\u201388.","journal-title":"J Netw Comput Appl"},{"key":"312_CR10","doi-asserted-by":"crossref","unstructured":"Zeng G, Bai W, Chen G, Chen K, Han D, Zhu Y, Cui L (2019) Congestion control for cross-datacenter networks In: Proc. IEEE ICNP,\u00a0Piscataway 1\u201312.","DOI":"10.1109\/ICNP.2019.8888042"},{"key":"312_CR11","doi-asserted-by":"crossref","unstructured":"Hu S, Bai W, Zeng G, Wang Z, Qiao B, Chen K, Tan K, Wang Y (2020) Aeolus: A building block for proactive transport in datacenters In: Proc. ACM SIGCOMM,\u00a0New York 1\u201313.","DOI":"10.1145\/3387514.3405878"},{"issue":"2020","key":"312_CR12","doi-asserted-by":"publisher","first-page":"546","DOI":"10.1016\/j.future.2020.03.016","volume":"108","author":"S Zou","year":"2020","unstructured":"Zou S, Huang J, Jiang W, Wang J (2020) Achieving high utilization of flowletbased load balancing in data center networks. Futur Gener Comput Syst 108(2020):546\u2013559.","journal-title":"Futur Gener Comput Syst"},{"key":"312_CR13","doi-asserted-by":"publisher","unstructured":"Liu J, Huang J, Lv W, Wang J (2020) APS: Adaptive packet spraying to isolate mix-flows in data center network. IEEE Trans Cloud Comput:1\u201314. https:\/\/doi.org\/10.1109\/TCC.2020.2985037.","DOI":"10.1109\/TCC.2020.2985037"},{"issue":"6","key":"312_CR14","doi-asserted-by":"publisher","first-page":"2338","DOI":"10.1109\/TNET.2019.2945863","volume":"27","author":"J Hu","year":"2019","unstructured":"Hu J, Huang J, Lv W, Zhou Y, Wang J, He T (2019) CAPS: Coding-based adaptive packet spraying to reduce flow completion time in data center. IEEE\/ACM Trans Networking 27(6):2338\u20132353.","journal-title":"IEEE\/ACM Trans Networking"},{"issue":"3","key":"312_CR15","doi-asserted-by":"publisher","first-page":"1183","DOI":"10.1109\/TNET.2021.3056601","volume":"29","author":"J Huang","year":"2021","unstructured":"Huang J, Lv W, Li W, Wang J, He T (2021) Mitigating packet reordering for random packet spraying in data center networks. IEEE\/ACM Trans Networking 29(3):1183\u20131196.","journal-title":"IEEE\/ACM Trans Networking"},{"issue":"12","key":"312_CR16","doi-asserted-by":"publisher","first-page":"8363","DOI":"10.1109\/TCOMM.2021.3118467","volume":"69","author":"S Zou","year":"2021","unstructured":"Zou S, Huang J, Wang J, He T (2021) RMC: Reordering marking and coding for fine-grained load balancing in data centers. IEEE Trans Commun 69(12):8363\u20138374.","journal-title":"IEEE Trans Commun"},{"key":"312_CR17","doi-asserted-by":"crossref","unstructured":"Hu S, Zhu Y, Cheng P, Guo C, Kun Tan J, Padhye K (2016) Chen, Deadlocks in datacenter networks: why do they form, and how to avoid them In: Proc. ACM HotNets,\u00a0New York 1\u20137.","DOI":"10.1145\/3005745.3005760"},{"key":"312_CR18","doi-asserted-by":"crossref","unstructured":"Hu S, Zhu Y, Cheng P, Guo C, Tan K, Padhye J, Chen K (2017) Tagger: Practical PFC deadlock prevention in data center networks In: Proc. ACM CoNEXT,\u00a0New York 451\u2013463.","DOI":"10.1145\/3143361.3143382"},{"key":"312_CR19","doi-asserted-by":"crossref","unstructured":"Chowdhury M, Zaharia M, Ma J, Jordan MI, Stoica I (2011) Managing data transfers in computer clusters with Orchestra In: Proc. ACM SIGCOMM,\u00a0New York 98\u2013109.","DOI":"10.1145\/2043164.2018448"},{"key":"312_CR20","doi-asserted-by":"crossref","unstructured":"Chowdhury M, Zhong Y, Stoica I (2014) Efficient coflow scheduling with Varys In: Proc. ACM SIGCOMM,\u00a0New York 443\u2013454.","DOI":"10.1145\/2740070.2626315"},{"key":"312_CR21","doi-asserted-by":"crossref","unstructured":"Zhao Y, Chen K, Bai W, et al. (2015) RAPIER: Integrating routing and scheduling for coflow-aware data center networks In: Proc. IEEE INFOCOM,\u00a0Piscataway 424\u2013432.","DOI":"10.1109\/INFOCOM.2015.7218408"},{"key":"312_CR22","doi-asserted-by":"crossref","unstructured":"Li Z, Zhang Y, Li D, Chen K, Peng Y (2016) OPTAS: Decentralized flow monitoring and scheduling for tiny tasks In: Proc. IEEE INFOCOM,\u00a0Piscataway 1\u20139.","DOI":"10.1109\/INFOCOM.2016.7524532"},{"key":"312_CR23","doi-asserted-by":"publisher","unstructured":"Li Z, Shen H (2022) Co-Scheduler: A coflow-aware data-parallel job scheduler in hybrid electrical\/optical datacenter networks. IEEE\/ACM Trans Networking:1\u201314. https:\/\/doi.org\/10.1109\/TNET.2022.3143232.","DOI":"10.1109\/TNET.2022.3143232"},{"key":"312_CR24","doi-asserted-by":"crossref","unstructured":"Chowdhury M, Stoica I (2015) Efficient coflow scheduling without prior knowledge In: Proc. ACM SIGCOMM, 393\u2013406.","DOI":"10.1145\/2829988.2787480"},{"key":"312_CR25","unstructured":"Jajoo A, Gandhi R, Hu YC (2016) Graviton: Twisting space and time to speed-up coflows In: Proc. USENIX HotCloud, 1\u20136."},{"key":"312_CR26","doi-asserted-by":"crossref","unstructured":"Zhang H, Chen L, Yi B, et al. (2016) CODA: Toward automatically identifying and scheduling coflows in the dark In: Proc. ACM SIGCOMM, 160\u2013173.","DOI":"10.1145\/2934872.2934880"},{"key":"312_CR27","doi-asserted-by":"crossref","unstructured":"Chen L, Lingys J, Chen K, Liu F (2018) AuTO: Scaling deep reinforcement learning to enable datacenter-scale automatic traffic optimization In: Proc. ACM SIGCOMM, 191\u2013205.","DOI":"10.1145\/3230543.3230551"},{"key":"312_CR28","doi-asserted-by":"crossref","unstructured":"Susanto H, Jin H, Chen K (2016) Stream: Decentralized opportunistic inter-coflow scheduling for datacenter networks In: Proc. IEEE ICNP, 1\u201310.","DOI":"10.1109\/ICNP.2016.7784423"},{"key":"312_CR29","doi-asserted-by":"crossref","unstructured":"Wei Z, Guo S, Liu G, Yang Y (2021) Coflow scheduling with unknown prior information in data center networks In: Proc. IEEE ICC, 1\u20136.","DOI":"10.1109\/ICC42927.2021.9500870"},{"key":"312_CR30","doi-asserted-by":"publisher","unstructured":"Liu L, Gao C, Wang P, Huang H, Li J, Xu H, Zhang W (2021) Bottleneck-aware non-clairvoyant coflow scheduling with Fai. IEEE Trans Cloud Comput:1\u201314. https:\/\/doi.org\/10.1109\/TCC.2021.3128360.","DOI":"10.1109\/TCC.2021.3128360"},{"key":"312_CR31","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.comnet.2021.108248","volume":"196","author":"F Zhang","year":"2021","unstructured":"Zhang F, Tang Y, Shan D, Wang H, Hu C (2021) RICH: Strategy-proof and efficient coflow scheduling in non-cooperative environments. J Netw Comput 196:1\u201313.","journal-title":"J Netw Comput"},{"key":"312_CR32","doi-asserted-by":"crossref","unstructured":"Sun P, Guo Z, Wang J, Li J, Lan J, Hu Y (2021) Deepweave: Accelerating job completion time with deep reinforcement learning-based coflow scheduling In: Proc. International Joint Conferences on Artificial Intelligence,\u00a0New York 3314\u20133320.","DOI":"10.24963\/ijcai.2020\/458"},{"key":"312_CR33","doi-asserted-by":"crossref","unstructured":"Yi B, Chen L, Xia J, Chen K (2017) Towards zero copy dataflows using rdma In: Proc. ACM SIGCOMM Poster and Demo,\u00a0New York 28\u201330.","DOI":"10.1145\/3123878.3131975"},{"key":"312_CR34","doi-asserted-by":"crossref","unstructured":"Viswanath P, Babu VS (2009) Rough-DBSCAN: A fast hybrid density based clustering method for large data sets. Pattern Recogn Lett, Elsevier 30(16):1477\u20131488.","DOI":"10.1016\/j.patrec.2009.08.008"},{"key":"312_CR35","unstructured":"Xing EP, Ng AY, Jordan MI, Russell SJ (2002) Distance metric learning with application to clustering with side-information In: Proc. ACM NIPS, 521\u2013528."},{"key":"312_CR36","unstructured":"Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise In: Proc. ACM KDD,\u00a0New York 226\u2013231."},{"key":"312_CR37","unstructured":"Chen L, Chen K, Zhu J, Yu M, Porter G, Qiao C, Zhong S (2017) Enabling wide-spread communications on optical fabric with MegaSwitch In: Proc. USENIX NSDI,\u00a0New York 577\u2013593."},{"key":"312_CR38","unstructured":"Coflow benchmark based on facebook traces. https:\/\/github.com\/coflow\/coflow-benchmark.\u00a0Accessed May 2017."},{"key":"312_CR39","doi-asserted-by":"crossref","unstructured":"Meisner D, Sadler CM, Barroso LA, Weber W, Wenisch TF (2011) Power management of online data-intensive services In: Proc. International Symposium on Computer Architecture (ISCA),\u00a0New York 319\u2013330.","DOI":"10.1145\/2024723.2000103"},{"key":"312_CR40","unstructured":"Bai W, Chen L, Chen K, Wu H (2016) Enabling ECN in multi-service multi-queue data centers In: Proc. USENIX NSDI,\u00a0New York 537\u2013549."},{"key":"312_CR41","doi-asserted-by":"crossref","unstructured":"Zhang H, Zhang J, Bai W, Chen K, Chowdhury M (2017) Resilient datacenter load balancing in the wild In: Proc. ACM SIGCOMM,\u00a0\u00a0New York 253\u2013266.","DOI":"10.1145\/3098822.3098841"},{"key":"312_CR42","doi-asserted-by":"crossref","unstructured":"Zhang J, Bai W, Chen K (2019) Enabling ECN for datacenter networks with RTT variations In: ACM CoNEXT,\u00a0New York 233\u2013245.","DOI":"10.1145\/3359989.3365426"},{"key":"312_CR43","doi-asserted-by":"crossref","unstructured":"Bai W, Chen K, Chen L, Kim C, Wu H (2016) Enabling ECN over generic packet scheduling In: Proc. ACM CoNEXT,\u00a0New York 191\u2013204.","DOI":"10.1145\/2999572.2999575"},{"issue":"2","key":"312_CR44","doi-asserted-by":"publisher","first-page":"489","DOI":"10.1109\/TNET.2020.3032999","volume":"29","author":"W Bai","year":"2020","unstructured":"Bai W, Hu S, Chen K, Tan K, Xiong Y (2020) One more config is enough: saving (DC)TCP for high-speed extremely shallow-buffered datacenters. IEEE\/ACM Trans Networking 29(2):489\u2013502.","journal-title":"IEEE\/ACM Trans Networking"}],"container-title":["Journal of Cloud Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13677-022-00312-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13677-022-00312-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13677-022-00312-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,9,1]],"date-time":"2022-09-01T12:29:39Z","timestamp":1662035379000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofcloudcomputing.springeropen.com\/articles\/10.1186\/s13677-022-00312-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,1]]},"references-count":44,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["312"],"URL":"https:\/\/doi.org\/10.1186\/s13677-022-00312-7","relation":{},"ISSN":["2192-113X"],"issn-type":[{"value":"2192-113X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,1]]},"assertion":[{"value":"24 June 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 August 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 September 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"36"}}