{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,16]],"date-time":"2026-05-16T04:06:44Z","timestamp":1778904404126,"version":"3.51.4"},"reference-count":56,"publisher":"Wiley","issue":"6","license":[{"start":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T00:00:00Z","timestamp":1775001600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"},{"start":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T00:00:00Z","timestamp":1775001600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/doi.wiley.com\/10.1002\/tdm_license_1.1"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Softw Pract Exp"],"published-print":{"date-parts":[[2026,6]]},"abstract":"<jats:title>ABSTRACT<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Introduction<\/jats:title>\n                    <jats:p>Modern stream processing engines are increasingly deployed on high\u2010core\u2010count servers with Non\u2010Uniform Memory Access (NUMA) architectures, where the cost of inter\u2010socket memory access poses a significant challenge to achieving low latency and high throughput. Existing approaches to operator placement either rely on static assignments that degrade under workload variations or employ dynamic migrations that incur excessive overhead due to blocking synchronization or global barriers.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>This paper introduces a lock\u2010free, NUMA\u2010aware operator rebinding mechanism that dynamically reallocates operator tasks across threads with minimal disruption. The mechanism uses an autonomic controller to detect imbalance in per\u2010thread queues and enacts rebinding via control messages and atomic updates, ensuring correctness without stalling execution. A two\u2010level policy is proposed, combining NUMA\u2010level partitioning with intra\u2010node thread\u2010level refinements, triggered by latency thresholds.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Extensive experiments using a 300\u2010query urban traffic analytics workload demonstrate that the proposed method achieves non\u2010negligible throughput improvement and reduces latency compared to state\u2010of\u2010the\u2010art static and METIS\u2010based approaches. Furthermore, it reduces latency variance by an order of magnitude, illustrating the importance of fine\u2010grained NUMA\u2010aware scheduling in memory\u2010bound stream processing.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1002\/spe.70064","type":"journal-article","created":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T07:10:25Z","timestamp":1775027425000},"page":"687-708","update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Operator Rebinding for Stream Processing on\n                    <scp>NUMA<\/scp>\n                    Machines"],"prefix":"10.1002","volume":"56","author":[{"given":"Xiaorui","family":"Du","sequence":"first","affiliation":[{"name":"Technical University of Munich  Munich Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrea","family":"Piccione","sequence":"additional","affiliation":[{"name":"Huawei Munich Research Center  Munich Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Adriano","family":"Pimpini","sequence":"additional","affiliation":[{"name":"DICII, University of Rome \u201cTor Vergata\u201d  Rome Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stefano","family":"Bortoli","sequence":"additional","affiliation":[{"name":"Huawei Munich Research Center  Munich Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0179-9868","authenticated-orcid":false,"given":"Alessandro","family":"Pellegrini","sequence":"additional","affiliation":[{"name":"DICII, University of Rome \u201cTor Vergata\u201d  Rome Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alois","family":"Knoll","sequence":"additional","affiliation":[{"name":"Technical University of Munich  Munich Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","published-online":{"date-parts":[[2026,4,1]]},"reference":[{"key":"e_1_2_11_2_1","volume-title":"What Every Programmer Should Know About Memory","author":"Ulrich D.","year":"2007"},{"key":"e_1_2_11_3_1","first-page":"3","volume-title":"In Memory Data Management and Analysis. Lecture Notes in Computer Science","author":"Lang H.","year":"2015"},{"key":"e_1_2_11_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2014.2313874"},{"key":"e_1_2_11_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2004.1303026"},{"key":"e_1_2_11_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465282"},{"key":"e_1_2_11_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2013.295"},{"key":"e_1_2_11_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3062341.3062366"},{"key":"e_1_2_11_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCGrid59990.2024.00022"},{"key":"e_1_2_11_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/DS-RT62209.2024.00019"},{"key":"e_1_2_11_11_1","first-page":"1","article-title":"To Migrate or Not to Migrate: An Analysis of Operator Migration in Distributed Stream Processing","volume":"26","author":"Espen V.","year":"2023","journal-title":"IEEE Communications Surveys and Tutorials"},{"key":"e_1_2_11_12_1","volume-title":"Proceedings of the 1998 IEEE\/ACM High Performance Networking and Computing Conference. SC'98","author":"George K.","year":"1998"},{"issue":"2","key":"e_1_2_11_13_1","first-page":"1379","article-title":"Towards a Streaming SQL Standard","volume":"1","author":"Namit J.","year":"2008","journal-title":"Proceedings of the VLDB Endowment International Conference on Very Large Data Bases"},{"key":"e_1_2_11_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/s13222-022-00415-0"},{"key":"e_1_2_11_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2014.2311811"},{"key":"e_1_2_11_16_1","first-page":"206","volume-title":"Lecture Notes in Computer Science. Lecture Notes in Computer Science","author":"Jonatan L.","year":"2013"},{"key":"e_1_2_11_17_1","first-page":"46","volume-title":"Proceedings of the 9th EAI International Conference on Simulation Tools and Techniques. SIMUTOOLS","author":"Romolo M.","year":"2016"},{"issue":"1","key":"e_1_2_11_18_1","first-page":"1","article-title":"A Conflict\u2010Resilient Lock\u2010Free Linearizable Calendar Queue","volume":"11","author":"Romolo M.","year":"2024","journal-title":"ACM Transactions on Parallel Computing"},{"key":"e_1_2_11_19_1","volume-title":"IEEE INFOCOM 2016\u2014The 35th Annual IEEE International Conference on Computer Communications","author":"Raphael E.","year":"2016"},{"key":"e_1_2_11_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3092819.3092823"},{"issue":"13","key":"e_1_2_11_21_1","first-page":"1329","article-title":"Database System Support of Simulation Data","volume":"9","author":"Hermano L.","year":"2016","journal-title":"Proceedings of the VLDB Endowment International Conference on Very Large Data Bases"},{"key":"e_1_2_11_22_1","volume-title":"Proceedings of the International Workshop on Real\u2010Time Business Intelligence and Analytics","author":"Daniele F.","year":"2018"},{"key":"e_1_2_11_23_1","first-page":"1","article-title":"Enactment of Adaptation in Data Stream Processing With Latency Implications\u2014A Systematic Literature Review","volume":"111","author":"Cui Q.","year":"2019","journal-title":"Information and Software Technology"},{"key":"e_1_2_11_24_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.6759"},{"key":"e_1_2_11_25_1","volume-title":"Proceedings of the 7th ACM International Conference on Distributed Event\u2010Based Systems","author":"Boris K.","year":"2013"},{"issue":"4","key":"e_1_2_11_26_1","first-page":"28","article-title":"Apache Flink: Stream and Batch Processing in a Single Engine","volume":"38","author":"Paris C.","year":"2015","journal-title":"Bulletin of the Technical Committee on Data Engineering"},{"key":"e_1_2_11_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007617"},{"key":"e_1_2_11_28_1","volume-title":"Proceedings of the 16th ACM International Conference on Distributed and Event\u2010Based Systems","author":"Espen V.","year":"2022"},{"key":"e_1_2_11_29_1","volume-title":"Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data","author":"Bonaventura D. M.","year":"2020"},{"key":"e_1_2_11_30_1","first-page":"1","volume-title":"Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. PPoPP '16","author":"Tiziano D. M.","year":"2016"},{"key":"e_1_2_11_31_1","volume-title":"Proceedings of the 24th ACM International on Conference on Information and Knowledge Management","author":"Skat M. K. G.","year":"2015"},{"key":"e_1_2_11_32_1","first-page":"539","volume-title":"2022 USENIX Annual Technical Conference (USENIX ATC 22)","author":"Rong G.","year":"2022"},{"key":"e_1_2_11_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/MIC.2008.129"},{"key":"e_1_2_11_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2017.2723403"},{"key":"e_1_2_11_35_1","volume-title":"Proceedings 19th International Conference on Data Engineering. DE '04","author":"Shah M. A.","year":"2004"},{"key":"e_1_2_11_36_1","volume-title":"Proceedings of the 7th ACM International Conference on Distributed Event\u2010Based Systems","author":"Beate O.","year":"2013"},{"key":"e_1_2_11_37_1","volume-title":"Proceedings of the 12th ACM International Conference on Distributed and Event\u2010Based Systems","author":"Manisha L.","year":"2018"},{"key":"e_1_2_11_38_1","volume-title":"Proceedings of the 21st International Middleware Conference","author":"Albert J.","year":"2020"},{"key":"e_1_2_11_39_1","first-page":"783","volume-title":"Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation. OSDI '18","author":"Vasiliki K.","year":"2018"},{"key":"e_1_2_11_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2006.105"},{"key":"e_1_2_11_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-013-0335-9"},{"key":"e_1_2_11_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLOUD.2016.0023"},{"key":"e_1_2_11_43_1","volume-title":"Proceedings of the 8th ACM International Conference on Distributed Event\u2010Based Systems","author":"Heinze T.","year":"2014"},{"key":"e_1_2_11_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCCN.2010.5560127"},{"key":"e_1_2_11_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3361525.3361551"},{"key":"e_1_2_11_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2017.2762683"},{"key":"e_1_2_11_47_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2016.08.037"},{"key":"e_1_2_11_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2018.05.025"},{"key":"e_1_2_11_49_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2012.09.013"},{"key":"e_1_2_11_50_1","first-page":"26","volume-title":"Proceedings of the 5th GI\/ITG KuVS Fachgespr\u00e4ch Inter\u2010Vehicle Communication. Technical Reports, vol. CS\u20102017\u201003","author":"Zehe D.","year":"2017"},{"key":"e_1_2_11_51_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-48311-X_42"},{"key":"e_1_2_11_52_1","doi-asserted-by":"publisher","DOI":"10.1002\/spe.665"},{"key":"e_1_2_11_53_1","volume-title":"METIS: A Software Package for Partitioning Unstructured Graphs, Partitioning Meshes, and Computing Fill\u2010Reducing Orderings of Sparse Matrices. 97\u2013061","author":"George K.","year":"1997"},{"key":"e_1_2_11_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/WSC.2014.7020180"},{"key":"e_1_2_11_55_1","volume-title":"In: 2023 Winter Simulation Conference (WSC)","author":"Xiaorui D.","year":"2023"},{"key":"e_1_2_11_56_1","first-page":"181","volume-title":"Proceedings of the 2022 SUMO User Conference","author":"Zhuoxiao M.","year":"2022"},{"key":"e_1_2_11_57_1","volume-title":"Intel Threading Building Blocks: Outfitting C++ for Multi\u2010Core Processor Parallelism","author":"Reinders J.","year":"2007"}],"container-title":["Software: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/spe.70064","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/spe.70064","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/spe.70064","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,16]],"date-time":"2026-05-16T03:08:12Z","timestamp":1778900892000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/spe.70064"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,1]]},"references-count":56,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2026,6]]}},"alternative-id":["10.1002\/spe.70064"],"URL":"https:\/\/doi.org\/10.1002\/spe.70064","archive":["Portico"],"relation":{},"ISSN":["0038-0644","1097-024X"],"issn-type":[{"value":"0038-0644","type":"print"},{"value":"1097-024X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,4,1]]},"assertion":[{"value":"2025-07-28","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-03-17","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-04-01","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}