{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:33:39Z","timestamp":1750307619343,"version":"3.41.0"},"reference-count":28,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2009,1,1]],"date-time":"2009-01-01T00:00:00Z","timestamp":1230768000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2009,1]]},"abstract":"<jats:p>Preemptive multitasking is widely used in many low-cost and real-time embedded applications for its superior hardware utilization. The frequent and asynchronous context switches, however, require the preservation and restoration of the task state, thus resulting in a large number of memory transfer instructions. As a consequence, task responsiveness and application throughput can be significantly deteriorated. To address this problem we propose a cross-layer customization framework which through the close cooperation of compiler, OS, and hardware architecture achieves rapid and low-cost task switch. Application information extracted during compile-time regarding state liveness is exploited in order to preserve a minimal amount of task state on task preemption. We introduce two complementary techniques to implement the application-aware state preservation. The first technique utilizes compiler-generated custom routines which preserve\/restore an extremely small live context at judiciously selected points in the application code. The second technique requires more sophisticated hardware support. It employs an OS-controlled register file mapping to achieve a rapid context switch. By mapping a small fraction of the register file in a single clock cycle, a context switch is achieved requiring no memory transfers for the majority of cases to preserve\/restore the live state. The effect of aggressively replicated register files, where each task is given its own replica, is achieved with the hardware cost of only adding from 25% to 50% extra physical registers. Through the utilization of these novel mechanisms, a significant improvement on task response time is achieved as the context-switch cost is minimized.<\/jats:p>","DOI":"10.1145\/1457255.1457261","type":"journal-article","created":{"date-parts":[[2009,2,10]],"date-time":"2009-02-10T16:42:19Z","timestamp":1234284139000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Cross-layer customization for rapid and low-cost task preemption in multitasked embedded systems"],"prefix":"10.1145","volume":"8","author":[{"given":"Xiangrong","family":"Zhou","sequence":"first","affiliation":[{"name":"ECE, University of Maryland, College Park"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Peter","family":"Petrov","sequence":"additional","affiliation":[{"name":"ECE, University of Maryland, College Park"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2009,2,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/6448"},{"volume-title":"Proceedings of the 12th Euromicro Workshop on Parallel, Distributed and Network-Based Processing (PDP'04)","author":"Albrecht C.","key":"e_1_2_1_2_1","unstructured":"Albrecht , C. , Hagenau , R. , and Doring , A . 2004. Cooperative software multithreading to enhance utilization of embedded processors for network applications . In Proceedings of the 12th Euromicro Workshop on Parallel, Distributed and Network-Based Processing (PDP'04) , IEEE, Los Alamitos, CA, 300--307. Albrecht, C., Hagenau, R., and Doring, A. 2004. Cooperative software multithreading to enhance utilization of embedded processors for network applications. In Proceedings of the 12th Euromicro Workshop on Parallel, Distributed and Network-Based Processing (PDP'04), IEEE, Los Alamitos, CA, 300--307."},{"volume-title":"ARM920T technical reference manual","author":"Ltd","key":"e_1_2_1_3_1","unstructured":"ARM Ltd . ARM920T technical reference manual . ARM Ltd . ARM Ltd. ARM920T technical reference manual. ARM Ltd."},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Baker T. Snyder J. and Whalley D. 1995. Fast context switches: Compiler and architectural support for preemptive scheduling. In Microprocessors and Microsystems 35--42.  Baker T. Snyder J. and Whalley D. 1995. Fast context switches: Compiler and architectural support for preemptive scheduling. In Microprocessors and Microsystems 35--42.","DOI":"10.1016\/0141-9331(95)93086-X"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/513829.513855"},{"key":"e_1_2_1_6_1","first-page":"563","article-title":"Mantis os: An embedded multithreaded operating system for wireless micro sensor platforms","volume":"10","author":"Bhatti S.","year":"2005","unstructured":"Bhatti , S. , Carlson , J. , Dai , H. , Deng , J. , Rose , J. , Sheth , A. , Shucker , B. , Gruenwald , C. , Torgerson , A. , and Han , R. 2005 . Mantis os: An embedded multithreaded operating system for wireless micro sensor platforms . Mob. Netw. Appl. (Special Issue on Wireless Sensor Networks) 10 , 4, 563 -- 579 . Bhatti, S., Carlson, J., Dai, H., Deng, J., Rose, J., Sheth, A., Shucker, B., Gruenwald, C., Torgerson, A., and Han, R. 2005. Mantis os: An embedded multithreaded operating system for wireless micro sensor platforms. Mob. Netw. Appl. (Special Issue on Wireless Sensor Networks) 10, 4, 563--579.","journal-title":"Mob. Netw. Appl. (Special Issue on Wireless Sensor Networks)"},{"key":"e_1_2_1_7_1","unstructured":"Bovet D. and Cesati M. 2002. Understanding the Linux Kernel 2nd Ed. O'Reilly Sebastopol CA.   Bovet D. and Cesati M. 2002. Understanding the Linux Kernel 2nd Ed. O'Reilly Sebastopol CA."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/6.402166"},{"volume-title":"Proceedings of the Symposium on Operating System Design and Implementation. 45--58","author":"Chandra A.","key":"e_1_2_1_9_1","unstructured":"Chandra , A. , Adler , M. , Goyal , P. , and Shenoy , P . 2000. Surplus fair scheduling: A proportional-share cpu scheduling algorithm for symmetric multiprocessors . In Proceedings of the Symposium on Operating System Design and Implementation. 45--58 . Chandra, A., Adler, M., Goyal, P., and Shenoy, P. 2000. Surplus fair scheduling: A proportional-share cpu scheduling algorithm for symmetric multiprocessors. In Proceedings of the Symposium on Operating System Design and Implementation. 45--58."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/DATE.2005.275"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339682"},{"key":"e_1_2_1_13_1","volume-title":"Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools.","author":"Fisher J.","year":"2005","unstructured":"Fisher , J. , Faraboschi , P. , and Young , C . 2005 . Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Morgan Kaufman , New York, NY . Fisher, J., Faraboschi, P., and Young, C. 2005. Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Morgan Kaufman, New York, NY."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/1128020.1128563"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.620482"},{"key":"e_1_2_1_16_1","unstructured":"Hill J. and Culler D. 2001. A wireless embedded sensor architecture for system-level optimization. Tech. rep. University of California Berkeley.  Hill J. and Culler D. 2001. A wireless embedded sensor architecture for system-level optimization. Tech. rep. University of California Berkeley."},{"key":"e_1_2_1_17_1","unstructured":"Hinton G. Sager D. Upton M. Boggs D. Carmean D. Kyker A. and Roussel P. 2001. The microarchitecture of the pentium 4 processor. Intel Tech. J.  Hinton G. Sager D. Upton M. Boggs D. Carmean D. Kyker A. and Roussel P. 2001. The microarchitecture of the pentium 4 processor. Intel Tech. J."},{"volume-title":"Intel XScale Microarchitecture","author":"Intel Corporation","key":"e_1_2_1_18_1","unstructured":"Intel Corporation . Intel XScale Microarchitecture . Intel Corporation . Intel Corporation. Intel XScale Microarchitecture. Intel Corporation."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/40.755465"},{"volume-title":"Proceedings of the 30th Annual International Symposium on Microarchitecture (MICRO'30)","author":"Lee C.","key":"e_1_2_1_20_1","unstructured":"Lee , C. , Potkonjak , M. , and Mangione-Smith , W. H . 1997. Mediabench: A tool for evaluating and synthesizing multimedia and communications systems . In Proceedings of the 30th Annual International Symposium on Microarchitecture (MICRO'30) . IEEE, Los Alamitos, CA, 330--335. Lee, C., Potkonjak, M., and Mangione-Smith, W. H. 1997. Mediabench: A tool for evaluating and synthesizing multimedia and communications systems. In Proceedings of the 30th Annual International Symposium on Microarchitecture (MICRO'30). IEEE, Los Alamitos, CA, 330--335."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/MDM.2006.151"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.931894"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/762483.762484"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2005.21"},{"volume-title":"Proceedings of the 9th International Symposium on High-Performance Computer Architecture (HPCA'03)","author":"Redstone J.","key":"e_1_2_1_25_1","unstructured":"Redstone , J. , Eggers , S. , and Levy , H . 2003. Mini-threads: Increasing tlp on small-scale smt processors . In Proceedings of the 9th International Symposium on High-Performance Computer Architecture (HPCA'03) , IEEE, Los Alamitos, CA, 19--30. Redstone, J., Eggers, S., and Levy, H. 2003. Mini-threads: Increasing tlp on small-scale smt processors. In Proceedings of the 9th International Symposium on High-Performance Computer Architecture (HPCA'03), IEEE, Los Alamitos, CA, 19--30."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.5555\/619003.620354"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2003.1261391"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1086297.1086335"},{"key":"e_1_2_1_29_1","unstructured":"WINDRIVER. VxWorks http:\/\/www.windriver.com.  WINDRIVER. VxWorks http:\/\/www.windriver.com."}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1457255.1457261","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1457255.1457261","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T12:45:48Z","timestamp":1750250748000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1457255.1457261"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,1]]},"references-count":28,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2009,1]]}},"alternative-id":["10.1145\/1457255.1457261"],"URL":"https:\/\/doi.org\/10.1145\/1457255.1457261","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2009,1]]},"assertion":[{"value":"2007-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2008-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-02-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}