{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,1]],"date-time":"2025-11-01T05:44:03Z","timestamp":1761975843390},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T00:00:00Z","timestamp":1600041600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T00:00:00Z","timestamp":1600041600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cloud Comp"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Real-time data streaming fetches live sensory segments of the dataset in the heterogeneous distributed computing environment. This process assembles data chunks at a rapid encapsulation rate through a streaming technique that bundles sensor segments into multiple micro-batches and extracts into a repository, respectively. Recently, the acquisition process is enhanced with an additional feature of exchanging IoT devices\u2019 dataset comprised of two components: (i) sensory data and (ii) metadata. The body of sensory data includes record information, and the metadata part consists of logs, heterogeneous events, and routing path tables to transmit micro-batch streams into the repository. Real-time acquisition procedure uses the Directed Acyclic Graph (DAG) to extract live query outcomes from in-place micro-batches through MapReduce stages and returns a result set. However, few bottlenecks affect the performance during the execution process, such as (i) homogeneous micro-batches formation only, (ii) complexity of dataset diversification, (iii) heterogeneous data tuples processing, and (iv) linear DAG workflow only. As a result, it produces huge processing latency and the additional cost of extracting event-enabled IoT datasets. Thus, the Spark cluster that processes Resilient Distributed Dataset (RDD) in a fast-pace using Random access memory (RAM) defies expected robustness in processing IoT streams in the distributed computing environment. This paper presents an IoT-enabled Directed Acyclic Graph (I-DAG) technique that labels micro-batches at the stage of building a stream event and arranges stream elements with event labels. In the next step, heterogeneous stream events are processed through the I-DAG workflow, which has non-linear DAG operation for extracting queries\u2019 results in a Spark cluster. The performance evaluation shows that I-DAG resolves homogeneous IoT-enabled stream event issues and provides an effective stream event heterogeneous solution for IoT-enabled datasets in spark clusters.<\/jats:p>","DOI":"10.1186\/s13677-020-00195-6","type":"journal-article","created":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T10:02:59Z","timestamp":1600077779000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["IoT-enabled directed acyclic graph in spark cluster"],"prefix":"10.1186","volume":"9","author":[{"given":"Jahwan","family":"Koo","sequence":"first","affiliation":[]},{"given":"Nawab Muhammad","family":"Faseeh Qureshi","sequence":"additional","affiliation":[]},{"given":"Isma Farah","family":"Siddiqui","sequence":"additional","affiliation":[]},{"given":"Asad","family":"Abbas","sequence":"additional","affiliation":[]},{"given":"Ali Kashif","family":"Bashir","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,9,14]]},"reference":[{"issue":"2","key":"195_CR1","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1145\/1083784.1083789","volume":"34","author":"M Gaber","year":"2005","unstructured":"Gaber M, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM Sigmod Rec 34(2):18\u201326.","journal-title":"ACM Sigmod Rec"},{"issue":"5","key":"195_CR2","first-page":"402","volume":"78","author":"PJ Denning","year":"1990","unstructured":"Denning PJ (1990) The science of computing: Saving all the bits. American Sci 78(5):402\u2013405.","journal-title":"American Sci"},{"issue":"2","key":"195_CR3","doi-asserted-by":"publisher","first-page":"432","DOI":"10.1109\/TBC.2018.2822869","volume":"64","author":"M Vega","year":"2018","unstructured":"Vega M, Perra C, De Turck F, Liotta A (2018) A review of predictive quality of experience management in video streaming services. IEEE Trans Broadcast 64(2):432\u2013445.","journal-title":"IEEE Trans Broadcast"},{"key":"195_CR4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.jnca.2017.12.001","volume":"103","author":"M de Assuncao","year":"2018","unstructured":"de Assuncao M, da Silva Veith A, Buyya R (2018) Distributed data stream processing and edge computing: A survey on resource elasticity and future directions. J Netw Comput Appl 103:1\u201317.","journal-title":"J Netw Comput Appl"},{"issue":"1","key":"195_CR5","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2674026.2674028","volume":"16","author":"G Krempl","year":"2014","unstructured":"Krempl G, \u017eliobaite I, Brzezi\u0144ski D, H\u00fcllermeier E, Last M, Lemaire V, Noack T, Shaker A, Sievi S, Spiliopoulou M, et al. (2014) Open challenges for data stream mining research. ACM SIGKDD explor newsl 16(1):1\u201310.","journal-title":"ACM SIGKDD explor newsl"},{"issue":"1","key":"195_CR6","first-page":"97","volume":"26","author":"X Wu","year":"2013","unstructured":"Wu X, Zhu X, Wu G-Q, Ding W (2013) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97\u2013107.","journal-title":"IEEE Trans Knowl Data Eng"},{"unstructured":"Streaming SQL Analytics for Kafka & Kinesis.https:\/\/sqlstream.com. Accessed 11 Dec 2019.","key":"195_CR7"},{"unstructured":"Software Inc. TGlobal Leader in Integration and Analytics Software. https:\/\/www.tibco.com\/. Accessed 11 Dec 2019.","key":"195_CR8"},{"unstructured":"Inc. IComputer hardware company. http:\/\/www.ibm.com. Accessed 11 Dec 2019.","key":"195_CR9"},{"unstructured":"striimstream with two i\u2019s for integration and intelligence. https:\/\/www.striim.com\/. Accessed 11 Dec 2019.","key":"195_CR10"},{"unstructured":"Apache OrgWelcome to The Apache Software Foundation!. https:\/\/www.apache.org\/. Accessed 11 Dec 2019.","key":"195_CR11"},{"key":"195_CR12","volume-title":"Apache Flume: Distributed Log Collection for Hadoop","author":"S Hoffman","year":"2013","unstructured":"Hoffman S (2013) Apache Flume: Distributed Log Collection for Hadoop. Packt Publishing Ltd, USA."},{"issue":"11","key":"195_CR13","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1145\/2934664","volume":"59","author":"M Zaharia","year":"2016","unstructured":"Zaharia M, Xin R, Wendell P, Das T, Armbrust M, Dave A, Meng X, Rosen J, Venkataraman S, Franklin M, et al. (2016) Apache spark: a unified engine for big data processing. Commun ACM 59(11):56\u201365.","journal-title":"Commun ACM"},{"key":"195_CR14","volume-title":"Learning Storm","author":"A Jain","year":"2014","unstructured":"Jain A, Nalya A (2014) Learning Storm. Packt Publishing Ltd, USA."},{"unstructured":"Apache nifiWelcome to The Apache Nifi. http:\/\/nifi.apache.org. Accessed 11 Dec 2019.","key":"195_CR15"},{"unstructured":"Apache ApexWelcome to The Apache Apex.","key":"195_CR16"},{"unstructured":"Apache KafkaWelcome to The Apache Kafka. http:\/\/kafka.apache.org. Accessed 11 Dec 2019.","key":"195_CR17"},{"unstructured":"Apache SamzaWelcome to The Apache Samza. samza.apache.org. Accessed 11 Dec 2019.","key":"195_CR18"},{"unstructured":"Apache FlinkWelcome to The Apache Flink. http:\/\/flink.apache.org. Accessed 11 Dec 2019.","key":"195_CR19"},{"unstructured":"Apache BeamWelcome to The Apache Beam. http:\/\/beam.apache.org. Accessed 11 Dec 2019.","key":"195_CR20"},{"unstructured":"Apache IgniteWelcome to The Apache Ignite. http:\/\/ignite.apache.org. Accessed 11 Dec 2019.","key":"195_CR21"},{"key":"195_CR22","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1109\/ICDEW.2010.5452751","volume-title":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","author":"N Tatbul","year":"2010","unstructured":"Tatbul N (2010) Streaming data integration: Challenges and opportunities In: 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010), 155\u2013158.. IEEE, USA."},{"key":"195_CR23","doi-asserted-by":"publisher","first-page":"414","DOI":"10.1007\/978-3-540-74469-6_41","volume-title":"International Conference on Database and Expert Systems Applications","author":"Y Watanabe","year":"2007","unstructured":"Watanabe Y, Yamada S, Kitagawa H, Amagasa T (2007) Integrating a stream processing engine and databases for persistent streaming data management In: International Conference on Database and Expert Systems Applications, 414\u2013423.. Springer, USA."},{"issue":"15","key":"195_CR24","doi-asserted-by":"publisher","first-page":"2787","DOI":"10.1016\/j.comnet.2010.05.010","volume":"54","author":"L Atzori","year":"2010","unstructured":"Atzori L, Iera A, Morabito G (2010) The internet of things: A survey. Comput Netw 54(15):2787\u20132805.","journal-title":"Comput Netw"},{"key":"195_CR25","doi-asserted-by":"publisher","first-page":"3185","DOI":"10.1109\/ICC.2014.6883811","volume-title":"2014 IEEE International Conference on Communications (ICC)","author":"S Vural","year":"2014","unstructured":"Vural S, Navaratnam P, Wang N, Wang C, Dong L, Tafazolli R (2014) In-network caching of internet-of-things data In: 2014 IEEE International Conference on Communications (ICC), 3185\u20133190.. IEEE, USA."},{"issue":"4","key":"195_CR26","doi-asserted-by":"publisher","first-page":"42","DOI":"10.1145\/1107499.1107504","volume":"34","author":"M Stonebraker","year":"2005","unstructured":"Stonebraker M, \u00c7etintemel U, Zdonik S (2005) The 8 requirements of real-time stream processing. ACM Sigmod Rec 34(4):42\u201347.","journal-title":"ACM Sigmod Rec"},{"doi-asserted-by":"crossref","unstructured":"Gaur P, Tahiliani M (2015) Operating systems for iot devices: A critical survey In: 2015 IEEE Region 10 Symposium, 33\u201336. IEEE.","key":"195_CR27","DOI":"10.1109\/TENSYMP.2015.17"},{"issue":"2","key":"195_CR28","doi-asserted-by":"publisher","first-page":"485","DOI":"10.1109\/JSEN.2015.2483499","volume":"16","author":"Y-S Kang","year":"2015","unstructured":"Kang Y-S, Park I-H, Rhee J, Lee Y-H (2015) Mongodb-based repository design for iot-generated rfid\/sensor big data. IEEE Sensors J 16(2):485\u2013497.","journal-title":"IEEE Sensors J"},{"issue":"1","key":"195_CR29","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1109\/MCC.2014.22","volume":"1","author":"R Ranjan","year":"2014","unstructured":"Ranjan R (2014) Streaming big data processing in datacenter clouds. IEEE Cloud Comput 1(1):78\u201383.","journal-title":"IEEE Cloud Comput"},{"key":"195_CR30","first-page":"1","volume":"2","author":"S Kamburugamuve","year":"2013","unstructured":"Kamburugamuve S, Fox G, Leake D, Qiu J (2013) Survey of distributed stream processing for large stream sources. Grids Ucs Indiana Edu 2:1\u201316.","journal-title":"Grids Ucs Indiana Edu"},{"key":"195_CR31","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1007\/1-4020-2258-1_11","volume-title":"Web Content Caching and Distribution","author":"J Lemon","year":"2004","unstructured":"Lemon J, Wang Z, Yang Z, Cao P (2004) Stream engine: A new kernel interface for high-performance internet streaming servers In: Web Content Caching and Distribution, 159\u2013170.. Springer, USA."},{"key":"195_CR32","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1145\/1851476.1851583","volume-title":"Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing","author":"C Liew","year":"2010","unstructured":"Liew C, Atkinson M, van Hemert J, Han L (2010) Towards optimising distributed data streaming graphs using parallel streams In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 725\u2013736.. ACM, USA."},{"key":"195_CR33","doi-asserted-by":"publisher","first-page":"158","DOI":"10.1145\/3312614.3312647","volume-title":"Proceedings of the International Conference on Omni-Layer Intelligent Systems","author":"A Saad","year":"2019","unstructured":"Saad A, Park S (2019) Decentralized directed acyclic graph based dlt network In: Proceedings of the International Conference on Omni-Layer Intelligent Systems, 158\u2013163.. ACM, USA."},{"key":"195_CR34","first-page":"1","volume-title":"2018 IEEE Global Communications Conference (GLOBECOM)","author":"N Qureshi","year":"2018","unstructured":"Qureshi N, Bashir A, Siddiqui I, Abbas A, Choi K, Shin D (2018) A knowledge-based path optimization technique for cognitive nodes in smart grid In: 2018 IEEE Global Communications Conference (GLOBECOM), 1\u20136.. IEEE, USA."},{"issue":"4","key":"195_CR35","doi-asserted-by":"publisher","first-page":"2225","DOI":"10.1007\/s11277-018-5936-6","volume":"106","author":"N Qureshi","year":"2019","unstructured":"Qureshi N, Siddiqui I, Unar M, Uqaili M, Nam C, Shin D, Kim J, Bashir A, Abbas A (2019) An aggregate mapreduce data block placement strategy for wireless iot edge nodes in smart grid. Wirel Pers Commun 106(4):2225\u20132236.","journal-title":"Wirel Pers Commun"},{"issue":"4","key":"195_CR36","doi-asserted-by":"publisher","first-page":"1969","DOI":"10.1007\/s11277-018-5739-9","volume":"106","author":"I Siddiqui","year":"2019","unstructured":"Siddiqui I, Qureshi N, Shaikh M, Chowdhry B, Abbas A, Bashir A, Lee S-J (2019) Stuck-at fault analytics of iot devices using knowledge-based data processing strategy in smart grid. Wirel Pers Commun 106(4):1969\u20131983.","journal-title":"Wirel Pers Commun"},{"key":"195_CR37","doi-asserted-by":"publisher","first-page":"218","DOI":"10.1016\/j.ins.2019.06.032","volume":"502","author":"F Zhu","year":"2019","unstructured":"Zhu F, Wu W, Zhang Y, Chen X (2019) Privacy-preserving authentication for general directed graphs in industrial IoT. Inf Sci 502:218\u2013228.","journal-title":"Inf Sci"},{"issue":"30","key":"195_CR38","doi-asserted-by":"publisher","first-page":"693","DOI":"10.1016\/j.ifacol.2018.11.213","volume":"51","author":"I Kotilevets","year":"2018","unstructured":"Kotilevets I, Ivanova I, Romanov I, Magomedov S, Nikonov V, Pavelev S (2018) Implementation of directed acyclic graph in blockchain network to improve security and speed of transactions. IFAC-PapersOnLine 51(30):693\u2013696.","journal-title":"IFAC-PapersOnLine"},{"key":"195_CR39","doi-asserted-by":"publisher","first-page":"2189","DOI":"10.1109\/TNNLS.2019.2929068","volume":"31","author":"C Deng","year":"2020","unstructured":"Deng C, Yang E, Liu T, Tao D (2020) Two-Stream Deep Hashing With Class-Specific Centers for Supervised Image Search. IEEE Trans Neural Netw Learn Syst 31:2189\u20132201. https:\/\/doi.org\/10.1109\/TNNLS.2019.2929068.","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"issue":"2","key":"195_CR40","doi-asserted-by":"crossref","first-page":"130","DOI":"10.1080\/00031305.1984.10483182","volume":"38","author":"J Saw","year":"1984","unstructured":"Saw J, Yang M, Mo T (1984) Chebyshev inequality with estimated mean and variance. Am Stat 38(2):130\u2013132.","journal-title":"Am Stat"},{"key":"195_CR41","doi-asserted-by":"publisher","first-page":"3324","DOI":"10.1109\/IJCNN.2014.6889806","volume-title":"2014 International Joint Conference on Neural Networks (IJCNN)","author":"P Duda","year":"2014","unstructured":"Duda P, Jaworski M, Pietruczuk L, Rutkowski L (2014) A novel application of hoeffding\u2019s inequality to decision trees construction for data streams In: 2014 International Joint Conference on Neural Networks (IJCNN), 3324\u20133330.. IEEE, USA."},{"key":"195_CR42","doi-asserted-by":"publisher","first-page":"522","DOI":"10.23919\/ICACT.2019.8701970","volume-title":"2019 21st International Conference on Advanced Communication Technology (ICACT)","author":"N Qureshi","year":"2019","unstructured":"Qureshi N, Siddiqui I, Abbas A, Bashir A, Choi K, Kim J, Shin D (2019) Dynamic container-based resource management framework of spark ecosystem In: 2019 21st International Conference on Advanced Communication Technology (ICACT), 522\u2013526.. IEEE, USA."},{"doi-asserted-by":"crossref","unstructured":"Siddiqui IF, Qureshi NMF, Chowdhry BS, Uqaili MA (2020) Pseudo-Cache-Based IoT Small Files Management Framework in HDFS Cluster. Wirel Personal Commun.","key":"195_CR43","DOI":"10.1007\/s11277-020-07312-3"},{"doi-asserted-by":"crossref","unstructured":"Qureshi NMF, Siddiqui IF, Abbas A, Bashir AK, Nam CS, Chowdhry BS, Uqaili MA (2020) Stream-Based Authentication Strategy Using IoT Sensor Data in Multi-homing Sub-aqueous Big Data Network. Wirel Personal Commun.1\u201313.","key":"195_CR44","DOI":"10.1007\/s11277-020-07215-3"},{"issue":"1","key":"195_CR45","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1007\/s11277-019-06264-7","volume":"106","author":"IF Siddiqui","year":"2019","unstructured":"Siddiqui IF, Qureshi NMF, Chowdhry BS, Uqaili MA (2019) Edge-node-aware adaptive data processing framework for smart grid. Wirel Personal Commun 106(1):179\u2013189.","journal-title":"Wirel Personal Commun"},{"key":"195_CR46","first-page":"2","volume":"5","author":"NMF Qureshi","year":"1374","unstructured":"Qureshi NMF, Shin DR, Siddiqui IF, Chowdhry BS (1374) Storage-tag-aware scheduler for hadoop cluster. IEEE Access 5:2\u201313755.","journal-title":"IEEE Access"},{"issue":"9","key":"195_CR47","first-page":"4063","volume":"10","author":"NMF Qureshi","year":"2016","unstructured":"Qureshi NMF, Shin DR (2016) RDP: A storage-tier-aware Robust Data Placement strategy for Hadoop in a Cloud-based Heterogeneous Environment. TIIS 10(9):4063\u20134086.","journal-title":"TIIS"}],"container-title":["Journal of Cloud Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13677-020-00195-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13677-020-00195-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13677-020-00195-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,13]],"date-time":"2024-08-13T19:22:52Z","timestamp":1723576972000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofcloudcomputing.springeropen.com\/articles\/10.1186\/s13677-020-00195-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,14]]},"references-count":47,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["195"],"URL":"https:\/\/doi.org\/10.1186\/s13677-020-00195-6","relation":{},"ISSN":["2192-113X"],"issn-type":[{"type":"electronic","value":"2192-113X"}],"subject":[],"published":{"date-parts":[[2020,9,14]]},"assertion":[{"value":"21 January 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 August 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 September 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"There is not any conflict of interest among the authors related to the content disclosed at the disposal of manuscript and the authors have a mutual consenton all concerned points related to the manuscript.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"50"}}