{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T22:58:59Z","timestamp":1762210739037,"version":"3.41.0"},"reference-count":24,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2012,1,9]],"date-time":"2012-01-09T00:00:00Z","timestamp":1326067200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGCOMM Comput. Commun. Rev."],"published-print":{"date-parts":[[2013,1,9]]},"abstract":"<jats:p>Internet traffic measurement and analysis has long been used to characterize network usage and user behaviors, but faces the problem of scalability under the explosive growth of Internet traffic and high-speed access. Scalable Internet traffic measurement and analysis is difficult because a large data set requires matching computing and storage resources. Hadoop, an open-source computing platform of MapReduce and a distributed file system, has become a popular infrastructure for massive data analytics because it facilitates scalable data processing and storage services on a distributed computing system consisting of commodity hardware. In this paper, we present a Hadoop-based traffic monitoring system that performs IP, TCP, HTTP, and NetFlow analysis of multi-terabytes of Internet traffic in a scalable manner. From experiments with a 200-node testbed, we achieved 14 Gbps throughput for 5 TB files with IP and HTTP-layer analysis MapReduce jobs. We also explain the performance issues related with traffic analysis MapReduce jobs.<\/jats:p>","DOI":"10.1145\/2427036.2427038","type":"journal-article","created":{"date-parts":[[2013,1,11]],"date-time":"2013-01-11T15:42:48Z","timestamp":1357918968000},"page":"5-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":147,"title":["Toward scalable internet traffic measurement and analysis with Hadoop"],"prefix":"10.1145","volume":"43","author":[{"given":"Yeonhee","family":"Lee","sequence":"first","affiliation":[{"name":"Dept. of Computer Engineering, Chungnam National University, Daejon, South Korea"}]},{"given":"Youngseok","family":"Lee","sequence":"additional","affiliation":[{"name":"Dept. of Computer Engineering, Chungnam National University, Daejon, South Korea"}]}],"member":"320","published-online":{"date-parts":[[2012,1,9]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"2016","author":"Paper Cisco White","year":"2011","journal-title":"Cicso Visual Networking Index: Forecast and Methodology"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1544012.1544024"},{"volume-title":"USENIX OSDI","year":"2004","author":"Dean J.","key":"e_1_2_1_3_1"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/945445.945450"},{"key":"e_1_2_1_5_1","unstructured":"Hadoop http:\/\/hadoop.apache.org\/.  Hadoop http:\/\/hadoop.apache.org\/."},{"key":"e_1_2_1_6_1","unstructured":"T. White Hadoop: the Definitive Guide O'Reilly 3rd ed. 2012   T. White Hadoop: the Definitive Guide O'Reilly 3rd ed. 2012"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687553.1687609"},{"key":"e_1_2_1_8_1","unstructured":"Tcpdump http:\/\/www.tcpdump.org.  Tcpdump http:\/\/www.tcpdump.org."},{"key":"e_1_2_1_9_1","unstructured":"Wireshark http:\/\/www.wireshark.org.  Wireshark http:\/\/www.wireshark.org."},{"key":"e_1_2_1_10_1","unstructured":"CAIDA CoralReef Software Suite http:\/\/www.caida.org\/tools\/measurement\/coralreef.  CAIDA CoralReef Software Suite http:\/\/www.caida.org\/tools\/measurement\/coralreef."},{"key":"e_1_2_1_11_1","unstructured":"M. Roesch Snort - Lightweight Intrusion Detection for Networks USENIX LISA 1999.   M. Roesch Snort - Lightweight Intrusion Detection for Networks USENIX LISA 1999."},{"key":"e_1_2_1_12_1","unstructured":"Bro http:\/\/www.bro-ids.org.  Bro http:\/\/www.bro-ids.org."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/1776434.1776443"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-13315-2_24"},{"key":"e_1_2_1_15_1","unstructured":"Cisco NetFlow http:\/\/www.cisco.com\/web\/go\/netflow.  Cisco NetFlow http:\/\/www.cisco.com\/web\/go\/netflow."},{"volume-title":"USENIX LISA","year":"2000","author":"Fullmer M.","key":"e_1_2_1_16_1"},{"volume-title":"USENIX Conference on System Administration","year":"2000","author":"Plonka D.","key":"e_1_2_1_17_1"},{"key":"e_1_2_1_18_1","unstructured":"QoSient LLC argus: newtork audit record generation and utilization system http:\/\/www.qosient.com\/argus\/.  QoSient LLC argus: newtork audit record generation and utilization system http:\/\/www.qosient.com\/argus\/."},{"key":"e_1_2_1_19_1","unstructured":"Arbor Networks http:\/\/www.arbornetworks.com.  Arbor Networks http:\/\/www.arbornetworks.com."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/1986282.1986289"},{"key":"e_1_2_1_21_1","unstructured":"RIPE Hadoop PCAP https:\/\/labs.ripe.net\/Members\/wnagele\/large-scale-pcap-data-analysis-using-apache-hadoop Nov. 2011.  RIPE Hadoop PCAP https:\/\/labs.ripe.net\/Members\/wnagele\/large-scale-pcap-data-analysis-using-apache-hadoop Nov. 2011."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1879141.1879169"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2079327.2079334"},{"key":"e_1_2_1_24_1","unstructured":"CNU Project on traffic analysis in Hadoop https:\/\/sites.google.com\/a\/networks.cnu.ac.kr\/dnlab\/research\/hadoop.  CNU Project on traffic analysis in Hadoop https:\/\/sites.google.com\/a\/networks.cnu.ac.kr\/dnlab\/research\/hadoop."}],"container-title":["ACM SIGCOMM Computer Communication Review"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2427036.2427038","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2427036.2427038","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T08:49:00Z","timestamp":1750236540000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2427036.2427038"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,1,9]]},"references-count":24,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2013,1,9]]}},"alternative-id":["10.1145\/2427036.2427038"],"URL":"https:\/\/doi.org\/10.1145\/2427036.2427038","relation":{},"ISSN":["0146-4833"],"issn-type":[{"type":"print","value":"0146-4833"}],"subject":[],"published":{"date-parts":[[2012,1,9]]},"assertion":[{"value":"2012-01-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}