{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,14]],"date-time":"2025-10-14T07:07:16Z","timestamp":1760425636705,"version":"3.41.0"},"publisher-location":"New York, New York, USA","reference-count":31,"publisher":"ACM Press","license":[{"start":{"date-parts":[[2018,1,1]],"date-time":"2018-01-01T00:00:00Z","timestamp":1514764800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100003170","name":"Stiftelsen f\u00f6r Kunskaps- och Kompetensutveckling","doi-asserted-by":"publisher","award":["HITS"],"award-info":[{"award-number":["HITS"]}],"id":[{"id":"10.13039\/501100003170","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018]]},"DOI":"10.1145\/3229543.3229548","type":"proceedings-article","created":{"date-parts":[[2018,8,1]],"date-time":"2018-08-01T19:07:07Z","timestamp":1533150427000},"page":"21-27","source":"Crossref","is-referenced-by-count":11,"title":["Efficient Distribution-Derived Features for High-Speed Encrypted Flow Classification"],"prefix":"10.1145","author":[{"given":"Johan","family":"Garcia","sequence":"first","affiliation":[{"name":"Karlstad University, Karlstad, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Topi","family":"Korhonen","sequence":"additional","affiliation":[{"name":"Karlstad University, Karlstad, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","reference":[{"key":"key-10.1145\/3229543.3229548-1","doi-asserted-by":"crossref","unstructured":"John Aitchison. 1986. The statistical analysis of compositional data. Chapman and Hall London.","DOI":"10.1007\/978-94-009-4109-0"},{"key":"key-10.1145\/3229543.3229548-2","doi-asserted-by":"crossref","unstructured":"Lucien Birg&#233; and Yves Rozenholc. 2006. How many bins should be put in a regular histogram. ESAIM: Probability and Statistics 10 (2006), 24--45.","DOI":"10.1051\/ps:2006001"},{"key":"key-10.1145\/3229543.3229548-3","unstructured":"Anderson Santos da Silva, Cristian Cleder Machado, Rodolfo Vebber Bisol, Lisandro Zambenedetti Granville, and Alberto Schaeffer-Filho. 2015. Identification and selection of flow features for accurate traffic classification in sdn. In Network Computing and Applications (NCA), 2015 IEEE 14th International Symposium on. IEEE, 134--141."},{"key":"key-10.1145\/3229543.3229548-4","doi-asserted-by":"crossref","unstructured":"Tapio Elomaa and Juho Rousu. 1999. General and efficient multisplitting of numerical attributes. Machine learning 36, 3 (1999), 201--244.","DOI":"10.1023\/A:1007674919412"},{"key":"key-10.1145\/3229543.3229548-5","doi-asserted-by":"crossref","unstructured":"Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recogn. Lett. 27, 8 (jun 2006), 861--874.","DOI":"10.1016\/j.patrec.2005.10.010"},{"key":"key-10.1145\/3229543.3229548-6","unstructured":"U Fayyad and K Irani. 1993. Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In 13th International Joint Conference on Artificial Intelligence, Vol. 2. 1022--1027."},{"key":"key-10.1145\/3229543.3229548-7","unstructured":"David Freedman and Persi Diaconis. 1981. On the histogram as a density estimator: L 2 theory. Zeitschrift f&#252;r Wahrscheinlichkeitstheorie und verwandte Gebiete 57, 4 (1981), 453--476."},{"key":"key-10.1145\/3229543.3229548-8","doi-asserted-by":"crossref","unstructured":"Mikel Galar, Alberto Fernandez, Edurne Barrenechea, Humberto Bustince, and Francisco Herrera. 2011. An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes. Pattern Recognition 44, 8 (2011), 1761--1776.","DOI":"10.1016\/j.patcog.2011.01.017"},{"key":"key-10.1145\/3229543.3229548-9","unstructured":"Johan Garcia and Anna Brunstrom. 2018. Clustering-based Separation of Media Transfers in DPI-classified Cellular Video and VoIP Traffic. In 2018 IEEE Wireless Communications and Networking Conference, (WCNC). IEEE, 1--6."},{"key":"key-10.1145\/3229543.3229548-10","doi-asserted-by":"crossref","unstructured":"Johan Garcia, Topi Korhonen, Ricky Andersson, and Filip Vastlund. 2018. Towards Video Flow Classification at a Million Encrypted Flows Per Second. In 2018 IEEE Advanced Information Networking and Applications (AINA) Conference.","DOI":"10.1109\/AINA.2018.00061"},{"key":"key-10.1145\/3229543.3229548-11","doi-asserted-by":"crossref","unstructured":"Salvador Garcia, Julian Luengo, Jos&#233; Antonio S&#225;ez, Victoria Lopez, and Francisco Herrera. 2013. A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning. IEEE Transactions on Knowledge and Data Engineering 25, 4 (2013), 734--750.","DOI":"10.1109\/TKDE.2012.35"},{"key":"key-10.1145\/3229543.3229548-12","doi-asserted-by":"crossref","unstructured":"Santiago Egea G&#243;mez, Bel&#233;n Carro Mart&#237;nez, Antonio J S&#225;nchez-Esguevillas, and Luis Hern&#225;ndez Callejo. 2017. Ensemble network traffic classification: Algorithm comparison and novel ensemble scheme proposal. Computer Networks 127 (2017), 68--80.","DOI":"10.1016\/j.comnet.2017.07.018"},{"key":"key-10.1145\/3229543.3229548-13","unstructured":"Robert M Gray. 1990. Vector quantization. In Readings in speech recognition. Elsevier, 75--100."},{"key":"key-10.1145\/3229543.3229548-14","unstructured":"Kevin H Knuth. 2006. Optimal data-based binning for histograms. arXiv preprint physics\/0605197 (2006)."},{"key":"key-10.1145\/3229543.3229548-15","unstructured":"Andrey Kolmogorov. 1933. Sulla determinazione empirica di una lgge di distribuzione. Inst. Ital. Attuari, Giorn. 4 (1933), 83--91."},{"key":"key-10.1145\/3229543.3229548-16","unstructured":"Yeon-sup Lim, Hyun-chul Kim, Jiwoong Jeong, Chong-kwon Kim, Ted Taekyoung Kwon, and Yanghee Choi. 2010. Internet traffic classification demystified: on the sources of the discriminative power. In Proceedings of the 6th International Conference (Co-NEXT '10). ACM."},{"key":"key-10.1145\/3229543.3229548-17","doi-asserted-by":"crossref","unstructured":"H. Liu, F. Hussain, C.L. Tan, and M. Dash. 2002. Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6, 4 (2002), 393--423.","DOI":"10.1023\/A:1016304305535"},{"key":"key-10.1145\/3229543.3229548-18","doi-asserted-by":"crossref","unstructured":"Frank J Massey Jr. 1951. The Kolmogorov-Smirnov test for goodness of fit. Journal of the American statistical Association 46, 253 (1951), 68--78.","DOI":"10.1080\/01621459.1951.10500769"},{"key":"key-10.1145\/3229543.3229548-19","doi-asserted-by":"crossref","unstructured":"Thuy TT Nguyen and Grenville Armitage. 2008. A survey of techniques for internet traffic classification using machine learning. IEEE Communications Surveys & Tutorials 10, 4 (2008), 56--76.","DOI":"10.1109\/SURV.2008.080406"},{"key":"key-10.1145\/3229543.3229548-20","doi-asserted-by":"crossref","unstructured":"Lizhi Peng, Bo Yang, and Yuehui Chen. 2015. Effective packet number for early stage internet traffic identification. Neurocomputing 156 (2015), 252--267.","DOI":"10.1016\/j.neucom.2014.12.053"},{"key":"key-10.1145\/3229543.3229548-21","unstructured":"Lizhi Peng, Bo Yang, Yuehui Chen, and Zhenxiang Chen. 2015. Effectiveness of Statistical Features for Early Stage Internet Traffic Identification. International Journal of Parallel Programming (2015), 1--17."},{"key":"key-10.1145\/3229543.3229548-22","doi-asserted-by":"crossref","unstructured":"Jeffrey D Scargle, Jay P Norris, Brad Jackson, and James Chiang. 2013. Studies in astronomical time series analysis. VI. Bayesian block representations. The Astrophysical Journal 764, 2 (2013).","DOI":"10.1088\/0004-637X\/764\/2\/167"},{"key":"key-10.1145\/3229543.3229548-23","unstructured":"David W Scott. 1979. On optimal and data-based histograms. Biometrika 66, 3 (1979), 605--610."},{"key":"key-10.1145\/3229543.3229548-24","unstructured":"Tuncay Soylu, O&#287;uzhan Erdem, Aydin Carus, and Edip S G&#252;ner. 2017. Simple CART based real-time traffic classification engine on FPGAs. In ReConFigurable Computing and FPGAs (ReConFig), 2017 International Conference on. IEEE, 1--8."},{"key":"key-10.1145\/3229543.3229548-25","unstructured":"Dougal J Sutherland. 2016. Scalable, Flexible and Active Learning on Distributions. Ph.D. Dissertation. Carnegie Mellon University Pittsburgh United States."},{"key":"key-10.1145\/3229543.3229548-26","unstructured":"Vincent F Taylor, Riccardo Spolaor, Mauro Conti, and Ivan Martinovic. 2016. Appscanner: Automatic fingerprinting of smartphone apps from encrypted network traffic. In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on. 439--454."},{"key":"key-10.1145\/3229543.3229548-27","doi-asserted-by":"crossref","unstructured":"Petr Velan, Milan &#268;erm&#225;k, Pavel &#268;eleda, and Martin Dra&#353;ar. 2015. A survey of methods for encrypted traffic classification and analysis. International Journal of Network Management 25, 5 (2015), 355--374.","DOI":"10.1002\/nem.1901"},{"key":"key-10.1145\/3229543.3229548-28","doi-asserted-by":"crossref","unstructured":"Yu Wang, Yang Xiang, Jun Zhang, and Shunzheng Yu. 2012. Internet traffic clustering with constraints. In Wireless Communications and Mobile Computing Conference (IWCMC), 2012 8th International. IEEE, 619--624.","DOI":"10.1109\/IWCMC.2012.6314275"},{"key":"key-10.1145\/3229543.3229548-29","doi-asserted-by":"crossref","unstructured":"Ming Xu, Wenbo Zhu, Jian Xu, and Ning Zheng. 2015. Towards selecting optimal features for flow statistical based network traffic classification. In Network Operations and Management Symposium (APNOMS), 2015 17th Asia-Pacific. IEEE, 479--482.","DOI":"10.1109\/APNOMS.2015.7275371"},{"key":"key-10.1145\/3229543.3229548-30","doi-asserted-by":"crossref","unstructured":"Zhenlong Yuan, Y. Xue, and Y. Dong. 2013. Harvesting unique characteristics in packet sequences for effective application classification. In 2013 IEEE Conference on Communications and Network Security (CNS). 341--349.","DOI":"10.1109\/CNS.2013.6682724"},{"key":"key-10.1145\/3229543.3229548-31","doi-asserted-by":"crossref","unstructured":"Shuyuan Zhao, Yongzheng Zhang, and Peng Chang. 2017. Network Traffic Classification Using Tri-training Based on Statistical Flow Characteristics. In Trustcom\/BigDataSE\/ICESS, 2017 IEEE. IEEE, 323--330.","DOI":"10.1109\/Trustcom\/BigDataSE\/ICESS.2017.254"}],"event":{"name":"the 2018 Workshop","start":{"date-parts":[[2018,8,24]]},"sponsor":["SIGCOMM, ACM Special Interest Group on Data Communication"],"location":"Budapest, Hungary","end":{"date-parts":[[2018,8,24]]},"acronym":"NetAI'18"},"container-title":["Proceedings of the 2018 Workshop on Network Meets AI &amp; ML - NetAI'18"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3229543.3229548","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/dl.acm.org\/ft_gateway.cfm?id=3229548&ftid=1992144&dwn=1","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:39:43Z","timestamp":1750210783000},"score":1,"resource":{"primary":{"URL":"http:\/\/dl.acm.org\/citation.cfm?doid=3229543.3229548"}},"subtitle":[],"proceedings-subject":"Network Meets AI & ML","short-title":[],"issued":{"date-parts":[[2018]]},"references-count":31,"URL":"https:\/\/doi.org\/10.1145\/3229543.3229548","relation":{},"subject":[],"published":{"date-parts":[[2018]]}}}