{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,10,3]],"date-time":"2024-10-03T04:17:36Z","timestamp":1727929056665},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,10,3]],"date-time":"2024-10-03T00:00:00Z","timestamp":1727913600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,10,3]],"date-time":"2024-10-03T00:00:00Z","timestamp":1727913600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cybersecurity"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>An application programming interface (API) usage specification, which includes the conditions, calling sequences, and semantic relationships of the API, is important for verifying its correct usage, which is in turn critical for ensuring the security and availability of the target program. However, existing techniques either mine the co-occurring relationships of multiple APIs without considering their semantic relationships, or they use data flow and control flow information to extract semantic beliefs on API pairs but difficult to incorporate when mining specifications for multiple APIs. Hence, we propose an API specification mining approach that efficiently extracts a relatively complete list of the API combinations and semantic relationships between APIs. This approach analyzes a target program in two stages. The first stage uses frequent API set mining based on frequent common API identification and filtration to extract the maximal set of frequent context-sensitive API sequences. In the second stage, the API relationship graph is constructed using three semantic relationships extracted from the symbolic path information, and the specifications containing semantic relationships for multiple APIs are mined. The experimental results on six popular open-source code bases of different scales show that the proposed two-stage approach not only yields better results than existing typical approaches, but also can effectively discover the specifications along with the semantic relationships for multiple APIs. Instance analysis shows that the analysis of security-related API call violations can assist in the cause analysis and patch of software vulnerabilities.<\/jats:p>","DOI":"10.1186\/s42400-024-00224-w","type":"journal-article","created":{"date-parts":[[2024,10,3]],"date-time":"2024-10-03T02:01:31Z","timestamp":1727920891000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Discovering API usage specifications for security detection using two-stage code mining"],"prefix":"10.1186","volume":"7","author":[{"given":"Zhongxu","family":"Yin","sequence":"first","affiliation":[]},{"given":"Yiran","family":"Song","sequence":"additional","affiliation":[]},{"given":"Guoxiao","family":"Zong","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,10,3]]},"reference":[{"issue":"10","key":"224_CR1","doi-asserted-by":"publisher","first-page":"984","DOI":"10.1109\/TSE.2018.2816639","volume":"45","author":"P Bian","year":"2018","unstructured":"Bian P et al (2018a) Detecting bugs by discovering expectations and their violations. IEEE Trans Softw Eng 45(10):984\u20131001","journal-title":"IEEE Trans Softw Eng"},{"key":"224_CR2","doi-asserted-by":"crossref","unstructured":"Bian P et al. (2018) \u201cNar-miner: Discovering negative association rules from code for bug detection\u201d. In: Proceedings of the 2018 26th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering. pp. 411\u2013422.","DOI":"10.1145\/3236024.3236032"},{"issue":"1","key":"224_CR3","first-page":"51","volume":"24","author":"R-y Chang","year":"2012","unstructured":"Chang R-y, Podgurski A (2012) Discovering programming rules and violations by mining interprocedural dependences. J Softw: Evolut Process 24(1):51\u201366","journal-title":"J Softw: Evolut Process"},{"issue":"5","key":"224_CR4","doi-asserted-by":"publisher","first-page":"579","DOI":"10.1109\/TSE.2008.24","volume":"34","author":"R-Y Chang","year":"2008","unstructured":"Chang R-Y, Podgurski A, Yang J (2008) Discovering neglected conditions in software by mining dependence graphs. IEEE Trans Softw Eng 34(5):579\u2013596","journal-title":"IEEE Trans Softw Eng"},{"key":"224_CR5","doi-asserted-by":"publisher","DOI":"10.3970\/cmc.2018.02574","author":"L Chen","year":"2018","unstructured":"Chen L et al (2018) Automatic mining of security-sensitive functions from source code. Comput, Mater Continua. https:\/\/doi.org\/10.3970\/cmc.2018.02574","journal-title":"Comput, Mater Continua"},{"key":"224_CR6","doi-asserted-by":"crossref","unstructured":"Dyer R et al. (2013) \u201cBoa: A language and infrastructure for analyzing ultra-large- scale software repositories\u201d. In: 2013 35th international conference on software engineering (ICSE). IEEE. pp. 422\u2013431.","DOI":"10.1109\/ICSE.2013.6606588"},{"key":"224_CR7","unstructured":"Grahne G and Zhu J  (2003) \u201cEfficiently using prefix-trees in mining frequent itemsets.\u201d In: FIMI. Vol. 90 pp 65."},{"key":"224_CR8","unstructured":"Grahne G  and Zhu J (2003) \u201cHigh performance mining of maximal frequent itemsets\u201d. In: 6th International workshop on high performance data mining. Vol. 16. pp 34."},{"key":"224_CR9","doi-asserted-by":"publisher","unstructured":"He  B et al. \u201cVetting SSL Usage in Applications with SSLINT\u201d. In: 2015 IEEE Symposium on Security and Privacy. 2015, pp. 519\u2013534. doi: https:\/\/doi.org\/10.1109\/SP.2015.38.","DOI":"10.1109\/SP.2015.38"},{"key":"224_CR10","unstructured":"Henkel J  et al. (2019) \u201cEnabling Open-World Specification Mining via Unsuper- vised Learning\u201d. In: arXiv preprint arXiv:1904.12098"},{"key":"224_CR11","doi-asserted-by":"crossref","unstructured":"Huan J et al. (2004) \u201cSpin: mining maximal frequent subgraphs from graph databases\u201d. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. pp 581\u2013586.","DOI":"10.1145\/1014052.1014123"},{"key":"224_CR12","unstructured":"Jana S, Kang Y J, Roth S, et al. (2016) Automatically detecting error handling bugs using error specifications[C]\/\/25th USENIX Security Symposium (USENIX Security 16). pp 345\u2013362."},{"key":"224_CR13","doi-asserted-by":"crossref","unstructured":"Kang Y,  Ray B and Jana S . (2016) \u201cApex: Automated inference of error specifications for c apis\u201d. In: Proceedings of the 31st IEEE\/ACM international conference on automated software engineering, pp 472\u2013 482.","DOI":"10.1145\/2970276.2970354"},{"key":"224_CR14","doi-asserted-by":"crossref","unstructured":"Karp RM and Tarjan RE . (1980) \u201cLinear expected-time algorithms for connectivity problems\u201d. In: Proceedings of the twelfth annual ACM symposium on Theory of computing. pp 368\u2013377.","DOI":"10.1145\/800141.804686"},{"key":"224_CR15","doi-asserted-by":"crossref","unstructured":"Lee G et al. \u201cApproximate maximal frequent pattern mining with weight conditions and error tolerance\u201d. In: International Journal of Pattern Recognition and Artificial Intelligence 30.06 (2016), p. 1650012.","DOI":"10.1142\/S0218001416500129"},{"key":"224_CR16","doi-asserted-by":"publisher","first-page":"4267","DOI":"10.1007\/s00500-017-2820-3","volume":"22","author":"G Lee","year":"2018","unstructured":"Lee G, Yun U (2018) Performance and characteristic analysis of maximal frequent pattern mining methods using additional factors. Soft Comput 22:4267\u20134273","journal-title":"Soft Comput"},{"key":"224_CR17","doi-asserted-by":"crossref","unstructured":"Lemieux C , Park D , and Beschastnikh  I . (2015) \u201cGeneral LTL speci- fication mining (T)\u201d. In: 2015 30th IEEE\/ACM international conference on automated software engineering (ASE). IEEE., pp 81\u201392.","DOI":"10.1109\/ASE.2015.71"},{"key":"224_CR18","doi-asserted-by":"crossref","unstructured":"Liang B et al. (2016) \u201cAntMiner: mining more bugs by reducing noise interference\u201d. In: Proceedings of the 38th international conference on software engineering. pp 333\u2013344.","DOI":"10.1145\/2884781.2884870"},{"issue":"5","key":"224_CR19","doi-asserted-by":"publisher","first-page":"306","DOI":"10.1145\/1095430.1081755","volume":"30","author":"Z Li","year":"2005","unstructured":"Li Z, Zhou Y (2005) PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. ACM SIGSOFT Softw Eng Notes 30(5):306\u2013315","journal-title":"ACM SIGSOFT Softw Eng Notes"},{"key":"224_CR20","doi-asserted-by":"crossref","unstructured":"Lv T, Li R, Yang Y, et al. Rtfm! automatic assumption discovery and verification derivation from library document for api misuse detection[C]\/\/Proceedings of the 2020 ACM SIGSAC conference on computer and communications security. 2020 pp 1837-1852","DOI":"10.1145\/3372297.3423360"},{"key":"224_CR21","unstructured":"MicrochipTech. MicrochipTech mbedtls examples. https:\/\/github.com\/MicrochipTech\/mbedtls-examples. 2019."},{"key":"224_CR22","doi-asserted-by":"crossref","unstructured":"Nguyen  HA et al. (2014) \u201cMining preconditions of APIs in large-scale code cor- pus\u201d. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering. pp. 166\u2013177.","DOI":"10.1145\/2635868.2635924"},{"key":"224_CR23","doi-asserted-by":"crossref","unstructured":"Nguyen HA et al. (2015) \u201cConsensus-based mining of API preconditions in big code\u201d. In: Companion Proceedings of the 2015 ACM SIGPLAN international conference on systems, programming, languages and applications: software for humanity. pp 5\u20136.","DOI":"10.1145\/2814189.2816271"},{"key":"224_CR24","doi-asserted-by":"crossref","unstructured":"Ramanathan MK, Grama  A , and Jagannathan S. (2007) \u201cStatic specification inference using predicate mining\u201d. In: ACM SIGPLAN Notices 42.6, pp 123\u2013134.","DOI":"10.1145\/1273442.1250749"},{"key":"224_CR25","unstructured":"Ramos DA and Engler D (2015) \u201cUnder-constrained symbolic execution: Correctness checking for real code\u201d. In: 24th USENIX Security Symposium (USENIX Security 15), pp 49\u201364."},{"key":"224_CR26","doi-asserted-by":"crossref","unstructured":"Schlichtig M, Sassalla S, Narasimhan K, et al. (2022) Fum-a framework for api usage constraint and misuse classification[C]\/\/2022 IEEE international conference on software analysis, evolution and reengineering (SANER). IEEE, pp 673\u2013684.","DOI":"10.1109\/SANER53432.2022.00085"},{"key":"224_CR27","doi-asserted-by":"crossref","unstructured":"Shastry B  et al. (2016) \u201cTowards vulnerability discovery using staged program analysis\u201d. In: detection of intrusions and malware, and vulnerability assess- ment: 13th international conference, DIMVA 2016, San Sebasti\u00b4an, Spain, July 7\u20138, Proceedings 13. Springer. 2016, pp 78\u201397.","DOI":"10.1007\/978-3-319-40667-1_5"},{"key":"224_CR28","unstructured":"Tamaskar SD, Raut AB. Approach for Mining in Lossless Representation of Closed Itemsets[J].\u00a02016(11)."},{"key":"224_CR29","doi-asserted-by":"crossref","unstructured":"Wang X, Zhao L. APICAD: Augmenting API Misuse Detection through Specifications from Code and Documents[C]\/\/2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 2023: 245\u2013256.","DOI":"10.1109\/ICSE48619.2023.00032"},{"key":"224_CR30","doi-asserted-by":"crossref","unstructured":"Yamaguchi F, Wressnegger C, Gascon H, et al. Chucky: Exposing missing checks in source code for vulnerability discovery[C]\/\/Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. 2013: pp 499-510","DOI":"10.1145\/2508859.2516665"},{"issue":"2","key":"224_CR31","first-page":"1013","volume":"63","author":"Z Yin","year":"2020","unstructured":"Yin Z et al (2020) A security sensitive function mining approach based on pre- condition pattern analysis. Comput, Mater Continua 63(2):1013\u20131029","journal-title":"Comput, Mater Continua"},{"key":"224_CR32","unstructured":"Yun I et al. (2016) \u201cAPISan: Sanitizing API Usages through Semantic Cross- Checking.\u201d In: Usenix Security Symposium. pp. 363\u2013378."},{"key":"224_CR33","doi-asserted-by":"publisher","first-page":"304","DOI":"10.1016\/j.eswa.2016.01.049","volume":"54","author":"U Yun","year":"2016","unstructured":"Yun U, Lee G (2016) Incremental mining of weighted maximal frequent itemsets from dynamic databases. Expert Syst Appl 54:304\u2013327","journal-title":"Expert Syst Appl"},{"issue":"5","key":"224_CR34","doi-asserted-by":"publisher","first-page":"439","DOI":"10.1111\/exsy.12158","volume":"33","author":"U Yun","year":"2016","unstructured":"Yun U, Lee G, Lee K-M (2016) Efficient representative pattern mining based on weight and maximality conditions. Expert Syst 33(5):439\u2013462","journal-title":"Expert Syst"}],"container-title":["Cybersecurity"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-024-00224-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s42400-024-00224-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-024-00224-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,3]],"date-time":"2024-10-03T02:03:22Z","timestamp":1727921002000},"score":1,"resource":{"primary":{"URL":"https:\/\/cybersecurity.springeropen.com\/articles\/10.1186\/s42400-024-00224-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,3]]},"references-count":34,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["224"],"URL":"https:\/\/doi.org\/10.1186\/s42400-024-00224-w","relation":{},"ISSN":["2523-3246"],"issn-type":[{"value":"2523-3246","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,3]]},"assertion":[{"value":"3 July 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 February 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 October 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"All authors disclosed no relevant relationships.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"30"}}