{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T09:31:02Z","timestamp":1767864662476,"version":"3.49.0"},"reference-count":19,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,3,25]],"date-time":"2023-03-25T00:00:00Z","timestamp":1679702400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100006168","name":"National Nuclear Security Administration","doi-asserted-by":"publisher","award":["DE-AC05-00OR22725"],"award-info":[{"award-number":["DE-AC05-00OR22725"]}],"id":[{"id":"10.13039\/100006168","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>In many non-canonical data science scenarios, obtaining, detecting, attributing, and annotating enough high-quality training data is the primary barrier to developing highly effective models. Moreover, in many problems that are not sufficiently defined or constrained, manually developing a training dataset can often overlook interesting phenomena that should be included. To this end, we have developed and demonstrated an iterative self-supervised learning procedure, whereby models are successfully trained and applied to new data to extract new training examples that are added to the corpus of training data. Successive generations of classifiers are then trained on this augmented corpus. Using low-frequency acoustic data collected by a network of infrasound sensors deployed around the High Flux Isotope Reactor and Radiochemical Engineering Development Center at Oak Ridge National Laboratory, we test the viability of our proposed approach to develop a powerful classifier with the goal of identifying vehicles from continuously streamed data and differentiating these from other sources of noise such as tools, people, airplanes, and wind. Using a small collection of exhaustively manually labeled data, we test several implementation details of the procedure and demonstrate its success regardless of the fidelity of the initial model used to seed the iterative procedure. Finally, we demonstrate the method\u2019s ability to update a model to accommodate changes in the data-generating distribution encountered during long-term persistent data collection.<\/jats:p>","DOI":"10.3390\/data8040064","type":"journal-article","created":{"date-parts":[[2023,3,27]],"date-time":"2023-03-27T04:05:40Z","timestamp":1679889940000},"page":"64","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Improving an Acoustic Vehicle Detector Using an Iterative Self-Supervision Procedure"],"prefix":"10.3390","volume":"8","author":[{"given":"Birdy","family":"Phathanapirom","sequence":"first","affiliation":[{"name":"Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA"}]},{"given":"Jason","family":"Hite","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1405-147X","authenticated-orcid":false,"given":"Kenneth","family":"Dayman","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA"}]},{"given":"David","family":"Chichester","sequence":"additional","affiliation":[{"name":"Idaho National Laboratory, Idaho Falls, ID 83415, USA"}]},{"given":"Jared","family":"Johnson","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Hite, J., Dayman, K., Rao, N., Greulich, C., Sen, S., Chichester, D., Nicholson, A., Archer, D., Willis, M., and Garishvili, I. (2020, January 6\u20139). Automated Vehicle Detection in a Nuclear Facility Using Low-Frequency Acoustic Sensors. Proceedings of the 23rd International Conference on Information Fusion, Rustenburg, South Africa.","DOI":"10.23919\/FUSION45008.2020.9190452"},{"key":"ref_2","unstructured":"Zou, T., Sun, H., Tian, X., and Zhang, E. (2003, January 8\u201313). Modeling A self-Learning detection engine automatically for IDS. Proceedings of the IEEE International Conference on Robotics, Intelligent Systems and Signal Processing, Changsha, China."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"105729","DOI":"10.1016\/j.compbiomed.2022.105729","article-title":"Pseudo-Labeling Generative Adversarial Networks for Medical Image Classification","volume":"147","author":"Mao","year":"2022","journal-title":"Comput. Biol. Med."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"102158","DOI":"10.1016\/j.media.2021.102158","article-title":"Semi-supervised learning with progressive unlabeled data excavation for label-efficient surgical workflow recognition","volume":"73","author":"Shi","year":"2021","journal-title":"Med. Image Anal."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1621","DOI":"10.1080\/08839514.2021.1988443","article-title":"Semi-Supervised Self-Training of Hate and Offensive Speech from Social Media","volume":"35","author":"Alsafari","year":"2021","journal-title":"Appl. Artif. Intell."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1007\/s10994-019-05855-6","article-title":"A survey on semi-supervised learning","volume":"109","author":"Hoos","year":"2020","journal-title":"Mach. Learn."},{"key":"ref_7","unstructured":"Cozman, F.G., Cohen, I., and Cirelo, M. (2002, January 14\u201316). Unlabeled Data Can Degrade Classification Performance of Generative Classifiers. Proceedings of the Fifteenth International Florida Artificial Intelligence Research Society Conference, Pensacola Beach, FL, USA."},{"key":"ref_8","unstructured":"Han, L., Ye, H.J., and Zhan, D.C. (2023). On Pseudo-Labeling for Class-Mismatch Semi-Supervised Learning. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1109\/TASSP.1980.1163359","article-title":"Time-Frequency Representation of Digital Signals","volume":"28","author":"Portnoff","year":"1980","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"ref_10","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_11","first-page":"98","article-title":"Empirical comparison of \u2018hard\u2019 and \u2018soft\u2019 label propagation for relational classification","volume":"Volume 4894","author":"Galstyan","year":"2007","journal-title":"Inductive Logic Programming: 17th International Conference, ILP 2007, Corvallis, OR, USA, 19\u201321 June 2007, Revised Selected Papers 17"},{"key":"ref_12","first-page":"125","article-title":"Properties and benefits of calibrated classifiers","volume":"Volume 3202","author":"Cohen","year":"2004","journal-title":"Knowledge Discovery in Databases: PKDD 2004, 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, Pisa, Italy, 20\u201324 September 2004"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"3215","DOI":"10.1109\/JSEN.2012.2192425","article-title":"Semi-Supervised Learning Techniques in Artificial Olfaction: A Novel Approach to Classification Problems and Drift Counteraction","volume":"12","author":"Fattoruso","year":"2012","journal-title":"IEEE Sens. J."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"657","DOI":"10.1109\/JSEN.2013.2285919","article-title":"Drift Compensation for Electronic Nose by Semi-Supervised Domain Adaption","volume":"14","author":"Liu","year":"2013","journal-title":"IEEE Sens. J."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Liu, T., Li, D., Chen, J., Chen, Y., Yang, T., and Cao, J. (2018). Gas-Sensor Drift Counteraction with Adaptive Active Learning for an Electronic Nose. Sensors, 18.","DOI":"10.3390\/s18114028"},{"key":"ref_16","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_17","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). Attention Is All You Need. Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Subakan, C., Ravanelli, M., Cornell, S., Bronzi, M., and Zhong, J. (2021, January 6\u201311). Attention is all you need in speech separation. Proceedings of the ICASSP 2021\u20132021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada.","DOI":"10.1109\/ICASSP39728.2021.9413901"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Hastie, T., Tibshirani, R., Friedman, J.H., and Friedman, J.H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer. [2nd ed.].","DOI":"10.1007\/978-0-387-84858-7"}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/8\/4\/64\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:03:03Z","timestamp":1760122983000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/8\/4\/64"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,25]]},"references-count":19,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,4]]}},"alternative-id":["data8040064"],"URL":"https:\/\/doi.org\/10.3390\/data8040064","relation":{},"ISSN":["2306-5729"],"issn-type":[{"value":"2306-5729","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,25]]}}}