{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,3]],"date-time":"2026-07-03T16:55:15Z","timestamp":1783097715244,"version":"3.54.6"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2019,3,29]],"date-time":"2019-03-29T00:00:00Z","timestamp":1553817600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Interact. Mob. Wearable Ubiquitous Technol."],"published-print":{"date-parts":[[2019,3,29]]},"abstract":"<jats:p>Over the years, activity sensing and recognition has been shown to play a key enabling role in a wide range of applications, from sustainability and human-computer interaction to health care. While many recognition tasks have traditionally employed inertial sensors, acoustic-based methods offer the benefit of capturing rich contextual information, which can be useful when discriminating complex activities. Given the emergence of deep learning techniques and leveraging new, large-scale multimedia datasets, this paper revisits the opportunity of training audio-based classifiers without the onerous and time-consuming task of annotating audio data. We propose a framework for audio-based activity recognition that can make use of millions of embedding features from public online video sound clips. Based on the combination of oversampling and deep learning approaches, our framework does not require further feature processing or outliers filtering as in prior work. We evaluated our approach in the context of Activities of Daily Living (ADL) by recognizing 15 everyday activities with 14 participants in their own homes, achieving 64.2% and 83.6% averaged within-subject accuracy in terms of top-1 and top-3 classification respectively. Individual class performance was also examined in the paper to further study the co-occurrence characteristics of the activities and the robustness of the framework.<\/jats:p>","DOI":"10.1145\/3314404","type":"journal-article","created":{"date-parts":[[2019,4,2]],"date-time":"2019-04-02T11:57:40Z","timestamp":1554206260000},"page":"1-18","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":80,"title":["Audio-Based Activities of Daily Living (ADL) Recognition with Large-Scale Acoustic Embeddings from Online Videos"],"prefix":"10.1145","volume":"3","author":[{"given":"Dawei","family":"Liang","sequence":"first","affiliation":[{"name":"University of Texas at Austin, Austin, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Edison","family":"Thomaz","sequence":"additional","affiliation":[{"name":"University of Texas at Austin, Austin, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2019,3,29]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Mart\u00edn Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg S. Corrado Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Ian Goodfellow Andrew Harp Geoffrey Irving Michael Isard Yangqing Jia Rafal Jozefowicz Lukasz Kaiser Manjunath Kudlur Josh Levenberg Dandelion Man\u00e9 Rajat Monga Sherry Moore Derek Murray Chris Olah Mike Schuster Jonathon Shlens Benoit Steiner Ilya Sutskever Kunal Talwar Paul Tucker Vincent Vanhoucke Vijay Vasudevan Fernanda Vi\u00e9gas Oriol Vinyals Pete Warden Martin Wattenberg Martin Wicke Yuan Yu and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). https:\/\/www.tensorflow.org\/ Software available from tensorflow.org.  Mart\u00edn Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg S. Corrado Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Ian Goodfellow Andrew Harp Geoffrey Irving Michael Isard Yangqing Jia Rafal Jozefowicz Lukasz Kaiser Manjunath Kudlur Josh Levenberg Dandelion Man\u00e9 Rajat Monga Sherry Moore Derek Murray Chris Olah Mike Schuster Jonathon Shlens Benoit Steiner Ilya Sutskever Kunal Talwar Paul Tucker Vincent Vanhoucke Vijay Vasudevan Fernanda Vi\u00e9gas Oriol Vinyals Pete Warden Martin Wattenberg Martin Wicke Yuan Yu and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). https:\/\/www.tensorflow.org\/ Software available from tensorflow.org."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCNC.2013.6488584"},{"key":"e_1_2_1_3_1","volume-title":"Soundnet: Learning sound representations from unlabeled video. In Advances in Neural Information Processing Systems. 892--900.","author":"Aytar Yusuf","year":"2016"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1613\/jair.953"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/11428572_4"},{"key":"e_1_2_1_6_1","unstructured":"Fran\u00e7ois Chollet et al. 2015. Keras. https:\/\/keras.io. (2015).  Fran\u00e7ois Chollet et al. 2015. Keras. https:\/\/keras.io. (2015)."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2011.06.073"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSA.2005.854103"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2858036.2858528"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2502081.2502245"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1814433.1814450"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7952261"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/11538059_91"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7952132"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.pmcj.2010.11.005"},{"key":"e_1_2_1_16_1","doi-asserted-by":"crossref","DOI":"10.1109\/TCE.2012.6227479","article-title":"Environmental audio scene and activity recognition through mobile-based crowdsourcing","volume":"58","author":"Hwang Kyuwoong","year":"2012","journal-title":"IEEE Transactions on Consumer Electronics"},{"key":"e_1_2_1_17_1","volume-title":"Audio Set classification with attention model: A probabilistic perspective. arXiv preprint arXiv:1711.00927","author":"Kong Qiuqiang","year":"2017"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1964897.1964918"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2750858.2804262"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3242587.3242609"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3025453.3025773"},{"key":"e_1_2_1_22_1","first-page":"1","article-title":"Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning","volume":"18","author":"Lema\u00eetre Guillaume","year":"2017","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_2_1_23_1","unstructured":"Alexander Liu Joydeep Ghosh and Cheryl E Martin. 2007. Generative Oversampling for Mining Imbalanced Datasets.. In DMIN. 66--72.  Alexander Liu Joydeep Ghosh and Cheryl E Martin. 2007. Generative Oversampling for Mining Imbalanced Datasets.. In DMIN. 66--72."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555816.1555834"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1460412.1460445"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541831.2541832"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2509352.2509396"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"key":"e_1_2_1_29_1","first-page":"1541","article-title":"Activity recognition from accelerometer data","volume":"5","author":"Ravi Nishkam","year":"2005","journal-title":"Aaai"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/PerComW.2013.6529487"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISWC.2012.12"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1186\/s13636-018-0137-5"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2017.2657381"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2655045"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/PERCOMW.2015.7134104"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2750858.2807545"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2678025.2701405"},{"key":"e_1_2_1_38_1","volume-title":"Eighth Annual Conference of the International Speech Communication Association.","author":"Wyatt Danny","year":"2007"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TST.2014.6838194"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370216.2370269"}],"container-title":["Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3314404","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3314404","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:53:30Z","timestamp":1750204410000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3314404"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,3,29]]},"references-count":40,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2019,3,29]]}},"alternative-id":["10.1145\/3314404"],"URL":"https:\/\/doi.org\/10.1145\/3314404","relation":{},"ISSN":["2474-9567"],"issn-type":[{"value":"2474-9567","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,3,29]]},"assertion":[{"value":"2018-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-03-29","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}