{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T13:05:45Z","timestamp":1775912745626,"version":"3.50.1"},"reference-count":21,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2021,5,12]],"date-time":"2021-05-12T00:00:00Z","timestamp":1620777600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>A series of eating behaviors, including chewing and swallowing, is considered to be crucial to the maintenance of good health. However, most such behaviors occur within the human body, and highly invasive methods such as X-rays and fiberscopes must be utilized to collect accurate behavioral data. A simpler method of measurement is needed in healthcare and medical fields; hence, the present study concerns the development of a method to automatically recognize a series of eating behaviors from the sounds produced during eating. The automatic detection of left chewing, right chewing, front biting, and swallowing was tested through the deployment of the hybrid CTC\/attention model, which uses sound recorded through 2ch microphones under the ear and weak labeled data as training data to detect the balance of chewing and swallowing. N-gram based data augmentation was first performed using weak labeled data to generate many weak labeled eating sounds to augment the training data. The detection performance was improved through the use of the hybrid CTC\/attention model, which can learn the context. In addition, the study confirmed a similar detection performance for open and closed foods.<\/jats:p>","DOI":"10.3390\/s21103378","type":"journal-article","created":{"date-parts":[[2021,5,12]],"date-time":"2021-05-12T22:46:14Z","timestamp":1620859574000},"page":"3378","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["Automatic Detection of Chewing and Swallowing"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7938-5254","authenticated-orcid":false,"given":"Akihiro","family":"Nakamura","sequence":"first","affiliation":[{"name":"Graduate School of Integrated Science and Technology, Shizuoka University, Hamamatsu 432-8011, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7243-2270","authenticated-orcid":false,"given":"Takato","family":"Saito","sequence":"additional","affiliation":[{"name":"NTT DOCOMO, Inc., Tokyo 100-6150, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4022-0505","authenticated-orcid":false,"given":"Daizo","family":"Ikeda","sequence":"additional","affiliation":[{"name":"NTT DOCOMO, Inc., Tokyo 100-6150, Japan"}]},{"given":"Ken","family":"Ohta","sequence":"additional","affiliation":[{"name":"NTT DOCOMO, Inc., Tokyo 100-6150, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3921-4298","authenticated-orcid":false,"given":"Hiroshi","family":"Mineno","sequence":"additional","affiliation":[{"name":"Graduate School of Integrated Science and Technology, Shizuoka University, Hamamatsu 432-8011, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7633-9340","authenticated-orcid":false,"given":"Masafumi","family":"Nishimura","sequence":"additional","affiliation":[{"name":"Graduate School of Integrated Science and Technology, Shizuoka University, Hamamatsu 432-8011, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2021,5,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"117","DOI":"10.2188\/jea.16.117","article-title":"Eating fast leads to obesity: Findings based on self-administered questionnaires among middle-aged Japanese men and women","volume":"16","author":"Otsuka","year":"2006","journal-title":"J. Epidemiol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1111\/j.1741-2358.2012.00666.x","article-title":"Chewing number is related to incremental increases in body weight from 20 years of age in Japanese middle-aged adults","volume":"30","author":"Fukuda","year":"2013","journal-title":"Gerodontology"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"580","DOI":"10.1111\/joor.12290","article-title":"Endoscopic evaluation of food bolus formation and its relationship with the number of chewing cycles","volume":"42","author":"Fukatsu","year":"2015","journal-title":"J. Oral Rehabil."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Risako, A., Furuya, J., and Suzuki, T. (2011). Videoendoscopic measurement of food bolus formation for quantitative evaluation of masticatory function. J. Prosthodont. Res., 171\u2013178.","DOI":"10.1016\/j.jpor.2010.12.002"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1772","DOI":"10.1109\/TBME.2014.2306773","article-title":"Automatic Ingestion Monitor: A Novel Wearable Device for Monitoring of Ingestive Behavior","volume":"61","author":"Fontana","year":"2014","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"ref_6","unstructured":"Nguyen, D.T., Cohen, E., Pourhomayoun, M., and Alshurafa, N. (2017, January 13\u201317). SwallowNet: Recurrent Neural Network Detects and Characterizes Eating Patterns. Proceedings of the 2017 IEEE International Conference on Pervasive Computing and Communications Workshops, Kaila-Kona, HI, USA."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3130902","article-title":"EarBit:Using Wearable Sensors to Detect Eating","volume":"1","author":"Bedri","year":"2017","journal-title":"Proc. ACM Interact. Mob. Wearable Ubiquitous Technol."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Farooq, M., and Sazonov, E. (2016). Automatic Measurement of Chew Count and Chewing Rate during Food Intake. Electronics, 5.","DOI":"10.3390\/electronics5040062"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Lee, J.T., Park, E., Hwang, J.M., Jung, T.D., and Park, D. (2020). Machine training analysis to automatically measure response time of pharyngeal swallowing reflex in videofluoroscopic swallowing study. Sci. Rep., 14735.","DOI":"10.1038\/s41598-020-71713-4"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"806","DOI":"10.1109\/JSEN.2015.2469095","article-title":"AutoDietary: A Wearable Acoustic Sensor System for Food Intake Recognition in Daily Life","volume":"16","author":"Bi","year":"2015","journal-title":"IEEE Sens. J."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Olubanjo, T., and Ghovanloo, M. (2014, January 4\u20139). Real-time swallowing detection based on tracheal acoustics. Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy.","DOI":"10.1109\/ICASSP.2014.6854430"},{"key":"ref_12","unstructured":"Ando, J., Saito, T., Kawasaki, S., Katagiri, M., Ikeda, D., Mineno, H., Tsunakawa, T., Nishida, M., and Nishimura, M. (2018, January 4\u20137). Dietary and Conversational Behavior Monitoring by Using Sound Information. Proceedings of the 2018 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP\u201918), Honolulu, HI, USA."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Billah, M.M., Abe, T., Nakamura, A., Saito, T., Ikeda, D., Mineno, H., and Nishimura, M. (2019, January 15\u201318). Estimation of Number of Chewing Strokes and Swallowing Event by Using LSTM-CTC and Throat Microphone. Proceedings of the 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE 2019), Osaka, Japan.","DOI":"10.1109\/GCCE46687.2019.9015226"},{"key":"ref_14","unstructured":"Zhang, Y., Qin, J., Park, S.D., Han, W., Chiu, C.C., Pang, R., Le, V.Q., and Wu, Y. (2020). Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Graves, A., Fernandez, S., Gomez, F., and Huber, J.S. (2006, January 25\u201329). Connectionist Temporal Classication:Labelling Unsegmented Sequence Data with Re current Neural Networks. Proceedings of the 23rd International Conference on Machine Learning, New York, NY, USA.","DOI":"10.1145\/1143844.1143891"},{"key":"ref_16","unstructured":"Bahdanau, D., Cho, K., and Bengio, Y. (2015, January 7\u20139). Neural Machine Translation by Jointly Training to Align and Translate. Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA."},{"key":"ref_17","unstructured":"Chorowski, J., Bahdanau, D., Cho, K., and Bengio, Y. (2014, January 8\u201314). End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results. Proceedings of the NIPS 2014 Workshop on Deep Training, Montreal, QC, Canada."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1240","DOI":"10.1109\/JSTSP.2017.2763455","article-title":"Hybrid CTC\/Attention Architecture for End-to-End Speech Recognition","volume":"11","author":"Watanabe","year":"2017","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Dinkel, H., and Yu, K. (2020, January 4\u20138). Duration Robust Weakly Supervised Sound Event Detection. Proceedings of the 45th International Conference on Acoustics, Speech, and Signal Processing, Barcelona, Spain.","DOI":"10.1109\/ICASSP40776.2020.9053459"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Park, S.D., Chan, W., Zhang, Y., Chiu, C.C., Zoph, B., Cubuk, D.E., and Le, Q.V. (2019, January 15\u201319). SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. Proceedings of the Interspeech 2019, Graz, Austria.","DOI":"10.21437\/Interspeech.2019-2680"},{"key":"ref_21","unstructured":"Zhang, H., Cisse, M., Dauphin, N.Y., and Lopez-Paz, D. (May, January 30). mixup: Beyond Empirical Risk Minimization. Proceedings of the ICLR 2018, Vancouver, BC, Canada."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/10\/3378\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:59:58Z","timestamp":1760162398000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/10\/3378"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,5,12]]},"references-count":21,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2021,5]]}},"alternative-id":["s21103378"],"URL":"https:\/\/doi.org\/10.3390\/s21103378","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,5,12]]}}}