{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:08:45Z","timestamp":1750306125441,"version":"3.41.0"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2017,3,21]],"date-time":"2017-03-21T00:00:00Z","timestamp":1490054400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"crossref","award":["61375014, 61533015, U1613214, 61333019 and 61401455"],"award-info":[{"award-number":["61375014, 61533015, U1613214, 61333019 and 61401455"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2017,5,31]]},"abstract":"<jats:p>Successful computer-aided diagnosis systems typically rely on training datasets containing sufficient and richly annotated images. However, detailed image annotation is often time consuming and subjective, especially for medical images, which becomes the bottleneck for the collection of large datasets and then building computer-aided diagnosis systems. In this article, we design a novel computer-aided endoscopy diagnosis system to deal with the multi-classification problem of electronic endoscopy medical records (EEMRs) containing sets of frames, while labels of EEMRs can be mined from the corresponding text records using an automatic text-matching strategy without human special labeling. With unambiguous EEMR labels and ambiguous frame labels, we propose a simple but effective pooling scheme called Multi-class Latent Concept Pooling, which learns a codebook from EEMRs with different classes step by step and encodes EEMRs based on a soft weighting strategy. In our method, a computer-aided diagnosis system can be extended to new unseen classes with ease and applied to the standard single-instance classification problem even though detailed annotated images are unavailable. In order to validate our system, we collect 1,889 EEMRs with more than 59K frames and successfully mine labels for 348 of them. The experimental results show that our proposed system significantly outperforms the state-of-the-art methods. Moreover, we apply the learned latent concept codebook to detect the abnormalities in endoscopy images and compare it with a supervised learning classifier, and the evaluation shows that our codebook learning method can effectively extract the true prototypes related to different classes from the ambiguous data.<\/jats:p>","DOI":"10.1145\/3051481","type":"journal-article","created":{"date-parts":[[2017,3,23]],"date-time":"2017-03-23T16:19:44Z","timestamp":1490285984000},"page":"1-18","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Multi-Class Latent Concept Pooling for Computer-Aided Endoscopy Diagnosis"],"prefix":"10.1145","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3730-6401","authenticated-orcid":false,"given":"Shuai","family":"Wang","sequence":"first","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences and University of Chinese Academy of Sciences, Shenyang, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yang","family":"Cong","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Huijie","family":"Fan","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Baojie","family":"Fan","sequence":"additional","affiliation":[{"name":"College of Automation, Nanjing University of Posts and Telecommunications, Nanjing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lianqing","family":"Liu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yunsheng","family":"Yang","sequence":"additional","affiliation":[{"name":"Chinese PLA General Hospital, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yandong","family":"Tang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Huaici","family":"Zhao","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haibin","family":"Yu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2017,3,21]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.120"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2005.859472"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-85990-1_72"},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Ylan Boureau Francis Bach Yann Lecun and Jean Ponce. 2010. Learning mid-level features for recognition. In CVPR. 2559--2566. Ylan Boureau Francis Bach Yann Lecun and Jean Ponce. 2010. Learning mid-level features for recognition. In CVPR. 2559--2566.","DOI":"10.1109\/CVPR.2010.5539963"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1053\/j.gastro.2009.10.053"},{"volume-title":"Liyuan Li, Kap Luk Chan, Shuicheng Yan, Weijia Shen, That Mon Htwe, Jiang Liu, Joo Hwee Lim, and Eng Hui Ong.","year":"2010","author":"Chu Xinqi","key":"e_1_2_1_6_1"},{"volume-title":"Smith","year":"2014","author":"Codella Noel","key":"e_1_2_1_7_1"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2006.873158"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2014.09.010"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2011.5995434"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2012.11.021"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2005.177"},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","unstructured":"T. Deselaers L. Pimenidis and H. Ney. 2008. Bag-of-visual-words models for adult image classification and filtering. In ICPR. 1--4. T. Deselaers L. Pimenidis and H. Ney. 2008. Bag-of-visual-words models for adult image classification and filtering. In ICPR. 1--4.","DOI":"10.1109\/ICPR.2008.4761366"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2011.5995454"},{"volume-title":"Liang-Tien Chia, and Peilin Zhao.","year":"2010","author":"Gao Shenghua","key":"e_1_2_1_15_1"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1117\/1.1695563"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/JBHI.2015.2419251"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2015.2437998"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TITB.2007.913128"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.113"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.compmedimag.2009.11.005"},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Herve Jegou Matthijs Douze Cordelia Schmid and Patrick Perez. 2010. Aggregating local descriptors into a compact image representation. In CVPR. 3304--3311. Herve Jegou Matthijs Douze Cordelia Schmid and Patrick Perez. 2010. Aggregating local descriptors into a compact image representation. In CVPR. 3304--3311.","DOI":"10.1109\/CVPR.2010.5540039"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2011.235"},{"volume-title":"Hinton","year":"2012","author":"Krizhevsky Alex","key":"e_1_2_1_24_1"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBME.2011.2172438"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBME.2008.2010526"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TITB.2012.2185807"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1118\/1.4905164"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2008.926061"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2014.2314959"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10916-014-0109-y"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2016.03.009"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00535-011-0419-5"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2013.2258676"},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Florent Perronnin Jorge Sanchez and Thomas Mensink. 2010. Improving the fisher kernel for large-scale image classification. In ECCV. 119--133. Florent Perronnin Jorge Sanchez and Thomas Mensink. 2010. Improving the fisher kernel for large-scale image classification. In ECCV. 119--133.","DOI":"10.1007\/978-3-642-15561-1_11"},{"key":"e_1_2_1_36_1","first-page":"e90","article-title":"Content-adaptive region-based color texture descriptors for medical images","volume":"27","author":"Riaz F.","year":"2015","journal-title":"Leukemia"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBME.2012.2212440"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-013-0636-x"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2014.2310123"},{"key":"e_1_2_1_40_1","doi-asserted-by":"crossref","unstructured":"Bernhard Scholkopf John Platt and Thomas Hofmann. 2007. Efficient sparse coding algorithms. In NIPS. 801--808. Bernhard Scholkopf John Platt and Thomas Hofmann. 2007. Efficient sparse coding algorithms. In NIPS. 801--808.","DOI":"10.7551\/mitpress\/7503.003.0105"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2002.806597"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2014.6907019"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/TITB.2011.2171977"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBME.2016.2530141"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBME.2006.889767"},{"volume-title":"Hauptmann","year":"2015","author":"Xu Zhongwen","key":"e_1_2_1_46_1"},{"key":"e_1_2_1_47_1","doi-asserted-by":"crossref","unstructured":"Jianchao Yang Kai Yu Yihong Gong and T. Huang. 2009. Linear spatial pyramid matching using sparse coding for image classification. In CVPR. 1794--1801. Jianchao Yang Kai Yu Yihong Gong and T. Huang. 2009. Linear spatial pyramid matching using sparse coding for image classification. In CVPR. 1794--1801.","DOI":"10.1109\/CVPR.2009.5206757"},{"key":"e_1_2_1_48_1","doi-asserted-by":"crossref","unstructured":"Jianchao Yang Kai Yu and Thomas Huang. 2010. Efficient highly over-complete sparse coding using a mixture model. In ECCV. 113--126. Jianchao Yang Kai Yu and Thomas Huang. 2010. Efficient highly over-complete sparse coding using a mixture model. In ECCV. 113--126.","DOI":"10.1007\/978-3-642-15555-0_9"},{"key":"e_1_2_1_49_1","first-page":"444","article-title":"Key point detection by max pooling for tracking","volume":"45","author":"Yu X.","year":"2015","journal-title":"IEEE Trans. Cybern."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2015.2418534"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.146"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2011.5995484"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3051481","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3051481","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:36:56Z","timestamp":1750217816000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3051481"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,3,21]]},"references-count":52,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2017,5,31]]}},"alternative-id":["10.1145\/3051481"],"URL":"https:\/\/doi.org\/10.1145\/3051481","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"type":"print","value":"1551-6857"},{"type":"electronic","value":"1551-6865"}],"subject":[],"published":{"date-parts":[[2017,3,21]]},"assertion":[{"value":"2016-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-03-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}