{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,10,18]],"date-time":"2024-10-18T04:27:49Z","timestamp":1729225669944,"version":"3.27.0"},"reference-count":0,"publisher":"IOS Press","isbn-type":[{"value":"9781643685489","type":"electronic"}],"license":[{"start":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T00:00:00Z","timestamp":1729036800000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,10,16]]},"abstract":"<jats:p>Self-supervised learning and knowledge distillation intersect to achieve exceptional performance on downstream tasks across diverse network capacities. This paper introduces MIM-HD, which implements enhancements for masked image modeling (MIM) distillation, in two key aspects. First, a vision transformer head-level relation adaptive distillation approach is proposed, allowing the student to dynamically draw multi-source knowledge from the teacher based on its evolving state, compatible with scenarios where teacher-student transformer block head count differs. Second, to address the overemphasis on the encoder and neglect of the decoder role in maintaining representation consistency in previous MIM distillations, a dual-view decoding strategy for latent visual representations is introduced, reusing the teacher\u2019s decoder to alleviate MIM burdens on smaller networks. MIM-HD effectiveness is demonstrated through evaluations on ADE20K (mIoU) and ImageNet-1K (Acc), achieving +1.4% and +0.5% improved performance, respectively, compared to state-of-the-art methods, with substantial advantages on smaller pre-training datasets. Moreover, MIM-HD achieves superior efficiency, reducing pre-training epochs from 300 to 100.<\/jats:p>","DOI":"10.3233\/faia240493","type":"book-chapter","created":{"date-parts":[[2024,10,17]],"date-time":"2024-10-17T12:40:14Z","timestamp":1729168814000},"source":"Crossref","is-referenced-by-count":0,"title":["MIM-HD: Making Smaller Masked Autoencoder Better with Efficient Distillation"],"prefix":"10.3233","author":[{"given":"Zherui","family":"Zhang","sequence":"first","affiliation":[{"name":"Artificial Intelligence, Beijing University of Posts and Telecommunications"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Changwei","family":"Wang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences)"},{"name":"Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rongtao","family":"Xu","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenhao","family":"Xu","sequence":"additional","affiliation":[{"name":"Artificial Intelligence, Beijing University of Posts and Telecommunications"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shibiao","family":"Xu","sequence":"additional","affiliation":[{"name":"Artificial Intelligence, Beijing University of Posts and Telecommunications"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Li","family":"Guo","sequence":"additional","affiliation":[{"name":"Artificial Intelligence, Beijing University of Posts and Telecommunications"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiguang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoqiang","family":"Teng","sequence":"additional","affiliation":[{"name":"Artificial Intelligence, Beijing University of Posts and Telecommunications"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenbo","family":"Xu","sequence":"additional","affiliation":[{"name":"Artificial Intelligence, Beijing University of Posts and Telecommunications"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"7437","container-title":["Frontiers in Artificial Intelligence and Applications","ECAI 2024"],"original-title":[],"link":[{"URL":"https:\/\/ebooks.iospress.nl\/pdf\/doi\/10.3233\/FAIA240493","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,17]],"date-time":"2024-10-17T12:40:14Z","timestamp":1729168814000},"score":1,"resource":{"primary":{"URL":"https:\/\/ebooks.iospress.nl\/doi\/10.3233\/FAIA240493"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,16]]},"ISBN":["9781643685489"],"references-count":0,"URL":"https:\/\/doi.org\/10.3233\/faia240493","relation":{},"ISSN":["0922-6389","1879-8314"],"issn-type":[{"value":"0922-6389","type":"print"},{"value":"1879-8314","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,16]]}}}