{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T16:40:11Z","timestamp":1776357611484,"version":"3.51.2"},"reference-count":87,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T00:00:00Z","timestamp":1776297600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T00:00:00Z","timestamp":1776297600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004004","name":"Universit\u00e0 degli Studi di Trento","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100004004","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2026,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Concept Bottleneck Models (CBMs) are neural networks designed to conjoin high performance with\n                    <jats:italic>ante-hoc<\/jats:italic>\n                    interpretability. CBMs work by first mapping inputs (e.g., images) to high-level concepts (e.g., visible objects and their properties) and then use these to solve a downstream task (e.g., tagging or scoring an image) in an interpretable manner. Their performance and interpretability, however,\n                    <jats:italic>hinge on the quality of the concepts they learn<\/jats:italic>\n                    . The go-to strategy for ensuring good quality concepts is to leverage expert annotations, which are expensive to collect and seldom available in applications. Researchers have recently addressed this issue by introducing \u201cVLM-CBM\u201d architectures that replace manual annotations with weak supervision from foundation models. It is however unclear what the impact of doing so is on the quality of the learned concepts. To answer this question, we put state-of-the-art VLM-CBMs to the test, analyzing their learned concepts empirically using a selection of significant metrics. Our results show that, depending on the task, VLM supervision can noticeably differ from expert annotations, and that concept accuracy and quality are not strongly correlated. Our code is available at\n                    <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/github.com\/debryu\/CQA\" ext-link-type=\"uri\">https:\/\/github.com\/debryu\/CQA<\/jats:ext-link>\n                    .\n                  <\/jats:p>","DOI":"10.1007\/s10994-026-06999-y","type":"journal-article","created":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T15:39:14Z","timestamp":1776353954000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["If Concept Bottlenecks are the Question, are Foundation Models the Answer?"],"prefix":"10.1007","volume":"115","author":[{"given":"Nicola","family":"Debole","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pietro","family":"Barbiero","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Francesco","family":"Giannini","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrea","family":"Passerini","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stefano","family":"Teso","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Emanuele","family":"Marconato","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2026,4,16]]},"reference":[{"key":"6999_CR1","unstructured":"Abbasi\u00a0Yadkori, Y., et al. (2024). To believe or not to believe your LLM: Iterative prompting for estimating epistemic uncertainty. NeurIPS"},{"key":"6999_CR2","unstructured":"Alvarez\u00a0Melis, D., & Jaakkola, T. (2018). Towards robust interpretability with self-explaining neural networks. NeurIPS"},{"key":"6999_CR3","doi-asserted-by":"crossref","unstructured":"Ansel, J., et al. (2024). PyTorch 2: Faster machine learning through dynamic python bytecode transformation and graph compilation. In: ASPLOS","DOI":"10.1145\/3620665.3640366"},{"key":"6999_CR4","unstructured":"Bahadori, M.T., & Heckerman, D. (2021). Debiasing concept-based explanations with causal analysis. In: ICLR"},{"key":"6999_CR5","doi-asserted-by":"crossref","unstructured":"Barbiero, P., et al. (2022). Entropy-based logic explanations of neural networks. In: AAAI","DOI":"10.1609\/aaai.v36i6.20551"},{"key":"6999_CR6","unstructured":"Barbiero, P., et al. (2023). Interpretable neural-symbolic concept reasoning. In: ICML"},{"key":"6999_CR7","doi-asserted-by":"crossref","unstructured":"Barbiero, P., et al. (2024). Relational concept bottleneck models. NeurIPS","DOI":"10.52202\/079017-2468"},{"key":"6999_CR8","unstructured":"Barbiero, P., et al. (2025). Neural interpretable reasoning. arXiv:2502.11639"},{"key":"6999_CR9","unstructured":"Bontempelli, A., et al. (2023) Concept-level debugging of part-prototype networks. In: ICLR"},{"key":"6999_CR10","unstructured":"Bortolotti, S., et al. (2025). Shortcuts and identifiability in concept-based models from a neuro-symbolic lens. arXiv:2502.11245"},{"key":"6999_CR11","unstructured":"Breiman, L., et al. (1984). Classification and regression trees. CRC Press"},{"key":"6999_CR12","unstructured":"Calanzone, D., et al. (2025). Logically consistent language models via neuro-symbolic integration. In: ICLR"},{"key":"6999_CR13","doi-asserted-by":"crossref","unstructured":"Chauhan, K., et al. (2023). Interactive concept bottleneck models. In: AAAI","DOI":"10.1609\/aaai.v37i5.25736"},{"key":"6999_CR14","unstructured":"Chen, C., et al. (2019). This looks like that: deep learning for interpretable image recognition. NeurIPS"},{"key":"6999_CR15","doi-asserted-by":"crossref","unstructured":"Chen, Z., et al. (2020). Concept whitening for interpretable image recognition. Nature Machine Intelligence","DOI":"10.1038\/s42256-020-00265-z"},{"key":"6999_CR16","unstructured":"Contributors, X. (2023). XTuner: A toolkit for efficiently fine-tuning LLM. https:\/\/github.com\/InternLM\/xtuner"},{"key":"6999_CR17","doi-asserted-by":"crossref","unstructured":"Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning","DOI":"10.1023\/A:1022627411411"},{"key":"6999_CR18","unstructured":"De\u00a0Felice, G., et al. (2025). Causally reliable concept bottleneck models. arXiv:2503.04363"},{"key":"6999_CR19","unstructured":"Debot, D., et al. (2024). Interpretable concept-based memory reasoning. arXiv:2407.15527"},{"key":"6999_CR20","unstructured":"Dominici, G., et al. (2024a). AnyCBMs: How to turn any black box into a concept bottleneck model"},{"key":"6999_CR21","unstructured":"Dominici, G., et al. (2024b). Causal concept graph models: Beyond causal opacity in deep learning. arXiv:2405.16507"},{"key":"6999_CR22","unstructured":"Dominici, G., et al. (2024c). Counterfactual concept bottleneck models. arXiv:2402.01408"},{"key":"6999_CR23","unstructured":"Eastwood, C., & Williams, C.K. (2018). A framework for the quantitative evaluation of disentangled representations. In: ICLR"},{"key":"6999_CR24","unstructured":"Espinosa\u00a0Zarlenga, M., et al. (2023). Learning to receive help: Intervention-aware concept embedding models. NeurIPS"},{"key":"6999_CR25","unstructured":"Espinosa\u00a0Zarlenga, M., et al. (2024). Learning to receive help: Intervention-aware concept embedding models. NeurIPS"},{"key":"6999_CR26","unstructured":"Feng, J., et al. (2024). Bayesian concept bottleneck models with llm priors. arXiv:2410.15555"},{"key":"6999_CR27","unstructured":"Fokkema, H., et al. (2025). Sample-efficient learning of concepts with theoretical guarantees: From data to concepts without interventions. arXiv:2502.06536"},{"key":"6999_CR28","unstructured":"Furby, J., et al. (2023). Towards a deeper understanding of concept bottleneck models through end-to-end explanation. In: Workshop on representation learning for responsible human-centric AI @ AAAI"},{"key":"6999_CR29","unstructured":"Grattafiori, A., et al. (2024). The llama 3 herd of models. arXiv preprint arXiv:2407.21783"},{"key":"6999_CR30","unstructured":"Havasi, M., et al. (2022). Addressing leakage in concept bottleneck models. In: NeurIPS"},{"key":"6999_CR31","doi-asserted-by":"crossref","unstructured":"He, K., et al. (2016). Deep residual learning for image recognition. In: CVPR","DOI":"10.1109\/CVPR.2016.90"},{"key":"6999_CR32","unstructured":"Higgins, I., et al. (2018). Towards a definition of disentangled representations. arXiv:1812.02230"},{"key":"6999_CR33","unstructured":"Huang, L., et al. (2023). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM TOIS"},{"key":"6999_CR34","unstructured":"Hurst, A., et al. (2024). Gpt-4o system card. arXiv:2410.21276"},{"key":"6999_CR35","unstructured":"Ismail, A.A., et al. (2023). Concept bottleneck generative models. In: ICLR"},{"key":"6999_CR36","doi-asserted-by":"crossref","unstructured":"Ji, Y., et al. (2025). A comprehensive survey on self-interpretable neural networks. arXiv:2501.15638","DOI":"10.1109\/JPROC.2025.3635153"},{"key":"6999_CR37","unstructured":"Kazhdan, D., et al. (2021). Is disentanglement all you need? Comparing concept-based & disentanglement approaches. arXiv:2104.06917"},{"key":"6999_CR38","unstructured":"Kim, H., & Mnih, A. (2018). Disentangling by factorising. In: ICML"},{"key":"6999_CR39","unstructured":"Kim, B., et al. (2018). Interpretability beyond feature attribution: Quantitative testing with concept activation vectors. In: ICML"},{"key":"6999_CR40","unstructured":"Kim, E., et al. (2023). Probabilistic concept bottleneck models. In: ICML"},{"key":"6999_CR41","unstructured":"Kingma, D.P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980"},{"key":"6999_CR42","unstructured":"Koh, P.W., et al. (2020). Concept bottleneck models. In: ICML"},{"key":"6999_CR43","doi-asserted-by":"crossref","unstructured":"Laguna, S., et al. (2024). Beyond concept bottleneck models: How to make black boxes intervenable? NeurIPS","DOI":"10.52202\/079017-2699"},{"key":"6999_CR44","unstructured":"Lai, S., et al. (2024). Faithful vision-language interpretation via concept bottleneck models. In: ICLR"},{"key":"6999_CR45","doi-asserted-by":"crossref","unstructured":"Lertvittayakumjorn, P., et al. (2020) Find: human-in-the-loop debugging deep text classifiers. In: EMNLP","DOI":"10.18653\/v1\/2020.emnlp-main.24"},{"key":"6999_CR46","doi-asserted-by":"crossref","unstructured":"Li, O., et al. (2018). Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In: AAAI","DOI":"10.1609\/aaai.v32i1.11771"},{"key":"6999_CR47","unstructured":"Li, S., et al. (2024). On erroneous agreements of clip image embeddings. arXiv:2411.05195"},{"key":"6999_CR48","doi-asserted-by":"crossref","unstructured":"Liu, Z., et al. (2015). Deep learning face attributes in the wild. In: ICCV","DOI":"10.1109\/ICCV.2015.425"},{"key":"6999_CR49","doi-asserted-by":"crossref","unstructured":"Liu, S., et al. (2024). Grounding DINO: Marrying DINO with grounded pre-training for open-set object detection. In: ECCV","DOI":"10.1007\/978-3-031-72970-6_3"},{"key":"6999_CR50","unstructured":"Lockhart, J., et al. (2022). Towards learning to explain with concept bottleneck models: Mitigating information leakage. arXiv:2211.03656"},{"key":"6999_CR51","unstructured":"Mahinpei, A., et al. (2021). Promises and pitfalls of black-box concept learning models. In: Workshop on theoretic foundation, criticism, and application trend of explainable AI @ ICML"},{"key":"6999_CR52","doi-asserted-by":"crossref","unstructured":"Marconato, E., et al. (2022). GlanceNets: Interpretabile. Leak-proof Concept-based Models. In: NeurIPS.","DOI":"10.52202\/068431-1542"},{"key":"6999_CR53","doi-asserted-by":"crossref","unstructured":"Marconato, E., et al. (2023). Interpretability is in the mind of the beholder: A causal framework for human-interpretable representation learning. Entropy","DOI":"10.3390\/e25121574"},{"key":"6999_CR54","doi-asserted-by":"crossref","unstructured":"Marconato, E., et al. (2024). Not all neuro-symbolic concepts are created equal: Analysis and mitigation of reasoning shortcuts. NeurIPS","DOI":"10.52202\/075280-3170"},{"key":"6999_CR55","unstructured":"Margeloiu, A., et al. (2021). Do concept bottleneck models learn as intended? arXiv:2105.04289"},{"key":"6999_CR56","doi-asserted-by":"crossref","unstructured":"Mikriukov, G., et al. (2023). Evaluating the stability of semantic concept representations in CNNs for robust explainability. In: World conference on explainable artificial intelligence","DOI":"10.1007\/978-3-031-44067-0_26"},{"key":"6999_CR57","unstructured":"Montero, M., et al. (2022). Lost in latent space: Examining failures of disentangled models at combinatorial generalisation. NeurIPS"},{"key":"6999_CR58","unstructured":"Moreira, R., et al. (2024). Diconstruct: Causal concept-based explanations through black-box distillation. arXiv:2401.08534"},{"key":"6999_CR59","unstructured":"Oikarinen, T., et al. (2023). Label-free concept bottleneck models. In: ICLR"},{"key":"6999_CR60","unstructured":"Poeta, E., et al. (2023). Concept-based explainable artificial intelligence: A survey. arXiv:2312.12936"},{"key":"6999_CR61","unstructured":"Radford, A., et al. (2021). Learning transferable visual models from natural language supervision. In: ICML"},{"key":"6999_CR62","unstructured":"Rajendran, G., et al. (2024). From causal to concept-based representation learning. NeurIPS"},{"key":"6999_CR63","unstructured":"Raman, N., et al. (2023). Do concept bottleneck models obey locality? In: XAI in action: Past, present, and future applications"},{"key":"6999_CR64","doi-asserted-by":"crossref","unstructured":"Rao, S., et al. (2024). Discover-then-name: Task-agnostic concept bottlenecks via automated concept discovery","DOI":"10.1007\/978-3-031-72980-5_26"},{"key":"6999_CR65","unstructured":"Sahu, P., et al. (2022). Unpacking large language models with conceptual consistency. arXiv:2209.15093"},{"key":"6999_CR66","doi-asserted-by":"crossref","unstructured":"Sawada, Y., & Nakamura, K. (2022a). Concept bottleneck model with additional unsupervised concepts. IEEE Access","DOI":"10.1109\/ACCESS.2022.3167702"},{"key":"6999_CR67","unstructured":"Sawada, Y., & Nakamura, K. (2022b). C-senn: Contrastive self-explaining neural network. arXiv:2206.09575"},{"key":"6999_CR68","doi-asserted-by":"crossref","unstructured":"Sch\u00f6lkopf, B., et al. (2000). New support vector algorithms. Neural Computation","DOI":"10.1162\/089976600300015565"},{"key":"6999_CR69","doi-asserted-by":"crossref","unstructured":"Sch\u00f6lkopf, B., et al. (2021). Toward causal representation learning. IEEE","DOI":"10.1109\/JPROC.2021.3058954"},{"key":"6999_CR70","unstructured":"Schrodi, S., et al. (2024). Concept bottleneck models without predefined concepts. arXiv:2407.03921"},{"key":"6999_CR71","unstructured":"Schwalbe, G. (2022). Concept embedding analysis: A review. arXiv:2203.13909"},{"key":"6999_CR72","unstructured":"Shin, S., et al. (2023). A closer look at the intervention procedure of concept bottleneck models. In: ICML"},{"key":"6999_CR73","doi-asserted-by":"crossref","unstructured":"Srivastava, D., et al. (2024). VLG-CBM: Training concept bottleneck models with vision-language guidance. In: NeurIPS","DOI":"10.52202\/079017-2510"},{"key":"6999_CR74","doi-asserted-by":"crossref","unstructured":"Stammer, W., et al. (2021). Right for the right concept: Revising neuro-symbolic concepts by interacting with their explanations. In: CVPR","DOI":"10.1109\/CVPR46437.2021.00362"},{"key":"6999_CR75","unstructured":"Steinmann, D., et al. (2024). Learning to intervene on concept bottlenecks. In: ICML"},{"key":"6999_CR76","unstructured":"Suter, R., et al. (2019). Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness. In: ICML"},{"key":"6999_CR77","doi-asserted-by":"crossref","unstructured":"Teso, S., et al. (2023). Leveraging explanations in interactive machine learning: An overview. Frontiers in Artificial Intelligence","DOI":"10.3389\/frai.2023.1066049"},{"key":"6999_CR78","unstructured":"Vandenhirtz, M., et al. (2024) Stochastic concept bottleneck models. arXiv:2406.19272"},{"key":"6999_CR79","unstructured":"Wah, C., et al. (2011). The caltech-ucsd birds-200-2011 dataset"},{"key":"6999_CR80","unstructured":"Wong, E., et al. (2021). Leveraging sparse linear layers for debuggable deep networks. In: ICML"},{"key":"6999_CR81","doi-asserted-by":"crossref","unstructured":"Yang, Y., et al. (2023). Language in a bottle: Language model guided concept bottlenecks for interpretable image classification. In: CVPR","DOI":"10.1109\/CVPR52729.2023.01839"},{"key":"6999_CR82","unstructured":"Yeh, C.-K., et al. (2020). On completeness-aware concept-based explanations in deep neural networks. NeurIPS"},{"key":"6999_CR83","doi-asserted-by":"crossref","unstructured":"Yuan, Y., et al. (2024). Do LLMs overcome shortcut learning? An evaluation of shortcut challenges in large language models. arXiv:2410.13343","DOI":"10.32388\/1BZH25"},{"key":"6999_CR84","unstructured":"Yuksekgonul, M., et al. (2023). Post-hoc concept bottleneck models. In: ICLR"},{"key":"6999_CR85","unstructured":"Zarlenga, M.E., et al. (2022). Concept embedding models: Beyond the accuracy-explainability trade-off. In: NeurIPS"},{"key":"6999_CR86","unstructured":"Zarlenga, M.E., et al. (2023). Towards robust metrics for concept representation evaluation. In: AAAI"},{"key":"6999_CR87","unstructured":"Zhang, R., et al. (2024). The decoupling concept bottleneck model. PAMI"}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-026-06999-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-026-06999-y","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-026-06999-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T15:39:59Z","timestamp":1776353999000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-026-06999-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,16]]},"references-count":87,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2026,5]]}},"alternative-id":["6999"],"URL":"https:\/\/doi.org\/10.1007\/s10994-026-06999-y","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,4,16]]},"assertion":[{"value":"16 April 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 January 2026","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 January 2026","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 April 2026","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no Conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"97"}}