{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T04:51:44Z","timestamp":1774673504706,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1012547","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2024,12,2]],"date-time":"2024-12-02T00:00:00Z","timestamp":1733097600000}}],"reference-count":45,"publisher":"Public Library of Science (PLoS)","issue":"11","license":[{"start":{"date-parts":[[2024,11,11]],"date-time":"2024-11-11T00:00:00Z","timestamp":1731283200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"NIH","award":["R35 GM122547"],"award-info":[{"award-number":["R35 GM122547"]}]},{"DOI":"10.13039\/100011495","name":"Massachusetts Life Sciences Center","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100011495","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Image-based cell profiling is a powerful tool that compares perturbed cell populations by measuring thousands of single-cell features and summarizing them into profiles. Typically a sample is represented by averaging across cells, but this fails to capture the heterogeneity within cell populations. We introduce CytoSummaryNet: a Deep Sets-based approach that improves mechanism of action prediction by 30\u201368% in mean average precision compared to average profiling on a public dataset. CytoSummaryNet uses self-supervised contrastive learning in a multiple-instance learning framework, providing an easier-to-apply method for aggregating single-cell feature data than previously published strategies. Interpretability analysis suggests that the model achieves this improvement by downweighting small mitotic cells or those with debris and prioritizing large uncrowded cells. The approach requires only perturbation labels for training, which are readily available in all cell profiling datasets. CytoSummaryNet offers a straightforward post-processing step for single-cell profiles that can significantly boost retrieval performance on image-based profiling datasets.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1012547","type":"journal-article","created":{"date-parts":[[2024,11,11]],"date-time":"2024-11-11T13:45:44Z","timestamp":1731332744000},"page":"e1012547","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":6,"title":["Capturing cell heterogeneity in representations of cell populations for image-based profiling using contrastive learning"],"prefix":"10.1371","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4982-9648","authenticated-orcid":true,"given":"Robert","family":"van Dijk","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1138-5036","authenticated-orcid":true,"given":"John","family":"Arevalo","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1829-8397","authenticated-orcid":true,"given":"Mehrtash","family":"Babadi","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1555-8261","authenticated-orcid":true,"given":"Anne E.","family":"Carpenter","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3150-3025","authenticated-orcid":true,"given":"Shantanu","family":"Singh","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2024,11,11]]},"reference":[{"key":"pcbi.1012547.ref001","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1038\/s41573-020-00117-w","article-title":"Image-based profiling for drug discovery: due for a machine-learning upgrade?","volume":"20","author":"SN Chandrasekaran","year":"2021","journal-title":"Nat Rev Drug Discov"},{"key":"pcbi.1012547.ref002","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1016\/j.chembiol.2018.01.015","article-title":"Repurposing High-Throughput Image Assays Enables Biological Activity Prediction for Drug Discovery.","volume":"25","author":"J Simm","year":"2018","journal-title":"Cell Chem Biol"},{"key":"pcbi.1012547.ref003","doi-asserted-by":"crossref","first-page":"1967","DOI":"10.1038\/s41467-023-37570-1","article-title":"Predicting compound activity from phenotypic profiles and chemical structures.","volume":"14","author":"N Moshkov","year":"2023","journal-title":"Nat Commun"},{"key":"pcbi.1012547.ref004","doi-asserted-by":"crossref","first-page":"858","DOI":"10.1038\/s42003-022-03763-5","article-title":"Integrating cell morphology with gene expression and chemical structure to aid mitochondrial toxicity detection","volume":"5","author":"S Seal","year":"2022","journal-title":"Commun Biol"},{"key":"pcbi.1012547.ref005","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1002\/cyto.a.23987","article-title":"Label-Free Leukemia Monitoring by Computer Vision.","volume":"97","author":"M Doan","year":"2020","journal-title":"Cytometry A"},{"key":"pcbi.1012547.ref006","doi-asserted-by":"crossref","first-page":"ar49","DOI":"10.1091\/mbc.E21-11-0538","article-title":"Cell Painting predicts impact of lung cancer variants","volume":"33","author":"JC Caicedo","year":"2022","journal-title":"Mol Biol Cell"},{"key":"pcbi.1012547.ref007","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1016\/j.cell.2010.04.033","article-title":"Cellular heterogeneity: do differences make a difference?","volume":"141","author":"SJ Altschuler","year":"2010","journal-title":"Cell"},{"key":"pcbi.1012547.ref008","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1016\/j.copbio.2016.03.015","article-title":"Single-cell states versus single-cell atlases\u2014two classes of heterogeneity that differ in meaning and method","volume":"39","author":"KA Janes","year":"2016","journal-title":"Curr Opin Biotechnol"},{"key":"pcbi.1012547.ref009","first-page":"105","article-title":"Tumor heterogeneity: causes and consequences","volume":"1805","author":"A Marusyk","year":"2010","journal-title":"Biochim Biophys Acta"},{"key":"pcbi.1012547.ref010","doi-asserted-by":"crossref","first-page":"3070","DOI":"10.1158\/0008-5472.CAN-15-3052","article-title":"Combination Therapy Targeting BCL6 and Phospho-STAT3 Defeats Intratumor Heterogeneity in a Subset of Non-Small Cell Lung Cancers","volume":"77","author":"D Deb","year":"2017","journal-title":"Cancer Res"},{"key":"pcbi.1012547.ref011","doi-asserted-by":"crossref","first-page":"553","DOI":"10.1038\/s41568-019-0180-2","article-title":"Unravelling tumour heterogeneity by single-cell profiling of circulating tumour cells","volume":"19","author":"L Keller","year":"2019","journal-title":"Nat Rev Cancer"},{"key":"pcbi.1012547.ref012","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.ccell.2019.12.001","article-title":"An Integrated Gene Expression Landscape Profiling Approach to Identify Lung Tumor Endothelial Cell Heterogeneity and Angiogenic Candidates","volume":"37","author":"J Goveia","year":"2020","journal-title":"Cancer Cell"},{"key":"pcbi.1012547.ref013","doi-asserted-by":"crossref","first-page":"2082","DOI":"10.1038\/s41467-019-10154-8","article-title":"Capturing single-cell heterogeneity via data fusion improves image-based profiling.","volume":"10","author":"MH Rohban","year":"2019","journal-title":"Nat Commun"},{"key":"pcbi.1012547.ref014","doi-asserted-by":"crossref","first-page":"849","DOI":"10.1038\/nmeth.4397","article-title":"Data-analysis strategies for image-based cell profiling","volume":"14","author":"JC Caicedo","year":"2017","journal-title":"Nat Methods"},{"key":"pcbi.1012547.ref015","doi-asserted-by":"crossref","first-page":"1491","DOI":"10.1101\/gr.190595.115","article-title":"Defining cell types and states with single-cell genomics","volume":"25","author":"C. Trapnell","year":"2015","journal-title":"Genome Res"},{"key":"pcbi.1012547.ref016","doi-asserted-by":"crossref","first-page":"1321","DOI":"10.1177\/1087057113503553","article-title":"Comparison of methods for image-based profiling of cellular morphological responses to small-molecule treatment","volume":"18","author":"V Ljosa","year":"2013","journal-title":"J Biomol Screen"},{"key":"pcbi.1012547.ref017","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1177\/2472555218818053","article-title":"Unbiased Phenotype Detection Using Negative Controls.","volume":"24","author":"A Janosch","year":"2019","journal-title":"SLAS Discov."},{"key":"pcbi.1012547.ref018","doi-asserted-by":"crossref","first-page":"759","DOI":"10.1038\/nmeth.1375","article-title":"An approach for extensibly profiling the molecular states of cellular subpopulations.","volume":"6","author":"H Loo L-","year":"2009","journal-title":"Nat Methods"},{"key":"pcbi.1012547.ref019","doi-asserted-by":"crossref","first-page":"370","DOI":"10.1038\/msb.2010.25","article-title":"Clustering phenotype populations by genome-wide RNAi and multiparametric imaging","volume":"6","author":"F Fuchs","year":"2010","journal-title":"Mol Syst Biol"},{"key":"pcbi.1012547.ref020","doi-asserted-by":"crossref","first-page":"193907","DOI":"10.1109\/ACCESS.2020.3031549","article-title":"Contrastive Representation Learning: A Framework and Review.","volume":"8","author":"PH Le-Khac","year":"2020","journal-title":"IEEE Access."},{"key":"pcbi.1012547.ref021","doi-asserted-by":"crossref","first-page":"1981","DOI":"10.1038\/s41596-023-00840-9","article-title":"Optimizing the Cell Painting assay for image-based profiling","volume":"18","author":"BA Cimini","year":"2023","journal-title":"Nat Protoc"},{"key":"pcbi.1012547.ref022","article-title":"A Framework for Multiple-Instance Learning.","author":"Lozano-P\u00e9rez Maron","year":"1997","journal-title":"Adv Neural Inf Process Syst"},{"key":"pcbi.1012547.ref023","article-title":"Towards a Neural Statistician.","author":"H Edwards","year":"2016","journal-title":"arXiv [stat.ML]"},{"key":"pcbi.1012547.ref024","article-title":"PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space.","author":"CR Qi","year":"2017","journal-title":"arXiv [cs.CV]."},{"key":"pcbi.1012547.ref025","article-title":"Deep Sets.","author":"M Zaheer","year":"2017","journal-title":"arXiv [cs.LG]."},{"key":"pcbi.1012547.ref026","article-title":"A versatile information retrieval framework for evaluating profile strength and similarity.","author":"AA Kalinin","year":"2024","journal-title":"bioRxiv"},{"key":"pcbi.1012547.ref027","article-title":"Cell Painting Gallery: an open resource for image-based profiling.","author":"E Weisbart","year":"2024","journal-title":"ArXiv"},{"key":"pcbi.1012547.ref028","doi-asserted-by":"crossref","first-page":"1757","DOI":"10.1038\/nprot.2016.105","article-title":"Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes.","volume":"11","author":"A Bray M-","year":"2016","journal-title":"Nat Protoc."},{"key":"pcbi.1012547.ref029","doi-asserted-by":"crossref","first-page":"e80999","DOI":"10.1371\/journal.pone.0080999","article-title":"Multiplex cytological profiling assay to measure diverse cellular states.","volume":"8","author":"SM Gustafsdottir","year":"2013","journal-title":"PLoS One"},{"key":"pcbi.1012547.ref030","article-title":"A Decade in a Systematic Review: The Evolution and Impact of Cell Painting.","author":"S Seal","year":"2024","journal-title":"bioRxiv"},{"key":"pcbi.1012547.ref031","article-title":"UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction.","author":"L McInnes","year":"2018","journal-title":"arXiv [stat.ML]."},{"key":"pcbi.1012547.ref032","doi-asserted-by":"crossref","first-page":"911","DOI":"10.1016\/j.cels.2022.10.001","article-title":"Morphology and gene expression profiling provide complementary information for mapping cell state","volume":"13","author":"GP Way","year":"2022","journal-title":"Cell Syst"},{"key":"pcbi.1012547.ref033","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.cels.2023.03.003","article-title":"A global genetic interaction network by single-cell imaging and machine learning","volume":"14","author":"F Heigwer","year":"2023","journal-title":"Cell Syst"},{"key":"pcbi.1012547.ref034","article-title":"High-dimensional phenotyping to define the genetic basis of cellular morphology.","author":"M Tegtmeyer","year":"2023","journal-title":"bioRxiv"},{"key":"pcbi.1012547.ref035","article-title":"Learning representations for image-based profiling of perturbations.","author":"N Moshkov","year":"2022","journal-title":"bioRxiv"},{"key":"pcbi.1012547.ref036","doi-asserted-by":"crossref","first-page":"521","DOI":"10.1038\/s44320-024-00029-6","article-title":"PIFiA: self-supervised approach for protein functional annotation from single-cell imaging data","volume":"20","author":"A Razdaibiedina","year":"2024","journal-title":"Mol Syst Biol"},{"key":"pcbi.1012547.ref037","article-title":"Supervised Contrastive Learning.","author":"P Khosla","year":"2020","journal-title":"arXiv [cs.LG]."},{"key":"pcbi.1012547.ref038","unstructured":"Chen T, Kornblith S, Norouzi M, Hinton G. A Simple Framework for Contrastive Learning of Visual Representations. In: Iii HD, Singh A, editors. Proceedings of the 37th International Conference on Machine Learning. PMLR; 13\u201318 Jul 2020. pp. 1597\u20131607."},{"key":"pcbi.1012547.ref039","doi-asserted-by":"crossref","unstructured":"Chakraborty S, Tomsett R, Raghavendra R, Harborne D, Alzantot M, Cerutti F, et al. Interpretability of deep learning models: A survey of results. 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld\/SCALCOM\/UIC\/ATC\/CBDCom\/IOP\/SCI). 2017. pp. 1\u20136.","DOI":"10.1109\/UIC-ATC.2017.8397411"},{"key":"pcbi.1012547.ref040","article-title":"Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models.","author":"W Samek","year":"2017","journal-title":"arXiv [cs.AI]"},{"key":"pcbi.1012547.ref041","author":"CR Qi","year":"2016","journal-title":"PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation"},{"key":"pcbi.1012547.ref042","article-title":"Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations.","author":"SN Chandrasekaran","year":"2022","journal-title":"bioRxiv"},{"key":"pcbi.1012547.ref043","doi-asserted-by":"crossref","first-page":"653","DOI":"10.1038\/d41573-019-00144-2","article-title":"Machine learning brings cell imaging promises into focus","volume":"18","author":"A. Mullard","year":"2019","journal-title":"Nat Rev Drug Discov"},{"key":"pcbi.1012547.ref044","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1186\/s12859-021-04344-9","article-title":"CellProfiler 4: improvements in speed, utility and usability","volume":"22","author":"DR Stirling","year":"2021","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1012547.ref045","article-title":"Decoupled Weight Decay Regularization.","author":"I Loshchilov","year":"2018"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1012547","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2024,12,2]],"date-time":"2024-12-02T00:00:00Z","timestamp":1733097600000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1012547","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,2]],"date-time":"2024-12-02T14:12:28Z","timestamp":1733148748000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1012547"}},"subtitle":[],"editor":[{"given":"Virginie","family":"Uhlmann","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,11,11]]},"references-count":45,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2024,11,11]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1012547","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2023.11.14.567038","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,11]]}}}