{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T03:55:13Z","timestamp":1774670113160,"version":"3.50.1"},"reference-count":18,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,7,2]],"date-time":"2020-07-02T00:00:00Z","timestamp":1593648000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,7,2]],"date-time":"2020-07-02T00:00:00Z","timestamp":1593648000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n<jats:title>Background<\/jats:title>\n<jats:p>Image-based high throughput (HT) screening provides a rich source of information on dynamic cellular response to external perturbations. The large quantity of data generated necessitates computer-aided quality control (QC) methodologies to flag imaging and staining artifacts. Existing image- or patch-level QC methods require separate thresholds to be simultaneously tuned for each image quality metric used, and also struggle to distinguish between artifacts and valid cellular phenotypes. As a result, extensive time and effort must be spent on per-assay QC feature thresholding, and valid images and phenotypes may be discarded while image- and cell-level artifacts go undetected.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Results<\/jats:title>\n<jats:p>We present a novel cell-level QC workflow built on machine learning approaches for classifying artifacts from HT image data. First, a phenotype sampler based on unlabeled clustering collects a comprehensive subset of cellular phenotypes, requiring only the inspection of a handful of images per phenotype for validity. A set of one-class support vector machines are then trained on each biologically valid image phenotype, and used to classify individual objects in each image as valid cells or artifacts. We apply this workflow to two real-world large-scale HT image datasets and observe that the ratio of artifact to total object area (<jats:italic>AR<\/jats:italic><jats:sub><jats:italic>cell<\/jats:italic><\/jats:sub>) provides a single robust assessment of image quality regardless of the underlying causes of quality issues. Gating on this single intuitive metric, partially contaminated images can be salvaged and highly contaminated images can be excluded before image-level phenotype summary, enabling a more reliable characterization of cellular response dynamics.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Conclusions<\/jats:title>\n<jats:p>Our cell-level QC workflow enables identification of artificial cells created not only by staining or imaging artifacts but also by the limitations of image segmentation algorithms. The single readout <jats:italic>AR<\/jats:italic><jats:sub><jats:italic>cell<\/jats:italic><\/jats:sub> that summaries the ratio of artifacts contained in each image can be used to reliably rank images by quality and more accurately determine QC cutoff thresholds. Machine learning-based cellular phenotype clustering and sampling reduces the amount of manual work required for training example collection. Our QC workflow automatically handles assay-specific phenotypic variations and generalizes to different HT image assays.<\/jats:p>\n<\/jats:sec>","DOI":"10.1186\/s12859-020-03603-5","type":"journal-article","created":{"date-parts":[[2020,7,2]],"date-time":"2020-07-02T13:40:52Z","timestamp":1593697252000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["A cell-level quality control workflow for high-throughput image analysis"],"prefix":"10.1186","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8481-601X","authenticated-orcid":false,"given":"Minhua","family":"Qiu","sequence":"first","affiliation":[]},{"given":"Bin","family":"Zhou","sequence":"additional","affiliation":[]},{"given":"Frederick","family":"Lo","sequence":"additional","affiliation":[]},{"given":"Steven","family":"Cook","sequence":"additional","affiliation":[]},{"given":"Jason","family":"Chyba","sequence":"additional","affiliation":[]},{"given":"Doug","family":"Quackenbush","sequence":"additional","affiliation":[]},{"given":"Jason","family":"Matzen","sequence":"additional","affiliation":[]},{"given":"Zhizhong","family":"Li","sequence":"additional","affiliation":[]},{"given":"Puiying Annie","family":"Mak","sequence":"additional","affiliation":[]},{"given":"Kaisheng","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Yingyao","family":"Zhou","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,7,2]]},"reference":[{"key":"3603_CR1","doi-asserted-by":"publisher","first-page":"598","DOI":"10.1016\/j.tcb.2016.03.008","volume":"26","author":"MM Usaj","year":"2016","unstructured":"Usaj MM, et al. High-content screening for quantitative cell biology. Trends Cell Biol. 2016;26:598\u2013611.","journal-title":"Trends Cell Biol"},{"key":"3603_CR2","doi-asserted-by":"publisher","first-page":"R100","DOI":"10.1186\/gb-2006-7-10-r100","volume":"7","author":"AE Carpenter","year":"2006","unstructured":"Carpenter AE, et al. CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 2006;7:R100.","journal-title":"Genome Biol"},{"key":"3603_CR3","doi-asserted-by":"publisher","first-page":"655","DOI":"10.1039\/C5NP00113G","volume":"33","author":"V Fetz","year":"2016","unstructured":"Fetz V, Prochnow H, Bronstrup M, Sasse F. Target identification by image analysis. Nat Prod Rep. 2016;33:655\u201367.","journal-title":"Nat Prod Rep"},{"key":"3603_CR4","doi-asserted-by":"publisher","first-page":"1194","DOI":"10.1126\/science.1100709","volume":"306","author":"ZE Perlman","year":"2004","unstructured":"Perlman ZE, et al. Multidimensional drug profiling by automated microscopy. Science. 2004;306:1194\u20138.","journal-title":"Science"},{"key":"3603_CR5","doi-asserted-by":"publisher","first-page":"877","DOI":"10.1126\/science.352.6288.877","volume":"352","author":"E Pennisi","year":"2016","unstructured":"Pennisi E. \u2018Cell painting\u2019 highlights responses to drugs and toxins. Science. 2016;352:877\u20138.","journal-title":"Science"},{"key":"3603_CR6","doi-asserted-by":"publisher","first-page":"266","DOI":"10.1177\/1087057111420292","volume":"17","author":"MA Bray","year":"2012","unstructured":"Bray MA, Fraser AN, Hasaka TP, Carpenter AE. Workflow and metrics for image quality control in large-scale high-content screens. J Biomol Screen. 2012;17:266\u201374.","journal-title":"J Biomol Screen"},{"key":"3603_CR7","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1186\/s12859-018-2087-4","volume":"19","author":"SJ Yang","year":"2018","unstructured":"Yang SJ, et al. Assessing microscope image focus quality with deep learning. BMC Bioinformatics. 2018;19:77.","journal-title":"BMC Bioinformatics"},{"key":"3603_CR8","doi-asserted-by":"publisher","first-page":"849","DOI":"10.1038\/nmeth.4397","volume":"14","author":"JC Caicedo","year":"2017","unstructured":"Caicedo JC, et al. Data-analysis strategies for image-based cell profiling. Nat Methods. 2017;14:849\u201363.","journal-title":"Nat Methods"},{"issue":"4","key":"3603_CR9","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1016\/j.cell.2010.04.033","volume":"141","author":"SJ Altschuler","year":"2010","unstructured":"Altschuler SJ, Wu LF. Cellular heterogeneity: do differences make a difference? Cell. 2010;141(4):559\u201363.","journal-title":"Cell"},{"key":"3603_CR10","doi-asserted-by":"publisher","first-page":"482","DOI":"10.1186\/1471-2105-9-482","volume":"9","author":"TR Jones","year":"2008","unstructured":"Jones TR, et al. CellProfiler Analyst: data exploration and analysis software for complex image-based screens. BMC Bioinformatics. 2008;9:482.","journal-title":"BMC Bioinformatics"},{"key":"3603_CR11","doi-asserted-by":"publisher","first-page":"3028","DOI":"10.1093\/bioinformatics\/btp524","volume":"25","author":"P Ramo","year":"2009","unstructured":"Ramo P, Sacher R, Snijder B, Begemann B, Pelkmans L. CellClassifier: supervised learning of cellular phenotypes. Bioinformatics. 2009;25:3028\u201330.","journal-title":"Bioinformatics"},{"key":"3603_CR12","doi-asserted-by":"publisher","first-page":"3210","DOI":"10.1093\/bioinformatics\/btw390","volume":"32","author":"D Dao","year":"2016","unstructured":"Dao D, et al. CellProfiler analyst: interactive data exploration, analysis and classification of large biological image sets. Bioinformatics. 2016;32:3210\u20132.","journal-title":"Bioinformatics"},{"key":"3603_CR13","doi-asserted-by":"publisher","first-page":"328","DOI":"10.1002\/sca.4950230506","volume":"23","author":"JT Thong","year":"2001","unstructured":"Thong JT, Sim KS, Phang JC. Single-image signal-to-noise ratio estimation. Scanning. 2001;23:328\u201336.","journal-title":"Scanning"},{"key":"3603_CR14","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825\u201330.","journal-title":"J Mach Learn Res"},{"key":"3603_CR15","doi-asserted-by":"publisher","first-page":"262","DOI":"10.1007\/978-3-540-48247-5_28","volume":"1704","author":"MM Breunig","year":"1999","unstructured":"Breunig MM, Kriegel HP, Ng RT, Sander J. OPTICS-OF: identifying local outliers. Princ Data Min Knowl Discov. 1999;1704:262\u201370.","journal-title":"Princ Data Min Knowl Discov"},{"key":"3603_CR16","first-page":"04597","volume":"1505","author":"O Ronneberger","year":"2015","unstructured":"Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. ArXiv. 2015;1505:04597.","journal-title":"ArXiv"},{"key":"3603_CR17","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1038\/s41592-018-0261-2","volume":"16","author":"T Falk","year":"2019","unstructured":"Falk T, et al. U-net: deep learning for cell counting, detection, and morphometry. Nat Methods. 2019;16:67\u201370.","journal-title":"Nat Methods"},{"issue":"3","key":"3603_CR18","doi-asserted-by":"publisher","first-page":"761","DOI":"10.1083\/jcb.106.3.761","volume":"106","author":"P Boukamp","year":"1988","unstructured":"Boukamp P, et al. Normal keratinization in a spontaneously immortalized aneuploid human keratinocyte cell line. J Cell Biol. 1988;106(3):761\u201371.","journal-title":"J Cell Biol"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03603-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-020-03603-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03603-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T23:14:31Z","timestamp":1625181271000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-03603-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,2]]},"references-count":18,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["3603"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-03603-5","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,7,2]]},"assertion":[{"value":"5 June 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 June 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 July 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not Applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not Applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"280"}}