{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:57:52Z","timestamp":1760237872381,"version":"build-2065373602"},"reference-count":47,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2020,6,29]],"date-time":"2020-06-29T00:00:00Z","timestamp":1593388800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Informatics"],"abstract":"<jats:p>This paper proposes a new method to generate edited topics or clusters to analyze images for prioritizing quality issues. The approach is associated with a new way for subject matter experts to edit the cluster definitions by \u201czapping\u201d or \u201cboosting\u201d pixels. We refer to the information entered by users or experts as \u201chigh-level\u201d data and we are apparently the first to allow in our model for the possibility of errors coming from the experts. The collapsed Gibbs sampler is proposed that permits efficient processing for datasets involving tens of thousands of records. Numerical examples illustrate the benefits of the high-level data related to improving accuracy measured by Kullback\u2013Leibler (KL) distance. The numerical examples include a Tungsten inert gas example from the literature. In addition, a novel laser aluminum alloy image application illustrates the assignment of welds to groups that correspond to part conformance standards.<\/jats:p>","DOI":"10.3390\/informatics7030021","type":"journal-article","created":{"date-parts":[[2020,6,29]],"date-time":"2020-06-29T11:17:17Z","timestamp":1593429437000},"page":"21","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Expert Refined Topic Models to Edit Topic Clusters in Image Analysis Applied to Welding Engineering"],"prefix":"10.3390","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9522-3252","authenticated-orcid":false,"given":"Theodore T.","family":"Allen","sequence":"first","affiliation":[{"name":"Integrated Systems Engineering, The Ohio State University, 1971 Neil Avenue, Columbus, OH 43210-1271, USA"}]},{"given":"Hui","family":"Xiong","sequence":"additional","affiliation":[{"name":"Intel Corporation, 2501 NW 229th Ave, Hillsboro, OR 97124, USA"}]},{"given":"Shih-Hsien","family":"Tseng","sequence":"additional","affiliation":[{"name":"Department of Business Administration, Chung Yuan Christian University, 200 Chung Pei Road, Chung Li District, Taoyuan City 32023, Taiwan"}]}],"member":"1968","published-online":{"date-parts":[[2020,6,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1080\/00224065.2011.11917851","article-title":"A Bayesian model for integrating multiple sources of lifetime information in system-reliability assessments","volume":"43","author":"Reese","year":"2011","journal-title":"J. Qual. Technol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1198\/004017007000000227","article-title":"Special Issue on Statistics in Information Technology","volume":"49","author":"Nair","year":"2007","journal-title":"Technometrics"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1002\/qre.517","article-title":"An integrated model for statistical and vision monitoring in manufacturing transitions","volume":"19","author":"Nembhard","year":"2003","journal-title":"Qual. Reliab. Eng. Int."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1002\/asmb.2123","article-title":"A directed topic model applied to call center improvement","volume":"32","author":"Allen","year":"2016","journal-title":"Appl. Stoch. Models Bus. Ind."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"250","DOI":"10.1287\/deca.2017.0360","article-title":"Timely decision analysis enabled by efficient social media modeling","volume":"14","author":"Allen","year":"2017","journal-title":"Decis. Anal."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1080\/00224065.2011.11917848","article-title":"A review and perspective on control charting with image data","volume":"43","author":"Megahed","year":"2011","journal-title":"J. Qual. Technol."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1080\/00224065.2011.11917856","article-title":"Analyzing the effect of process parameters on the shape of 3D profiles","volume":"43","author":"Colosimo","year":"2011","journal-title":"J. Qual. Technol."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1080\/00401706.1997.10485116","article-title":"Monitoring wafer map data from integrated circuit fabrication processes for spatially clustered defects","volume":"39","author":"Hansen","year":"1997","journal-title":"Technometrics"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1111\/j.1467-9876.2005.00493.x","article-title":"Design and analysis of variable fidelity experimentation applied to engine valve heat treatment process design","volume":"54","author":"Huang","year":"2005","journal-title":"J. R. Stat. Soc. Ser. C (Appl. Stat.)"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"801","DOI":"10.1016\/j.cie.2011.01.018","article-title":"Data mining for quality control: Burr detection in the drilling process","volume":"60","author":"Ferreiro","year":"2011","journal-title":"Comput. Ind. Eng."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1016\/j.neuroimage.2017.10.034","article-title":"Image processing and Quality Control for the first 10,000 brain imaging datasets from UK Biobank","volume":"166","author":"Jenkinson","year":"2018","journal-title":"Neuroimage"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1080\/00224065.2010.11917805","article-title":"Simultaneous identification of premodeled and unmodeled variation patterns","volume":"42","author":"Apley","year":"2010","journal-title":"J. Qual. Technol."},{"key":"ref_13","first-page":"993","article-title":"Latent Dirichlet Allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1466","DOI":"10.1080\/01621459.2016.1174132","article-title":"Multiple imputation of missing categorical and continuous values via Bayesian mixture models with local dependence","volume":"111","author":"Murray","year":"2016","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"340","DOI":"10.1080\/01621459.2016.1255636","article-title":"Mixture models with a prior on the number of components","volume":"113","author":"Miller","year":"2018","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Van Havre, Z., White, N., Rousseau, J., and Mengersen, K. (2015). Overfitting Bayesian mixture models with an unknown number of components. PLoS ONE, 10.","DOI":"10.1371\/journal.pone.0131739"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1109\/TNNLS.2018.2844399","article-title":"Variational Bayesian learning for Dirichlet process mixture of inverted Dirichlet distributions in non-Gaussian image feature modeling","volume":"30","author":"Ma","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"690","DOI":"10.1002\/asmb.2075","article-title":"A Simple Approach for Multi-fidelity Experimentation Applied to Financial Engineering","volume":"31","author":"Tseng","year":"2015","journal-title":"Appl. Stoch. Models Bus. Ind."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1198\/004017006000000471","article-title":"Mining and tracking massive text data: Classification, construction of tracking statistics, and inference under misclassification","volume":"49","author":"Jeske","year":"2007","journal-title":"Technometrics"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1198\/004017007000000245","article-title":"Large-scale Bayesian logistic regression for text categorization","volume":"49","author":"Genkin","year":"2007","journal-title":"Technometrics"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"773","DOI":"10.1002\/qre.999","article-title":"Review of multinomial and multiattribute quality control charts","volume":"25","author":"Topalidou","year":"2009","journal-title":"Qual. Reliab. Eng. Int."},{"key":"ref_22","first-page":"17","article-title":"A correlated topic model of science","volume":"1","author":"Blei","year":"2007","journal-title":"Ann. Appl. Stat."},{"key":"ref_23","unstructured":"Blei, D.M., and Mcauliffe, J.D. (2007, January 3\u20136). Supervised topic models. Proceedings of the 20th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1080\/01969727308546046","article-title":"A fuzzy relative of the ISODATA process and its use in detecting compact well-separated Clust","volume":"3","author":"Dunn","year":"1973","journal-title":"J. Cybern."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1016\/S0165-0114(97)00307-2","article-title":"Detection of welding flaws from radiographic images with fuzzy clustering methods","volume":"108","author":"Liao","year":"1999","journal-title":"Fuzzy Sets Syst."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1016\/S0952-1976(01)00032-X","article-title":"Knowledge discovery from process operational data using PCA and fuzzy clustering","volume":"14","author":"Sebzalli","year":"2001","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1109\/TASE.2014.2327029","article-title":"Image-based process monitoring using low-rank tensor decomposition","volume":"12","author":"Yan","year":"2014","journal-title":"IEEE Trans. Autom. Sci. Eng."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1080\/08982112.2017.1357077","article-title":"Jump regression, image processing, and quality control","volume":"30","author":"Qiu","year":"2018","journal-title":"Qual. Eng."},{"key":"ref_29","first-page":"1","article-title":"Learning author-topic models from text corpora","volume":"28","author":"Chemudugunta","year":"2010","journal-title":"ACM Trans. Inf. Syst. (TOIS)"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Ihianle, I.K., Naeem, U., Islam, S., and Tawil, A.-R. (2018). A Hybrid Approach to Recognising Activities of Daily Living from Object Use in the Home Environment. Informatics, 5.","DOI":"10.3390\/informatics5010006"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Arun, R., Suresh, V., Madhavan, V.C.E., and Murty, N.M. (2010). On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations. Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer.","DOI":"10.1007\/978-3-642-13657-3_43"},{"key":"ref_32","unstructured":"Jeffus, L., and Bower, L. (2009). Welding Skills, Processes and Practices for Entry-Level Welders, Cengage Learning."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1775","DOI":"10.1016\/j.neucom.2008.06.011","article-title":"A Density-Based Method for Adaptive LDA Model Selection","volume":"72","author":"Cao","year":"2009","journal-title":"Neurocomputing"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1192","DOI":"10.1016\/j.physa.2018.08.050","article-title":"Application of R\u00e9nyi and Tsallis entropies to topic modeling optimization","volume":"512","author":"Koltsov","year":"2018","journal-title":"Phys. A Stat. Mech. Its Appl."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"5228","DOI":"10.1073\/pnas.0307752101","article-title":"Finding scientific topics","volume":"101","author":"Griffiths","year":"2004","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_36","unstructured":"Waal, A.D., and Barnard, E. (2008). Evaluating topic models with stability. Human Language Technologies, Meraka Institute."},{"key":"ref_37","first-page":"74","article-title":"What is Wrong with Topic Modeling? And how to fix it using search-based software engineering","volume":"98","author":"Amritanshu","year":"2008","journal-title":"Inf. Softw. Technol."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Chuang, J., Roberts, M.E., Stewart, B.M., Weiss, R., Tingley, D., Grimmer, J., and Heer, J. (June, January 31). TopicCheck: Interactive alignment for assessing topic model stability. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, CO, USA.","DOI":"10.3115\/v1\/N15-1018"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Koltcov, S., Nikolenko, S.I., Koltsova, O., Filippov, V., and Bodrunova, S. (2016, January 12\u201314). Stable Topic Modeling with Local Density Regularization. Proceedings of the Third International Conference on Internet Science, Florence, Italy.","DOI":"10.1007\/978-3-319-45982-0_16"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1007\/s101110200002","article-title":"Can we ever escape from data overload? A cognitive systems diagnosis","volume":"4","author":"Woods","year":"2002","journal-title":"Cogn. Technol. Work"},{"key":"ref_41","first-page":"424","article-title":"Probabilistic topic models","volume":"427","author":"Steyvers","year":"2007","journal-title":"Handb. Latent Semant. Anal."},{"key":"ref_42","first-page":"464","article-title":"Integrating out multinomial parameters in Latent Dirichlet Allocation and naive Bayes for collapsed Gibbs sampling","volume":"4","author":"Carpenter","year":"2010","journal-title":"Rapp. Tech."},{"key":"ref_43","unstructured":"(2020, May 18). Fashion MNIST. Available online: https:\/\/www.kaggle.com\/zalando-research\/fashionmnist#fashion-mnisttest.csv."},{"key":"ref_44","unstructured":"(2020, June 16). Digit Recognizer. Available online: https:\/\/www.kaggle.com\/c\/digit-recognizer\/data."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"603","DOI":"10.1016\/j.jmapro.2019.07.020","article-title":"Automated defect classification of Aluminium 5083 TIG welding using HDR camera and neural networks","volume":"45","author":"Bacioiu","year":"2019","journal-title":"J. Manuf. Process."},{"key":"ref_46","unstructured":"Mimno, D., Wallach, H.M., Talley, E., Leenders, M., and McCallum, A. Optimizing Semantic Coherence in Topic Models. Proceedings of the Conference on Empirical Methods in Natural Language Processing."},{"key":"ref_47","unstructured":"Newman, D., Lau, J.H., Grieser, K., and Baldwin, T. (2010). Automatic Evaluation of Topic Coherence. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Association for Computational Linguistics."}],"container-title":["Informatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2227-9709\/7\/3\/21\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:44:27Z","timestamp":1760175867000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2227-9709\/7\/3\/21"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,29]]},"references-count":47,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2020,9]]}},"alternative-id":["informatics7030021"],"URL":"https:\/\/doi.org\/10.3390\/informatics7030021","relation":{},"ISSN":["2227-9709"],"issn-type":[{"type":"electronic","value":"2227-9709"}],"subject":[],"published":{"date-parts":[[2020,6,29]]}}}