{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T13:02:50Z","timestamp":1777899770825,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":63,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,5,16]],"date-time":"2022-05-16T00:00:00Z","timestamp":1652659200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,5,16]]},"DOI":"10.1145\/3522664.3528620","type":"proceedings-article","created":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T16:30:14Z","timestamp":1666024214000},"page":"217-228","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":35,"title":["Code smells for machine learning applications"],"prefix":"10.1145","author":[{"given":"Haiyin","family":"Zhang","sequence":"first","affiliation":[{"name":"AI for Fintech Research, ING, Amsterdam, Netherlands"}]},{"given":"Lu\u00eds","family":"Cruz","sequence":"additional","affiliation":[{"name":"Delft University of Technology, Delft, Netherlands"}]},{"given":"Arie","family":"van Deursen","sequence":"additional","affiliation":[{"name":"Delft University of Technology, Delft, Netherlands"}]}],"member":"320","published-online":{"date-parts":[[2022,10,17]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2017.8258038"},{"key":"e_1_3_2_1_2_1","unstructured":"International Organization for Standardization\/International Electrotechnical Commission et al. 2001. ISO\/IEC 9126--Software Engineering-Product Quality."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3379597.3387473"},{"key":"e_1_3_2_1_4_1","unstructured":"MPA Haakman. 2020. Studying the Machine Learning Lifecycle and Improving Code Quality of Machine Learning Applications. (2020)."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-021-09993-1"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380395"},{"key":"e_1_3_2_1_7_1","volume-title":"NIPS MLSys Workshop.","author":"Hynes Nick","year":"2017","unstructured":"Nick Hynes, D Sculley, and Michael Terry. 2017. The data linter: Lightweight, automated sanity checking for ml data sets. In NIPS MLSys Workshop."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3338906.3338955"},{"key":"e_1_3_2_1_9_1","volume-title":"MLSmellHound: A Context-Aware Code Analysis Tool. In 2022 IEEE\/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER).","author":"Kannan Jai","year":"2022","unstructured":"Jai Kannan, Scott Barnett, Andrew Simmons, Lu\u00eds Cruz, and Akash Agarwal. 2022. MLSmellHound: A Context-Aware Code Analysis Tool. In 2022 IEEE\/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)."},{"key":"e_1_3_2_1_10_1","volume-title":"Ijcai","volume":"14","author":"Ron","unstructured":"Ron Kohavi et al. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Ijcai, Vol. 14. Montreal, Canada, 1137--1145."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2020.110610"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-65854-0_4"},{"key":"e_1_3_2_1_13_1","unstructured":"Kent Beck Martin Fowler. 2018. efactoring: Improving the Design of Existing Code."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-017-9535-z"},{"key":"e_1_3_2_1_15_1","volume-title":"Pitfalls Analyzer: Quality Control for Model-Driven Data Science Pipelines. In 2019 ACM\/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems (MODELS). IEEE, 12--22","author":"Rajbahadur Gopi Krishnan","year":"2019","unstructured":"Gopi Krishnan Rajbahadur, Gustavo Ansaldi Oliva, Ahmed E Hassan, and Juergen Dingel. 2019. Pitfalls Analyzer: Quality Control for Model-Driven Data Science Pipelines. In 2019 ACM\/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems (MODELS). IEEE, 12--22."},{"key":"e_1_3_2_1_16_1","unstructured":"D. Sculley Gary Holt D. Golovin Eugene Davydov Todd Phillips D. Ebner Vinay Chaudhary M. Young J. Crespo and Dan Dennison. 2015. Hidden Technical Debt in Machine Learning Systems. In NIPS."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3382494.3410680"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2012.89"},{"key":"e_1_3_2_1_19_1","volume-title":"An Empirical Study of Refactorings and Technical Debt in Machine Learning Systems. In 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 238--250","author":"Tang Yiming","year":"2021","unstructured":"Yiming Tang, Raffi Khatchadourian, Mehdi Bagherzadeh, Rhia Singh, Ajani Stewart, and Anita Raja. 2021. An Empirical Study of Refactorings and Technical Debt in Machine Learning Systems. In 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 238--250."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/WAIN52551.2021.00011"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2013.08.002"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3213846.3213866"},{"key":"e_1_3_2_1_23_1","unstructured":"Christian Haller. My Machine Learning Model Is Perfect. URL: https:\/\/towardsdatascience.com\/my-machine-learning-model-is-perfect-9a7928e0f604"},{"key":"e_1_3_2_1_24_1","unstructured":"Cheng-Tao Chu. Machine Learning Done Wrong. URL: https:\/\/ml.posthaven.com\/machine-learning-done-wrong"},{"key":"e_1_3_2_1_25_1","unstructured":"What are common mistakes when working with neural networks? URL: https:\/\/www.kaggle.com\/general\/196487"},{"key":"e_1_3_2_1_26_1","unstructured":"Top 10 Coding Mistakes Made by Data Scientists. URL: https:\/\/www.kdnuggets.com\/2019\/04\/top-10-coding-mistakes-data-scientists.html"},{"key":"e_1_3_2_1_27_1","volume-title":"A PyTorch Tools, best practices & Styleguide. URL: https:\/\/github.com\/IgorSusmelj\/pytorch-styleguide","author":"Susmelj Igor","year":"2022","unstructured":"Igor Susmelj, Lucas Vandroux, Daniel Bourke (2022). A PyTorch Tools, best practices & Styleguide. URL: https:\/\/github.com\/IgorSusmelj\/pytorch-styleguide"},{"key":"e_1_3_2_1_28_1","unstructured":"EffectiveTensorflow. URL: https:\/\/github.com\/vahidk\/EffectiveTensorflow"},{"key":"e_1_3_2_1_29_1","volume-title":"Pandas Style Guide. URL: https:\/\/github.com\/joshlk\/pandas_style_guide","author":"Levy-Kramer Josh","year":"2021","unstructured":"Josh Levy-Kramer (2021). Pandas Style Guide. URL: https:\/\/github.com\/joshlk\/pandas_style_guide"},{"key":"e_1_3_2_1_30_1","unstructured":"Scikit-Learn Documentation. URL: https:\/\/scikit-learn.org\/stable\/common_pitfalls.html"},{"key":"e_1_3_2_1_31_1","unstructured":"PyTorch Documentation. Reproducibility. URL: https:\/\/pytorch.org\/docs\/stable\/notes\/randomness.html"},{"key":"e_1_3_2_1_32_1","unstructured":"Alexandra Deis. In-place Operations in PyTorch. URL: https:\/\/towardsdatascience.com\/in-place-operations-in-pytorch-f91d493e970e"},{"key":"e_1_3_2_1_33_1","unstructured":"GitHub Commit. URL: https:\/\/github.com\/bamos\/dcgan-completion.tensorflow\/commit\/e8b930501dffe01db423b6ca1c65d3ac54f27223"},{"key":"e_1_3_2_1_34_1","volume-title":"Inplace operator in Python. URL: https:\/\/www.tutorialspoint.com\/inplace-operator-in-python","author":"Sam Samual","year":"2018","unstructured":"Samual Sam (2018). Inplace operator in Python. URL: https:\/\/www.tutorialspoint.com\/inplace-operator-in-python"},{"key":"e_1_3_2_1_35_1","unstructured":"Github Commit - Tensor Flow. URL: https:\/\/github.com\/tensorflow\/models\/commit\/90f63a1e1653"},{"key":"e_1_3_2_1_36_1","unstructured":"Pandas Documentation. Essential basic functionality - Iteration. URL: https:\/\/pandas.pydata.org\/pandas-docs\/stable\/user_guide\/basics.html#iteration"},{"key":"e_1_3_2_1_37_1","unstructured":"Vectorization Part 2: Why and What? URL: https:\/\/www.quantifisolutions.com\/vectorization-part-2-why-and-what\/"},{"key":"e_1_3_2_1_38_1","unstructured":"Scikit-Learn Documentation. URL: https:\/\/scikit-learn.org\/stable\/modules\/preprocessing.html"},{"key":"e_1_3_2_1_39_1","unstructured":"Stack Overflow. GridSearchCV extremely slow on small dataset in scikit-learn. URL: https:\/\/stackoverflow.com\/questions\/17455302\/gridsearchcv-extremely-slow-on-small-dataset-in-scikit-learn\/23813876#23813876"},{"key":"e_1_3_2_1_40_1","unstructured":"Feature scaling. URL: https:\/\/en.wikipedia.org\/wiki\/Feature_scaling"},{"key":"e_1_3_2_1_41_1","unstructured":"TensorFlow Documentation. Backend: clear_session. URL: https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/keras\/backend\/clear_session"},{"key":"e_1_3_2_1_42_1","unstructured":"Stack Overflow. Tensorflow OOM on GPU. URL: https:\/\/stackoverflow.com\/questions\/42495930\/tensorflow-oom-on-gpu"},{"key":"e_1_3_2_1_43_1","unstructured":"Stack Overflow. Tensorflow NaN bug? URL: https:\/\/stackoverflow.com\/questions\/33712178\/tensorflow-nan-bug"},{"key":"e_1_3_2_1_44_1","unstructured":"Stack Overflow. TensorFlow's ReluGrad claims input is not finite. URL: https:\/\/stackoverflow.com\/questions\/33699174\/tensorflows-relugrad-claims-input-is-not-finite"},{"key":"e_1_3_2_1_45_1","unstructured":"Stack Overflow. Tensorflow - Convolutionary Net: Grayscale vs Black\/White training. URL: https:\/\/stackoverflow.com\/questions\/39487825\/tensorflow-convolutionary-net-grayscale-vs-black-white-training"},{"key":"e_1_3_2_1_46_1","unstructured":"Stack Overflow. Implement MLP in tensorflow. URL: https:\/\/stackoverflow.com\/questions\/35078027\/implement-mlp-in-tensorflow"},{"key":"e_1_3_2_1_47_1","unstructured":"Weight Initialization Techniques in Neural Networks. URL: https:\/\/towardsdatascience.com\/weight-initialization-techniques-in-neural-networks-26c649eb3b78"},{"key":"e_1_3_2_1_48_1","unstructured":"Stack Overflow. Best practices for generating a random seeds to seed Pytorch? URL: https:\/\/stackoverflow.com\/questions\/57416925\/best-practices-for-generating-a-random-seeds-to-seed-pytorch"},{"key":"e_1_3_2_1_49_1","unstructured":"Stack Overflow. Keras Regression using Scikit Learn StandardScaler with Pipeline and without Pipeline. URL: https:\/\/stackoverflow.com\/questions\/43816718\/keras-regression-using-scikit-learn-standardscaler-with-pipeline-and-without-pip\/43816833#43816833"},{"key":"e_1_3_2_1_50_1","unstructured":"Ask a Data Scientist: Data Leakage. URL: https:\/\/insidebigdata.com\/2014\/11\/26\/ask-data-scientist-data-leakage\/"},{"key":"e_1_3_2_1_51_1","unstructured":"Data Leakage. URL: https:\/\/www.kaggle.com\/alexisbcook\/data-leakage"},{"key":"e_1_3_2_1_52_1","unstructured":"Pandas Documentation. URL: https:\/\/pandas.pydata.org\/pandas-docs\/stable\/user_guide\/indexing.html#indexing-view-versus-copy"},{"key":"e_1_3_2_1_53_1","unstructured":"Stack Overflow. Extrapolate values in Pandas DataFrame. URL: https:\/\/stackoverflow.com\/questions\/22491628\/extrapolate-values-in-pandas-dataframe\/35959909#35959909"},{"key":"e_1_3_2_1_54_1","unstructured":"Stack Overflow. Why does one use of iloc() give a SettingWithCopyWarning but the other doesn't? URL: https:\/\/stackoverflow.com\/questions\/53806570\/why-does-one-use-of-iloc-give-a-settingwithcopywarning-but-the-other-doesnt\/53807453#53807453"},{"key":"e_1_3_2_1_55_1","unstructured":"Stack Overflow. Convert pandas dataframe to NumPy array. URL: https:\/\/stackoverflow.com\/questions\/13187778\/convert-pandas-dataframe-to-numpy-array\/54508052#54508052"},{"key":"e_1_3_2_1_56_1","unstructured":"Stack Overflow. Does np.dot automatically transpose vectors? URL: https:\/\/stackoverflow.com\/questions\/54160155\/does-np-dot-automatically-transpose-vectors\/54161169#54161169"},{"key":"e_1_3_2_1_57_1","unstructured":"Linear Algebra (numpy.dot). NumPy Documentation. URL: https:\/\/numpy.org\/doc\/stable\/reference\/generated\/numpy.dot.html#numpy.dot"},{"key":"e_1_3_2_1_58_1","unstructured":"Yuval Greenfield. Most Common Neural Net PyTorch Mistakes. URL: https:\/\/medium.com\/missinglink-deep-learning-platform\/most-common-neural-net-pytorch-mistakes-456560ada037"},{"key":"e_1_3_2_1_59_1","unstructured":"Stack Overflow. Is this a right way to train and test the model using Pytorch? URL: https:\/\/stackoverflow.com\/questions\/67066452\/is-this-a-right-way-to-train-and-test-the-model-using-pytorch\/67067242#67067242"},{"key":"e_1_3_2_1_60_1","unstructured":"Why does detach reduce the allocated memory? URL: https:\/\/discuss.pytorch.org\/t\/why-does-detach-reduce-the-allocated-memory\/43836"},{"key":"e_1_3_2_1_61_1","unstructured":"Dot product. Wikipedia. URL: https:\/\/en.wikipedia.org\/wiki\/Dot_product"},{"key":"e_1_3_2_1_62_1","unstructured":"Stack Overflow. What is the rationale for all comparisons returning false for IEEE754 NaN values? URL: https:\/\/stackoverflow.com\/questions\/1565164\/what-is-the-rationale-for-all-comparisons-returning-false-for-ieee754-nan-values"},{"key":"e_1_3_2_1_63_1","unstructured":"Broadcasting the good and the ugly URL: https:\/\/effectivemachinelearning.com\/PyTorch\/3._Broadcasting_the_good_and_the_ugly"}],"event":{"name":"CAIN '22: 1st Conference on AI Engineering - Software Engineering for AI","location":"Pittsburgh Pennsylvania","acronym":"CAIN '22","sponsor":["SIGSOFT ACM Special Interest Group on Software Engineering","IEEE TCSC IEEE Technical Committee on Scalable Computing"]},"container-title":["Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3522664.3528620","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3522664.3528620","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:34Z","timestamp":1750183774000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3522664.3528620"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,16]]},"references-count":63,"alternative-id":["10.1145\/3522664.3528620","10.1145\/3522664"],"URL":"https:\/\/doi.org\/10.1145\/3522664.3528620","relation":{},"subject":[],"published":{"date-parts":[[2022,5,16]]},"assertion":[{"value":"2022-10-17","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}