{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,1]],"date-time":"2026-06-01T15:54:43Z","timestamp":1780329283980,"version":"3.54.1"},"reference-count":50,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2023,4,2]],"date-time":"2023-04-02T00:00:00Z","timestamp":1680393600000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Deutsche Forschungs Gemeinschaft","award":["398967434"],"award-info":[{"award-number":["398967434"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,4,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Machine learning has shown extensive growth in recent years and is now routinely applied to sensitive areas. To allow appropriate verification of predictive models before deployment, models must be deterministic. Solely fixing all random seeds is not sufficient for deterministic machine learning, as major machine learning libraries default to the usage of nondeterministic algorithms based on atomic operations.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Various machine learning libraries released deterministic counterparts to the nondeterministic algorithms. We evaluated the effect of these algorithms on determinism and runtime. Based on these results, we formulated a set of requirements for deterministic machine learning and developed a new software solution, the mlf-core ecosystem, which aids machine learning projects to meet and keep these requirements. We applied mlf-core to develop deterministic models in various biomedical fields including a single-cell autoencoder with TensorFlow, a PyTorch-based U-Net model for liver-tumor segmentation in computed tomography scans, and a liver cancer classifier based on gene expression profiles with XGBoost.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>The complete data together with the implementations of the mlf-core ecosystem and use case models are available at https:\/\/github.com\/mlf-core.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad164","type":"journal-article","created":{"date-parts":[[2023,4,2]],"date-time":"2023-04-02T13:10:55Z","timestamp":1680441055000},"source":"Crossref","is-referenced-by-count":9,"title":["mlf-core: a framework for deterministic machine learning"],"prefix":"10.1093","volume":"39","author":[{"given":"Lukas","family":"Heumos","sequence":"first","affiliation":[{"name":"Quantitative Biology Center (QBiC), Eberhard Karls University of T\u00fcbingen , T\u00fcbingen 72076, Germany"},{"name":"Institute of Computational Biology, Helmholtz Zentrum M\u00fcnchen , Munich 85764, Germany"},{"name":"Institute of Lung Biology and Disease and Comprehensive Pneumology Center, Helmholtz Zentrum M\u00fcnchen, Member of the German Center for Lung Research (DZL) , Munich 81377, Germany"},{"name":"TUM School of Life Sciences Weihenstephan, Technical University of Munich , Freising 85354, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Philipp","family":"Ehmele","sequence":"additional","affiliation":[{"name":"Department of Informatics, University of Hamburg , Hamburg 20146, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6950-6929","authenticated-orcid":false,"given":"Luis","family":"Kuhn Cuellar","sequence":"additional","affiliation":[{"name":"Quantitative Biology Center (QBiC), Eberhard Karls University of T\u00fcbingen , T\u00fcbingen 72076, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kevin","family":"Menden","sequence":"additional","affiliation":[{"name":"Quantitative Biology Center (QBiC), Eberhard Karls University of T\u00fcbingen , T\u00fcbingen 72076, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Edmund","family":"Miller","sequence":"additional","affiliation":[{"name":"Department of Biological Sciences and Center for Systems Biology, The University of Texas at Dallas , Richardson, TX 75205, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Steffen","family":"Lemke","sequence":"additional","affiliation":[{"name":"Quantitative Biology Center (QBiC), Eberhard Karls University of T\u00fcbingen , T\u00fcbingen 72076, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7049-9474","authenticated-orcid":false,"given":"Gisela","family":"Gabernet","sequence":"additional","affiliation":[{"name":"Quantitative Biology Center (QBiC), Eberhard Karls University of T\u00fcbingen , T\u00fcbingen 72076, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sven","family":"Nahnsen","sequence":"additional","affiliation":[{"name":"Quantitative Biology Center (QBiC), Eberhard Karls University of T\u00fcbingen , T\u00fcbingen 72076, Germany"},{"name":"Biomedical Data Science, Department for Computer Science, Eberhard Karls University of T\u00fcbingen , T\u00fcbingen 72074, Germany"},{"name":"Institute of Bioinformatics and Medical Informatics, Eberhard Karls University of T\u00fcbingen , T\u00fcbingen 72074, Germany"},{"name":"Faculty of Medicine, Eberhard Karls University of T\u00fcbingen , T\u00fcbingen 72016, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2023,4,2]]},"reference":[{"key":"2023041120315437800_","author":"Abadi","year":"2015"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3389360","article-title":"Algorithms for efficient reproducible floating point summation","volume":"46","author":"Ahrens","year":"2020","journal-title":"ACM Trans Math Softw"},{"key":"2023041120315437800_","year":"2016"},{"key":"2023041120315437800_","article-title":"Imputation of single-cell gene expression with an autoencoder neural network","volume":"8","author":"Badsha","year":"2020","journal-title":"Quant Biol (Beijing, China)"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"1317","DOI":"10.1001\/jama.2017.18391","article-title":"Big data and machine learning in health care","volume":"319","author":"Beam","year":"2018","journal-title":"JAMA"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"1215","DOI":"10.1038\/s41592-019-0458-z","article-title":"Applications, promises, and pitfalls of deep learning for fluorescence image reconstruction","volume":"16","author":"Belthangady","year":"2019","journal-title":"Nat Methods"},{"key":"2023041120315437800_","author":"Bilic"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1016\/j.jhep.2012.12.005","article-title":"The burden of liver disease in Europe: a review of available epidemiological data","volume":"58","author":"Blachier","year":"2013","journal-title":"J Hepatol"},{"key":"2023041120315437800_","first-page":"1531","article-title":"Research on error accumulative sum of single precision floating point","volume":"33","author":"Chen","year":"2013","journal-title":"J Comput Appl"},{"key":"2023041120315437800_","first-page":"785","author":"Chen","year":"2016"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"532","DOI":"10.3390\/genes11050532","article-title":"Sparsity-Penalized stacked denoising autoencoders for imputing single-cell RNA-Seq data","volume":"11","author":"Chi","year":"2020","journal-title":"Genes"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"15497","DOI":"10.1038\/s41598-018-33860-7","article-title":"Automatic liver tumor segmentation in CT with fully convolutional neural networks and object-based postprocessing","volume":"8","author":"Chlebus","year":"2018","journal-title":"Sci Rep"},{"key":"2023041120315437800_","first-page":"424","author":"\u00c7i\u00e7ek","year":"2016"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1145\/2812803","article-title":"Repeatability in computer systems research","volume":"59","author":"Collberg","year":"2016","journal-title":"Commun ACM"},{"key":"2023041120315437800_","volume-title":"Efficient Reproducible Floating Point Summation and BLAS","author":"Demmel","year":"2016"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1016\/j.ejca.2008.10.026","article-title":"New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1)","volume":"45","author":"Eisenhauer","year":"2009","journal-title":"Eur J Cancer"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"2557","DOI":"10.1053\/j.gastro.2007.04.061","article-title":"Hepatocellular carcinoma: epidemiology and molecular carcinogenesis","volume":"132","author":"El-Serag","year":"2007","journal-title":"Gastroenterology"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-018-07931-2","article-title":"Single-cell RNA-seq denoising using a deep count autoencoder","volume":"10","author":"Eraslan","year":"2019","journal-title":"Nat Commun"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1038\/s41587-020-0439-x","article-title":"The nf-core framework for community-curated bioinformatics pipelines","volume":"38","author":"Ewels","year":"2020","journal-title":"Nat Biotechnol"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"E14","DOI":"10.1038\/s41586-020-2766-y","article-title":"Transparency and reproducibility in artificial intelligence","volume":"586","author":"Haibe-Kains","year":"2020","journal-title":"Nature"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"1132","DOI":"10.1038\/s41592-021-01256-7","article-title":"Reproducibility standards for machine learning in the life sciences","volume":"18","author":"Heil","year":"2021","journal-title":"Nat Methods"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"1251","DOI":"10.1109\/TMI.2009.2013851","article-title":"Comparison and evaluation of methods for liver segmentation from CT datasets","volume":"28","author":"Heimann","year":"2009","journal-title":"IEEE Trans Med Imaging"},{"key":"2023041120315437800_","first-page":"3207","author":"Henderson"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1038\/nmeth.4662","article-title":"FateID infers cell fate bias in multipotent progenitors from single-cell RNA-seq data","volume":"15","author":"Herman","year":"2018","journal-title":"Nat Methods"},{"key":"2023041120315437800_","author":"Heumos","year":"2020"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"469","DOI":"10.1080\/10618600.2017.1305277","article-title":"Letter-Value plots: boxplots for large data","volume":"26","author":"Hofmann","year":"2017","journal-title":"J Comput Graph Stat"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1126\/science.359.6377.725","article-title":"Artificial intelligence faces reproducibility crisis","volume":"359","author":"Hutson","year":"2018","journal-title":"Science"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"605132","DOI":"10.3389\/fbioe.2020.605132","article-title":"RA-UNet: a hybrid deep attention-aware network to extract liver and tumor in CT scans","volume":"8","author":"Jin","year":"2020","journal-title":"Front Bioeng Biotechnol"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"5125","DOI":"10.1016\/j.eswa.2013.03.019","article-title":"Consumer credit risk: individual probability estimates using machine learning","volume":"40","author":"Kruppa","year":"2013","journal-title":"Expert Syst Appl"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"715","DOI":"10.1038\/s41592-019-0494-8","article-title":"scGen predicts single-cell perturbation responses","volume":"16","author":"Lotfollahi","year":"2019","journal-title":"Nat Methods"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"242","DOI":"10.3389\/fgene.2018.00242","article-title":"Machine learning on human muscle transcriptomic data for biomarker discovery and tissue-specific drug target identification","volume":"9","author":"Mamoshina","year":"2018","journal-title":"Front Genet"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"1128","DOI":"10.1038\/s41592-021-01241-0","article-title":"The AIMe registry for artificial intelligence in biomedical research","volume":"18","author":"Matschinske","year":"2021","journal-title":"Nat Methods"},{"key":"2023041120315437800_","author":"McInnes","year":"2018"},{"key":"2023041120315437800_","article-title":"Docker: lightweight linux containers for consistent development and deployment","volume":"2014","author":"Merkel","year":"2014","journal-title":"Linux J"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"1386","DOI":"10.1038\/s41592-021-01275-4","article-title":"Deep learning improves macromolecule identification in 3D cellular cryo-electron tomograms","volume":"18","author":"Moebel","year":"2021","journal-title":"Nat Methods"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"e200029","DOI":"10.1148\/ryai.2020200029","article-title":"Checklist for artificial intelligence in medical imaging (CLAIM): a guide for authors and reviewers","volume":"2","author":"Mongan","year":"2020","journal-title":"Radiol Artif Intell"},{"key":"2023041120315437800_","author":"Nagarajan"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"1320","DOI":"10.1038\/s41591-020-1041-y","article-title":"Minimum information about clinical artificial intelligence modeling: the MI-CLAIM checklist","volume":"26","author":"Norgeot","year":"2020","journal-title":"Nat Med"},{"key":"2023041120315437800_","author":"Gundersen"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.jbi.2017.07.010","article-title":"Reproducibility of studies on text mining for citation screening in systematic reviews: evaluation and checklist","volume":"73","author":"Olorisade","year":"2017","journal-title":"J Biomed Inform"},{"key":"2023041120315437800_","first-page":"8024","volume-title":"Advances in Neural Information Processing Systems.","author":"Paszke","year":"2019"},{"key":"2023041120315437800_","author":"Pham"},{"key":"2023041120315437800_","author":"Rocklin","year":"2015"},{"key":"2023041120315437800_","first-page":"234","author":"Ronneberger"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1007\/s00146-014-0539-6","article-title":"Crime detection and criminal identification in India using data mining techniques","volume":"30","author":"Tayal","year":"2015","journal-title":"AI Soc"},{"key":"2023041120315437800_","author":"Toreini","year":"2020"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"e1006826","DOI":"10.1371\/journal.pcbi.1006826","article-title":"Machine learning analysis of gene expression data reveals novel diagnostic and prognostic biomarkers and identifies therapeutic targets for soft tissue sarcomas","volume":"15","author":"van IJzendoorn","year":"2019","journal-title":"PLoS Comput Biol"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1016\/j.celrep.2018.03.046","article-title":"Machine learning detects pan-cancer RAS pathway activation in the cancer genome atlas","volume":"23","author":"Way","year":"2018","journal-title":"Cell Rep"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"160018","DOI":"10.1038\/sdata.2016.18","article-title":"The FAIR guiding principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Sci Data"},{"key":"2023041120315437800_","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1145\/2934664","article-title":"Apache spark: a unified engine for big data processing","volume":"59","author":"Zaharia","year":"2016","journal-title":"Commun ACM"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad164\/49726584\/btad164.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/4\/btad164\/49852154\/btad164.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/4\/btad164\/49852154\/btad164.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,9]],"date-time":"2023-12-09T20:56:44Z","timestamp":1702155404000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad164\/7099608"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2023,4,1]]},"references-count":50,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,4,3]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad164","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,4,1]]},"published":{"date-parts":[[2023,4,1]]},"article-number":"btad164"}}