{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:30:06Z","timestamp":1750221006645,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":32,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,7,25]],"date-time":"2019-07-25T00:00:00Z","timestamp":1564012800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,7,25]]},"DOI":"10.1145\/3292500.3332294","type":"proceedings-article","created":{"date-parts":[[2019,7,26]],"date-time":"2019-07-26T13:17:26Z","timestamp":1564147046000},"page":"3239-3240","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Statistical Mechanics Methods for Discovering Knowledge from Modern Production Quality Neural Networks"],"prefix":"10.1145","author":[{"given":"Charles H.","family":"Martin","sequence":"first","affiliation":[{"name":"Calculation Consulting, San Francisco, CA, USA"}]},{"given":"Michael W.","family":"Mahoney","sequence":"additional","affiliation":[{"name":"University of California, Berkeley, Berkeley, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2019,7,25]]},"reference":[{"key":"e_1_3_2_2_1_1","unstructured":"M. Advani and S. Ganguli. 2016. Statistical Mechanics of High-Dimensional Inference. Technical Report Preprint: arXiv:1601.04650.  M. Advani and S. Ganguli. 2016. Statistical Mechanics of High-Dimensional Inference. Technical Report Preprint: arXiv:1601.04650."},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00220-007-0389-x"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"crossref","unstructured":"A. Auffinger and S. Tang. 2016. Extreme eigenvalues of sparse heavy tailed random matrices. Stochastic Processes and their Applications Vol. 126 11 (2016) 3310--3330.  A. Auffinger and S. Tang. 2016. Extreme eigenvalues of sparse heavy tailed random matrices. Stochastic Processes and their Applications Vol. 126 11 (2016) 3310--3330.","DOI":"10.1016\/j.spa.2016.04.029"},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1608103113"},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1039\/C7CP01108C"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"crossref","unstructured":"J. P. Bouchaud and M. Potters. 2003. Theory of Financial Risk and Derivative Pricing: From Statistical Physics to Risk Management .Cambridge University Press.  J. P. Bouchaud and M. Potters. 2003. Theory of Financial Risk and Derivative Pricing: From Statistical Physics to Risk Management .Cambridge University Press.","DOI":"10.1017\/CBO9780511753893"},{"volume-title":"Statistical Mechanics of Neural Networks. Ft","author":"Cowan J. D.","key":"e_1_3_2_2_7_1"},{"volume-title":"Annual Advances in Neural Information Processing Systems 27: Proceedings of the 2014 Conference. 2933--2941","author":"Dauphin Y. N.","key":"e_1_3_2_2_8_1"},{"volume-title":"Statistical mechanics of learning","author":"Engel A.","key":"e_1_3_2_2_9_1","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781139164542"},{"key":"e_1_3_2_2_10_1","unstructured":"N. Golmant N. Vemuri Z. Yao V. Feinberg A. Gholami K. Rothauge M. W. Mahoney and J. Gonzalez. 2018. On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent. Technical Report. Preprint: arXiv:1811.12941.  N. Golmant N. Vemuri Z. Yao V. Feinberg A. Gholami K. Rothauge M. W. Mahoney and J. Gonzalez. 2018. On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent. Technical Report. Preprint: arXiv:1811.12941."},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00114010"},{"volume-title":"On the distribution of the largest eigenvalue in principal components analysis. The Annals of Statistics","year":"2001","author":"Johnstone I. M.","key":"e_1_3_2_2_12_1"},{"key":"e_1_3_2_2_14_1","unstructured":"Q. Liao B. Miranda A. Banburski J. Hidary and T. Poggio. 2018. A surprising linear relationship predicts test performance in deep networks. Technical Report Preprint: arXiv:1807.09659.  Q. Liao B. Miranda A. Banburski J. Hidary and T. Poggio. 2018. A surprising linear relationship predicts test performance in deep networks. Technical Report Preprint: arXiv:1807.09659."},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/0025-5564(74)90031-5"},{"key":"e_1_3_2_2_16_1","unstructured":"M. W. Mahoney. February 2019. Seminar at ACM SF-SIG. https:\/\/www.youtube.com\/watch?v=2qF8TezRwS0.  M. W. Mahoney. February 2019. Seminar at ACM SF-SIG. https:\/\/www.youtube.com\/watch?v=2qF8TezRwS0."},{"key":"e_1_3_2_2_17_1","unstructured":"M. W. Mahoney. September 2018. Seminar at Simons Institute. https:\/\/simons.berkeley.edu\/talks\/9--24-mahoney-deep-learning.  M. W. Mahoney. September 2018. Seminar at Simons Institute. https:\/\/simons.berkeley.edu\/talks\/9--24-mahoney-deep-learning."},{"key":"e_1_3_2_2_18_1","unstructured":"C. H. Martin. December 2018a. Seminar at ICSI. https:\/\/www.youtube.com\/watch?v=6Zgul4oygMc.  C. H. Martin. December 2018a. Seminar at ICSI. https:\/\/www.youtube.com\/watch?v=6Zgul4oygMc."},{"key":"e_1_3_2_2_19_1","unstructured":"C. H. Martin. June 2018b. Seminar at LBNL. https:\/\/www.youtube.com\/watch?v=_Ni5UDrVwYU.  C. H. Martin. June 2018b. Seminar at LBNL. https:\/\/www.youtube.com\/watch?v=_Ni5UDrVwYU."},{"key":"e_1_3_2_2_20_1","unstructured":"C. H. Martin and M. W. Mahoney. 2017. Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior. Technical Report Preprint: arXiv:1710.09553.  C. H. Martin and M. W. Mahoney. 2017. Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior. Technical Report Preprint: arXiv:1710.09553."},{"key":"e_1_3_2_2_21_1","unstructured":"C. H. Martin and M. W. Mahoney. 2018. Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning. Technical Report Preprint: arXiv:1810.01075.  C. H. Martin and M. W. Mahoney. 2018. Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning. Technical Report Preprint: arXiv:1810.01075."},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"crossref","unstructured":"C. H. Martin and M. W. Mahoney. 2019a. Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks. Technical Report Preprint: arXiv:1901.08278.  C. H. Martin and M. W. Mahoney. 2019a. Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks. Technical Report Preprint: arXiv:1901.08278.","DOI":"10.1137\/1.9781611976236.57"},{"volume-title":"Proceedings of the 36st International Conference on Machine Learning.","author":"Martin C. H.","key":"e_1_3_2_2_23_1"},{"volume-title":"Proceedings of the 34th International Conference on Machine Learning. 2798--2806","author":"Pennington J.","key":"e_1_3_2_2_24_1"},{"key":"e_1_3_2_2_25_1","unstructured":"J. Pennington S. S. Schoenholz and S. Ganguli. 2018. The Emergence of Spectral Universality in Deep Networks. Technical Report Preprint: arXiv:1802.09979.  J. Pennington S. S. Schoenholz and S. Ganguli. 2018. The Emergence of Spectral Universality in Deep Networks. Technical Report Preprint: arXiv:1802.09979."},{"key":"e_1_3_2_2_26_1","unstructured":"T. Poggio Q. Liao B. Miranda A. Banburski X. Boix and J. Hidary. 2018. Theory IIIb: Generalization in Deep Networks. Technical Report Preprint: arXiv:1806.11379.  T. Poggio Q. Liao B. Miranda A. Banburski X. Boix and J. Hidary. 2018. Theory IIIb: Generalization in Deep Networks. Technical Report Preprint: arXiv:1806.11379."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevA.45.6056"},{"key":"e_1_3_2_2_28_1","unstructured":"C. J. Shallue J. Lee J. Antognini J. Sohl-Dickstein R. Frostig and G. E. Dahl. 2018. Measuring the Effects of Data Parallelism on Neural Network Training. Technical Report. Preprint: arXiv:1811.03600.  C. J. Shallue J. Lee J. Antognini J. Sohl-Dickstein R. Frostig and G. E. Dahl. 2018. Measuring the Effects of Data Parallelism on Neural Network Training. Technical Report. Preprint: arXiv:1811.03600."},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1063\/1.881142"},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-04174-1"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1103\/RevModPhys.65.499"},{"key":"e_1_3_2_2_32_1","unstructured":"Z. Yao A. Gholami Q. Lei K. Keutzer and M. W. Mahoney. 2018. Hessian-based Analysis of Large Batch Training and Robustness to Adversaries. Technical Report. Preprint: arXiv:1802.08241.   Z. Yao A. Gholami Q. Lei K. Keutzer and M. W. Mahoney. 2018. Hessian-based Analysis of Large Batch Training and Robustness to Adversaries. Technical Report. Preprint: arXiv:1802.08241."},{"key":"e_1_3_2_2_33_1","unstructured":"C. Zhang S. Bengio M. Hardt B. Recht and O. Vinyals. 2016. Understanding deep learning requires rethinking generalization. Technical Report Preprint: arXiv:1611.03530.  C. Zhang S. Bengio M. Hardt B. Recht and O. Vinyals. 2016. Understanding deep learning requires rethinking generalization. Technical Report Preprint: arXiv:1611.03530."}],"event":{"name":"KDD '19: The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"],"location":"Anchorage AK USA","acronym":"KDD '19"},"container-title":["Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3292500.3332294","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3292500.3332294","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T00:25:56Z","timestamp":1750206356000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3292500.3332294"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,7,25]]},"references-count":32,"alternative-id":["10.1145\/3292500.3332294","10.1145\/3292500"],"URL":"https:\/\/doi.org\/10.1145\/3292500.3332294","relation":{},"subject":[],"published":{"date-parts":[[2019,7,25]]},"assertion":[{"value":"2019-07-25","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}