{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,12]],"date-time":"2025-12-12T02:51:32Z","timestamp":1765507892439,"version":"3.48.0"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2026,1,31]]},"abstract":"<jats:p>\n                    <jats:italic toggle=\"yes\">Aim<\/jats:italic>\n                    . The code review team at Meta is continuously improving the code review process. In this work, we report on three randomized controlled experimental trials to improve code reviewer recommendation.\n                  <\/jats:p>\n                  <jats:p>\n                    <jats:italic toggle=\"yes\">Method<\/jats:italic>\n                    . To evaluate the recommenders, we conduct three A\/B tests which are a type of randomized controlled experimental trial. The unit is either the code diff (Meta\u2019s term for a pull-request) or all the diffs that an author creates during the experimental period. We set goal metrics, i.e., those we expect to improve, and guardrail metrics, those that we do not want to negatively impact, i.e., analogous to safety metrics in medical trials. We test the outcomes using a\n                    <jats:italic toggle=\"yes\">t<\/jats:italic>\n                    -test, Wilcoxon test, or Fisher test depending on the type of data.\n                  <\/jats:p>\n                  <jats:p>\n                    <jats:italic toggle=\"yes\">Expt. 1<\/jats:italic>\n                    . We developed a new recommender,\n                    <jats:monospace>RevRecV2<\/jats:monospace>\n                    , based on features that had been successfully used in the literature and that could be calculated with low latency. In an A\/B test on 82k diffs in Spring 2022, we found that the new recommender was more accurate and had lower latency. The new recommender did not impact the amount of time a diff was under review. The results allowed us to roll-out the recommender in the Summer 2022 to all of Meta.\n                  <\/jats:p>\n                  <jats:p>\n                    <jats:italic toggle=\"yes\">Expt. 2<\/jats:italic>\n                    . Reviewer workload is not evenly distributed, our goal was to reduce the workload of top reviewers. Based on the literature and using historical data, we conducted backtests to determine the best measure of reviewer workload. We then ran an A\/B test on 28k diff authors in Winter 2023 on a workload-balanced recommender,\n                    <jats:monospace>RevRecWL<\/jats:monospace>\n                    . Our A\/B test led to mixed results. When a low workload reviewer had reasonable expertise, authors selected them, however, the top recommended low workload reviewer was often not selected. There was no impact on our guardrail metrics of the amount of time to perform a review. This workload-balancing replaced the recommender from the first experiment as the recommender in production at Meta.\n                  <\/jats:p>\n                  <jats:p>\n                    <jats:italic toggle=\"yes\">Expt. 3<\/jats:italic>\n                    . Engineers at Meta often select a team rather than an individual reviewer to review a diff. We suspected the bystander effect might be slowing down reviews of these diffs because no single individual was assigned the review. On diffs that only had a team assigned, we randomly selected one of the top three recommended reviewers to review the diff with\n                    <jats:monospace>BystanderRecRnd<\/jats:monospace>\n                    . We conducted an A\/B test on 12.5k authors in Spring 2023 and found a large decrease in the amount of time it took for diffs to be reviewed. We did not find that reviewers rushed reviews. The results were strong enough to roll this recommender out to all diffs that only have a team assigned for review.\n                  <\/jats:p>\n                  <jats:p>\n                    <jats:italic toggle=\"yes\">Implications<\/jats:italic>\n                    . Aside from the direct findings from our work, our findings suggest there can be a discrepancy between historical backtesting and A\/B test experimental findings, and that more A\/B tests are necessary to test recommenders in production. Outcome measures beyond accuracy are important. This is especially true in understanding how recommenders change a reviewer\u2019s workload. We also see that the latency in displaying a recommendation can have a large impact on how often authors select recommendations making the reporting of latency an important metric for future work.\n                  <\/jats:p>","DOI":"10.1145\/3736405","type":"journal-article","created":{"date-parts":[[2025,5,27]],"date-time":"2025-05-27T12:38:41Z","timestamp":1748349521000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Improving Code Reviewer Recommendation: Accuracy, Latency, Workload, and Bystanders"],"prefix":"10.1145","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1137-4297","authenticated-orcid":false,"given":"Peter C.","family":"Rigby","sequence":"first","affiliation":[{"name":"Department of Computer Science and Software Engineering, Concordia University, Montreal, Quebec, Canada\u00a0and Meta Platforms Inc, Menlo\u00a0Park, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-1587-0578","authenticated-orcid":false,"given":"Seth","family":"Rogers","sequence":"additional","affiliation":[{"name":"Meta Platforms Inc, Menlo\u00a0Park, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-2848-3442","authenticated-orcid":false,"given":"Sadruddin","family":"Saleem","sequence":"additional","affiliation":[{"name":"Meta Platforms Inc, Menlo\u00a0Park, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0087-8759","authenticated-orcid":false,"given":"Parth","family":"Suresh","sequence":"additional","affiliation":[{"name":"Meta Platforms Inc, Menlo\u00a0Park, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-0710-7906","authenticated-orcid":false,"given":"Daniel","family":"Suskin","sequence":"additional","affiliation":[{"name":"Meta Platforms Inc, Menlo\u00a0Park, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-9549-8695","authenticated-orcid":false,"given":"Patrick","family":"Riggs","sequence":"additional","affiliation":[{"name":"Meta Platforms Inc, Menlo\u00a0Park, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9432-1045","authenticated-orcid":false,"given":"Chandra","family":"Maddila","sequence":"additional","affiliation":[{"name":"Meta Platforms Inc, Menlo\u00a0Park, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1358-4124","authenticated-orcid":false,"given":"Nachiappan","family":"Nagappan","sequence":"additional","affiliation":[{"name":"Meta Platforms Inc, Menlo\u00a0Park, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7987-7598","authenticated-orcid":false,"given":"Audris","family":"Mockus","sequence":"additional","affiliation":[{"name":"The University of Tennessee, Knoxville, Knoxville, Tennessee, USA\u00a0and Meta Platforms Inc, Menlo\u00a0Park, California, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,12,11]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3416508.3417115"},{"issue":"3","key":"e_1_3_2_3_2","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1109\/TSE.2008.89","article-title":"Variability and reproducibility in software engineering: A study of four companies that developed the same system","volume":"35","author":"Anda Bente C. D.","year":"2008","unstructured":"Bente C. D. Anda, Dag I. K. Sj\u00f8berg, and Audris Mockus. 2008. Variability and reproducibility in software engineering: A study of four companies that developed the same system. IEEE Transactions on Software Engineering 35, 3 (2008), 407\u2013429.","journal-title":". IEEE Transactions on Software Engineering"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3338906.3340449"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.5555\/2486788.2486882"},{"key":"e_1_3_2_6_2","first-page":"931","volume-title":"Proceedings of the 2013 International Conference on Software Engineering","author":"Vipin Balachandran","year":"2013","unstructured":"Vipin Balachandran. 2013. Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, 931\u2013940."},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/WCRE.2013.6671287"},{"key":"e_1_3_2_8_2","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1145\/2025113.2025119","volume-title":"Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering","author":"Bird Christian","year":"2011","unstructured":"Christian Bird, Nachiappan Nagappan, Brendan Murphy, Harald Gall, and Premkumar Devanbu. 2011. Don\u2019t touch my code!: Examining the effects of ownership on software quality. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. ACM, 4\u201314."},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2016.2576451"},{"issue":"23","key":"e_1_3_2_10_2","first-page":"), 81","article-title":"From ranknet to lambdarank to lambdamart: An overview","volume":"11","author":"Burges Christopher J. C.","year":"2010","unstructured":"Christopher J. C. Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning 11, 23\u2013581 (2010), 81.","journal-title":"Learning"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3558945"},{"key":"e_1_3_2_12_2","first-page":"117","volume-title":"Best Kept Secrets of Peer Code Review","author":"Cohen J.","year":"2006","unstructured":"J. Cohen. 2006. Best Kept Secrets of Peer Code Review. Smart Bear Inc., Austin, TX, 117."},{"key":"e_1_3_2_13_2","volume-title":"Experimental and Quasi-Experimental Designs for Generalized Causal Inference","author":"Cook Thomas D.","year":"2002","unstructured":"Thomas D. Cook, Donald Thomas Campbell, and William Shadish. 2002. Experimental and Quasi-Experimental Designs for Generalized Causal Inference, Vol. 1195, Houghton Mifflin Boston, MA."},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1147\/sj.153.0182"},{"issue":"4","key":"e_1_3_2_15_2","doi-asserted-by":"crossref","first-page":"517","DOI":"10.1037\/a0023304","article-title":"The bystander-effect: A meta-analytic review on bystander intervention in dangerous and non-dangerous emergencies","volume":"137","author":"Fischer Peter","year":"2011","unstructured":"Peter Fischer, Joachim I. Krueger, Tobias Greitemeyer, Claudia Vogrincic, Andreas Kastenm\u00fcller, Dieter Frey, Moritz Heene, Magdalena Wicher, and Martina Kainbacher. 2011. The bystander-effect: A meta-analytic review on bystander intervention in dangerous and non-dangerous emergencies. Psychological Bulletin 137, 4 (2011), 517.","journal-title":"Psychological Bulletin"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/2786805.2786870"},{"key":"e_1_3_2_17_2","first-page":"341","volume-title":"Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering","author":"Fritz Thomas","year":"2007","unstructured":"Thomas Fritz, Gail C. Murphy, and Emily Hill. 2007. Does a programmer\u2019s activity indicate knowledge of code? In Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering. ACM, 341\u2013350."},{"key":"e_1_3_2_18_2","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1109\/IWPSE.2005.21","volume-title":"8th International Workshop on Principles of Software Evolution (IWPSE \u201905)","author":"Girba Tudor","year":"2005","unstructured":"Tudor Girba, Adrian Kuhn, Mauricio Seeberger, and St\u00e9phane Ducasse. 2005. How developers drive software evolution. In Proceedings of the 8th International Workshop on Principles of Software Evolution (IWPSE \u201905). IEEE, 113\u2013122."},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","unstructured":"LauraMacLeod Michaela Greiler Margaret-AnneStorey Christian Bird and Jacek Czerwonka. 2018. Code reviewing in the trenches: Understanding challenges best practices and tool needs. IEEE Software 35 4 (2018) 34\u201342. DOI: 10.1109\/MS.2017.265100500","DOI":"10.1109\/MS.2017.265100500"},{"key":"e_1_3_2_20_2","volume-title":"Balance Expertise, Workload and Turnover into Code Review Recommendation","author":"Hajari Fahimeh","year":"2022","unstructured":"Fahimeh Hajari. 2022. Balance Expertise, Workload and Turnover into Code Review Recommendation. Ph.\u2009D. Dissertation. Concordia University."},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2024.3366753"},{"key":"e_1_3_2_22_2","doi-asserted-by":"crossref","DOI":"10.1002\/9780470691922","volume-title":"Randomized Controlled Trials: Questions, Answers and Musings","author":"Jadad Alehandro R.","year":"2007","unstructured":"Alehandro R. Jadad and Murray W[. Enkin. 2007. Randomized Controlled Trials: Questions, Answers and Musings. John Wiley & Sons."},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2016.10.006"},{"key":"e_1_3_2_24_2","first-page":"1","volume-title":"Proceedings of the 4th International Workshop on Conducting Empirical Studies in Industry","author":"Juristo Natalia","year":"2016","unstructured":"Natalia Juristo. 2016. Experiences conducting experiments in industry: The ESEIL FiDiPro project. In Proceedings of the 4th International Workshop on Conducting Empirical Studies in Industry, 1\u20133."},{"key":"e_1_3_2_25_2","volume-title":"Basics of Software Engineering Experimentation","author":"Juristo Natalia","year":"2013","unstructured":"Natalia Juristo and Ana M. Moreno. 2013. Basics of Software Engineering Experimentation. Springer Science & Business Media."},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1017\/9781108653985"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2018.2868367"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/2661685.2661687"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380335"},{"key":"e_1_3_2_30_2","first-page":"503","volume-title":"Proceedings of the 24th International Conference on Software Engineering (ICSE \u201902)","author":"Mockus Audris","year":"2002","unstructured":"Audris Mockus and James D. Herbsleb. 2002. Expertise browser: A quantitative approach to identifying expertise. In Proceedings of the 24th International Conference on Software Engineering (ICSE \u201902). IEEE, 503\u2013512."},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3579527"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/268411.268421"},{"key":"e_1_3_2_33_2","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1145\/1985793.1985860","volume-title":"Proceedings of the 33rd International Conference on Software Engineering","author":"Rahman Foyzur","year":"2011","unstructured":"Foyzur Rahman and Premkumar Devanbu. 2011. Ownership, experience and defects: A fine-grained study of authorship. In Proceedings of the 33rd International Conference on Software Engineering. ACM, 491\u2013500."},{"key":"e_1_3_2_34_2","first-page":"222","volume-title":"Proceedings of the 2016 IEEE\/ACM 38th International Conference on Software Engineering Companion (ICSE-C)","author":"Rahman Mohammad Masudur","year":"2016","unstructured":"Mohammad Masudur Rahman, Chanchal K. Roy, and Jason A. Collins. 2016. Correct: Code reviewer recommendation in github based on cross-project and technology experience. In Proceedings of the 2016 IEEE\/ACM 38th International Conference on Software Engineering Companion (ICSE-C). IEEE, 222\u2013231."},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/2491411.2491444"},{"issue":"4","key":"e_1_3_2_36_2","first-page":"35","article-title":"Peer review on open-source software projects: Parameters, statistical models, and theory","volume":"23","author":"Rigby Peter C.","year":"2014","unstructured":"Peter C. Rigby, Daniel M. German, Laura Cowen, and Margaret-Anne Storey. 2014. Peer review on open-source software projects: Parameters, statistical models, and theory. ACM Transactions on Software Engineering and Methodology 23, 4 (2014), 35.","journal-title":"ACM Transactions on Software Engineering and Methodology"},{"key":"e_1_3_2_37_2","first-page":"541","volume-title":"Proceedings of the 2011 33rd International Conference on Software Engineering (ICSE)","author":"Rigby Peter C.","year":"2011","unstructured":"Peter C. Rigby and Margaret-Anne Storey. 2011. Understanding broadcast based peer review on open source software projects. In Proceedings of the 2011 33rd International Conference on Software Engineering (ICSE). IEEE, 541\u2013550."},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-018-9646-1"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3183519.3183525"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2005.97"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377813.3381365"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884852"},{"key":"e_1_3_2_43_2","first-page":"141","volume-title":"Proceedings of the 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","author":"Thongtanunam Patanamon","year":"2015","unstructured":"Patanamon Thongtanunam, Chakkrit Tantithamthavorn, Raula Gaikovina Kula, Norihiro Yoshida, Hajimu Iida, and Ken-ichi Matsumoto. 2015. Who should review my code? A file location-based code-reviewer recommendation approach for modern code review. In Proceedings of the 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 141\u2013150."},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-29044-2"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2016.01.004"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2015.2500238"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE-SEIP58684.2023.00020"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3736405","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,12]],"date-time":"2025-12-12T02:50:19Z","timestamp":1765507819000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3736405"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,11]]},"references-count":46,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1,31]]}},"alternative-id":["10.1145\/3736405"],"URL":"https:\/\/doi.org\/10.1145\/3736405","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"type":"print","value":"1049-331X"},{"type":"electronic","value":"1557-7392"}],"subject":[],"published":{"date-parts":[[2025,12,11]]},"assertion":[{"value":"2023-12-28","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-21","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-12-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}