{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:20:43Z","timestamp":1750220443927,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":39,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,3,8]],"date-time":"2021-03-08T00:00:00Z","timestamp":1615161600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,3,8]]},"DOI":"10.1145\/3437963.3441809","type":"proceedings-article","created":{"date-parts":[[2021,3,6]],"date-time":"2021-03-06T04:34:28Z","timestamp":1615005268000},"page":"301-309","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study"],"prefix":"10.1145","author":[{"given":"Dara","family":"Bahri","sequence":"first","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]},{"given":"Yi","family":"Tay","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]},{"given":"Che","family":"Zheng","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]},{"given":"Cliff","family":"Brunk","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]},{"given":"Donald","family":"Metzler","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]},{"given":"Andrew","family":"Tomkins","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2021,3,8]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Implementation (OSDI '16). 265--283.","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , 2016 . Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Implementation (OSDI '16). 265--283. Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Implementation (OSDI '16). 265--283."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-2311"},{"key":"e_1_3_2_1_3_1","volume-title":"The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction. arXiv preprint arXiv:1906.01733","author":"Alikaniotis Dimitrios","year":"2019","unstructured":"Dimitrios Alikaniotis and Vipul Raheja . 2019. The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction. arXiv preprint arXiv:1906.01733 ( 2019 ). Dimitrios Alikaniotis and Vipul Raheja. 2019. The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction. arXiv preprint arXiv:1906.01733 (2019)."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1178"},{"key":"e_1_3_2_1_5_1","volume-title":"Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II.","author":"Badaskar Sameer","year":"2008","unstructured":"Sameer Badaskar , Sachin Agarwal , and Shilpa Arora . 2008 . Identifying real or fake articles: Towards better language modeling . In Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II. Sameer Badaskar, Sachin Agarwal, and Shilpa Arora. 2008. Identifying real or fake articles: Towards better language modeling. In Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II."},{"key":"e_1_3_2_1_6_1","volume-title":"Real or Fake? Learning to Discriminate Machine from Human Generated Text. arXiv preprint arXiv:1906.03351","author":"Bakhtin Anton","year":"2019","unstructured":"Anton Bakhtin , Sam Gross , Myle Ott , Yuntian Deng , Marc-Aurelio Ranzato , and Arthur Szlam . 2019. Real or Fake? Learning to Discriminate Machine from Human Generated Text. arXiv preprint arXiv:1906.03351 ( 2019 ). Anton Bakhtin, Sam Gross, Myle Ott, Yuntian Deng, Marc-Aurelio Ranzato, and Arthur Szlam. 2019. Real or Fake? Learning to Discriminate Machine from Human Generated Text. arXiv preprint arXiv:1906.03351 (2019)."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1935826.1935849"},{"key":"e_1_3_2_1_8_1","unstructured":"Tom B Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell etal 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020).  Tom B Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020)."},{"key":"e_1_3_2_1_9_1","volume-title":"Efficient and effective spam filtering and re-ranking for large web datasets. Information retrieval 14, 5","author":"Cormack Gordon V","year":"2011","unstructured":"Gordon V Cormack , Mark D Smucker , and Charles LA Clarke . 2011. Efficient and effective spam filtering and re-ranking for large web datasets. Information retrieval 14, 5 ( 2011 ), 441--465. Gordon V Cormack, Mark D Smucker, and Charles LA Clarke. 2011. Efficient and effective spam filtering and re-ranking for large web datasets. Information retrieval 14, 5 (2011), 441--465."},{"key":"e_1_3_2_1_10_1","unstructured":"Sumanth Dathathri Andrea Madotto Janice Lan Jane Hung Eric Frank Piero Molino Jason Yosinski and Rosanne Liu. 2019. Plug and Play Language Models: a Simple Approach to Controlled Text Generation. arXiv:1912.02164 [cs.CL]  Sumanth Dathathri Andrea Madotto Janice Lan Jane Hung Eric Frank Piero Molino Jason Yosinski and Rosanne Liu. 2019. Plug and Play Language Models: a Simple Approach to Controlled Text Generation. arXiv:1912.02164 [cs.CL]"},{"key":"e_1_3_2_1_11_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_3_2_1_12_1","volume-title":"Hierarchical neural story generation. arXiv preprint arXiv:1805.04833","author":"Fan Angela","year":"2018","unstructured":"Angela Fan , Mike Lewis , and Yann Dauphin . 2018. Hierarchical neural story generation. arXiv preprint arXiv:1805.04833 ( 2018 ). Angela Fan, Mike Lewis, and Yann Dauphin. 2018. Hierarchical neural story generation. arXiv preprint arXiv:1805.04833 (2018)."},{"key":"e_1_3_2_1_13_1","volume-title":"GLTR: Statistical Detection and Visualization of Generated Text. arXiv preprint arXiv:1906.04043","author":"Gehrmann Sebastian","year":"2019","unstructured":"Sebastian Gehrmann , Hendrik Strobelt , and Alexander M Rush . 2019 . GLTR: Statistical Detection and Visualization of Generated Text. arXiv preprint arXiv:1906.04043 (2019). Sebastian Gehrmann, Hendrik Strobelt, and Alexander M Rush. 2019. GLTR: Statistical Detection and Visualization of Generated Text. arXiv preprint arXiv:1906.04043 (2019)."},{"key":"e_1_3_2_1_14_1","volume-title":"Detection of artificial texts. RCDL?2009 Proceedings. Petrozavodsk","author":"Grechnikov EA","year":"2009","unstructured":"EA Grechnikov , GG Gusev , AA Kustarev , and AM Raigorodsky . 2009. Detection of artificial texts. RCDL?2009 Proceedings. Petrozavodsk ( 2009 ), 306--308. EA Grechnikov, GG Gusev, AA Kustarev, and AM Raigorodsky. 2009. Detection of artificial texts. RCDL?2009 Proceedings. Petrozavodsk (2009), 306--308."},{"key":"e_1_3_2_1_15_1","unstructured":"Karl Moritz Hermann Tomas Kocisky Edward Grefenstette Lasse Espeholt Will Kay Mustafa Suleyman and Phil Blunsom. 2015. Teaching machines to read and comprehend. In Advances in neural information processing systems. 1693--1701.  Karl Moritz Hermann Tomas Kocisky Edward Grefenstette Lasse Espeholt Will Kay Mustafa Suleyman and Phil Blunsom. 2015. Teaching machines to read and comprehend. In Advances in neural information processing systems. 1693--1701."},{"key":"e_1_3_2_1_16_1","volume-title":"The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751","author":"Holtzman Ari","year":"2019","unstructured":"Ari Holtzman , Jan Buys , Maxwell Forbes , and Yejin Choi . 2019. The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751 ( 2019 ). Ari Holtzman, Jan Buys, Maxwell Forbes, and Yejin Choi. 2019. The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751 (2019)."},{"volume-title":"Write with transformer","year":"2019","key":"e_1_3_2_1_17_1","unstructured":"Huggingface. 2019. Write with transformer . 2019 . (2019). https:\/\/ transformer.huggingface.co\/ Huggingface. 2019. Write with transformer. 2019. (2019). https:\/\/ transformer.huggingface.co\/"},{"key":"e_1_3_2_1_18_1","volume-title":"Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858","author":"Keskar Nitish Shirish","year":"2019","unstructured":"Nitish Shirish Keskar , Bryan McCann , Lav R Varshney , Caiming Xiong , and Richard Socher . 2019 . Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858 (2019). Nitish Shirish Keskar, Bryan McCann, Lav R Varshney, Caiming Xiong, and Richard Socher. 2019. Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858 (2019)."},{"key":"e_1_3_2_1_19_1","volume-title":"Herke Van Hoof, and Max Welling","author":"Kool Wouter","year":"2019","unstructured":"Wouter Kool , Herke Van Hoof, and Max Welling . 2019 . Stochastic Beams and Where to Find Them : The Gumbel-Top-k Trick for Sampling Sequences Without Replacement . arXiv preprint arXiv:1903.06059 (2019). Wouter Kool, Herke Van Hoof, and Max Welling. 2019. Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement. arXiv preprint arXiv:1903.06059 (2019)."},{"key":"e_1_3_2_1_20_1","volume-title":"Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu , Myle Ott , Naman Goyal , Jingfei Du , Mandar Joshi , Danqi Chen , Omer Levy , Mike Lewis , Luke Zettlemoyer , and Veselin Stoyanov . 2019 . Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019). Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1464"},{"key":"e_1_3_2_1_22_1","volume-title":"CEAS","volume":"17","author":"Metsis Vangelis","year":"2006","unstructured":"Vangelis Metsis , Ion Androutsopoulos , and Georgios Paliouras . 2006 . Spam filtering with naive bayes-which naive bayes? . In CEAS , Vol. 17 . Mountain View, CA, 28--69. Vangelis Metsis, Ion Androutsopoulos, and Georgios Paliouras. 2006. Spam filtering with naive bayes-which naive bayes?. In CEAS, Vol. 17. Mountain View, CA, 28--69."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1135777.1135794"},{"key":"e_1_3_2_1_24_1","volume-title":"Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 311--318","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni , Salim Roukos , Todd Ward , and Wei-Jing Zhu . 2002 . BLEU: a method for automatic evaluation of machine translation . In Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 311--318 . Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 311--318."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"key":"e_1_3_2_1_26_1","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. (2018).  Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. (2018)."},{"key":"e_1_3_2_1_27_1","unstructured":"Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. (2019).  Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. (2019)."},{"key":"e_1_3_2_1_28_1","volume-title":"The Next Word. The New Yorker","author":"Seabrook John","year":"2019","unstructured":"John Seabrook . 2019. The Next Word. The New Yorker ( 2019 ), 52--63. John Seabrook. 2019. The Next Word. The New Yorker (2019), 52--63."},{"key":"e_1_3_2_1_29_1","volume-title":"Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203","author":"Solaiman Irene","year":"2019","unstructured":"Irene Solaiman , Miles Brundage , Jack Clark , Amanda Askell , Ariel Herbert-Voss , Jeff Wu , Alec Radford , and Jasmine Wang . 2019. Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203 ( 2019 ). Irene Solaiman, Miles Brundage, Jack Clark, Amanda Askell, Ariel Herbert-Voss, Jeff Wu, Alec Radford, and Jasmine Wang. 2019. Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203 (2019)."},{"volume-title":"Autocompletion with deep learning","year":"2019","key":"e_1_3_2_1_30_1","unstructured":"TabNine. 2019. Autocompletion with deep learning . 2019 . (2019). https:\/\/tabnine.com\/blog\/deep\/ TabNine. 2019. Autocompletion with deep learning. 2019. (2019). https:\/\/tabnine.com\/blog\/deep\/"},{"key":"e_1_3_2_1_31_1","volume-title":"Reverse Engineering Configurations of Neural Text Generation Models. arXiv preprint arXiv:2004.06201","author":"Tay Yi","year":"2020","unstructured":"Yi Tay , Dara Bahri , Che Zheng , Clifford Brunk , Donald Metzler , and Andrew Tomkins . 2020. Reverse Engineering Configurations of Neural Text Generation Models. arXiv preprint arXiv:2004.06201 ( 2020 ). Yi Tay, Dara Bahri, Che Zheng, Clifford Brunk, Donald Metzler, and Andrew Tomkins. 2020. Reverse Engineering Configurations of Neural Text Generation Models. arXiv preprint arXiv:2004.06201 (2020)."},{"key":"e_1_3_2_1_32_1","volume-title":"Diverse beam search: Decoding diverse solutions from neural sequence models. arXiv preprint arXiv:1610.02424","author":"Vijayakumar Ashwin K","year":"2016","unstructured":"Ashwin K Vijayakumar , Michael Cogswell , Ramprasath R Selvaraju , Qing Sun , Stefan Lee , David Crandall , and Dhruv Batra . 2016. Diverse beam search: Decoding diverse solutions from neural sequence models. arXiv preprint arXiv:1610.02424 ( 2016 ). Ashwin K Vijayakumar, Michael Cogswell, Ramprasath R Selvaraju, Qing Sun, Stefan Lee, David Crandall, and Dhruv Batra. 2016. Diverse beam search: Decoding diverse solutions from neural sequence models. arXiv preprint arXiv:1610.02424 (2016)."},{"key":"e_1_3_2_1_33_1","unstructured":"NickWalton. 2019. AI Dungeon. 2019. (2019). http:\/\/www.aidungeon.io\/  NickWalton. 2019. AI Dungeon. 2019. (2019). http:\/\/www.aidungeon.io\/"},{"key":"e_1_3_2_1_34_1","volume-title":"GPT-2 Neural Network Poetry","author":"Walton Nick","year":"2019","unstructured":"Nick Walton . 2019. GPT-2 Neural Network Poetry . 2019 . (2019). https:\/\/www.gwern.net\/GPT-2 Nick Walton. 2019. GPT-2 Neural Network Poetry. 2019. (2019). https:\/\/www.gwern.net\/GPT-2"},{"key":"e_1_3_2_1_35_1","volume-title":"Neural text generation with unlikelihood training. arXiv preprint arXiv:1908.04319","author":"Welleck Sean","year":"2019","unstructured":"Sean Welleck , Ilia Kulikov , Stephen Roller , Emily Dinan , Kyunghyun Cho , and Jason Weston . 2019. Neural text generation with unlikelihood training. arXiv preprint arXiv:1908.04319 ( 2019 ). Sean Welleck, Ilia Kulikov, Stephen Roller, Emily Dinan, Kyunghyun Cho, and Jason Weston. 2019. Neural text generation with unlikelihood training. arXiv preprint arXiv:1908.04319 (2019)."},{"key":"e_1_3_2_1_36_1","volume-title":"State-of-the-art Natural Language Processing. ArXiv abs\/1910.03771","author":"Debut Lysandre","year":"2019","unstructured":"ThomasWolf, Lysandre Debut , Victor Sanh , Julien Chaumond , Clement Delangue , Anthony Moi , Pierric Cistac , Tim Rault , R?emi Louf, Morgan Funtowicz , and Jamie Brew . 2019. HuggingFace?s Transformers : State-of-the-art Natural Language Processing. ArXiv abs\/1910.03771 ( 2019 ). ThomasWolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R?emi Louf, Morgan Funtowicz, and Jamie Brew. 2019. HuggingFace?s Transformers: State-of-the-art Natural Language Processing. ArXiv abs\/1910.03771 (2019)."},{"key":"e_1_3_2_1_37_1","unstructured":"Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey etal 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).  Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey et al. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)."},{"key":"e_1_3_2_1_38_1","volume-title":"Defending Against Neural Fake News. arXiv preprint arXiv:1905.12616","author":"Zellers Rowan","year":"2019","unstructured":"Rowan Zellers , Ari Holtzman , Hannah Rashkin , Yonatan Bisk , Ali Farhadi , Franziska Roesner , and Yejin Choi . 2019. Defending Against Neural Fake News. arXiv preprint arXiv:1905.12616 ( 2019 ). Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Yejin Choi. 2019. Defending Against Neural Fake News. arXiv preprint arXiv:1905.12616 (2019)."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3209978.3210080"}],"event":{"name":"WSDM '21: The Fourteenth ACM International Conference on Web Search and Data Mining","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data","SIGIR ACM Special Interest Group on Information Retrieval"],"location":"Virtual Event Israel","acronym":"WSDM '21"},"container-title":["Proceedings of the 14th ACM International Conference on Web Search and Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3437963.3441809","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3437963.3441809","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:47:36Z","timestamp":1750193256000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3437963.3441809"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,8]]},"references-count":39,"alternative-id":["10.1145\/3437963.3441809","10.1145\/3437963"],"URL":"https:\/\/doi.org\/10.1145\/3437963.3441809","relation":{},"subject":[],"published":{"date-parts":[[2021,3,8]]},"assertion":[{"value":"2021-03-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}