{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T07:33:09Z","timestamp":1723015989363},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,8]]},"abstract":"<jats:p>Document summarisation can be formulated as a sequential decision-making problem, which can be solved by Reinforcement Learning (RL) algorithms. The predominant RL paradigm for summarisation learns a cross-input policy, which requires considerable time, data and parameter tuning due to the huge search spaces and the delayed rewards. Learning input-specific RL policies is a more efficient alternative, but so far depends on handcrafted rewards, which are difficult to design and yield poor performance. We propose RELIS, a novel RL paradigm that learns a reward function with Learning-to-Rank (L2R) algorithms at training time and uses this reward function to train an input-specific RL policy at test time. We prove that RELIS guarantees to generate near-optimal summaries with appropriate L2R and RL algorithms. Empirically, we evaluate our approach on extractive multi-document summarisation. We show that RELIS reduces the training time by two orders of magnitude compared to the state-of-the-art models while performing on par with them.<\/jats:p>","DOI":"10.24963\/ijcai.2019\/326","type":"proceedings-article","created":{"date-parts":[[2019,7,28]],"date-time":"2019-07-28T07:46:05Z","timestamp":1564299965000},"page":"2350-2356","source":"Crossref","is-referenced-by-count":2,"title":["Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation"],"prefix":"10.24963","author":[{"given":"Yang","family":"Gao","sequence":"first","affiliation":[{"name":"Dept. of Computer Science, Royal Holloway, University of London"}]},{"given":"Christian M.","family":"Meyer","sequence":"additional","affiliation":[{"name":"Ubiquitous Knowledge Processing Lab (UKP-TUDA), Technische Universita \u0308t Darmstadt"}]},{"given":"Mohsen","family":"Mesgar","sequence":"additional","affiliation":[{"name":"Ubiquitous Knowledge Processing Lab (UKP-TUDA), Technische Universita \u0308t Darmstadt"}]},{"given":"Iryna","family":"Gurevych","sequence":"additional","affiliation":[{"name":"Ubiquitous Knowledge Processing Lab (UKP-TUDA), Technische Universita \u0308t Darmstadt"}]}],"member":"10584","event":{"number":"28","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-2019","name":"Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}","start":{"date-parts":[[2019,8,10]]},"theme":"Artificial Intelligence","location":"Macao, China","end":{"date-parts":[[2019,8,16]]}},"container-title":["Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2019,7,28]],"date-time":"2019-07-28T07:48:34Z","timestamp":1564300114000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2019\/326"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2019,8]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2019\/326","relation":{},"subject":[],"published":{"date-parts":[[2019,8]]}}}