{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,11]],"date-time":"2025-04-11T13:48:29Z","timestamp":1744379309892},"reference-count":0,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2018,2,22]],"date-time":"2018-02-22T00:00:00Z","timestamp":1519257600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGIR Forum"],"published-print":{"date-parts":[[2018,2,22]]},"abstract":"<jats:p>Document understanding for the purpose of assessing the relevance of a document or passage to a query based only on the document content appears to be a familiar goal for information retrieval community, however, this problem has remained largely intractable, despite repeated attacks over many years. This is while people are able to assess the relevance quite well, though unfamiliar topics and complex documents can defeat them. This assessment may require the ability to understand language, images, document structure, videos, audio, and functional elements. In turn, understanding of these elements is built on background information about the world, such as human behavior patterns, and even more fundamental truths such as the existence of time, space, and people. All this comes naturally to people, but not to computers.<\/jats:p>\n          <jats:p>Recently, large-scale machine learning has altered the landscape. Deep learning has greatly advanced machine understanding of images and language. Since document and query understanding incorporate these elements, deep learning can hold great promise. But it comes with a drawback: general purpose representations (like CNNs for images) have proved somewhat elusive for text. In particular, embeddings act as a distributed representation not just of semantic information but also application-specific learning, which are hard to transfer. In short, conditions seem right for a renewed attempt on the fundamental document understanding problem.<\/jats:p>","DOI":"10.1145\/3190580.3190585","type":"journal-article","created":{"date-parts":[[2018,2,23]],"date-time":"2018-02-23T16:40:01Z","timestamp":1519404001000},"page":"27-31","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Toward Document Understanding for Information Retrieval"],"prefix":"10.1145","volume":"51","author":[{"given":"Mostafa","family":"Dehghani","sequence":"first","affiliation":[{"name":"University of Amsterdam"}]}],"member":"320","published-online":{"date-parts":[[2018,2,22]]},"container-title":["ACM SIGIR Forum"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3190580.3190585","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,18]],"date-time":"2023-04-18T04:18:10Z","timestamp":1681791490000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3190580.3190585"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,2,22]]},"references-count":0,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2018,2,22]]}},"alternative-id":["10.1145\/3190580.3190585"],"URL":"https:\/\/doi.org\/10.1145\/3190580.3190585","relation":{},"ISSN":["0163-5840"],"issn-type":[{"value":"0163-5840","type":"print"}],"subject":[],"published":{"date-parts":[[2018,2,22]]},"assertion":[{"value":"2018-02-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}