{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,14]],"date-time":"2025-07-14T02:45:02Z","timestamp":1752461102221},"reference-count":0,"publisher":"Cambridge University Press (CUP)","issue":"4","license":[{"start":{"date-parts":[[1996,12,1]],"date-time":"1996-12-01T00:00:00Z","timestamp":849398400000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[1996,12]]},"abstract":"<jats:p>The paper presents background and motivation for a processing model \nthat segments discourse \ninto units that are simple, non-nested clauses, prior to the recognition \nof clause internal phrasal \nconstituents, and experimental results in support of this model. One \nset of results is derived \nfrom a statistical reanalysis of the Swedish empirical data in \nStrangert, Ejerhed and Huber \n1993 concerning the linguistic structure of major prosodic units. The \nother set of results is \nderived from experiments in segmenting part of speech annotated Swedish \ntext corpora into \nclauses, using a new clause segmentation algorithm. The clause segmented \ncorpus data is taken from the Stockholm Ume\u00e5 Corpus (SUC), 1 M words\n of Swedish \ntexts from different \ngenres, part of speech annotated by hand, and from the Ume\u00e5 corpus \nDAGENS INDUSTRI \n1993 (DI93), 5 M words of Swedish financial newspaper text, processed by \nfully automatic \nmeans consisting of tokenizing, lexical analysis, and probabilistic POS \ntagging. The results of \nthese two experiments show that the proposed clause segmentation \nalgorithm is 96% correct \nwhen applied to manually tagged text, and 91% correct when applied \nto probabilistically tagged text.<\/jats:p>","DOI":"10.1017\/s1351324997001629","type":"journal-article","created":{"date-parts":[[2002,7,27]],"date-time":"2002-07-27T13:36:03Z","timestamp":1027776963000},"page":"355-364","source":"Crossref","is-referenced-by-count":3,"title":["Finite state segmentation of discourse into clauses"],"prefix":"10.1017","volume":"2","author":[{"given":"EVA","family":"EJERHED","sequence":"first","affiliation":[]}],"member":"56","published-online":{"date-parts":[[1996,12,1]]},"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324997001629","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,5,12]],"date-time":"2019-05-12T19:47:39Z","timestamp":1557690459000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324997001629\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1996,12]]},"references-count":0,"journal-issue":{"issue":"4","published-print":{"date-parts":[[1996,12]]}},"alternative-id":["S1351324997001629"],"URL":"https:\/\/doi.org\/10.1017\/s1351324997001629","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[1996,12]]}}}