{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,9,28]],"date-time":"2023-09-28T05:10:20Z","timestamp":1695877820605},"reference-count":26,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2006,10,30]],"date-time":"2006-10-30T00:00:00Z","timestamp":1162166400000},"content-version":"vor","delay-in-days":6511,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Softw Pract Exp"],"published-print":{"date-parts":[[1989,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In recent years, several authors have presented algorithms that locate instances of a given string, or set of strings, within a text. Recently, authors have given less consideration to the complementary problem of processing a text to find out what strings appear in the text, without any preconceived notion of what strings might be present. A system called PATRICIA, which was developed two decades ago, is an implementation of a solution to this problem. The design of PATRICIA is very tightly bound to the assumptions that individual string elements are bits and that the user of the system can provide complete lists of starting and stopping places for strings. This paper presents an approach that drops these assumptions. Our method allows different definitions of indivisible string elements for different applications, and the only information the user provides for the determination of the beginning and ends of strings is a specification of a maximum length for output strings.<\/jats:p><jats:p>This paper also describes a portable C implementation of the method, called PORTREP. The primary data structure of PORTREP is a trie represented as a ternary tree. PORTREP has a method for eliminating redundancy from the output, and it can function with a bounded number of nodes by employing a heuristic process that reuses seldom\u2010visited nodes. Theoretical analysis and empirical studies, reported here, give confidence in the efficiency of the algorithms. PORTREP has the ability to form the basis for a variety of text\u2010analysis applications, and this paper considers one such application, automatic document indexing.<\/jats:p>","DOI":"10.1002\/spe.4380190107","type":"journal-article","created":{"date-parts":[[2006,11,17]],"date-time":"2006-11-17T20:54:12Z","timestamp":1163796852000},"page":"63-77","source":"Crossref","is-referenced-by-count":0,"title":["Portrep: A portable repeated string finder"],"prefix":"10.1002","volume":"19","author":[{"given":"Leslie P.","family":"Jones","sequence":"first","affiliation":[]},{"suffix":"Jr.","given":"Edward W.","family":"Gassie","sequence":"additional","affiliation":[]},{"given":"Sridhar","family":"Radhakrishnan","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2006,10,30]]},"reference":[{"key":"e_1_2_1_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(79)90041-0"},{"key":"e_1_2_1_3_2","doi-asserted-by":"publisher","DOI":"10.1137\/0206024"},{"key":"e_1_2_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/360825.360855"},{"key":"e_1_2_1_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/359842.359859"},{"key":"e_1_2_1_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/0096-3003(80)90037-5"},{"key":"e_1_2_1_7_2","volume-title":"LEX\u2014a lexical analyzer genertor","author":"Lesk M.","year":"1975"},{"key":"e_1_2_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/321479.321481"},{"key":"e_1_2_1_9_2","volume-title":"Fundamentals of Data Structures","author":"Horowitz E.","year":"1982"},{"key":"e_1_2_1_10_2","volume-title":"The C Programming Language","author":"Kernighan B. W.","year":"1975"},{"key":"e_1_2_1_11_2","article-title":"INDEX: the statistical basis for an automatic conceptual phrase\u2010indexing system","author":"Jones L.","journal-title":"J. ASIS."},{"key":"e_1_2_1_12_2","doi-asserted-by":"crossref","unstructured":"L.Jones C.de BessonetandS.Kundu \u2018.ALLOY: an amalgamation of expert linguistic and statistical indexing methods\u2019 Proceedings of the Eleventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval Grenoble France 1988.","DOI":"10.1145\/62437.62450"},{"key":"e_1_2_1_13_2","volume-title":"Louisiana Civil Code","year":"1986"},{"key":"e_1_2_1_14_2","volume-title":"Introduction to Modern Information Retrieval","author":"Salton G.","year":"1983"},{"key":"e_1_2_1_15_2","doi-asserted-by":"crossref","unstructured":"J.Fagan \u2018.Automatic phase indexing for document retrieval: an examination of syntactic and nonsyntactic methods\u2019 Proceedings of the Tenth Annual International ACM SIGIR Conference on Research and Deveeopment in Information Retrieval New Orleans 1987 pp.91\u2013101.","DOI":"10.1145\/42005.42016"},{"key":"e_1_2_1_16_2","doi-asserted-by":"crossref","unstructured":"W. B.CroftandDavid D.Lewis \u2018.An approach to natural language processing for document retrieval\u2019 Proceedings of the Tenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval New Orleans 1987.","DOI":"10.1145\/42005.42009"},{"issue":"2","key":"e_1_2_1_17_2","first-page":"99","article-title":"FASIT: a fully automatic syntactically based indexing system","volume":"32","author":"Dillon M.","year":"1983","journal-title":"J. ASIS"},{"key":"e_1_2_1_18_2","doi-asserted-by":"crossref","unstructured":"A.Smeaton \u2018Incorporating syntactic information into a document retrieval strategy: an investigation\u2019 Proceedings of the Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval Pisa Italy 1986.","DOI":"10.1145\/253168.253193"},{"key":"e_1_2_1_19_2","doi-asserted-by":"crossref","unstructured":"R.Tong L.Appelbaum V.AskmanandJ.Cunningha \u2018.Conceptual information retrieval using RUBRIC\u2019 Proceedings of the Tenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval New Orleans 1987.","DOI":"10.1145\/42005.42033"},{"key":"e_1_2_1_20_2","volume-title":"High Technology Law Journal","year":"1987"},{"key":"e_1_2_1_21_2","unstructured":"Proceedings of the First International Conference on Artificial Intelligence and Law Boston Massachusetts 1987."},{"key":"e_1_2_1_22_2","unstructured":"D.KraftandB.Boyce Techniques for the Evaluation of Decision Alternatives in Libraries and Information Agencies chapter 3 in preparation"},{"key":"e_1_2_1_23_2","unstructured":"S.Cater The topological information retrieval system and the topological paradigm: a unification of the major models in information retrieval Ph.D. Thesis Department of Computer Science Louisiana State University 1986."},{"key":"e_1_2_1_24_2","volume-title":"The INTEGRAL family of reversible compressors","author":"de Maine P. A. D."},{"key":"e_1_2_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/JRPROC.1952.273898"},{"key":"e_1_2_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1977.1055714"},{"key":"e_1_2_1_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/214762.214771"}],"container-title":["Software: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fspe.4380190107","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/spe.4380190107","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,27]],"date-time":"2023-09-27T08:23:10Z","timestamp":1695802990000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/spe.4380190107"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1989,1]]},"references-count":26,"journal-issue":{"issue":"1","published-print":{"date-parts":[[1989,1]]}},"alternative-id":["10.1002\/spe.4380190107"],"URL":"https:\/\/doi.org\/10.1002\/spe.4380190107","archive":["Portico"],"relation":{},"ISSN":["0038-0644","1097-024X"],"issn-type":[{"value":"0038-0644","type":"print"},{"value":"1097-024X","type":"electronic"}],"subject":[],"published":{"date-parts":[[1989,1]]}}}