{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,12]],"date-time":"2025-09-12T18:16:43Z","timestamp":1757701003200,"version":"3.38.0"},"reference-count":42,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2007,8,1]],"date-time":"2007-08-01T00:00:00Z","timestamp":1185926400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2007,8]]},"abstract":"<jats:p> As more and more documents become available on the internet, finding documents that fit users' needs from databases containing millions of documents is becoming increasingly important. Since a scientific document is a structured text, it has some useful features that can be used to improve retrieval performance. In this work, we investigate three such features: fonts, position and cited references. While past research has used these three features individually to improve document searching, no existing research discusses how to integrate these three together to improve retrieval performance. This work first investigates the relationships among them, and then uses these three features to design a novel retrieval method based on the discovered relationships. Extensive experiments have been carried out with real scientific documents to show its effectiveness. Our empirical results show that using the location factor alone achieves the same performance as considering location and font factors simultaneously. We also observed that citation similarity is useful only when the similarity is high. Based on these two clues, we developed a method to combine the content vector and reference vector conditionally, and as a result, this integrated approach does, indeed, improve search performance. <\/jats:p>","DOI":"10.1177\/0165551506075324","type":"journal-article","created":{"date-parts":[[2007,4,20]],"date-time":"2007-04-20T00:13:43Z","timestamp":1177028023000},"page":"492-508","source":"Crossref","is-referenced-by-count":1,"title":["Using position, fonts and cited references to retrieve scientific documents"],"prefix":"10.1177","volume":"33","author":[{"given":"Yen-Liang","family":"Chen","sequence":"first","affiliation":[{"name":"Department of Information Management, National Central University, Taiwan, R.O.C.,"}]},{"given":"Li-Chen","family":"Cheng","sequence":"additional","affiliation":[{"name":"Department of Information Management, National Central University, Taiwan, R.O.C."}]},{"given":"Yun-Ling","family":"Cheng","sequence":"additional","affiliation":[{"name":"Department of Information Management, National Central University, Taiwan, R.O.C."}]}],"member":"179","published-online":{"date-parts":[[2007,8,1]]},"reference":[{"key":"atypb1","doi-asserted-by":"crossref","unstructured":"A. Odlyzko, The rapid evolution of scholarly communication, Learned Publishing 15(1) (2002) 7\u201419.","DOI":"10.1087\/095315102753303634"},{"volume-title":"Modern Information Retrieval","year":"1999","author":"R. Baeza-Yates","key":"atypb2"},{"volume-title":"Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer","year":"1988","author":"G. Salton","key":"atypb3"},{"key":"atypb4","doi-asserted-by":"publisher","DOI":"10.1145\/321439.321441"},{"volume-title":"The SMART Retrieval System","year":"1971","author":"G. Salton","key":"atypb5"},{"key":"atypb6","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(88)90021-0"},{"key":"atypb7","unstructured":"G. Salton and M.J. McGill, Text analysis and automatic indexing. In: G. Salton and M.J. McGill, Introduction to Modern Information Retrieval, (McGraw Hill , New York, 1983), 52\u2014117."},{"key":"atypb8","doi-asserted-by":"publisher","DOI":"10.1108\/eb026526"},{"volume-title":"Proceedings of the 22nd Conference on Research and Development in Information Retrieval (SIGIR'99)","author":"J.L. Herlocker","key":"atypb9"},{"key":"atypb10","doi-asserted-by":"publisher","DOI":"10.1093\/applin\/7.1.57"},{"key":"atypb11","doi-asserted-by":"publisher","DOI":"10.1080\/00220671.1988.10885818"},{"key":"atypb12","doi-asserted-by":"publisher","DOI":"10.1016\/S0169-7552(98)00110-X"},{"key":"atypb13","doi-asserted-by":"publisher","DOI":"10.1147\/rd.24.0354"},{"volume-title":"Proceedings of the Tenth ACM Conference on Hypertext and Hypermedia: Returning to our Diverse Roots","author":"C. Chen","key":"atypb14"},{"volume-title":"Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"G. Jeh","key":"atypb15"},{"key":"atypb16","doi-asserted-by":"publisher","DOI":"10.1002\/asi.5090140103"},{"key":"atypb17","doi-asserted-by":"publisher","DOI":"10.1109\/ADL.2000.848380"},{"key":"atypb18","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630240406"},{"key":"atypb19","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(1999)50:9<799::AID-ASI9>3.0.CO;2-G"},{"volume-title":"Proceedings of Conferentie Informatiewetenschap","author":"J.G. Kircz","key":"atypb20"},{"key":"atypb21","doi-asserted-by":"publisher","DOI":"10.1087\/095315101753141365"},{"key":"atypb22","doi-asserted-by":"publisher","DOI":"10.1087\/095315102753303652"},{"key":"atypb23","doi-asserted-by":"publisher","DOI":"10.1145\/321510.321519"},{"key":"atypb24","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2003.1217626"},{"issue":"4","key":"atypb25","first-page":"399","volume":"28","author":"D.R. Radev","year":"2002","journal-title":"Linguistics"},{"volume-title":"Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics","author":"T. Yoshimi","key":"atypb26"},{"key":"atypb27","doi-asserted-by":"publisher","DOI":"10.1126\/science.178.4060.471"},{"key":"atypb28","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(198909)40:5<342::AID-ASI7>3.0.CO;2-U"},{"key":"atypb29","doi-asserted-by":"publisher","DOI":"10.1007\/BF02129604"},{"key":"atypb30","doi-asserted-by":"crossref","unstructured":"J.M. Kleinberg , Authoritative sources in a hyperlinked environment, Journal of the ACM 46(5) (1999) 604\u201432.","DOI":"10.1145\/324133.324140"},{"volume-title":"The PageRank Citation Ranking: Bringing Order to the Web. Technical Report","year":"1998","author":"L. Page","key":"atypb31"},{"volume-title":"Proceedings of the 17th National Conference on Artificial Intelligence: Workshop on Artificial Intelligence for Web Search","author":"A. Strehl","key":"atypb32"},{"volume-title":"Proceedings of the 8th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"S.K.M. Wong","key":"atypb33"},{"volume-title":"Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"G.W. Furnas","key":"atypb34"},{"key":"atypb35","doi-asserted-by":"publisher","DOI":"10.1137\/S0036144598347035"},{"key":"atypb36","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2004.01.002"},{"issue":"1","key":"atypb37","first-page":"56","volume":"42","author":"A. Kontostathis","year":"2006","journal-title":"Management"},{"issue":"6","key":"atypb38","first-page":"749","volume":"38","author":"X. Tai","year":"2002","journal-title":"Management"},{"issue":"4","key":"atypb39","first-page":"409","volume":"28","author":"S. Teufel","year":"2002","journal-title":"Linguistics"},{"volume-title":"Proceedings of the 12th International World Wide Web Conference WWW '03 International Workshop on Mobile Web Technologies WF7. World Wide Web Consortium WWW-C","author":"S. Dominich","key":"atypb40"},{"key":"atypb41","doi-asserted-by":"publisher","DOI":"10.1016\/0048-7333(89)90016-4"},{"key":"atypb42","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-9639.1980.tb00369.x"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551506075324","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551506075324","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,28]],"date-time":"2025-02-28T22:51:51Z","timestamp":1740783111000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0165551506075324"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,8]]},"references-count":42,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2007,8]]}},"alternative-id":["10.1177\/0165551506075324"],"URL":"https:\/\/doi.org\/10.1177\/0165551506075324","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"type":"print","value":"0165-5515"},{"type":"electronic","value":"1741-6485"}],"subject":[],"published":{"date-parts":[[2007,8]]}}}